10.000 participants, 2 weeks, performance?

More
1 year 6 months ago #136590 by neuhauserich
neuhauserich created the topic: 10.000 participants, 2 weeks, performance?
We are running a rather simple questionnaire, containing about 15 questions and generating some 300 bytes per user. The questionnaire is run by a customer as a service for their customers, mostly some 40 identical versions at the same time, with about 30 to 200 filled in questionnaires within 2 weeks each. Runs fine.

Now the customer asks us for a special run that should accept about 10.000 users within several days. As it's strictly anonymous - even with no invitations - I see no chance to prevent, let's say, 1000 users from logging in at the same time. I have the strong feeling, that PHP will collapse and I have no idea, how this will look to the users. Is someone here, having experience with this kind of problem?

Some months ago, our working group discussed the idea, to catch the questionaires html pages, as limesurvey generates them, do some minor changes in the URL's called and implement a CGI-programme (C, Perl) that saves the answers in a simple text-file with fixed record length. Session control and fast access could be made via record number in the URLs. All administrative service, limsurvey gives for running questionnaires, will be lost, but we also don't use them today and the resulting CSV-file is the same, as actually used by us. The advantage would be, to have no more access to the database.

Would the esteemed members of this forum see this as a realistic plan or do we miss some general problem?

Thanks and kind regards
Erich

Please Log in to join the conversation.

More
1 year 6 months ago #136594 by jelo
jelo replied the topic: 10.000 participants, 2 weeks, performance?

erich wrote: I have the strong feeling, that PHP will collapse and I have no idea, how this will look to the users. Is someone here, having experience with this kind of problem?Some months ago, our working group discussed the idea, to catch the questionaires html pages, as limesurvey generates them, do some minor changes in the URL's called and implement a CGI-programme (C, Perl) that saves the answers in a simple text-file with fixed record length.

So you code your own surveytool with html pages generated by Limesurvey. You choose C or Perl over PHP because you see PHP as the bottleneck.

Which LS version is used? I would put my money on server hardware and web stack optimization. Why do you expect you own code to be a lot faster?

PHP and MySQL is pretty robust. When hitting the resource limits of the webstack you mostly lost a few sessions and have more users which cannot get onto the survey at all. But you won't lost all intereviews. A lot of users will still get through. It depends on how much delay is accepted by users. When the incentive isn't high they drop earlier.

Bottleneck is often the session part of Limesurvey. If the server get a lot of users per seconds a new session file is created.
Not sure if LS 2.5 has optimized the session creation. LS 2.06 will create big session files.

The other part is the webserver. How many concurrent connections can be handled. That won't improve easily when using a cgi-bin with C or perl program. Not sure if you really want to use cgi-bin or other handlers like fcgi.

Please Log in to join the conversation.

More
1 year 6 months ago #136645 by neuhauserich
neuhauserich replied the topic: 10.000 participants, 2 weeks, performance?
Firstly, many thanks for your post.

So you code your own surveytool with html pages generated by Limesurvey. You choose C or Perl over PHP because you see PHP as the bottleneck.


Not only PHP, though I have no idea how it may behave regarding session data at a high user count, but there are more doubts about the database. It seems to me that every single page of a questionnaire is composed of user data and the description of the questionnaire every time again, when a page is called. This means several accesses to the database and is an explanation for the additional response time of estimated (!) 1 to 2 seconds. Maybe I'm wrong at this point, but in the database I couldn't find anything like a ready-to-go HTML-file. This is not meant as critics, but as a point of sorrow. Using allready existing HTML-pages and just filling in user data that were read with only one access to a text file is far away from the flexibility of Limesurvey but should be much more efficient in an limited application.

PHP and MySQL is pretty robust. When hitting the resource limits of the webstack you mostly lost a few sessions and have more users which cannot get onto the survey at all. But you won't lost all intereviews. A lot of users will still get through. It depends on how much delay is accepted by users. When the incentive isn't high they drop earlier.


Good to know, that a complete collapse must not be expected. Did you ever experience such a reaction on overload? As a first solution we are talking about using invitations to have some control over the timing.

The other part is the webserver. How many concurrent connections can be handled.


I have no idea about this and no idea about where to look. But yes, this question must also be answered.

That won't improve easily when using a cgi-bin with C or perl program. Not sure if you really want to use cgi-bin or other handlers like fcgi.


The idea behind cgi/fcgi is, to save the overhead of PHP - but possibly there is not so much overhead, because session data are stored in files. Whatever we will use, PHP, C or Perl, the program should be very small, because it will have less work. I hope.

Thank you very much for your helpful answer!
Kind regards,
Erich

Please Log in to join the conversation.

Start now!

Just create your account and start using Limesurvey today.

Register now
Join our Newsletter!