- Posts: 16
- Thank you received: 0
Welcome to the LimeSurvey Community Forum
Ask the community, share ideas, and connect with other LimeSurvey users!
Bias in rand function for random number generation ?
- lea_crtl
- Topic Author
- Offline
- New Member
I've observed a weird phenomenon. I defined a random variable in my survey using an equation type question and defining it at such :
{floor(rand(1,2.9999))}
This is meant to carry out some A/B (here, 1/2) testing. However, out of 65 responses, I recorded 43 "1" versus 22 "2". I would have expected this to be much more balanced. Is something wrong in my code? Or may it be that the probabilities went against me and that it will end up being balanced with more respondents?
(I know there are other ways to do A/B testing in LimeSurvey, using randomization groups for example, but since I launched the survey already I would like to avoid doing such major modifications.)
Thank you very much for your help !
Please Log in to join the conversation.
- holch
- Offline
- LimeSurvey Community Team
- Posts: 11660
- Thank you received: 2742
Afaik PHP rand already creates random integers.
www.php.net/manual/en/function.rand.php
Of course we can't exclude a possible bias (afaik no random number generator is 100% unbiased) in the rand function of PHP, but I would attribute the distribution you are showing rather to chance than to a bias in this dimension.
Test it for n=1000 or n=10000 and see what happens.
I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Please Log in to join the conversation.
- holch
- Offline
- LimeSurvey Community Team
- Posts: 11660
- Thank you received: 2742
You need to prevent that by checking if they equation already has been filled and then keep the current number, only if it is empty generate a random number.
There are many examples here in the forum how to do this.
I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Please Log in to join the conversation.
- Joffm
- Online
- LimeSurvey Community Team
- Posts: 12941
- Thank you received: 3979
LimeSurvey uses the Mersenne-Twister
[url] en.wikipedia.org/wiki/Mersenne_Twister [/url]
Joffm
Volunteers are not paid.
Not because they are worthless, but because they are priceless
Please Log in to join the conversation.
- lea_crtl
- Topic Author
- Offline
- New Member
- Posts: 16
- Thank you received: 0
But you're right: I'm going to use rand(1,2) instead. And I will wait for more respondents to see if it balances.
Thanks again!
Léa
Please Log in to join the conversation.
- holch
- Offline
- LimeSurvey Community Team
- Posts: 11660
- Thank you received: 2742
Thank you for this reply. Since I only use conditions based on this equation (e.g. I display a question group if x = = 1), the dice is not rolled multiple times.
I think it is. Afaik the condition of the question will call the equation and the function is called again.Let's see if Joffm agrees. I think he has tested this in the past in practice. I think I have tested it too, but not sure anymore.
But you can easily test it. Create a small sample survey, let Limesurvey generate the random number, then on another page create a condition for two questions, one shown if 1, the other if 2 and not down the results.
I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Please Log in to join the conversation.
- holch
- Offline
- LimeSurvey Community Team
- Posts: 11660
- Thank you received: 2742
Then I went through the survey 4 times.
Results:
1) (2) --> (2)
2) (2) --> (1)
3) (1) --> (2)
4) (2) --> (1)
Sorry for being the barer of bad news. But better now than later. Do your own tests, but I think you will have to start over.
I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Please Log in to join the conversation.
- lea_crtl
- Topic Author
- Offline
- New Member
- Posts: 16
- Thank you received: 0
Please Log in to join the conversation.
- Joffm
- Online
- LimeSurvey Community Team
- Posts: 12941
- Thank you received: 3979
But:
If your construct "rand(1,2)" is in a group with other questions the value is changed each time you click.
This is the same behaviour that you see in Excel, where a random number is changed each time you enter something in any cell of the sheet.
In many cases this doesn't matter.
If you later display some groups with "randnum==1" resp. "randnum==2" you see in your data which group was answered.
So the stored value of the randnum is not really relevant.
BUT it is really relevant if you use tayloring, e.g. display images with something like <img src=".../myimage{randnum}.jpg" />
Therefore we use this construct {if(is_empty(randnum),rand(1,2),randnum)}
Coming back to your distribution. As @holch, it is not really surprising in a sample of N=65.
Did you do a test in Excel.
I did.
Here the result of 50 runs
If you really have only a small sample, you have to observe your survey.
If you notice that the group 1 is full, you can always change the equation to "rand(2,2)" that the rest of your participants will be leaded to group 2.
Joffm
Volunteers are not paid.
Not because they are worthless, but because they are priceless
Please Log in to join the conversation.
- lea_crtl
- Topic Author
- Offline
- New Member
- Posts: 16
- Thank you received: 0
And although I don't use tayloring for this specific variable, I do for others. I'm going to use your piece of code to circumvent the problem.
Thank you very much to both of you!
Please Log in to join the conversation.
- holch
- Offline
- LimeSurvey Community Team
- Posts: 11660
- Thank you received: 2742
So if I understood you well the condition == 1 call does not change the value of the variable, right?
In my opinion it does. Well, it doesn't change the number necessarily. It just triggers a new draw, which in your case has a 50/50 chance of changing the number or leaving as is.
Because in your case the condition doesn't call a number, but rather a question which contains a formula.
What is strange is that you don't seem to perceive any changes throught your test, because in my opinion you should see changes, as with every call to the equation the dice should be rolled if you do not use the if(is_empty( construct that Joffm posted. My test also shows this. The random number was for some cases 2 at the beginning and on the next page where the conditions were it was another one.
I will have a look at your survey later, but the group numbers already scare me a little....
I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Please Log in to join the conversation.
- holch
- Offline
- LimeSurvey Community Team
- Posts: 11660
- Thank you received: 2742
The random variable is called "randDecision" and groups 24412 to 24424 and 24413 to 24433 are called only if it is equal to 1. Groups 25262 to 25271 and 25272 to 25281 are called only if it is equal to 2.
Those numbers don't help at all, because these are group numbers created by the Limesurvey-Installation and obviously the numbers in my installation are different to the numbers in your installation. I would need the names of the groups.
And you have about 5 different equations that generate a random number. Not sure I understand.
This will take a while to understand what you are doing there.
I answer at the LimeSurvey forum in my spare time, I'm not a LimeSurvey GmbH employee.
No support via private message.
Please Log in to join the conversation.