Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC: Support complex conditions sans system slowdown via optional equation parser

Support complex conditions sans system slowdown via optional equation parser 3 years 5 months ago #61161

  • TMSWhite
  • TMSWhite's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 759
  • Thank you received: 82
  • Karma: 36
There seem to be many posts over the last few months seeking ways to support more complex conditional logic. Some requests have included (a) branch based on sum of response values without needed Assessments, (b) allowing for parentheses within conditions, (c) allowing for other comparison operators.

That sort of functionality are especially important for long surveys (for example, most of the ones I use have 300-3500 questions). And there must be a way to support complex conditions without a slowing down system performance.

However, supporting this sort of functionality may require that users choose between the current GUI-based approach for specifying conditions and a new equation-like syntax for specifying the conditional logic.

I'm trying to help support such functionality by embedding an optional equation parser into LimeSurvey. More details can be found here: bugs.limesurvey.org/view.php?id=5103 . I've done this before (using the JavaCC compiler-compiler) in another product, so I know it works efficiently and end-users can easily encode and debug very complex conditions. My goal is to port that sort of functionality over to LimeSurvey.

I'd like to use this Forum thread as a way to collect feedback from developers and users as I learn the LimeSurvey code-base and start proposing a strategy.
Last Edit: 3 years 5 months ago by TMSWhite.
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62067

  • TMSWhite
  • TMSWhite's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 759
  • Thank you received: 82
  • Karma: 36
This first stage of the enhancement is complete - give it a try!

The embedded equation parser works, letting you do conditional micro-tailoring of questions (e.g. conjugating verbs and declining nouns) and embed complex scored results (e.g. as an alternative to using the Assessments module). The next steps include creating a new type of question for hidden equations, and optionally having LimeSurvey do branching and conditional question visibility using the equation parser (e.g. as an alternative to using conditions)

You can download the patch and a sample survey showing its capabilities here: bugs.limesurvey.org/view.php?id=5103

The sample survey asks four questions (name, age, numKids, numPets) in Group 1, then asks this question in the next Group:

{ANS:name}, you said that you are {ANS:age} years old, and that you have {ANS:numKids} {EVAL:if((numKids==1),'child','children')} and {ANS:numPets} {EVAL:if((numPets==1),'pet','pets')} running around the house. So, you have {EVAL:numKids + numPets} wild {EVAL:if((numKids + numPets ==1),'beast','beasts')} to chase around every day.

Since you have more {EVAL:if((numKids > numPets),'children','pets')} than you do {EVAL:if((numKids > numPets),'pets','children')}, do you feel that the {EVAL:if((numKids > numPets),'pets','children')} are at a disadvantage?


The {EVAL: } statements let you do arbitrarily complex equations, provides safe, read-only access to all variables (by SGQA code or by question.title as shown here), plus safe access to about 100 internal PHP math and string processing functions. You can also add your own functions. That's how I added the Excel-like if(test,do_if_true,do_if_false) function used above to display the singular or plural values for child/children, pet/pets, and beast/beasts.
Last Edit: 3 years 4 months ago by TMSWhite.
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62080

  • lemeur
  • lemeur's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
Hi TMSWhite,

This is a great idea, and in fact this was the subject of a GSoC idea (docs.limesurvey.org/Project+ideas+for+GS...itions_in_LimeSurvey)
and was already discussed for LS2 in:
docs.limesurvey.org/tiki-index.php?page=...010+Condition+engine
docs.limesurvey.org/Expression+engine+for+conditions

Your work so far is a wonderful improvement over the "static" LimeReplacementFields feature (INSERTANS, ...), but I don't get how it could be used in show/hide branching from your description in your last post: can you explain a little more (before I try to get it from your patch).

TIA,
Thibault
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62111

  • TMSWhite
  • TMSWhite's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 759
  • Thank you received: 82
  • Karma: 36
Thibault,

Thanks for pointing me to those links. Sounds like we're thinking in a similar direction. The expression parser I've created is a full featured recursive descent parser - so you don't need to worry about Reverse Polish Notation, and you can easily add functions (one line of code per function to add), and register known variables (e.g. LimeReplacementFields).

I've been using this approach for over a decade with epidemiologists, and they (and the high school students and interns they use to encode the surveys) have found it very easy to encode their branching logic and micro-tailoring using that syntax.

Here's the quick overview about how to use this expression parser for branching logic and hide/show values - I'll provide more detail in the bug tracker.

The attached file is an 12 slide extract of a 120 slide tutorial I gave to the American Medical Informatics Association in 2007. It provides a scientific analysis of the types of branching/complex navigation and micro-tailoring needed, and how this sort of expression parser can help solve most of those needs.

So, briefly:
(1) Using Relevance to control question visibility, server-side. From slide 12, every question has an associated Relevance field (the default being "true"). For each page to be displayed to the user, the engine:
  1. collects the next group of possible questions
  2. evaluates the relevance equation for each question in that group
  3. if there is at least one question in that group whose relevance is true, pass that set of questions to the rendering engine for token replacement and display
  4. if no questions in the group are relevant, repeat from the first step
  5. if there are no more groups of questions, then the survey is done

So, the relevance equation serves the same purpose as conditions, but supports more complex relevance while also being easier to read and understand by research directors. The only down side is that users may need to hand-type these rather than use a graphical user interface.

(2) Using equation parser to Show/Hide questions client-side.
I haven't built this in yet, but now that we have our own equation parser, it would not be hard to extend it to generate JavaScript code from the equation. The equation parser already validates the equation, so we can ensure that only valid JavaScript would be generated. The upgrade process would include:
  1. Replace all reference to variableName with the needed DOM object syntax for that variable
  2. Extend the registeredFunction syntax to include names of the JavaScript functions that are equivalent to the PHP ones; and add any missing JavaScript functions to a functions.js file
  3. Create an anonymous JavaScript function for each relevance equation - so Javascript would do the math directly without needing an eval() statement
  4. Make these functions control the visibility of their associated questions
  5. Add an on_blur or on_change event that recomputes the value of each JavaScript relevance equation each time a user changes or leaves a field

The same process could be use to automate the display of assessment scale results (e.g. if you have a display-only field and wants its value to be dynamically updated on the screen as the user fills out other sections of the screen).

In my own tool, I only ever used the server-side strategy. Given the experience the LimeSurvey team has in managing question visibility using JavaScript, I think this would be a nice feature to add, and would pretty easy to do (easier, certainly, than building the expression parser in the first place).

/Tom

File Attachment:

File Name: AMIA 2007 Tutorial - subset for LimeSurvey.zip
File Size: 241 KB
Last Edit: 3 years 4 months ago by TMSWhite. Reason: last try to upload file - this time zipped .ppt
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62112

  • lemeur
  • lemeur's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
TMSWhite wrote:
Thibault,

Thanks for pointing me to those links. Sounds like we're thinking in a similar direction.

Yes, seems so :)
This project is so old that I was beginning to think It was an utopia :silly:
TMSWhite wrote:
The expression parser I've created is a full featured recursive descent parser - so you don't need to worry about Reverse Polish Notation, and you can easily add functions (one line of code per function to add), and register known variables (e.g. LimeReplacementFields).

I've only add a very quick 30sec look at your code and I agree that your approach is very interresting. it would take me a lot of time to fully review your code because your coding skills are far ahead of mine actually :)
I'll focus on checking that it can be integrated in all cases (mostly all-in-one surveys, group-per-page surveys and in the future assessments).
TMSWhite wrote:
I've been using this approach for over a decade with epidemiologists, and they (and the high school students and interns they use to encode the surveys) have found it very easy to encode their branching logic and micro-tailoring using that syntax.

Yes I agree.
However, I don't think that building a GUI for your syntax would be impossible.
So we could have both a "text" expression, and a GUI that would translate the graphical input to this reference syntax.
TMSWhite wrote:
The attached file is an 16 slide extract of a 120 slide tutorial I gave to the American Medical Informatics Association in 2007. It provides a scientific analysis of the types of branching/complex navigation and micro-tailoring needed, and how this sort of expression parser can help solve most of those needs.
I see no attachement :( Maybe in the bugtracker ?

==> Seeing it now. Thx
TMSWhite wrote:
(1) Using Relevance to control question visibility, server-side. From slide 16, every question has an associated Relevance field (the default being "true"). For each page to be displayed to the user, the engine:
  1. collects the next group of possible questions
  2. evaluates the relevance equation for each question in that group
  3. if there is at least one question in that group whose relevance is true, pass that set of questions to the rendering engine for token replacement and display
  4. if no questions in the group are relevant, repeat from the first step
  5. if there are no more groups of questions, then the survey is done

Yes, very interresting. Looks like to me the exact same thing I had like to see for assessments: using internal variables to control the survey behaviour.
Il like it very much
TMSWhite wrote:
So, the relevance equation serves the same purpose as Conditions, but supports more complex relevance while also being easier to read and understand by research directors. The only down side is that users may need to hand-type these rather than use a graphical user interface.
Maybe we could think of a limited GUI which could at least handle a part of the syntax because we can't afford to "loose" our user base ;-)
TMSWhite wrote:
(2) Using equation parser to Show/Hide questions client-side.
I haven't built this in yet, but now that we have our own equation parser, it would not be hard to extend it to generate JavaScript code from the equation. The equation parser already validates the equation, so we can ensure that only valid JavaScript would be generated. The upgrade process would include:
  1. Replace all reference to variableName with the needed DOM object syntax for that variable
  2. Extend the registeredFunction syntax to include names of the JavaScript functions that are equivalent to the PHP ones; and add any missing JavaScript functions to a functions.js file
  3. Create an anonymous JavaScript function for each relevance equation - so Javascript would do the math directly without needing an eval() statement
  4. Make these functions control the visibility of their associated questions
  5. Add an on_blur or on_change event that recomputes the value of each JavaScript relevance equation each time a user changes or leaves a field
Yes, but still this is quite some work... unless you, as the Software power-designer i know you are, could work on it!
TMSWhite wrote:
In my own tool, I only ever used the server-side strategy. Given the experience the LimeSurvey team has in managing question visibility using JavaScript, I think this would be a nice feature to add,
It is a required feature in fact. Implementing this without "runtime on-page" condition evaluation would not be enough in my opinion.

Of course, this could be a little more difficult if write-access variable are created.
TMSWhite wrote:
and would pretty easy to do (easier, certainly, than building the expression parser in the first place).
Sure, would you help ?

Thibault
Last Edit: 3 years 4 months ago by lemeur.
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62114

  • TMSWhite
  • TMSWhite's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 759
  • Thank you received: 82
  • Karma: 36
Thibault-

Yes, I'd be happy to help with getting the JavaScript side working. I may need some guidance to ensure I get the DOM objects named correctly for all question types.

Here's a list of things you might want to test:

(1) Unit Testing of parser (http://localhost/limesurvey/classes/eval/Test_ExpressionManager_Evaluate.php)
  • Any "failed" test appears in red. The current set of failures are actually OK, as they are the result of rounding errors
  • If you want to add more tests, put them in the ExpressionManager::UnitTestEvaluator()
  • Is there a better way to register external functions? I initially put them in ExpressionManagerFunctions.php, but I couldn't get that file to be properly included() from dEvalFunction.php

(2) Integration Testing (classes/dTexts/dFunctions/dFunctionEval.php). See TODO comments in code:
  1. Is there an existing function that gets the full list of variable names and values - esp. including SubQuestions?
  2. Will this approach create and configure a single new ExpressionManager once per page request (as opposed to multiple times per page request, or only once per survey session)?

/Tom
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62119

  • TMSWhite
  • TMSWhite's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 759
  • Thank you received: 82
  • Karma: 36
To help assess interest in this feature, I'm going to point Forum threads with similar goals to this thread so that we can broaden the discussion. The goal is to get a comprehensive list of places where end-users want to be able to apply conditional branching, visibility, validation, message tailoring, etc.

As a start, I've seen need for expression parsing in the following places:

(1) Deciding which questions to ask.
  • Question-Level relevance (so could turn on/off a question within a group - e.g. asking "bothersomeness" of depressive symptoms, first ask 15 other questions about depression, compute a depression scale based upon those items, then only ask this follow-up questions if the depression score is high enough for the last 2 weeks)
  • Group-Level relevance (only ask any of the questions in the group if the group itself is relevant - e.g. U.S. Census - how many people live in the house - for each person, ask the following 20 questions; but only ask the subset of those 20 that are relevant for that particular person)

(2) Tailoring the Question text (or instructional message)
  • Conditional insertion of values or phrases
  • Proper spelling of plurals (e.g. "your child" vs. "your children", depending up how many children the person has)
  • Conjugating verbs and declining nouns (e.g. you singular/plural for non-English verbs; gender matching of adjective suffix in Spanish)

(3) Tailoring the Answer Choices
  • Conditionally changing the wording of an Answer - just like (2)
  • Conditional adding to answer list (e.g. in a survey asking about stressors, if person fills out an "other-specify" question, and survey wants to rank severity of stressors, each "other-specify" answer should appear as one of the ranking options)
  • Conditional removal of options from answer list (e.g. filter answer list to only include ones relevant for the person's gender)

(4) Tailoring Conditional Validation
  • Set min and/or max range for a numeric input (e.g. when asking number of years of education, set max = year_part(today() - date_of_birth) - 5.
  • Set man and/or max range for dates (e.g. child's birth date must be at least 10 years later than mother's birthdate [and hopefully considerably longer than that])

/Tom
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62120

  • lemeur
  • lemeur's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
TMSWhite wrote:
Thibault-

Yes, I'd be happy to help with getting the JavaScript side working. I may need some guidance to ensure I get the DOM objects named correctly for all question types.

I think that what you need is the retrieveJSidname function defined in qanda.php.
It is used in all-in-one surveys (handled by survey.php) and group-per-page surveys (handled by group.php).
Here's a list of things you might want to test:
...

thnx, I'll try to find the time to test this. I must admit that I have very few free time these days (newborn@home), but I try to get back to the active LS life when possible ;-)

Many thanks for your wonderful work, It's really a wonderfull contribution IMHO and it brings an old dream to life ;-)

Thibault
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62125

  • lemeur
  • lemeur's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
TMSWhite wrote:

(3) Tailoring the Answer Choices
    ...
  • Conditional adding to answer list (e.g. in a survey asking about stressors, if person fills out an "other-specify" question, and survey wants to rank severity of stressors, each "other-specify" answer should appear as one of the ranking options)
  • Conditional removal of options from answer list (e.g. filter answer list to only include ones relevant for the person's gender)

This is a difficult part for now in LS1 because we set the response table structure at activation time. This means that we can't add/remove columns from the response table once activated. For instance for a multiple options question having 4 option-checkboxes, we need to have a fixed number of 4 columns in DB (this is different from single choice questions).

One goal of LS2 was to support for cycling questions (dynamically add columns depending on previous answers). This will be ported to LS1 in the future (I'm not sure if the current "DBengine porting to CodeIginter" GSoc project will handle this or not).
The administrator has disabled public write access.

Re: Support complex conditions sans system slowdown via optional equation parser 3 years 4 months ago #62127

  • TMSWhite
  • TMSWhite's Avatar
  • OFFLINE
  • LimeSurvey Team
  • Posts: 759
  • Thank you received: 82
  • Karma: 36
Thibault-

Even LS1 database structure might work for this.
What I've done in the past for such optional answer choices is to set a maximum limit and create columns for each of them. So, if there were up to 10 "other" options, I'd have columns for each of them, but most of them would be null.

Another option LS might consider (after other priorities calm down) is an Entity Attribute Value database design for responses. I started with the model that LS1 is currently using (and still use it as one of two ways I simultaneously store the result). However, to add flexibility, I have generic Datum and ItemUsage tables in addition to a InstrumentSession table. Say I have an InstrumentSession with 1000 variables, when I start a new one, I insert 1000 rows into the Datum table. Then, each time I collect new data, I store Datum-like values in the ItemUsage table - one row per question asked per page. This way I have complete audit history in case respondents want to go back and change their answers; but the Datum rows always hold the most recent value for a given question. For one survey, hosted on an Amazon EC2 small instance, with MySql, the system has been live for nearly 4 years, has handled up to 50,000 surveys in a single day, and still has sub-second response time despite having nearly 500,000 completed surveys and 40 million Datum entries. So, if that system can support an Entity-Attribute-Value style storage of survey responses, LimeSurvey could too.

/Tom
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: ITEd
Time to create page: 0.145 seconds
Donation Image