Check out the LimeSurvey source code on GitHub!
Welcome, Guest
Username: Password:
  • Page:
  • 1
  • 2

TOPIC: Support complex conditions sans system slowdown via optional equation parser

Support complex conditions sans system slowdown via optional equation parser 5 years 3 months ago #61161

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
There seem to be many posts over the last few months seeking ways to support more complex conditional logic. Some requests have included (a) branch based on sum of response values without needed Assessments, (b) allowing for parentheses within conditions, (c) allowing for other comparison operators.

That sort of functionality are especially important for long surveys (for example, most of the ones I use have 300-3500 questions). And there must be a way to support complex conditions without a slowing down system performance.

However, supporting this sort of functionality may require that users choose between the current GUI-based approach for specifying conditions and a new equation-like syntax for specifying the conditional logic.

I'm trying to help support such functionality by embedding an optional equation parser into LimeSurvey. More details can be found here: bugs.limesurvey.org/view.php?id=5103 . I've done this before (using the JavaCC compiler-compiler) in another product, so I know it works efficiently and end-users can easily encode and debug very complex conditions. My goal is to port that sort of functionality over to LimeSurvey.

I'd like to use this Forum thread as a way to collect feedback from developers and users as I learn the LimeSurvey code-base and start proposing a strategy.
Last Edit: 5 years 3 months ago by TMSWhite.
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62067

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
This first stage of the enhancement is complete - give it a try!

The embedded equation parser works, letting you do conditional micro-tailoring of questions (e.g. conjugating verbs and declining nouns) and embed complex scored results (e.g. as an alternative to using the Assessments module). The next steps include creating a new type of question for hidden equations, and optionally having LimeSurvey do branching and conditional question visibility using the equation parser (e.g. as an alternative to using conditions)

You can download the patch and a sample survey showing its capabilities here: bugs.limesurvey.org/view.php?id=5103

The sample survey asks four questions (name, age, numKids, numPets) in Group 1, then asks this question in the next Group:

{ANS:name}, you said that you are {ANS:age} years old, and that you have {ANS:numKids} {EVAL:if((numKids==1),'child','children')} and {ANS:numPets} {EVAL:if((numPets==1),'pet','pets')} running around the house. So, you have {EVAL:numKids + numPets} wild {EVAL:if((numKids + numPets ==1),'beast','beasts')} to chase around every day.

Since you have more {EVAL:if((numKids > numPets),'children','pets')} than you do {EVAL:if((numKids > numPets),'pets','children')}, do you feel that the {EVAL:if((numKids > numPets),'pets','children')} are at a disadvantage?


The {EVAL: } statements let you do arbitrarily complex equations, provides safe, read-only access to all variables (by SGQA code or by question.title as shown here), plus safe access to about 100 internal PHP math and string processing functions. You can also add your own functions. That's how I added the Excel-like if(test,do_if_true,do_if_false) function used above to display the singular or plural values for child/children, pet/pets, and beast/beasts.
Last Edit: 5 years 2 months ago by TMSWhite.
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62080

  • lemeur
  • lemeur's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
Hi TMSWhite,

This is a great idea, and in fact this was the subject of a GSoC idea (docs.limesurvey.org/Project+ideas+for+GS...itions_in_LimeSurvey)
and was already discussed for LS2 in:
docs.limesurvey.org/tiki-index.php?page=...010+Condition+engine
docs.limesurvey.org/Expression+engine+for+conditions

Your work so far is a wonderful improvement over the "static" LimeReplacementFields feature (INSERTANS, ...), but I don't get how it could be used in show/hide branching from your description in your last post: can you explain a little more (before I try to get it from your patch).

TIA,
Thibault
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62111

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
Thibault,

Thanks for pointing me to those links. Sounds like we're thinking in a similar direction. The expression parser I've created is a full featured recursive descent parser - so you don't need to worry about Reverse Polish Notation, and you can easily add functions (one line of code per function to add), and register known variables (e.g. LimeReplacementFields).

I've been using this approach for over a decade with epidemiologists, and they (and the high school students and interns they use to encode the surveys) have found it very easy to encode their branching logic and micro-tailoring using that syntax.

Here's the quick overview about how to use this expression parser for branching logic and hide/show values - I'll provide more detail in the bug tracker.

The attached file is an 12 slide extract of a 120 slide tutorial I gave to the American Medical Informatics Association in 2007. It provides a scientific analysis of the types of branching/complex navigation and micro-tailoring needed, and how this sort of expression parser can help solve most of those needs.

So, briefly:
(1) Using Relevance to control question visibility, server-side. From slide 12, every question has an associated Relevance field (the default being "true"). For each page to be displayed to the user, the engine:
  1. collects the next group of possible questions
  2. evaluates the relevance equation for each question in that group
  3. if there is at least one question in that group whose relevance is true, pass that set of questions to the rendering engine for token replacement and display
  4. if no questions in the group are relevant, repeat from the first step
  5. if there are no more groups of questions, then the survey is done

So, the relevance equation serves the same purpose as conditions, but supports more complex relevance while also being easier to read and understand by research directors. The only down side is that users may need to hand-type these rather than use a graphical user interface.

(2) Using equation parser to Show/Hide questions client-side.
I haven't built this in yet, but now that we have our own equation parser, it would not be hard to extend it to generate JavaScript code from the equation. The equation parser already validates the equation, so we can ensure that only valid JavaScript would be generated. The upgrade process would include:
  1. Replace all reference to variableName with the needed DOM object syntax for that variable
  2. Extend the registeredFunction syntax to include names of the JavaScript functions that are equivalent to the PHP ones; and add any missing JavaScript functions to a functions.js file
  3. Create an anonymous JavaScript function for each relevance equation - so Javascript would do the math directly without needing an eval() statement
  4. Make these functions control the visibility of their associated questions
  5. Add an on_blur or on_change event that recomputes the value of each JavaScript relevance equation each time a user changes or leaves a field

The same process could be use to automate the display of assessment scale results (e.g. if you have a display-only field and wants its value to be dynamically updated on the screen as the user fills out other sections of the screen).

In my own tool, I only ever used the server-side strategy. Given the experience the LimeSurvey team has in managing question visibility using JavaScript, I think this would be a nice feature to add, and would pretty easy to do (easier, certainly, than building the expression parser in the first place).

/Tom

File Attachment:

File Name: AMIA 2007 ...rvey.zip
File Size:241 KB
Last Edit: 5 years 2 months ago by TMSWhite. Reason: last try to upload file - this time zipped .ppt
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62112

  • lemeur
  • lemeur's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
TMSWhite wrote:
Thibault,

Thanks for pointing me to those links. Sounds like we're thinking in a similar direction.

Yes, seems so :)
This project is so old that I was beginning to think It was an utopia :silly:
TMSWhite wrote:
The expression parser I've created is a full featured recursive descent parser - so you don't need to worry about Reverse Polish Notation, and you can easily add functions (one line of code per function to add), and register known variables (e.g. LimeReplacementFields).

I've only add a very quick 30sec look at your code and I agree that your approach is very interresting. it would take me a lot of time to fully review your code because your coding skills are far ahead of mine actually :)
I'll focus on checking that it can be integrated in all cases (mostly all-in-one surveys, group-per-page surveys and in the future assessments).
TMSWhite wrote:
I've been using this approach for over a decade with epidemiologists, and they (and the high school students and interns they use to encode the surveys) have found it very easy to encode their branching logic and micro-tailoring using that syntax.

Yes I agree.
However, I don't think that building a GUI for your syntax would be impossible.
So we could have both a "text" expression, and a GUI that would translate the graphical input to this reference syntax.
TMSWhite wrote:
The attached file is an 16 slide extract of a 120 slide tutorial I gave to the American Medical Informatics Association in 2007. It provides a scientific analysis of the types of branching/complex navigation and micro-tailoring needed, and how this sort of expression parser can help solve most of those needs.
I see no attachement :( Maybe in the bugtracker ?

==> Seeing it now. Thx
TMSWhite wrote:
(1) Using Relevance to control question visibility, server-side. From slide 16, every question has an associated Relevance field (the default being "true"). For each page to be displayed to the user, the engine:
  1. collects the next group of possible questions
  2. evaluates the relevance equation for each question in that group
  3. if there is at least one question in that group whose relevance is true, pass that set of questions to the rendering engine for token replacement and display
  4. if no questions in the group are relevant, repeat from the first step
  5. if there are no more groups of questions, then the survey is done

Yes, very interresting. Looks like to me the exact same thing I had like to see for assessments: using internal variables to control the survey behaviour.
Il like it very much
TMSWhite wrote:
So, the relevance equation serves the same purpose as Conditions, but supports more complex relevance while also being easier to read and understand by research directors. The only down side is that users may need to hand-type these rather than use a graphical user interface.
Maybe we could think of a limited GUI which could at least handle a part of the syntax because we can't afford to "loose" our user base ;-)
TMSWhite wrote:
(2) Using equation parser to Show/Hide questions client-side.
I haven't built this in yet, but now that we have our own equation parser, it would not be hard to extend it to generate JavaScript code from the equation. The equation parser already validates the equation, so we can ensure that only valid JavaScript would be generated. The upgrade process would include:
  1. Replace all reference to variableName with the needed DOM object syntax for that variable
  2. Extend the registeredFunction syntax to include names of the JavaScript functions that are equivalent to the PHP ones; and add any missing JavaScript functions to a functions.js file
  3. Create an anonymous JavaScript function for each relevance equation - so Javascript would do the math directly without needing an eval() statement
  4. Make these functions control the visibility of their associated questions
  5. Add an on_blur or on_change event that recomputes the value of each JavaScript relevance equation each time a user changes or leaves a field
Yes, but still this is quite some work... unless you, as the Software power-designer i know you are, could work on it!
TMSWhite wrote:
In my own tool, I only ever used the server-side strategy. Given the experience the LimeSurvey team has in managing question visibility using JavaScript, I think this would be a nice feature to add,
It is a required feature in fact. Implementing this without "runtime on-page" condition evaluation would not be enough in my opinion.

Of course, this could be a little more difficult if write-access variable are created.
TMSWhite wrote:
and would pretty easy to do (easier, certainly, than building the expression parser in the first place).
Sure, would you help ?

Thibault
Last Edit: 5 years 2 months ago by lemeur.
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62114

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
Thibault-

Yes, I'd be happy to help with getting the JavaScript side working. I may need some guidance to ensure I get the DOM objects named correctly for all question types.

Here's a list of things you might want to test:

(1) Unit Testing of parser (http://localhost/limesurvey/classes/eval/Test_ExpressionManager_Evaluate.php)
  • Any "failed" test appears in red. The current set of failures are actually OK, as they are the result of rounding errors
  • If you want to add more tests, put them in the ExpressionManager::UnitTestEvaluator()
  • Is there a better way to register external functions? I initially put them in ExpressionManagerFunctions.php, but I couldn't get that file to be properly included() from dEvalFunction.php

(2) Integration Testing (classes/dTexts/dFunctions/dFunctionEval.php). See TODO comments in code:
  1. Is there an existing function that gets the full list of variable names and values - esp. including SubQuestions?
  2. Will this approach create and configure a single new ExpressionManager once per page request (as opposed to multiple times per page request, or only once per survey session)?

/Tom
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62119

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
To help assess interest in this feature, I'm going to point Forum threads with similar goals to this thread so that we can broaden the discussion. The goal is to get a comprehensive list of places where end-users want to be able to apply conditional branching, visibility, validation, message tailoring, etc.

As a start, I've seen need for expression parsing in the following places:

(1) Deciding which questions to ask.
  • Question-Level relevance (so could turn on/off a question within a group - e.g. asking "bothersomeness" of depressive symptoms, first ask 15 other questions about depression, compute a depression scale based upon those items, then only ask this follow-up questions if the depression score is high enough for the last 2 weeks)
  • Group-Level relevance (only ask any of the questions in the group if the group itself is relevant - e.g. U.S. Census - how many people live in the house - for each person, ask the following 20 questions; but only ask the subset of those 20 that are relevant for that particular person)

(2) Tailoring the Question text (or instructional message)
  • Conditional insertion of values or phrases
  • Proper spelling of plurals (e.g. "your child" vs. "your children", depending up how many children the person has)
  • Conjugating verbs and declining nouns (e.g. you singular/plural for non-English verbs; gender matching of adjective suffix in Spanish)

(3) Tailoring the Answer Choices
  • Conditionally changing the wording of an Answer - just like (2)
  • Conditional adding to answer list (e.g. in a survey asking about stressors, if person fills out an "other-specify" question, and survey wants to rank severity of stressors, each "other-specify" answer should appear as one of the ranking options)
  • Conditional removal of options from answer list (e.g. filter answer list to only include ones relevant for the person's gender)

(4) Tailoring Conditional Validation
  • Set min and/or max range for a numeric input (e.g. when asking number of years of education, set max = year_part(today() - date_of_birth) - 5.
  • Set man and/or max range for dates (e.g. child's birth date must be at least 10 years later than mother's birthdate [and hopefully considerably longer than that])

/Tom
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62120

  • lemeur
  • lemeur's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
TMSWhite wrote:
Thibault-

Yes, I'd be happy to help with getting the JavaScript side working. I may need some guidance to ensure I get the DOM objects named correctly for all question types.

I think that what you need is the retrieveJSidname function defined in qanda.php.
It is used in all-in-one surveys (handled by survey.php) and group-per-page surveys (handled by group.php).
Here's a list of things you might want to test:
...

thnx, I'll try to find the time to test this. I must admit that I have very few free time these days (newborn@home), but I try to get back to the active LS life when possible ;-)

Many thanks for your wonderful work, It's really a wonderfull contribution IMHO and it brings an old dream to life ;-)

Thibault
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62125

  • lemeur
  • lemeur's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
TMSWhite wrote:

(3) Tailoring the Answer Choices
    ...
  • Conditional adding to answer list (e.g. in a survey asking about stressors, if person fills out an "other-specify" question, and survey wants to rank severity of stressors, each "other-specify" answer should appear as one of the ranking options)
  • Conditional removal of options from answer list (e.g. filter answer list to only include ones relevant for the person's gender)

This is a difficult part for now in LS1 because we set the response table structure at activation time. This means that we can't add/remove columns from the response table once activated. For instance for a multiple options question having 4 option-checkboxes, we need to have a fixed number of 4 columns in DB (this is different from single choice questions).

One goal of LS2 was to support for cycling questions (dynamically add columns depending on previous answers). This will be ported to LS1 in the future (I'm not sure if the current "DBengine porting to CodeIginter" GSoc project will handle this or not).
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62127

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
Thibault-

Even LS1 database structure might work for this.
What I've done in the past for such optional answer choices is to set a maximum limit and create columns for each of them. So, if there were up to 10 "other" options, I'd have columns for each of them, but most of them would be null.

Another option LS might consider (after other priorities calm down) is an Entity Attribute Value database design for responses. I started with the model that LS1 is currently using (and still use it as one of two ways I simultaneously store the result). However, to add flexibility, I have generic Datum and ItemUsage tables in addition to a InstrumentSession table. Say I have an InstrumentSession with 1000 variables, when I start a new one, I insert 1000 rows into the Datum table. Then, each time I collect new data, I store Datum-like values in the ItemUsage table - one row per question asked per page. This way I have complete audit history in case respondents want to go back and change their answers; but the Datum rows always hold the most recent value for a given question. For one survey, hosted on an Amazon EC2 small instance, with MySql, the system has been live for nearly 4 years, has handled up to 50,000 surveys in a single day, and still has sub-second response time despite having nearly 500,000 completed surveys and 40 million Datum entries. So, if that system can support an Entity-Attribute-Value style storage of survey responses, LimeSurvey could too.

/Tom
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62144

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
What is the preferred way to set values of variables in the database?

Several of the other threads seem they could benefit from this,so I have upgraded the equation parser to support assigning registered variables within an equation (e.g. a = b, c += d, etc). The unit testing on the equation parser works fine, but all assignment is done locally within the equation parser. To integrate with LimeSurvey, I'll set a dirty flag so that we know which of the registered variables had had their values changed.

So, once I know which of the variables have changed values, how do I ensure that this gets recorded?
(1) Server-Side (PHP) - Do I have to update the database directly?
(2) Client-Side (JavaScript) - If I just update the value of the appropriate DOM object (e.g. document.getElementById("answerSGQA").value = new_value, where SGQA is the SGQA code), will LimeSurvey automatically update the associated database field, or do I need to do something else too?
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62147

  • lemeur
  • lemeur's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 31
  • Karma: 15
Hi Tom,

I'm not sure if you propose to be able to change the values of already registered variables such as answers or if you also propose to let the survey admin register his own variabels (which would be IMO a great asset).

For the first case, it is simple:
* make sure your expression evaluation is done before save.php, and then let save.php record to DB as if the input was coming from the web page. Caution: you may only update registered fields that corresponds to questions already seen (they must have been displayed in previous pages).
* in Client-side: just update the DOM element that corresponds to the variable: the DOM name is different when the variable corresponds to a question displayed in the current page or question that has been displayed before.


If you intend to add expression specific variables, then we may let the survey-admin decides if the variable needs to be recorded into DB or not. Then if it is to be recorded to DB we must make it a valid fieldname (see function createFieldmap).

(1) Server-Side: The best way would be to use the same piece of code that is used for answer recording. This is done in save.php. Basically we parse the POST variables and delete variables not in the the authorized variable names list(function createFieldMap returns the list of registered variables).
So the process IMO would be to register the expression-variables to the createFieldMap (this requires special care because the createFieldMap function is used all over LS code, but this may be the cleanest way).
Then in save.php add a piece of code to record expression-variable changed at this step.

This also implies checking the activate.php code (but it uses createFieldMap sot it should create the corresponding variables). However, we can't create expression-variable once the survey is activated.

(2) For Client-Side: we need to POST the value in a DOM object. We thus need to define a naming convention for the new variables names in the DOM and update the retrieveJSidname function.

Just my 10 minutes idea on this very interresting proposal.
Thibault
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 2 months ago #62150

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
Thibault-

I was hoping to support the following:
(1) changing answers to questions that had already been asked
(2) changing any read-write LimeReplacementField
(3) changing any read-write Token
(4) letting admins register their own variables

To register the list of valid variables and their values, I'm already taking a read-only copy of createFieldMap($style='full), and populating the current values via retrieve_answer(SGQA). So, sounds like save.php will already manage that subset of the variables.

Is there a consolidated function for retrieving the names and current values of all of the LimeReplacementFields? I see about 100 of them spread across a dozen files.

/Tom
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 1 month ago #62974

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
ExpressionManager is now stable in the limesurvey_dev_tms branch. For people wanting to test it, it may be easier to try out that branch rather than dealing with patches to the main branch.

Assuming you install the branch on localhost, the Unit tests are [url=http://localhost/limesurvey_dev_tms/classes/eval/ExpressionManagerTestSuite.php]here[/url]. They show all of the main features (e.g. complex equations, call functions, syntax highlighting when there are syntax errors).

You can also load the attached survey (effectively the Integration tests), which demonstrates how ExpressionManager can be used to compute values, tailor messages based upon those computations, and generate customized reports of questions asked and responses given.

More details about current and planned features can be found at those issue-tracking links.
Attachments:
The administrator has disabled public write access.

Support complex conditions sans system slowdown via optional equation parser 5 years 1 month ago #63275

  • TMSWhite
  • TMSWhite's Avatar
  • Offline
  • LimeSurvey Team
  • Posts: 758
  • Thank you received: 83
  • Karma: 37
For those who want a preview of ExpressionManager (but aren't able to download the branch or patch a local install), you can try these links:

(1) Sample Survey - Shows calculations, conditional tailoring of text, and summary of responses before submitting
(2) ExpressionManager Test Suite - shows all of the main functionality, including syntax highlighting when syntax errors are detected.
(3) ExpressionManager Log file - shows table of all current responses for the Sample Survey, dynamically generated from the running survey.
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Moderators: ITEd
Time to create page: 0.375 seconds
Imprint                   Data Protection Statement                  Revocation information and revocation form