open-cogsci · domee-zawadzka · Mar 5, 2025 · Mar 5, 2025 · Apr 2, 2025 · Jun 18, 2025
diff --git a/tests/quality/test_opensesame_qa.py b/tests/quality/test_opensesame_qa.py
@@ -38,13 +38,18 @@ def init_testlog():
     testlog_folder = Path(__file__).parent / 'testlog'
     if not testlog_folder.exists():
         testlog_folder.mkdir()
-    testlog = Path(testlog_folder) / f'testlog.{datetime.now().strftime("%Y-%m-%d %H:%M")}.{config.settings_default["model_config"]}.log'
+    testlog = Path(testlog_folder) / f'testlog.{datetime.now().strftime("%Y-%m-%d %H-%M")}.{config.settings_default["model_config"]}.log'
 
 
 def read_testcases():
     output = {}
-    for path in (Path(__file__).parent / 'testcases').glob('*.md'):
-        test_case = path.read_text()
+    selected_case = os.getenv("TEST_CASE")  
+    test_cases_dir = Path(__file__).parent / 'testcases'
+
+    for path in test_cases_dir.glob('*.md'):
+        if selected_case and path.name != selected_case:
+            continue
+        test_case = path.read_text(encoding="utf-8")
         question, requirements = test_case.split('\n', 1)
         output[path.name] = {
             'question': question.strip(),
@@ -88,11 +93,13 @@ def score_testcase(description, question, requirements, n=3):
 def score_testcases(select_cases=None):
     results = []
     for description, testcase in read_testcases().items():
+        print(f"Running test case: {description}")
         if select_cases is not None and description not in select_cases:
             results.append((None, description))
             continue
         scores = score_testcase(description, **testcase)
         results.append((scores, description))
+
     with testlog.open('a') as fd:
         fd.write('\n\nSummary:\n')
         for scores, description in results:
@@ -119,7 +126,12 @@ def test_openai():
 def test_openai_o1():
     config.settings_default['model_config'] = 'openai_o1'
     init_testlog()
-    score_testcases()
+    selected_case = os.getenv("TEST_CASE")  # Read from environment variable
+    if selected_case:
+        score_testcases(select_cases=[selected_case])
+    else:
+        score_testcases()
+
 
 
 def test_openai_o3():

diff --git a/tests/quality/testcases/Experiment-Crash.md b/tests/quality/testcases/Experiment-Crash.md
@@ -0,0 +1,22 @@
+In my trial sequence I have a multiple-choice test. The sequence includes: a loop item with 4 multiple choice questions, answer options and the correct\_answer variable; then there is a multiple\_choice form and after that there is a sampler with the .ogg file that is supposed to play when the answer is incorrect; the run-if expression of the form is set to response \!= correct\_answer. When I run the experiment, it crashes after the first response is given. The error message I receive is:  Error: ConditionalExpressionError
+
+Error while evaluating run-if expression
+
+This error occurred in the run phase of item question\_sequence.  
+Traceback (most recent call last):  
+  File "\<conditional statement\>", line 1, in \<module\>
+
+NameError: name 'response' is not defined”
+
+How can I fix the error?
+
+The answer should:
+
+- Clearly explain why the NameError occurred  
+- Point the user to the multiple\_choice to check what the “response variable” is and change the run if expression with the correct response variable  
+- Emphasise that the expression should be in the Sampler’s run-if expression field
+
+The answer should not:
+
+- Suggest changing the expression to “correct \== 0”
+
diff --git a/tests/quality/testcases/Image-error.md b/tests/quality/testcases/Image-error.md
@@ -0,0 +1,13 @@
+I have an experiment where I want to present an image to a participant for 10 seconds. I used the code below to upload the image:  
+canvas \= Canvas()  
+canvas.image("stimulus.png", x=0, y=0)   
+canvas.show()  
+clock.sleep(10000)    
+However, when I run the experiment, I keep getting the error: FileNotFoundError: \[Errno 2\] No such file or directory: 'stimulus.png'  
+How can I fix it?
+
+The answer should:
+
+- Inform the user that the image file is not in the file pool of the experiment  
+- Provide the solution to the problem, either by adding the file to the pool or by adding a pool reference in the code 
+
diff --git a/tests/quality/testcases/Loop-purpose.md b/tests/quality/testcases/Loop-purpose.md
@@ -0,0 +1,7 @@
+What’s the purpose of using the loop in the experiment?
+
+Answer should:
+
+- Give information that the loop has the following functions: run an item multiple times, define independent variables, randomize the trial order  
+- Provide an example sequence where a loop is used
+
diff --git a/tests/quality/testcases/OS-online .md b/tests/quality/testcases/OS-online .md
@@ -0,0 +1,15 @@
+How can I run my OpenSesame experiment online? 
+
+The answer should: 
+
+- Provide a clear statement that experiments created with OpenSesame are primarily designed to run locally, but there is an extension, OSWeb, that allows experiments to be run in a browser.  
+- Provide a guide how to run an experiment online:  
+  - Set experiment properties to “In a browser with OSWeb” which will check if the experiment is compatible with OSWeb  
+  - Test the experiment in the browser  
+  - If the test works properly, the user can export the experiment for online use  
+    - In Tools select Export for JATOS  
+  - The experiment then can be uploaded to JATOS via a JATOS server (own or via a website)  
+  - Create a link for participants  
+- Mention prerequisites and limitations for using OSWeb for online experiments.  
+  - Prerequisites: OpenSesame 4.0 or later, no Python inline\_script is used, access to a JATOS server, browser supporting JavaScript  
+  - Limitations: no sound is supported, response and scripting-language (for example PyGaze or PsychoPy) items are not supported
diff --git a/tests/quality/testcases/Overview-gone.md b/tests/quality/testcases/Overview-gone.md
@@ -0,0 +1,9 @@
+I have accidentally deleted my overview area from the interface, how can I return it? 
+
+Answer should:
+
+- Navigate the user to the menubar and click on “View” and then choose the option “Show overview area”. 
+
+OR
+
+- Instruct the user to use “ctrl \+ \\” 
diff --git a/tests/quality/testcases/Sketchpad-new-stimuli.md b/tests/quality/testcases/Sketchpad-new-stimuli.md
@@ -0,0 +1,10 @@
+How do I add an image stimulus using the sketchpad in OpenSesame?
+
+Response Should:
+
+- Explain how to add an image to the file pool, either via drag-and-drop or the “＋” button.
+
+- Guide the user to insert a Sketchpad item and use “Draw image element” to place the image.
+
+- Mention how to configure the image and how to set the duration
+
diff --git a/tests/quality/testcases/Stroop-experiment.md b/tests/quality/testcases/Stroop-experiment.md
@@ -0,0 +1,14 @@
+How can I design a Stroop task on OpenSesame to measure participant’s reaction time based on a key press. I would like the participants to press: “m” when the word is red, “z” when it’s blue and “v” when it’s yellow.
+
+The answer should:
+
+Explain the basic layout of the experiment and where to define the variables:
+Sequence
+Loop to define variables
+Sketchpad with a fixation dot– central “+”
+Sketchpad to present the coloured word
+Keyboard_response
+Logger - record reaction time and accuracy 
+Specify that trial conditions should be defined in the Loop item, including a correct_response variable that matches the response key for each trial.
+Mention how to use the colour variable (or similar variable) to set the font colour in the Sketchpad item.
+Explain to set the allowed responses to desired keys (m, z and v) and make sure that reaction time is being recorded in the keyboard response item
diff --git a/tests/quality/testcases/T_F Statements .md b/tests/quality/testcases/T_F Statements .md
@@ -0,0 +1,24 @@
+How can I design an experiment where a person is presented with a series of true/false statements with a keyboard response? I want the allowed responses to be T for true and F for false
+
+The answer should:
+
+- Provide a step-by-step guide for implementing a True/False task and emphasise the GUI method  
+- Mention other possible design methods, such as using an inline script.  
+- Specify the design steps:   
+  - Add a sequence  
+  - Add a loop and append to it:  
+    - Sequence  
+    - Sketchpad  
+    - Keyboard response  
+    - Logger  
+  - Then, go to the loop item and add a column “statement”  with the statements, 1	 statement per row  
+  - In the Sketchpad and add text \[statement\]  
+  - For Keyboard Response set Allowed Keys to T and F and choose Store Response Time  
+  - In the logger, ensure that it contains: statement, response and response\_time  
+  - Run the experiment  
+- Specify how to configure following elements: Loop, Sketchpad, Keyboard Response, Logger
+
+The answer should not:
+
+- Describe alternative design methods in detail  
+- Assume which method is used and provide only one possible answer
diff --git a/tests/quality/testcases/csv_file.md b/tests/quality/testcases/csv_file.md
@@ -0,0 +1,6 @@
+For my experiment, I’ve prepared the trial conditions in a .csv file. Is it possible to use this file in OpenSesame to run the experiment?
+
+The answer should:
+
+- Inform the user that it is possible to use the file instead of making a table.   
+- Direct the user to the loop item in their experiment sequence. In the item they can set the “Source” from table to file. Then, they can upload the file either from their file pool OR the computer.
diff --git a/tests/quality/testcases/eye-tracking-fixation.md b/tests/quality/testcases/eye-tracking-fixation.md
@@ -0,0 +1,16 @@
+How can I design an eye-tracking experiment in OpenSesame using PyGaze to measure fixation duration during a feature search task? I want the participant to press the R key when they spot a red triangle in an image.
+
+The answer should:
+
+- Outline the basic elements necessary to create the experiment by either starting from scratch with a default template or use the eye-tracking template   
+- Instruct the user to configure the pygaze\_init item  
+- Describe how the trial sequence inside the loop should look like.    
+  - If the user is not using the eye-tracking template, go to the practice loop item and build trial sequence with following elements: pygaze\_drift\_correct (optional), pygaze\_start\_recording, sketchpad\_fixation (shows a cross or dot, 0 ms), inline\_script\_measure\_fixation (existing script), sketchpad\_stimulus (shows the main stimulus), keyboard\_response, pygaze\_log (logs fixation duration), pygaze\_stop\_recording  
+  - Explain how to adjust the following items: fixation, inline\_script, stimulus, response item and logger  
+  - Mentions how to configure the loop item 
+
+Answer should not:
+
+- Provide one, straightforward instruction to design the experiment  
+- Refer exclusively to inline script as the main design method
+
diff --git a/tests/quality/testcases/fail-to-record.md b/tests/quality/testcases/fail-to-record.md
@@ -0,0 +1,6 @@
+I am running an eye tracking study with the Eyelink tracker. However, I received an error message “Failed to start recording” after calibration. What would possibly cause the issue? 
+
+The answer should:
+
+- Explain to the user what the error means
+- List most common causes of the error, such as the tracker is in the wrong mode (Calibration instead of Track), lost signal after calibration, previous experiment sessions was not closed
diff --git a/tests/quality/testcases/naming-objects .md b/tests/quality/testcases/naming-objects .md
@@ -0,0 +1,35 @@
+How do I make an experiment where the participant has to name objects that appear on the screen?
+
+The answer should:
+
+- Guide the user how to create the experiment with the following steps::  
+* Open OpenSesame and create a new experiment.  
+* Choose a "Legacy" or "PsychoPy" backend for better response recording.  
+* Set the resolution and refresh rate.   
+* IIn the sequence tab, include:  
+* practice\_loop (optional)  
+* experiment\_loop  
+* fixation\_cross  
+* stimulus\_display  
+* voice\_response  
+* logger  
+* Add a loop item (experiment\_loop) that contains a table.  
+* Include a column for object names and image file names and set the repeat value (e.g., if you want each item to appear once).  
+* Inside experiment\_loop, add a sketchpad item (stimulus\_display).  
+* Click "Image" and insert "\[image\]" (this dynamically loads images).  
+* Set duration to 0 (so it moves to the next step immediately).  
+* Add a "keyboard\_response" or "voice\_response" item after stimulus\_display.  
+* For voice responses:  
+  * Use the voice\_response plugin (ensure microphone access).  
+  * Set "Duration" to 3000 ms (3 sec for response).  
+  * Enable "Save sound?" to record audio.  
+  * Define filename format: response\_\[subject\_nr\]\_\[count\].wav.  
+* Before each trial, add a sketchpad (fixation\_cross) with a centered "+".  
+* Duration: 500 ms.  
+* Add a logger at the end of the loop to save:  
+  * image (stimulus presented)  
+  * object\_name (expected answer)  
+  * response\_time  
+  * voice\_response (filename of recorded audio)  
+* Run the experiment and check the log file (.csv) to see if responses are stored correctly.
+
diff --git a/tests/quality/testcases/need-logger.md b/tests/quality/testcases/need-logger.md
@@ -0,0 +1,6 @@
+Do I need a logger in order to keep the experiment data saved? 
+
+The answer should
+
+- Indicate that a logger is necessary in order to save the data from the experiment
+
diff --git a/tests/quality/testcases/sketchpad-new-stimuli.md b/tests/quality/testcases/sketchpad-new-stimuli.md
@@ -0,0 +1,10 @@
+How do I add an image stimulus using the sketchpad in OpenSesame?
+
+Response Should:
+
+- Explain how to add an image to the file pool, either via drag-and-drop or the “＋” button.
+
+- Guide the user to insert a Sketchpad item and use “Draw image element” to place the image.
+
+- Mention how to configure the image and how to set the duration
+