Skip to content

Commit 51e2a93

Browse files
committed
Move LLMBENCH_WORKSPACE handling to server level
The server now checks for LLMBENCH_WORKSPACE environment variable and uses it when calling setup_problem and grade functions. If not set, it falls back to the configured working_dir. This centralizes the environment variable handling in the server, making the benchmark modules simpler and more predictable.
1 parent 248ddea commit 51e2a93

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

src/tools/grade_problem.jl

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,10 +53,13 @@ function ClaudeMCPTools.execute(tool::GradeProblemTool, params::Dict)
5353
# Push testset to capture test results (Test module won't print when inside a testset)
5454
Test.push_testset(ts)
5555
try
56+
# Use LLMBENCH_WORKSPACE if set, otherwise use working_dir
57+
workspace = get(ENV, "LLMBENCH_WORKSPACE", tool.working_dir)
58+
5659
# Call the grade function with all arguments
5760
# Use invokelatest to handle world age issues when loading modules dynamically
5861
# Always pass all three parameters - the function has a default value for problem_id
59-
result = Base.invokelatest(tool.grade_fn, tool.working_dir, transcript, problem_id)
62+
result = Base.invokelatest(tool.grade_fn, workspace, transcript, problem_id)
6063
finally
6164
Test.pop_testset()
6265
# Restore TESTSET_PRINT_ENABLE

src/tools/setup_problem.jl

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,11 +32,14 @@ function ClaudeMCPTools.execute(tool::SetupProblemTool, params::Dict)
3232
problem_id = get(params, "problem_id", "")
3333

3434
try
35-
# Call the setup function with the working directory and problem_id
35+
# Use LLMBENCH_WORKSPACE if set, otherwise use working_dir
36+
workspace = get(ENV, "LLMBENCH_WORKSPACE", tool.working_dir)
37+
38+
# Call the setup function with the workspace directory and problem_id
3639
# Use invokelatest to handle world age issues when loading modules dynamically
3740
# Always pass both parameters if we have a problem_id
3841
# The function can have a default value for problem_id
39-
result = Base.invokelatest(tool.setup_fn, tool.working_dir, problem_id)
42+
result = Base.invokelatest(tool.setup_fn, workspace, problem_id)
4043

4144
# The setup function should return a problem description
4245
# Format it as a proper MCP response

0 commit comments

Comments
 (0)