Skip to content

Commit 686e7c7

Browse files
committed
Fold testcase_run.sh logic into judgedaemon.
The primary motivation is to significantly reduce the per-testcase overhead of judging a submission. Historically, with fewer test cases, the separation was carrying its weight. However, with modern problems requiring many test cases or passes, the overhead from dozens of forked simple programs (`cp`, `mv`, `chmod`, `grep` etc.) within the shell script has become a major performance bottleneck. Moving them into PHP effectively replaces them with simple system calls. This change results in a substantial performance improvement: * Roughly a 50% speed-up in total judging overhead per submission * Example: A simple C++ solution to a multi-pass problem (NWERC 2025 practice) was reduced from ~70s to ~33s end-to-end.
1 parent e585e05 commit 686e7c7

File tree

19 files changed

+1490
-399
lines changed

19 files changed

+1490
-399
lines changed
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
name: Judge unit tests
2+
on:
3+
merge_group:
4+
pull_request:
5+
branches:
6+
- main
7+
- '[0-9]+.[0-9]+'
8+
9+
jobs:
10+
judge-unit-tests:
11+
runs-on: ubuntu-latest
12+
strategy:
13+
matrix:
14+
php-version: ['8.2', '8.3', '8.4']
15+
steps:
16+
- uses: actions/checkout@v4
17+
18+
- name: Setup PHP
19+
uses: shivammathur/setup-php@v2
20+
with:
21+
php-version: ${{ matrix.php-version }}
22+
23+
- name: Cache composer dependencies
24+
uses: actions/cache@v4
25+
with:
26+
path: webapp/vendor
27+
key: composer-${{ hashFiles('webapp/composer.lock') }}
28+
restore-keys: |
29+
composer-
30+
31+
- name: Install composer dependencies
32+
working-directory: webapp
33+
run: composer install --no-scripts --ignore-platform-reqs
34+
35+
- name: Run judge unit tests
36+
working-directory: judge/tests
37+
run: ../../webapp/vendor/bin/phpunit --configuration phpunit.xml --testdox

doc/manual/config-advanced.rst

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -223,18 +223,16 @@ To allow for problems that do not fit within the standard scheme of
223223
fixed input and/or output, DOMjudge has the possibility to change the
224224
way submissions are run and checked for correctness.
225225

226-
The back end script ``testcase_run.sh`` that handles
227-
the running and checking of submissions, calls separate programs
228-
for running submissions and comparison of the results. These can be
229-
specialised and adapted to the requirements per problem. For this, one
226+
The judgedaemon that handles the running and checking of submissions, calls
227+
separate programs for running submissions and comparison of the results. These
228+
can be specialised and adapted to the requirements per problem. For this, one
230229
has to create executable archives as described above.
231-
Then the executable must be
232-
selected in the ``special_run`` and/or ``special_compare``
233-
fields of the problem (an empty value means that the default run and
234-
compare scripts should be used; the defaults can be set in the global
235-
configuration settings). When creating custom run and compare
236-
programs, we recommend reusing wrapper scripts that handle the
237-
tedious, standard part. See the boolfind example for details.
230+
Then the executable must be selected in the ``special_run`` and/or
231+
``special_compare`` fields of the problem (an empty value means that the
232+
default run and compare scripts should be used; the defaults can be set in the
233+
global configuration settings). When creating custom run and compare programs,
234+
we recommend reusing wrapper scripts that handle the tedious, standard part.
235+
See the boolfind example for details.
238236

239237
Compare programs
240238
----------------
@@ -257,8 +255,7 @@ output. The validator program should not make any assumptions on its
257255
working directory.
258256

259257
For more details on writing and modifying a compare (or validator)
260-
script, see the ``boolfind_cmp`` example and the comments at the
261-
top of the file ``testcase_run.sh``.
258+
script, see the ``boolfind_cmp`` example.
262259

263260
Run programs
264261
------------

etc/judgehost-static.php.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ define('CHROOTDIR', '@judgehost_chrootdir@');
2020
define('RUNUSER', '@RUNUSER@');
2121
define('RUNGROUP', '@RUNGROUP@');
2222

23-
// Possible exitcodes from testcase_run.sh and their meaning.
23+
// Possible exitcodes from compile scripts and their meaning.
2424
$EXITCODES = array (
2525
0 => 'correct',
2626
101 => 'compiler-error',

example_problems/hello/submissions/no_output/test-timelimit-bug.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
* This is issue #122 and fixed now, see old description below.
44
*
55
* The reason for TIMELIMIT was that program and runguard stderr are
6-
* mixed and searched by testcase_run.sh for the string 'timelimit exceeded'.
6+
* mixed and searched by judgedaemon for the string 'timelimit exceeded'.
77
* This a minor bug that doesn't provide a team any advantages. It
88
* could be fixed by having runguard write the submission stderr to a
99
* separate file.

example_problems/hello/submissions/time_limit_exceeded/stress-test-fork-setsid.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,14 @@
11
/*
22
* When cgroups are enabled, this will hit timelimit and then be killed.
33
*
4-
* Without cgroups however, this will crash the judging daemon: it
5-
* forks processes and places these in a new session, such that
6-
* testcase_run cannot retrace and kill these. They are left running
4+
* Without cgroups however (not supported anymore), this will crash the judging
5+
* daemon: it forks processes and places these in a new session, such that
6+
* the judgedaemon cannot retrace and kill these. They are left running
77
* and should be killed before restarting the judging daemon. The
88
* cgroups code can detect this because the processes will belong to the
99
* same cgroup.
1010
*
1111
* @EXPECTED_RESULTS@: TIMELIMIT
12-
* (or judgedaemon crash when cgroups disabled)
1312
*/
1413

1514
#include <unistd.h>

example_problems/hello/submissions/time_limit_exceeded/test-fork.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
* timeout.
55
*
66
* The result should be a TIMELIMIT and the running forked programs
7-
* killed by testcase_run.
7+
* killed by the judgedaemon.
88
*
99
* @EXPECTED_RESULTS@: TIMELIMIT
1010
*/

judge/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@
66
/evict
77
/create-cgroups.service
88
9+
/tests/.phpunit.result.cache

judge/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ runpipe: runpipe.cc $(LIBOBJECTS)
2424

2525
install-judgehost:
2626
$(INSTALL_PROG) -t $(DESTDIR)$(judgehost_libjudgedir) \
27-
compile.sh build_executable.sh testcase_run.sh chroot-startstop.sh \
27+
compile.sh build_executable.sh chroot-startstop.sh \
2828
check_diff.sh evict version_check.sh
2929
$(INSTALL_DATA) -t $(DESTDIR)$(judgehost_libjudgedir) \
3030
judgedaemon.main.php run-interactive.sh

0 commit comments

Comments
 (0)