Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 32 additions & 28 deletions spec/2023-07-draft.md
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,7 @@ Markdown statements also support `.svg`.
- For problem statements provided as PDFs:
the judge system will display the PDF verbatim;
therefore any sample data must be included in the PDF.
The judge system is not required to reconcile sample data embedded in PDFs with the `sample` test data group nor to validate it in any other way.
The judge system is not required to reconcile sample data embedded in PDFs with the `sample` test data nor to validate it in any other way.

### LaTeX Environment and Supported Subset

Expand Down Expand Up @@ -573,10 +573,10 @@ The format of `test_group.yaml` is as follows:

Key | Type | Default | Comments
------------------------- | ----------------------------------- | ---------------------------------------------- | --------
`max_score` | Integer or `unbounded` | 100 in `secret`, otherwise `unbounded` | The maximum possible score of the test data group. Must be a non-negative integer or `unbounded`. This key is only permitted for the `secret` group and its subgroups.
`score_aggregation` | `pass-fail`, `sum`, or `min` | `sum` in `secret`, otherwise `pass-fail` | How the score is determined based on the scores of the contained groups or test cases. See [Result Aggregation](#result-aggregation). This key is only permitted for the `secret` group and its subgroups.
`static_validation_score` | Integer or `pass-fail` | | The maximum score of the static validation test case, or `pass-fail`. See [Static Validator](#static-validator).
`require_pass` | String or sequence of strings | empty sequence | Other test data groups whose test cases a submission must pass in order to receive a score for this test group. See [Result Aggregation](#result-aggregation). This key is only permitted for the `secret` group and its subgroups.
`max_score` | Integer or `unbounded` | 100 in `secret`, otherwise `unbounded` | The maximum possible score of the test data group or `secret`. Must be a non-negative integer or `unbounded`. This key is not permitted in `sample`.
`score_aggregation` | `pass-fail`, `sum`, or `min` | `sum` in `secret`, otherwise `pass-fail` | How the score is determined based on the scores of the contained groups or test cases. See [Result Aggregation](#result-aggregation). This key is not permitted in `sample`.
`static_validation_score` | Integer or `pass-fail` | | The maximum score of the static validation test case, or `pass-fail`. See [Static Validator](#static-validator).
`require_pass` | String or sequence of strings | empty sequence | Other test data groups (or `sample`) whose test cases a submission must pass in order to receive a score for this test data group. See [Result Aggregation](#result-aggregation). This key is not permitted in `sample`.
`args` | Sequence of strings | empty sequence | See [Test Case Configuration](#test-case-configuration).
`input_validator_args` | Sequence of strings or map of strings to sequences of strings | empty sequence | See [Test Case Configuration](#test-case-configuration).
`static_validator_args` | Sequence of strings | empty sequence | See [Static Validator](#static-validator).
Expand All @@ -592,7 +592,7 @@ Every subdirectory of `data/secret/` is a test data group and may contain a `tes
`data/secret` can only have test data groups *or* test cases, never both.
That is, if there are any directories under `data/secret/` there must not be any `.in` files directly in `data/secret/` and vice versa.

The test groups themselves can contain directories, but not further groups.
The test data groups themselves can contain directories, but not further groups.
This means that there are no `test_group.yaml` further down in the directory hierarchy.

A directory must not have the same name as a test case in the same directory.
Expand Down Expand Up @@ -720,7 +720,7 @@ Every submission is run on these test cases.
Sample test cases do not contribute to the problem score for [scoring problems](#scoring-problems).
If a `score.txt` file is produced on sample test cases on a scoring problem, it is not an error, but simply ignored.

`data/sample` must not contain test groups.
`data/sample` must not contain any test data groups.
It may be missing (for problems with no samples) or empty.

#### Samples Shown in the Problem Statement
Expand Down Expand Up @@ -845,13 +845,13 @@ Every submission matched by the glob pattern must satisfy:
The tooling should check the constraints for consistency,
such as that two disjoint `permitted` sets are never applied to a single `(submission, testcase)` pair.

### Groups
### Test Data Groups

The `permitted`, `required`, `score`, `message`, and `use_for_time_limit` requirements can also be given for only a subset of test cases,
by adding them under a key with the name of a test group (relative to `data/`).
by adding them under a key with the name of a test data group (relative to `data/`).
In this case, the `permitted`, `required`, `message`, and `use_for_time_limit` keys only apply to the set of test cases (recursively) in the given group.

The `score` key puts a constraint on the aggregated score of a given test group, _not_ on the _set_ of test cases the group contains.
The `score` key puts a constraint on the aggregated score of a given test data group, _not_ on the _set_ of test cases the group contains.

For example, the configuration below tests that the submission solves all cases in `group1`, but times out on at least one case in `group2`.
```yaml
Expand All @@ -867,8 +867,8 @@ solves_group_1.py:

#### Glob patterns

Glob patterns can be used to apply restrictions to a subset of submissions. It is also possible to use glob patterns to put restrictions on a subset of test
cases and test groups, for example, when test groups are not used:
Glob patterns can be used to apply restrictions to a subset of submissions.
It is also possible to use glob patterns to put restrictions on a subset of test cases and test data groups, for example, when groups are not used:
```yaml
time_limit_exceeded/solves_easy_cases.py:
sample:
Expand All @@ -883,9 +883,9 @@ This means that the submission must solve all samples and all easy cases,
but must time out on at least one of the hard cases.

Submission glob patterns are matched against all paths to files and directories of submissions inside and relative to the `submissions/` directory.
Test case glob patterns are matched against all paths of test groups and test cases relative to `data/`,
Test case glob patterns are matched against all paths of test data groups and test cases relative to `data/`,
excluding the trailing `.in`. Wildcards (`*`) only match within a file name (i.e., do not match `/`).
A test case is matched by the glob pattern if either itself or any of its parent test groups is matched by it,
A test case is matched by the glob pattern if either itself or any of its parent test data groups is matched by it,
and similarly a submission is matched if either itself or a parent directory is matched.

Using `**` to match any number of directories and `[xyz]` to match only a subset of characters is not supported.
Expand Down Expand Up @@ -977,7 +977,7 @@ Validation fails if any validator fails.

An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.

All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of.
All input validators provided will be run on every test data file using the arguments specified for the test data group (or `sample`, or `secret`) they are part of.
Validation fails if any validator fails.

When invoked, the input validator will get the input file on stdin.
Expand Down Expand Up @@ -1017,17 +1017,17 @@ A static validator may be provided under the `static_validator` directory, simil

### Static Validation Test Cases

Each test group may define a static validation test case.
Each test data group (or `sample`, or `secret`) may define a static validation test case.
It is an error to define static validation test cases without providing a static validator.
A static validation test case is defined within a group's `test_group.yaml` file by specifying the key `static_validation_score`.
A static validation test case is defined in `test_group.yaml` by specifying the key `static_validation_score`.

If `static_validation_score` is specified as a non-negative integer, then it is the maximum score of the static validation test case (see [Scoring Problems](#scoring-problems) for details).
If it is specified as `pass-fail`, then the test group it is part of must have `pass-fail` aggregation, or the problem must be of type `pass-fail`.
If it is specified as `pass-fail`, then `score_aggregation` must be set to `pass-fail`, or the problem must be of type `pass-fail`.
If `static_validator_args` is given, then it defines arguments passed to the static validator.

It is an error to:
- provide a static validator for `submit-answer` type problems,
- specify a `static_validation_score` in a test group with `pass-fail` aggregation or a problem that does not have the type `scoring`,
- specify a `static_validation_score` in a test data group with `pass-fail` aggregation or a problem that does not have the type `scoring`,
- specify `static_validator_args` without specifying `static_validation_score`.

### Invocation
Expand Down Expand Up @@ -1340,31 +1340,35 @@ It is a judge error if:
- an output or static validator produces a `score.txt` for a test case with bounded maximum score with a value that exceeds this maximum score;
- an output or static validator produces a `score.txt` or `score_multiplier.txt` with invalid contents.

#### Scoring Test Groups
#### Scoring Test Data Groups

The score of `secret` is determined by its groups or test cases (it can only have one or the other).
The score of a test group is determined by its test cases.
The score of a test data group is determined by its test cases.
The score depends on the aggregation mode, which is either `pass-fail`, `sum`, or `min`.

- If a group uses `pass-fail` aggregation, the group must have bounded maximum score.
If the submission receives an accepted verdict for all test cases in the group,
the score of the group is equal to its maximum possible score.
Otherwise the group score is 0.
- If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases or groups.
- If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases.
- If a group uses `min` aggregation, then the group score is the minimum of these scores.

The submission score is the score of the `secret` group.
The score of `secret` is determined by its groups or test cases (it can only have one or the other).
- If `secret` uses `pass-fail` aggregation, then `secret` must have bounded maximum score. If the submission receives an accepted verdict for all test cases in `secret`, or for all test cases in its test data groups (which must also have `pass-fail` aggregation), then the score of `secret` is equal to its maximum possible score. Otherwise the score of `secret` is 0.
- If `secret` uses `sum` or `min` aggregation, then the score of `secret` is computed from the scores of its test cases or test data groups, analogously to test data group scoring above.

The submission score is the score of `secret`.

It is a judge error if the score of any group or subgroup exceeds its `max_score`.
It is a judge error if the score of `secret` or any test data group exceeds its `max_score`.

#### Required Dependent Groups

A group may specify that it depends on some other test data groups.
Each required group must be either `sample` or have `pass-fail` aggregation.
A test data group or `secret` may specify that it depends on some other test data groups or `sample`.
Each required group must have `pass-fail` aggregation.
The dependent group will only be run if the group being depended on receives an accepted verdict for all test cases in the group.
If the dependent group is not run, the group score is 0.

The paths of these required groups, relative to the `data` folder, are listed under the `require_pass` key.
Comment on lines 1367 to 1370
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, since secret and sample is not a "group", nothing has been stated for secret and sample. However, it will be replaced with something else in a future PR #487.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good points. We'll fix it in #487 as you say...

A path consists of zero or more directory names followed by a directory or file name, with a `/` character separating consecutive names. Each name must conform to the [general requirements](#general-requirements) on directory and file names and the specified test data group must exist.

The path of a group, relative to the `data/` folder, must come later lexicographically than the paths of all test cases and groups it depends on.
A group must not depend on a group that is lexicographically earlier than itself.
A group must not depend on a group that is hierarchally above itself.