Skip to content

Commit f672ad7

Browse files
committed
Populate dataset card metadata with schema information
1 parent 4ce39fd commit f672ad7

File tree

10 files changed

+138
-177
lines changed

10 files changed

+138
-177
lines changed

.github/workflows/ci-validate-schema.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: Validate dataset_infos.json
1+
name: Validate dataset_features.yml
22

33
on:
44
push:
@@ -22,19 +22,19 @@ jobs:
2222
- name: Install dependencies
2323
run: pip install -e .
2424

25-
- name: Check dataset_infos.json exists
25+
- name: Check dataset_features.yml exists
2626
run: |
27-
if [ ! -e dataset_infos.json ]; then
28-
echo "dataset_infos.json is missing. Please run 'python scripts/update_schema.py' and commit the file to the repo." >&2
27+
if [ ! -e dataset_features.yml ]; then
28+
echo "dataset_features.yml is missing. Please run 'python scripts/update_schema.py' and commit the file to the repo." >&2
2929
exit 1
3030
fi
3131
32-
- name: Regenerate dataset_infos.json
32+
- name: Regenerate dataset_features.yml
3333
run: python scripts/update_schema.py
3434

3535
- name: Verify schema is up to date
3636
run: |
37-
if ! git diff --quiet dataset_infos.json; then
38-
echo "dataset_infos.json is out of date. Please run 'python scripts/update_schema.py' and commit the updated file." >&2
37+
if ! git diff --quiet dataset_features.yml; then
38+
echo "dataset_features.yml is out of date. Please run 'python scripts/update_schema.py' and commit the updated file." >&2
3939
exit 1
4040
fi

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ publish:
1414
@echo "Uploading package to PyPI..."
1515
@bash scripts/publish.sh
1616

17-
# Update dataset_infos.json
17+
# Update HF dataset features
1818
update-schema:
1919
@echo "Updating schema..."
2020
python scripts/update_schema.py

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,12 @@ Upload the scored results to HuggingFace datasets.
3838
# Administer the HuggingFace datasets
3939
Prior to publishing scores, two HuggingFace datasets should be set up, one for full submissions and one for results files.
4040

41-
If you want to call `load_dataset()` on the results dataset (e.g., for populating a leaderboard), you probably want to explicitly tell HuggingFace about the schema and dataset structure (otherwise, HuggingFace may fail to propertly auto-convert to Parquet):
42-
- *Schema:* Upload the [results schema](https://github.com/allenai/agent-eval/blob/main/dataset_infos.json) to the root of the results dataset.
43-
- *Dataset structure:* Specify the `configs` attribute in the YAML metadata block at the top of the `README.md` file at the root of the results dataset. For example, see the [sample metadata block](sample-config-dataset-structure.yml) for the [sample config](sample-config.yml). Using `agenteval publish` will automatically add the corresponding config name and split to the YAML metadata if it is missing.
41+
If you want to call `load_dataset()` on the results dataset (e.g., for populating a leaderboard), you probably want to explicitly tell HuggingFace about the schema and dataset structure (otherwise, HuggingFace may fail to propertly auto-convert to Parquet).
42+
This is done by updating the `configs` attribute in the YAML metadata block at the top of the `README.md` file at the root of the results dataset (the metadata block is identified by lines with just `---` above and below it).
43+
This attribute should contain a list of configs, each of which specifies the schema (under the `features` key) and dataset structure (under the `data_files` key).
44+
See [sample-config-hf-readme-metadata.yml](sample-config-hf-readme-metadata.yml) for a sample metadata block corresponding to [sample-comfig.yml](sample-config.yml) (note that the metadata references the [raw schema data](dataset_features.yml), which must be copied).
45+
46+
To facilitate initializing new configs, `agenteval publish` will automatically add this metadata if it is missing.
4447

4548
# Development
4649

dataset_features.yml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
- name: suite_config
2+
struct:
3+
- name: name
4+
dtype: string
5+
- name: version
6+
dtype: string
7+
- name: splits
8+
list:
9+
- name: name
10+
dtype: string
11+
- name: tasks
12+
list:
13+
- name: name
14+
dtype: string
15+
- name: path
16+
dtype: string
17+
- name: primary_metric
18+
dtype: string
19+
- name: tags
20+
sequence: string
21+
- name: split
22+
dtype: string
23+
- name: results
24+
list:
25+
- name: task_name
26+
dtype: string
27+
- name: metrics
28+
list:
29+
- name: name
30+
dtype: string
31+
- name: value
32+
dtype: float64
33+
- name: model_usages
34+
list:
35+
list:
36+
- name: model
37+
dtype: string
38+
- name: usage
39+
struct:
40+
- name: input_tokens
41+
dtype: int64
42+
- name: output_tokens
43+
dtype: int64
44+
- name: total_tokens
45+
dtype: int64
46+
- name: input_tokens_cache_write
47+
dtype: int64
48+
- name: input_tokens_cache_read
49+
dtype: int64
50+
- name: reasoning_tokens
51+
dtype: int64
52+
- name: model_costs
53+
sequence: float64
54+
- name: submission
55+
struct:
56+
- name: submit_time
57+
dtype: timestamp[us, tz=UTC]
58+
- name: username
59+
dtype: string
60+
- name: agent_name
61+
dtype: string
62+
- name: agent_description
63+
dtype: string
64+
- name: agent_url
65+
dtype: string
66+
- name: logs_url
67+
dtype: string
68+
- name: logs_url_public
69+
dtype: string
70+
- name: summary_url
71+
dtype: string

dataset_infos.json

Lines changed: 0 additions & 148 deletions
This file was deleted.

pyproject.toml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "agent-eval"
7-
version = "0.1.0"
7+
version = "0.1.1"
88
description = "Agent evaluation toolkit"
99
readme = "README.md"
1010
requires-python = ">=3.10"
@@ -43,3 +43,9 @@ where = ["src"]
4343
[tool.pytest.ini_options]
4444
testpaths = ["tests"]
4545
python_files = ["test_*.py"]
46+
47+
[tool.setuptools]
48+
include-package-data = true
49+
50+
[tool.setuptools.data-files]
51+
"agenteval" = ["dataset_features.yml"]
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,6 @@ configs:
66
path: 0.1-dev/validation/*.json
77
- split: test
88
path: 0.1-dev/test/*.json
9+
features:
10+
# Insert dataset_features.yml here at the proper indentation level.
911
---

scripts/update_schema.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
#!/usr/bin/env python3
22
"""
3-
Script to regenerate dataset_infos.json from the Pydantic schema.
3+
Script to regenerate dataset_features.yml from the Pydantic schema.
44
"""
55
from pathlib import Path
66

7-
from agenteval.schema_generator import generate_dataset_infos
7+
from agenteval.schema_generator import write_dataset_features
88

99

1010
def update_schema():
1111
repo_root = Path(__file__).parent.parent
12-
output_path = repo_root / "dataset_infos.json"
13-
generate_dataset_infos(str(output_path))
12+
output_path = repo_root / "dataset_features.yml"
13+
write_dataset_features(str(output_path))
1414

1515

1616
def main():
17-
"""Regenerate dataset_infos.json from Pydantic schema"""
17+
"""Regenerate dataset_features.yml from Pydantic schema"""
1818
update_schema()
19-
print("✅ dataset_infos.json updated at dataset_infos.json")
19+
print("✅ dataset_features.yml updated")
2020

2121

2222
if __name__ == "__main__":

src/agenteval/schema_generator.py

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,12 @@
33
"""
44

55
import datetime
6-
import json
76
import types
7+
from importlib import resources
88
from typing import Union, get_args, get_origin
99

1010
import pyarrow as pa
11+
import yaml
1112
from datasets import Features
1213
from pydantic import BaseModel
1314

@@ -61,19 +62,34 @@ def _schema_from_pydantic(model: type[BaseModel]) -> list[pa.Field]:
6162

6263
def features_from_pydantic(model: type[BaseModel]) -> Features:
6364
"""
64-
Build a Hugging Face Features object from a Pydantic BaseModel using PyArrow schema.
65+
Build a HuggingFace Features object from a Pydantic BaseModel using PyArrow schema.
6566
"""
6667
pa_fields = _schema_from_pydantic(model)
6768
pa_schema = pa.schema(pa_fields)
6869
return Features.from_arrow_schema(pa_schema)
6970

7071

71-
def generate_dataset_infos(output_path: str = "dataset_infos.json"):
72+
def write_dataset_features(output_path: str) -> None:
7273
"""
73-
Generate a dataset_infos.json file from the EvalResult schema.
74+
Write the HuggingFace Features data inferred from the EvalResult schema.
7475
"""
7576
features = features_from_pydantic(EvalResult)
76-
infos = {"default": {"features": features.to_dict()}}
7777
with open(output_path, "w", encoding="utf-8") as f:
78-
json.dump(infos, f, indent=2)
79-
print(f"Generated dataset_infos.json at {output_path}")
78+
yaml_values = features._to_yaml_list()
79+
yaml.safe_dump(yaml_values, f, indent=2, sort_keys=False)
80+
81+
82+
def load_dataset_features(input_path: str | None = None) -> Features:
83+
"""
84+
Load the HuggingFace Features data from a YAML file.
85+
"""
86+
if input_path is None:
87+
# load the shipped dataset_features.yml from the package
88+
with resources.open_text(
89+
"agenteval", "dataset_features.yml", encoding="utf-8"
90+
) as f:
91+
yaml_values = yaml.safe_load(f)
92+
else:
93+
with open(input_path, "r", encoding="utf-8") as f:
94+
yaml_values = yaml.safe_load(f)
95+
return Features._from_yaml_list(yaml_values)

0 commit comments

Comments
 (0)