Package schema by jbragg · Pull Request #4 · allenai/agent-eval

jbragg · 2025-05-07T23:34:01Z

Resolves an encountered problem where some scored data was set to a default value (number of tokens = 0), the field was dropped during serialization (due to exclude_defaults=True in model_dump_json), the HuggingFace dataset inferred a null value, and de-serialization in the leaderboard app failed (the number of tokens field expected an int).

This PR

Adds functionality to produce a fixed schema file (dataset_infos.json) which after being uploaded to the root of the results HuggingFace repo should obviate the need for HuggingFace schema inference.
Removes the exclusions of default and None values during serialization. These exclusions were put in place with the goal of avoiding auto schema inference problems, which are no longer relevant (and which caused problems like the one described above).

rodneykinney

LGTM

AmberRose2

Looks good, thank you! I think this also makes it a lot clearer for users!

jbragg · 2025-05-08T23:21:01Z

It turned out that HF wasn't reading from dataset_infos.json so I switched to adding the schema info to the README, which seems to work

jbragg force-pushed the jbragg/package-schema branch 17 times, most recently from bacb5bc to c5ebed5 Compare May 8, 2025 07:06

Package schema

9502370

jbragg force-pushed the jbragg/package-schema branch from c5ebed5 to 9502370 Compare May 8, 2025 07:12

jbragg marked this pull request as ready for review May 8, 2025 07:25

jbragg requested review from AmberRose2 and rodneykinney May 8, 2025 07:26

Documentation about HF dataset administration

4ce39fd

jbragg marked this pull request as draft May 8, 2025 16:26

rodneykinney approved these changes May 8, 2025

View reviewed changes

AmberRose2 approved these changes May 8, 2025

View reviewed changes

jbragg force-pushed the jbragg/package-schema branch from f672ad7 to c45c522 Compare May 8, 2025 23:08

Populate dataset card metadata with schema information

53a56b9

jbragg force-pushed the jbragg/package-schema branch from c45c522 to 53a56b9 Compare May 8, 2025 23:17

jbragg marked this pull request as ready for review May 8, 2025 23:18

jbragg merged commit fde83e3 into main May 8, 2025
3 checks passed

jbragg deleted the jbragg/package-schema branch May 8, 2025 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package schema#4

Package schema#4
jbragg merged 3 commits intomainfrom
jbragg/package-schema

jbragg commented May 7, 2025 •

edited

Loading

Uh oh!

rodneykinney left a comment

Uh oh!

AmberRose2 left a comment

Uh oh!

jbragg commented May 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jbragg commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rodneykinney left a comment

Choose a reason for hiding this comment

Uh oh!

AmberRose2 left a comment

Choose a reason for hiding this comment

Uh oh!

jbragg commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jbragg commented May 7, 2025 •

edited

Loading

jbragg commented May 8, 2025 •

edited

Loading