Skip to content

Comments

feat: improve Data Package metadata compliance with CKAN licenses and field schemas#755

Merged
dsmedia merged 11 commits intovega:mainfrom
dsmedia:fix/datapackage-toml-metadata
Feb 3, 2026
Merged

feat: improve Data Package metadata compliance with CKAN licenses and field schemas#755
dsmedia merged 11 commits intovega:mainfrom
dsmedia:fix/datapackage-toml-metadata

Conversation

@dsmedia
Copy link
Collaborator

@dsmedia dsmedia commented Feb 2, 2026

  • Add CKAN license identifiers (name field) to all resource licenses for machine-readable compliance
  • Add explicit field schemas for flare.json and flare-dependencies.json with required field constraints
  • Add datetime format strings (e.g., %Y/%m/%d %H:%M) for validation compatibility with date fields (e.g. in validating with frictionless-ts)
  • Correct field types where inferred types didn't match actual data (e.g., Miles_per_Gallon as number)
  • Improve resource descriptions for clarity and accuracy
  • Update CONTRIBUTING.md: uvxuv run for lockfile-consistent tool versions
  • Fix duplicate source entry for flights-airport.csv

Note: Some issues with Frictionless Data v2 validation tooling (Python and/or TypeScript) will need to be resolved before validation can be incorporated into this repo's CI.

dsmedia and others added 9 commits February 2, 2026 04:26
Add machine-readable license `name` fields using Open Definition IDs
per Data Package v2 spec. Fixes barley.json where description was in `name`.

Spec: https://datapackage.org/standard/data-package/#licenses
Licenses: https://licenses.opendefinition.org/licenses/groups/ckan.json
Add strptime format patterns to date/datetime fields for cross-framework
validation compatibility. The spec's `format: "any"` is not implemented
by frictionless-ts.

Spec: https://datapackage.org/standard/table-schema/#date

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- cars.json: Miles_per_Gallon as `number` (has decimals, not integer)
- flare.json: full schema with optional fields for tree structure
- monarchs.json: commonwealth as explicit `boolean` type

Spec: https://datapackage.org/standard/table-schema/#field

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Link to the actual Python script that generates this synthetic dataset.

Spec: https://datapackage.org/standard/data-package/#sources

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- flare.json: document tree structure (root/branch/leaf node fields)
- movies.json: document known data quality issues for teaching use
- flights-airport.csv: consolidate duplicate source entries

Spec: https://datapackage.org/standard/data-resource/#description

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…scriptions

- Add missing schema for flare-dependencies.json with source/target fields
- Clarify that source/target represent directed import dependencies
- Update descriptions to explicitly reference the Flare visualization library

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Using 'uvx' can verify against a newer version of ruff than what is locked in 'uv.lock', causing formatting discrepancies with CI. Switching to 'uv run' ensures the locked version is used.
Updated datapackage.json and datapackage.md via 'npm run build'.
Update `_data/datapackage_additions.toml` to explicitly mark key fields as
required in the schema definitions:
- `flare.json`: `id` and `name` are now required.
- `flare-dependencies.json`: `source` and `target` are now required.

This establishes a strict data contract in the generated `datapackage.json`,
improving documentation and allowing downstream tools to validate the integrity
of these hierarchical datasets.

See: https://datapackage.org/standard/table-schema/#field-constraints
@dsmedia dsmedia changed the title Fix/datapackage toml metadata feat: improve Data Package metadata compliance with SPDX licenses and field schemas Feb 2, 2026
@dsmedia dsmedia requested a review from domoritz February 2, 2026 04:37
@dsmedia dsmedia changed the title feat: improve Data Package metadata compliance with SPDX licenses and field schemas feat: improve Data Package metadata compliance with CKAN licenses and field schemas Feb 2, 2026
@dsmedia dsmedia merged commit ee7f6c6 into vega:main Feb 3, 2026
2 checks passed
@dsmedia dsmedia deleted the fix/datapackage-toml-metadata branch February 3, 2026 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants