Skip to content
This repository was archived by the owner on Mar 15, 2022. It is now read-only.

Resources for Data.json Analysis

Philip Ashlock edited this page Mar 31, 2014 · 1 revision

We now have everything in place to give out complete access to the test harvester that mirrors the exact functionality of Data.gov’s harvester and can provide you finely detailed error reports on any data.json files you are preparing.

There’s documentation here for all this - including the steps of setting up access and then how to use the tool.

In brief though, for the next step, please follow these steps, create your account, and then let us know when you’ve done that.

To report any bugs or feature requests with this tool, please add them to the data.gov issue tracker at https://github.com/GSA/data.gov/issues

Central CKAN

For those who are using inventory.data.gov, all critical issues should be addressed now and some agencies are already managing the full continuous lifecycle of updates to their data.json through this tool.

To report any bugs or feature requests please add them to the issue tracker at https://github.com/GSA/enterprise-data-inventory/issues

The validator on the Project Open Data Dashboard can be found at http://data.civicagency.org/validate The change-set tool is still rough around the edges, but can be found at http://data.civicagency.org/changeset

To report any bugs or feature requests please add them to the issue tracker at https://github.com/GSA/project-open-data-dashboard/issues

There is some overlap between this tool and the Data.gov Test Harvester. While both tools do use the same schema file to validate data.json, you can be fully confident that the Test Harvester reflects the exact same results as you'll see when your data.json is harvested by Data.gov because it uses the exact same underlying infrastructure. However, there are also some advantages to using the validator on the Project Open Data Dashboard. These include:

  • No account or login is needed
  • You can share a link to the results
  • Faster results, eg no need to schedule a harvest job
  • You can copy and paste in raw JSON rather than use a URL to a data.json
  • User friendly error reporting

There are also a few known issues with the tool. One issue is caused by scalability limitations that prevents it from parsing large data.json files (over about 5mb) but this should be fixed soon. Other outstanding issues should also listed on the issue tracker, but may relate to other functions of the dashboard than the validator.

Clone this wiki locally