CKAN extension for DCAT-AP Switzerland. This extension provides two CKAN plugins:
ogdch_cmd: providing commands to run in the backgroundogdch_admin: admin tools mainly for handling of solr
- CKAN 2.10+
- ckanext-switzerland-ng
- ckanext-harvest
To see the help text for any command:
ckan ogdch [command] --helpDatastore currently does not delete tables when the corresponding resource is deleted. This command finds these orphaned tables and deletes its rows to free the space in the database. It is meant to be run regularly by a cronjob.
ckan ogdch cleanup_datastoreWhen datasets are harvested, we try to reuse the existing resources, but not all of them are reused. Some old resources remain with the state 'deleted'. These orphaned resources can be deleted with this command. It is meant to be run regularly by a cronjob.
It will also delete all files from the filestore that are associated with the orphaned resources. It also comes with a dryrun option.
ckan ogdch cleanup_resourcesWhen a resource gets deleted will be marked as deleted in the database and also its associated file in the CKAN FileStore won't be deleted. This command finds these orphaned files by checking whether their corresponding resource still exists.
It is meant to be run regularly by a cronjob. It also comes with a dryrun option.
ckan ogdch cleanup_filestoreWhen a key is no longer needed in the package_extra table, since it is no longer part of the dataset, then after the data have been migrated that old key can be removed from the package_extra table and from the dependent table package_extra_revision. The command comes with a dryrun option.
ckan ogdch cleanup_extras publishers --dryrunThis commands deletes the harvest jobs and objects per source and overall leaving only the latest n, where n and the source are optional arguments. The command is supposed to be used in a cron job to provide for a regular cleanup of harvest jobs, so that the database is not overloaded with unneeded data of past job runs.
It has a dryrun option so that it can be tested what will get be deleted in the database before the actual database changes are performed.
ckan ogdch cleanup_harvestjobs [{source_id}] [--keep={n}}] [--dryrun]This command will look for private datasets that have the scheduled-field set and will publish it if it is due.
ckan ogdch publish_scheduled_datasets [--dryrun]This commands clears all datasets, jobs and objects related to a harvest source that was not active for a given amount of days (default 30 days).
The command is supposed to be used in a cron job and to check all harvest sources.
ckan ogdch clear_stale_harvestsources [--keep_harvestsource_days={n}}]The following API calls can be used if this plugin is installed:
-
/api/3/action/ogdch_check_indexingChecks whether there are any unindexed packages in CKAN -
/api/3/action/ogdch_reindexReindexes Solr. You can use it with these arguments:package_id=<name of the dataset>andonly_missing=true. In the later case only datasets missing in the index will get reindexed. -
/api/3/action/ogdch_check_field?field=<name of the field>This checks the database and looks for the given fields in there: the field values will be reported back together with the dataset name. -
/api/3/action/ogdch_latest_dataset_activitiesShows the latests activities on datasets with username, dataset name and a message if available. (The userharvest-notificationadds during harvesting messages about the change that occured on a dataset to the activity that it creates in case of a change on the dataset.)
To install ckanext-ogdchcommands:
-
Activate your CKAN virtual environment, for example:
. /usr/lib/ckan/default/bin/activate
-
Install the ckanext-ogdchcommands Python package into your virtual environment:
git clone https://github.com/opendata-swiss/ckanext-ogdchcommands.git cd ckanext-ogdchcommands pip install .
-
Add
ogdch_cmdto theckan.pluginssetting in your CKAN config file (by default the config file is located at/etc/ckan/default/production.ini) if you want to use the paster commands -
Add
ogdch_adminto theckan.pluginssetting in your CKAN config file (by default the config file is located at/etc/ckan/default/production.ini) if you want to use the solr admin tools -
Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:
sudo service apache2 reload
To install ckanext-ogdchcommands for development, activate your CKAN virtualenv and do:
git clone https://github.com/opendata-swiss/ckanext-ogdchcommands.git
cd ckanext-ogdchcommands
pip install .[dev]