-
Notifications
You must be signed in to change notification settings - Fork 2
Database
For readability purpose, only the columns that define the entities can be found below in the model.
Description
An asset kind is the extension of the file (aka. asset). For example, it can be txt, csv among other values.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the asset kind. | ✅ | |
| name | Name of the file (aka. asset) extension like pdf or doc. |
||
| created_at | Date and time when the asset kind was created in the table. | ||
| updated_at | Date and time when the asset kind was updated for the last time in the table. |
Description
This is an intermediary table that represents the many-to-many relationship between assets and resources. Use this table to join them.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier asset-resource relationship. | ✅ | |
| asset_id | Identifier of the asset in the relationship. Use this column to join with the assets on assets.id. |
||
| resource_id | Identifier of the resource in the relationship. Use this column to join with the resources on resources.id. |
||
| created_at | Date and time when the asset-resource relationship was created in the table. | ||
| updated_at | Date and time when the asset-resource relationship was updated for the last time in the table. |
Description
An asset is a file we want to rescue, it is included in a dataset and reachable through one or several resources. It is located at an URL, it can be either core data, sample or documentation and must be downloadable.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the asset. | ✅ | |
| url | URL where the asset can be found and downloaded. | ||
| kind_id | ID of the asset kind. Use this column to join with the asset_kinds on asset_kinds.id. |
||
| created_at | Date and time when the asset was created in the table. | ||
| updated_at | Date and time when the asset was updated for the last time in the table. | ||
| size | Size of the asset in octets. | ✅ | |
| mtime | Date and time when the asset was updated for the last time at the URL. | ✅ |
Description
Rankings of datasets. The detail of theses rankings can be found in dataset_ranks.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the dataset's ranking. | ✅ | |
| name | Name of the dataset's ranking. | ||
| comment | Comment about the dataset's ranking. | ||
| ranking_date | Date and time when the dataset's ranking was made. | ||
| type | Type of the dataset's ranking. | ||
| created_at | Date and time when the dataset's ranking was created in the table. | ||
| updated_at | Date and time when the dataset's ranking was updated for the last time in the table. |
Description
Dataset ranking details. Name of associated rankings can be found in dataset_rankings table.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the dataset's rank. | ✅ | |
| dataset_id | ID of the dataset that was ranked. Use this column to join with the datasets on datasets.id. |
||
| ranking_id | ID of the dataset's ranking that was made. Use this column to join with the dataset_rankings on dataset_rankings.id. |
||
| rank | Rank of the dataset. | ||
| event_count | ADD DOCS FOR COLUMN. | ||
| created_at | Date and time when the dataset's rank was created in the table. | ||
| updated_at | Date and time when the dataset's rank was updated for the last time in the table. |
Description
A dataset is a parcel of data - for example, it could be the crime statistics for a region, the spending figures for a government department, or temperature readings from various weather stations. A dataset is owned by an organization and contains two things: information or metadata about the data like the title, the publisher or the date and a number of resources that contain the data like files or links to other web pages.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the dataset. | ✅ | |
| dg_id | ID of the dataset within data.gov. | ||
| dg_name | Name of the dataset in data.gov. | ||
| dg_title | Title of the dataset in data.gov. | ||
| dg_notes | Notes that describe the dataset in data.gov. | ||
| dg_metadata_created | Date and time when the dataset's metadata was created in data.gov. | ||
| dg_metadata_modified | Date and time when the dataset's metadata was updated for the last time in data.gov. | ||
| access_direct_dl_count | ADD DOCS FOR COLUMN. | ||
| access_total_count | ADD DOCS FOR COLUMN. | ||
| organization_id | ID of the organization. Use this column to join with the organizations on organizations.id. |
||
| created_at | Date and time when the dataset was created in the table. | ||
| updated_at | Date and time when the dataset was updated for the last time in the table. |
Description
Raw data of datasets extracted from the CKAN API. It contains all the data related to the organization that owns the dataset and related to the dataset's resources. The tables organizations, datasets and resources are built from this table.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the dataset. | ✅ | |
| dataset_id | It is a duplicate of id. This column should be removed. |
||
| content | Dataset's data structured in a JSON format, dataset's metadata, resources' and organization's data can be found here. | ||
| created_at | Date and time when the dataset was created in the table. | ||
| updated_at | Date and time when the dataset was updated for the last time in the table. |
Description
In the context of rescuing data from data.gov, an organization can be a federal agency like the NOAA (National Oceanic and Atmospheric Administration, Department of Commerce), a state like the state of California or a city like the city of Austin. An organization owns datasets.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the organization. | ✅ | |
| dg_id | ID of the organization within data.gov. | ||
| dg_name | Code of the organization in data.gov. | ||
| dg_title | Full name of the organization in data.gov. | ||
| dg_created | Date and time when the organization was created in data.gov. | ||
| created_at | Date and time when the organization was created in the table. | ||
| updated_at | Date and time when the organization was updated for the last time in the table. |
Description
A resource is contained in a dataset and holds its data itself. It can be of various formats such as CSV or Excel spreadsheet, XML file, PDF document, image file, linked data in RDF format or even a page that is elsewhere on the web. When a resource is a file to be rescued, it is also an asset.
Columns
| Name | Description | PK | Nullable |
|---|---|---|---|
| id | Auto-incremented identifier of the resource. | ✅ | |
| dg_id | ID of the resource within data.gov. | ||
| dg_name | Name of the resource in data.gov. | ||
| dg_description | Description of the resource in data.gov. | ||
| dg_resource_locator_function | ADD DOCS FOR COLUMN. | ✅ | |
| dg_resource_locator_protocol | ADD DOCS FOR COLUMN. | ✅ | |
| dg_mimetype | ADD DOCS FOR COLUMN. | ✅ | |
| dg_state | ADD DOCS FOR COLUMN. | ||
| dg_created | Date and time when the resource was created in data.gov. | ||
| dg_metadata_modified | Date and time when the resource's metadata was modified for the last time in data.gov. | ||
| dg_url | URL provided in data.gov where the resource can be found. | ||
| resource_type | Type of the resource that is the file extension if the resource is a file or a custom type if it is a web page. | ||
| dataset_id | ID of the dataset in which the resource is included. Use this column to join with the datasets on datasets.id. |
||
| created_at | Date and time when the resource was created in the table. | ||
| updated_at | Date and time when the resource was updated for the last time in the table. |