-
Notifications
You must be signed in to change notification settings - Fork 7
TableInfo: collection.tsv
abradyIGS edited this page Sep 3, 2021
·
28 revisions
The C2M2 collection entity is a generalization of "dataset" -- a named grouping of files, biosamples and/or subjects.
Defining collections is optional. The `collection.tsv table will have one row for each collection you define for your program.
Please see the technical documentation for a complete treatment of how C2M2 collections are used.
| Field | Field Description | Required? | Attributes | Extra Info |
|---|---|---|---|---|
| id_namespace | A CFDE-cleared identifier representing the top-level data space containing this collection [part 1 of 2-component composite primary key] | If a row has this value, local_id must also have a value | Value type is string | If your program has not implemented multiple id_namespaces, this will be exactly the same for all rows This will be the value of id_namespace in project.tsv for the overarching project in your program and/or the value of project_id_namespace in primary_dcc_contact
|
| local_id | An identifier representing this collection, unique within this id_namespace [part 2 of 2-component composite primary key] | If a row has this value, id_namespace must also have a value | The value in each row must be different Value type is string |
Each individual collection needs a unique local_id value (every row should be different). The local_id column appears in many tables but values should not be repeated across tables. e.g. 'file' local_id is a separate concept from 'biosample' local_id. |
| persistent_id | A persistent, resolvable (not necessarily retrievable) URI or compact ID permanently attached to this collection | Non-required: Any number of rows after the header can be filled | The value in each row must be different Value type is string |
Meant to serve as a permanent address to which landing pages (which summarize metadata associated with this collection) and other relevant annotations and functions can optionally be attached, including information enabling resolution to a network location from which the file can be downloaded. Actual network locations must not be embedded directly within this identifier: one level of indirection is required in order to protect persistent_id values from changes in network location over time as files are moved around. |
| creation_time | An ISO 8601 -; RFC 3339 (subset)-compliant timestamp documenting this Collections creation time: YYYY-MM-DDTHH:MM:SS±NN:NN | Non-required: Any number of rows after the header can be filled | Value must be datetime | Example valid dates: 2021-01-082021-01-08T00:45:40Z>2021-01-08T00:45:40+00:00
|
| abbreviation | A very short display label for this collection | Non-required: Any number of rows after the header can be filled | Value can be any string that is not already in use at the CFDE; Value should be 10 characters or fewer Cannot contain special unix characters |
This is the display abbreviation for this collection in the portal |
| name | A short, human-readable, machine-read-friendly label for this collection | Non-required: Any number of rows after the header can be filled | Value type is string | This is the display name for this collection in the portal |
| description | A human-readable description of this collection | Non-required: Any number of rows after the header can be filled | Value type is string | This is the display description for this collection in the portal |
-
Tutorials
-
C2M2 Table Guide
-
Table Summary
- analysis_type.tsv
- anatomy.tsv
- assay_type.tsv
- biofluid.tsv
- biosample.tsv
- biosample_disease.tsv
- biosample_from_subject.tsv
- biosample_gene.tsv
- biosample_in_collection.tsv
- biosample_protein.tsv
- biosample_ptm.tsv
- biosample_substance.tsv
- collection.tsv
- collection_anatomy.tsv
- collection_biofluid.tsv
- collection_compound.tsv
- collection_defined_by_project.tsv
- collection_disease.tsv
- collection_gene.tsv
- collection_in_collection.tsv
- collection_phenotype.tsv
- collection_protein.tsv
- collection_ptm.tsv
- collection_substance.tsv
- collection_taxonomy.tsv
- compound.tsv
- data_type.tsv
- dcc.tsv (formerly
primary_dcc_contact.tsv - disease.tsv
- domain_location.tsv
- file.tsv
- file_describes_biosample.tsv
- file_describes_collection.tsv
- file_describes_subject.tsv
- file_format.tsv
- file_in_collection.tsv
- gene.tsv
- id_namespace.tsv
- ncbi_taxonomy.tsv
- phenotype.tsv
- phenotype_disease.tsv
- phenotype_gene.tsv
- project.tsv
- project_in_project.tsv
- protein.tsv
- protein_gene.tsv
- ptm.tsv
- ptm_type.tsv
- ptm_subtype.tsv
- sample_prep_method.tsv
- subject.tsv
- subject_disease.tsv
- subject_in_collection.tsv
- subject_phenotype.tsv
- subject_race.tsv
- subject_role_taxonomy.tsv
- subject_substance.tsv
- substance.tsv
- Reference Tables
-
Table Summary