The SingleCellSchemas repository houses developments related to Earlham Institute's (EI’s) CELLGEN ISP metadata mapping and schemas, designed to describe a variety of Single Cell Genomics and Spatial Transcriptomics experiment types, such as those from 10X Genomics and Vizgen. It is developed by the Collaborative OPen Omics (COPO) team based at the Earlham Institute.
Visit the SingleCellSchemas website at https://singlecellschemas.org.
The SingleCellSchemas repository contains the following directories:
-
dist: contains the output files generated from the conversion process. -
schemas: contains the xlsx base versions of the schema. -
utils: contains Python helper scripts to convert the base XLSX file into formats such as HTML, JSON, XML and XLSX.
The update_and_convert_schema.py script is responsible for updating the XLSX base schema files located in the schemas directory and generating corresponding YAML and JSON files based on the XLSX file. The script is located in the utils directory.
The main script, convert.py, is used to convert the XLSX schema into XLSX, XML, html and JSON files according to the namespace prefix. It is found in the project root directory.
Important note: Please do not directly modify the base YAML and JSON files in the
schemasdirectory. To make changes, update thedataworksheet insinglecell_schema_main_vxx.xlsxspreadsheet located in theschemasdirectory.
After making changes to the base XLSX file, run the update_and_convert_schema.py script in the utils directory to regenerate and update the YAML and JSON files. To run the update script, execute in the terminal - python3 utils/update_and_convert_schema.py.
Abbreviations:
- SC RNASEQ: Single-Cell RNA-Sequencing
- STX or ST: Spatial Transcriptomics
Please follow the instructions below to convert the XLSX schema into an xlsx, xml, html and json files:
-
Download or clone this repository and navigate to its directory in the terminal
git clone https://github.com/EarlhamInst/SingleCellSchemas.gitcd SingleCellSchemas -
Create a new Python virtual environment called
venvpython3 -m venv venv -
Activate the virtual environment
source venv/bin/activate -
Install dependencies
pip3 install -r requirements.txt -
Run the
convert.pyscriptThe script is located in the project root directory and can be run in several ways:
-
Use the
launch.jsonfile in the.vscodedirectory in VS Code and select the appropriate configuration--OR--
-
python3 convert.pyThis will convert the schema into a spreadsheet file, xml and json files using all namespace prefixes and schemas in the
schemasdirectory--OR--
-
python3 convert.py <namespace_prefix>where
<namespace_prefix>is the namespace prefix to be used (e.g. dwc, faang, mixs, tol) e.g.python3 convert.py dwc--OR--
-
python3 convert.py <format_type>where
<format_type>is the format type that the output will be returned in (e.g. xlsx, xml, html, json) e.g.python3 convert.py html--OR--
-
Run the tests (which also runs the converter whilst verifying the output)
python -m unittest
-
- SingleCellSchemas – This repository (included for reference)
- COPO-production
- COPO-schemas
- COPO-documentation
-
Single-cell website - The official website for this repository