Skip to content

For resources related to the EI Cellgen ISP, including metadata mappings and schemas for Single Cell Genomics and Spatial Transcriptomics experiments.

License

Notifications You must be signed in to change notification settings

EarlhamInst/SingleCellSchemas

Repository files navigation

SingleCellSchemas

The SingleCellSchemas repository houses developments related to Earlham Institute's (EI’s) CELLGEN ISP metadata mapping and schemas, designed to describe a variety of Single Cell Genomics and Spatial Transcriptomics experiment types, such as those from 10X Genomics and Vizgen. It is developed by the Collaborative OPen Omics (COPO) team based at the Earlham Institute.

Visit the SingleCellSchemas website at https://singlecellschemas.org.


The SingleCellSchemas repository contains the following directories:

  • dist: contains the output files generated from the conversion process.

  • schemas: contains the xlsx base versions of the schema.

  • utils: contains Python helper scripts to convert the base XLSX file into formats such as HTML, JSON, XML and XLSX.

The update_and_convert_schema.py script is responsible for updating the XLSX base schema files located in the schemas directory and generating corresponding YAML and JSON files based on the XLSX file. The script is located in the utils directory.

The main script, convert.py, is used to convert the XLSX schema into XLSX, XML, html and JSON files according to the namespace prefix. It is found in the project root directory.

Important note: Please do not directly modify the base YAML and JSON files in the schemas directory. To make changes, update the data worksheet in singlecell_schema_main_vxx.xlsx spreadsheet located in the schemas directory.

After making changes to the base XLSX file, run the update_and_convert_schema.py script in the utils directory to regenerate and update the YAML and JSON files. To run the update script, execute in the terminal - python3 utils/update_and_convert_schema.py.

Abbreviations:

  • SC RNASEQ: Single-Cell RNA-Sequencing
  • STX or ST: Spatial Transcriptomics

Please follow the instructions below to convert the XLSX schema into an xlsx, xml, html and json files:

  1. Download or clone this repository and navigate to its directory in the terminal

    git clone https://github.com/EarlhamInst/SingleCellSchemas.git

    cd SingleCellSchemas

  2. Create a new Python virtual environment called venv

    python3 -m venv venv

  3. Activate the virtual environment

    source venv/bin/activate

  4. Install dependencies

    pip3 install -r requirements.txt

  5. Run the convert.py script

    The script is located in the project root directory and can be run in several ways:

    • Use the launch.json file in the .vscode directory in VS Code and select the appropriate configuration

      --OR--

    • python3 convert.py

      This will convert the schema into a spreadsheet file, xml and json files using all namespace prefixes and schemas in the schemas directory

      --OR--

    • python3 convert.py <namespace_prefix>

      where <namespace_prefix> is the namespace prefix to be used (e.g. dwc, faang, mixs, tol) e.g. python3 convert.py dwc

      --OR--

    • python3 convert.py <format_type>

      where <format_type> is the format type that the output will be returned in (e.g. xlsx, xml, html, json) e.g. python3 convert.py html

      --OR--

    • Run the tests (which also runs the converter whilst verifying the output)

      python -m unittest


Related repositories

Additional resources

About

For resources related to the EI Cellgen ISP, including metadata mappings and schemas for Single Cell Genomics and Spatial Transcriptomics experiments.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •