Skip to content

Latest commit

 

History

History
123 lines (86 loc) · 3.64 KB

File metadata and controls

123 lines (86 loc) · 3.64 KB

CELLxGENE Cell Guide API Endpoints

This document provides information about the API endpoints discovered in the Cell Guide website that can be used to programmatically access cell type information.

Configuration

The endpoints are configured in the config.json file in the root directory of the project. The current version ID for marker endpoints is 1743611056.

{
  "base_url": "https://cellxgene.cziscience.com/cellguide/",
  "api_validated_url": "https://cellguide.cellxgene.cziscience.com/validated_descriptions/",
  "api_gpt_url": "https://cellguide.cellxgene.cziscience.com/gpt_descriptions/",
  "api_markers_base": "https://cellguide.cellxgene.cziscience.com/",
  "marker_id_version": "1743611056"
}

Endpoint Reference

Cell Description Endpoints

These endpoints provide text descriptions of cell types.

1. Validated Descriptions

https://cellguide.cellxgene.cziscience.com/validated_descriptions/{ONTOLOGY_ID}.json

Example:

https://cellguide.cellxgene.cziscience.com/validated_descriptions/CL_0000084.json

Response format:

{
  "description": "T cells also known as T lymphocytes are a critical component of the adaptive immune system...",
  "references": ["https://www.doi.org/10.1016/j.jaci.2022.10.011", "..."]
}

2. GPT-Generated Descriptions

https://cellguide.cellxgene.cziscience.com/gpt_descriptions/{ONTOLOGY_ID}.json

Example:

https://cellguide.cellxgene.cziscience.com/gpt_descriptions/CL_0000084.json

Response format: Direct string containing the description.

Marker Gene Endpoints

These endpoints provide marker gene information for cell types. Note that these use a version ID in the URL which may change over time.

1. Computational Marker Genes

https://cellguide.cellxgene.cziscience.com/{VERSION_ID}/computational_marker_genes/{ONTOLOGY_ID}.json

Example:

https://cellguide.cellxgene.cziscience.com/1743611056/computational_marker_genes/CL_0000084.json

Response format: Array of marker gene objects containing fields like symbol, name, specificity, etc.

2. Canonical Marker Genes

https://cellguide.cellxgene.cziscience.com/{VERSION_ID}/canonical_marker_genes/{ONTOLOGY_ID}.json

Example:

https://cellguide.cellxgene.cziscience.com/1743611056/canonical_marker_genes/CL_0000084.json

Response format: Array of marker gene objects containing fields like symbol, name, and tissue information.

Usage Notes

Versioning

The marker gene endpoints include a version ID in the URL path which appears to be a timestamp or other versioning identifier. If you experience issues with these endpoints, the version may have been updated. Use the debug_api.py script to investigate current endpoint behavior.

Example Cell Types

Here are some example cell ontology IDs that can be used with these endpoints:

  • CL_0000084: T cell
  • CL_0000236: B cell
  • CL_0000094: Granulocyte
  • CL_0000928: Activated CD4-negative, CD8-negative type I NK T cell

Using with the Python Client

The cell_guide_scraper.py script provides a convenient way to access these endpoints programmatically. It handles fallbacks between API endpoints and HTML scraping when necessary.

from cell_guide_scraper import scrape_cell_data

# Get data for a T cell
cell_data = scrape_cell_data("CL_0000084")

# Access description
description = cell_data["description"]
description_source = cell_data["description_source"]  # "validated", "gpt", or "html"

# Access markers
computational_markers = cell_data["markers"]["computational"]
canonical_markers = cell_data["markers"]["canonical"]
markers_source = cell_data["markers"]["markers_source"]  # "api" or "html"