iCHO3K

A community-driven, genome-scale metabolic reconstruction and analysis toolkit for Chinese Hamster Ovary (CHO) cells.

Highlights

Scope: 11,004 reactions • 7,377 metabolites • 3,597 genes • 3,489 mapped protein structures
Use cases: context-specific modeling (WT vs ZeLa), pFBA/ecFBA, flux sampling, subsystem coverage, structure-guided hypotheses (e.g., putative PEP allosteric inhibition of PFK)
Artifacts: curated datasets, notebooks (Python & MATLAB), figures, and utilities to reproduce key analyses

If you use this repository or the iCHO3K model, please see Citing below.

Repository layout

Analyses/
├── conf_score_distribution.png            # Confidence Score distribution across all reactions from iCHO3K
├── data_preprocessing                     # ZeLa vs WT growth rate and spent media data analysis
├── growth_rate_pred/                      # pFBA simulations from ZeLa and WT context-specific models
├── recons_comparisons/                    # Plot comparisions between iCHO3K and previous CHO reconstructions
├── Relevant_mets/                         # Analysis of subsystems related to metabolites relevant to biomass sysnthesis
├── sampled_results/
├── subsystem_overview/                    #Subsystem/System classification sunburst plot
└── tSNE/                                  #tSNE embedding analysis

Data/                         
├── Context_specific_models/              # Context-specific ZeLa and WT models (MAT, JSON)
│   ├── ecModels/                         # Context-specific ec models
│   └── unblocked_ecModel_generic/        # Generic iCHO3K ec model
│
├── GPR_Curation/                         # Supplementary data for GPR Mapping from Recon3D to iCHO3K
├── Gene_Essentiality/                    # Set of experimentally validated CHO essential genes
├── Metabolites/                          # Supplementary data for metabolites information
├── Orthologs/                            # Ortholog mapping information from Human to Chinese Hamster
├── Reconciliation/                       # Source reconstructions & derived models and datasets
│   ├── datasets/
│   └── models/
│
├── Uptake_Secretion_Rates/               # Pre-processed uptake and secretion rates from ZeLa fed-batch data
└── ZeLa Data/                            # ZeLa 14 fed-batch raw transcriptomics and spent media data

iCHO3K/
├── Dataset/                              # iCHO3K source dataset for model generation
└── Model/                                # iCHO3K generic model variants

Matlab/
├── ecFBA/                               # ecFBA scripts
├── Model_Extraction/                    # Model Extraction with mCADRE scripts
├── Standep/                             # ubiquityScore calculation with Standep
├── main_Model_Extraction.m              # Main code for mCADRE model extraction
└── main_standep.m                       # Main code for Standep data preprocessing

Python/
├── Network Reconstruction/              # Notebooks related to the reconciliation of previous reconstructions and building of iCHO3K
│   ├── Genes.ipynb                      # Retrieval of Gene information from databases
│   ├── Metabolites.ipynb                # Integration of metabolite information, de-duplication and analysis
│   ├── Reactions.ipynb                  # Reconciliation of previous CHO and Recon3D reconstructions, de-duplication, subsytem re-organization 
│   └── retrieveTurnoverNumber.ipynb      # Fetch turnover numbers and molecular weights from the BRENDA
│                 
├── Supplementary Notebooks/             # Supplementary Notebooks with extra information of previous reconstructions
├── Comparison..Reconstructions.ipynb    # Comparison of iCHO3K with previous CHO reconstructions
├── Computational_Tests.ipynb            # 
├── Final CHO Model.ipynb
├── Calculate_specific_rates.ipynb       # Preprocess of spent media data into GEM fluxes
└── ZeLa_fluxomics.ipynb                 # ZeLa fluxomics data

Large files: Some assets may use Git LFS. If you see pointer files, run:
git lfs install && git lfs pull

Installation & setup

Recommended: conda environment

conda create -n icho3k python=3.11 -y
conda activate icho3k
pip install cobra pandas numpy scipy matplotlib optlang networkx jupyterlab escher seaborn

Optional (for graph utilities & network export):

pip install ndex2 pygraphviz

If environment.yml or requirements.txt is provided in the repo or a release, prefer installing from those for exact reproducibility.

MATLAB (optional)

MATLAB R2022b+ recommended (earlier versions likely workable).
Add Notebooks/Matlab/ to your MATLAB path.

Quickstart (Python)

1) Load a context-specific model and run pFBA

import cobra
from cobra.flux_analysis import pfba

model = cobra.io.load_json_model("Data/Context_specific_models/ZeLa_model.json")
solution = pfba(model)
print(f"Objective ({model.objective.direction}): {solution.objective_value:.4f}")

# Top 10 absolute fluxes
top = sorted(solution.fluxes.items(), key=lambda x: abs(x[1]), reverse=True)[:10]
for rxn, v in top:
    print(f"{rxn:25s} {v:10.3f}")

2) Compare WT vs ZeLa growth under the same media

import cobra, pandas as pd

wt   = cobra.io.load_json_model("Data/Context_specific_models/WT_model.json")
zela = cobra.io.load_json_model("Data/Context_specific_models/ZeLa_model.json")

# Example: harmonize key exchange bounds
for ex in ["EX_glc__D_e", "EX_gln__L_e", "EX_o2_e"]:
    for m in (wt, zela):
        if ex in m.reactions:
            m.reactions.get_by_id(ex).lower_bound = -10.0

res = []
for name, m in [("WT", wt), ("ZeLa", zela)]:
    sol = m.optimize()
    res.append({"model": name, "mu": sol.objective_value})

print(pd.DataFrame(res))

3) Flux sampling (optlang-compatible solver required)

from cobra.sampling import sample
model = cobra.io.load_json_model("Data/Context_specific_models/WT_model.json")
samples = sample(model, n=1000)   # DataFrame
samples.to_csv("Analyses/sampled_results/wt_samples.csv", index=False)

Quickstart (MATLAB)

% Ensure COBRA Toolbox is installed & initialized
initCobraToolbox(false)  % without updates
changeCobraSolver('glpk', 'LP');

% Load a JSON context-specific model
model = readCbModel('Data/Context_specific_models/WT_model.json');

% Optimize & print objective
solution = optimizeCbModel(model);
fprintf('Growth objective: %.4f\n', solution.f);

MATLAB scripts for extraction, flux sampling, and context-specific modeling are under Notebooks/Matlab/.

Reproducing key analyses

Many figures in Analyses/ are generated from notebooks in Notebooks/:

Reconstruction comparisons →
Notebooks/Comparison of Metabolic Reconstructions.ipynb → Analyses/recons_comparisons/
Subsystem coverage & sunbursts →
Analyses/subsystem_overview/
Flux enrichment & sampling →
Analyses/flux_enrichment_analysis/, Analyses/sampled_results/
Growth rate prediction (WT vs ZeLa) →
Analyses/growth_rate_pred/
Topology & t-SNE →
Analyses/tSNE/

Most notebooks begin with a “Paths & Environment” cell—update paths as needed. For strict reproducibility, pin exact package versions via environment.yml and use releases/DOI snapshots.

Data & provenance

Curated inputs and derived artifacts are organized under Data/. Key elements:

Source reconstructions → Reconciliation/datasets/ (inputs) and Reconciliation/models/ (intermediate models).
Annotations & mappings → Metabolites/, Subsystem/, Orthologs/.
Evidence & curation → GPR_Curation/, Gene_Essentiality/, kcat_values/.
Experimental constraints → Uptake_Secretion_Rates/, ZeLa Data/.
Secretory overlap → Sec_Recon_shared_genes/.
Final model → iCHO3K_final/ (Excel format; conversion notebooks provided).

During manual curation, compartment and subsystem information were inherited from source reconstructions; discrepancies were resolved using authoritative resources (see notes within notebooks).

Model formats & I/O

Excel: Final iCHO3K lives in Data/iCHO3K_final/ for inspection and conversion.

SBML / JSON: Preferred for simulation. Use conversion notebooks (e.g., Notebooks/Final CHO Model.ipynb) or COBRApy I/O:

import cobra
m = cobra.io.load_json_model("path/to/model.json")
cobra.io.save_json_model(m, "out.json")
cobra.io.write_sbml_model(m, "out.xml")

Some scripts expect standardized BiGG-style IDs. See Notebooks/metabolite_identifiers.py for mapping helpers.

Solvers & performance tips

LP/QP solvers: GLPK (free), CPLEX/Gurobi (commercial, academic licenses). Set via COBRApy:
```
import cobra
cobra.Configuration().solver = "glpk"  # or "gurobi", "cplex"
```
Speed: Prefer commercial solvers for large sampling tasks; reduce model size using context-specific models; cache solutions where possible.
Numerics: Tighten feasibility/optimality tolerances for sensitive analyses.

Contributing

Contributions are welcome!

Issues: Report bugs, request features, or flag data discrepancies.
PRs: Use feature branches; include a clear description, minimal reproducible example or notebook, and updated docs.
Style: PEP 8 for Python; strip heavy notebook outputs before committing.
Data: Avoid committing large binaries; use Git LFS or attach to Releases/Zenodo.

If contributing new datasets or model variants, please include:

Data dictionary (column descriptions, units)
Provenance (source links/versions)
Minimal script/notebook to regenerate derived artifacts

Citing

If you use iCHO3K or materials from this repository, please cite the bioRxiv preprint:

Di Giusto, P., Choi, D.-H., et al. (2025). A community-consensus reconstruction of Chinese Hamster metabolism enables structural systems biology analyses to decipher metabolic rewiring in lactate-free CHO cells. bioRxiv. https://doi.org/10.1101/2025.04.10.647063 (v1 posted April 17, 2025).

License

Maintainers & contact

Pablo Di Giusto — [email protected] · [email protected]
Systems Biology & Cell Engineering Lab (Lewis Lab), UC San Diego & University of Georgia

Acknowledgments

We thank the iCHO3K community contributors and collaborators (including secRecon curators). This work builds upon public resources: Recon3D, BiGG, MetaNetX, Rhea, UniProt, BRENDA, and others referenced throughout the notebooks.

FAQ / Troubleshooting

I see .gitattributes LFS pointers instead of files.
Run:

git lfs install
git lfs pull

Solver not found / poor performance.
Install an LP solver (GLPK works; Gurobi/CPLEX recommended for speed). Set cobra.Configuration().solver = "gurobi" once installed.

Model won’t optimize (infeasible).

Harmonize exchange bounds across conditions.
Check blocked reactions / dead-end metabolites.
For comparative runs (WT vs ZeLa), ensure identical media constraints.

Notebook paths are wrong.
Edit the first “Paths & Environment” cell—most notebooks expose a single place to set root paths.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

iCHO3K

Table of contents

Repository layout

Installation & setup

Recommended: conda environment

MATLAB (optional)

Quickstart (Python)

1) Load a context-specific model and run pFBA

2) Compare WT vs ZeLa growth under the same media

3) Flux sampling (optlang-compatible solver required)

Quickstart (MATLAB)

Reproducing key analyses

Data & provenance

Model formats & I/O

Solvers & performance tips

Contributing

Citing

License

Maintainers & contact

Acknowledgments

FAQ / Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 743 Commits
Analyses		Analyses
Data		Data
Matlab		Matlab
Python		Python
iCHO3K		iCHO3K
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

LewisLabUCSD/iCHO3K

Folders and files

Latest commit

History

Repository files navigation

iCHO3K

Table of contents

Repository layout

Installation & setup

Recommended: conda environment

MATLAB (optional)

Quickstart (Python)

1) Load a context-specific model and run pFBA

2) Compare WT vs ZeLa growth under the same media

3) Flux sampling (optlang-compatible solver required)

Quickstart (MATLAB)

Reproducing key analyses

Data & provenance

Model formats & I/O

Solvers & performance tips

Contributing

Citing

License

Maintainers & contact

Acknowledgments

FAQ / Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Languages

Packages