Skip to content

A streamlit app for analyzing codon usage patterns across echinoderm species, featuring advanced machine learning and statistical methods to identify species-specific codon usage biases.

Notifications You must be signed in to change notification settings

AI-Ecology-Lab/EchinoML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

EchinoML - Echinoderm Codon Usage Analysis

A comprehensive web application for analyzing codon usage patterns across echinoderm species using machine learning and statistical methods.

Features

  • Upload and process CDS FASTA files from multiple echinoderm species
  • Calculate codon usage statistics and GC content
  • Perform dimensionality reduction (PCA, t-SNE)
  • Cluster analysis (K-means, Hierarchical)
  • Machine learning classification
  • Statistical analysis (ANOVA)
  • Interactive visualizations

Setup

  1. Clone the repository:
git clone https://github.com/yourusername/EchinoML.git
cd EchinoML
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install required packages:
pip install -r requirements.txt

Usage

  1. Start the Streamlit app:
streamlit run app.py
  1. Upload your CDS FASTA files (.fa.gz format) through the web interface
  2. Explore the various analysis sections:
    • Basic Statistics
    • GC Content Analysis
    • Codon Usage Analysis
    • Dimensionality Reduction
    • Clustering Analysis
    • Machine Learning Analysis
    • Statistical Analysis

Data Requirements

The app expects CDS (Coding Sequence) FASTA files in gzipped format (.fa.gz) from echinoderm species. Each file should contain protein-coding sequences.

Output

The app provides:

  • Interactive visualizations
  • Statistical summaries
  • Machine learning model performance metrics
  • Downloadable results in CSV format

License

[Add your license here]

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

A streamlit app for analyzing codon usage patterns across echinoderm species, featuring advanced machine learning and statistical methods to identify species-specific codon usage biases.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages