Drug Discovery Informatics Platform

A comprehensive MVP platform for managing compound libraries, running ML predictions, tracking experiments, and generating reports for drug discovery applications.

drugovery.mov

Data Usage Notice

This video uses a portion of: PubChemLite Compound Collection Used for demo only. All rights belong to original creators.

Features

Compound Library Management: Upload, search, and manage chemical compounds with SMILES notation
ML Predictions: QSAR models for molecular properties (solubility, toxicity) and drug-target interactions
Experiment Tracking: MLflow integration for tracking ML runs and comparing models
Report Generation: PDF/CSV exports with visualizations
User Authentication: JWT-based auth with role-based access control
Data Versioning: Track changes to compound data over time

Tech Stack

Backend

FastAPI (Python 3.9+)
PostgreSQL
Redis
MLflow
Celery
RDKit
SQLAlchemy

Frontend

React.js 18+
Material-UI (MUI)
Axios
React Router
Chart.js

Infrastructure

Docker & Docker Compose
Nginx
Kubernetes-ready

Quick Start

Prerequisites

Docker and Docker Compose
Python 3.9+ (for local development)
Node.js 18+ (for local development)

Using Docker Compose (Recommended)

Clone the repository and navigate to the project directory.
Set up environment variables:

cp .env.example .env
# Edit .env with your configuration

Start all services:

docker-compose up -d

Access the application:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- MLflow UI: http://localhost:5001
Create initial admin user:

docker-compose exec backend python scripts/create_admin.py

Local Development

Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

# Set up database
alembic upgrade head

# Run migrations
python scripts/init_db.py

# Start server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend Setup

cd frontend
npm install
npm start

Start Supporting Services

# PostgreSQL
docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres postgres:14

# Redis
docker run -d -p 6379:6379 redis:7

# MLflow
mlflow server --backend-store-uri postgresql://postgres:postgres@localhost/mlflow --default-artifact-root ./mlruns --host 0.0.0.0 --port 5001

Project Structure

drugovery/
├── backend/
│   ├── app/
│   │   ├── api/
│   │   │   ├── v1/
│   │   │   │   ├── auth.py
│   │   │   │   ├── compounds.py
│   │   │   │   ├── predictions.py
│   │   │   │   ├── experiments.py
│   │   │   │   └── reports.py
│   │   ├── core/
│   │   │   ├── config.py
│   │   │   ├── security.py
│   │   │   └── database.py
│   │   ├── models/
│   │   │   ├── user.py
│   │   │   ├── compound.py
│   │   │   ├── experiment.py
│   │   │   └── prediction.py
│   │   ├── schemas/
│   │   │   └── ...
│   │   ├── services/
│   │   │   ├── ml_service.py
│   │   │   ├── chembl_service.py
│   │   │   └── versioning_service.py
│   │   ├── tasks/
│   │   │   └── prediction_tasks.py
│   │   └── main.py
│   ├── ml_models/
│   │   ├── qsar_models/
│   │   └── dti_models/
│   ├── alembic/
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── components/
│   │   ├── pages/
│   │   ├── services/
│   │   ├── utils/
│   │   └── App.js
│   ├── package.json
│   └── Dockerfile
├── docker-compose.yml
├── .env.example
└── README.md

API Documentation

Once the backend is running, visit:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Testing

Backend Tests

cd backend
pytest tests/

Frontend Tests

cd frontend
npm test

Deployment

Kubernetes Deployment

See k8s/ directory for Kubernetes manifests.

kubectl apply -f k8s/

Usage Examples

Creating a Compound

import requests

token = "your-jwt-token"
headers = {"Authorization": f"Bearer {token}"}

compound_data = {
    "name": "Aspirin",
    "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O",
    "molecular_formula": "C9H8O4"
}

response = requests.post(
    "http://localhost:8000/api/v1/compounds",
    json=compound_data,
    headers=headers
)

Running a Prediction

prediction_data = {
    "compound_id": 1,
    "model_type": "solubility",
    "model_name": "solubility_model"
}

response = requests.post(
    "http://localhost:8000/api/v1/predictions",
    json=prediction_data,
    headers=headers
)

Batch Predictions

batch_data = {
    "compound_ids": [1, 2, 3, 4, 5],
    "model_type": "toxicity",
    "model_name": "toxicity_model"
}

response = requests.post(
    "http://localhost:8000/api/v1/predictions/batch",
    json=batch_data,
    headers=headers
)

Troubleshooting

Database Connection Issues

Ensure PostgreSQL is running: docker-compose ps
Check database credentials in .env
Verify network connectivity between services

MLflow Connection Issues

Check MLflow service is running: docker-compose logs mlflow
Verify MLFLOW_TRACKING_URI in environment variables
Check PostgreSQL connection for MLflow backend

Frontend Not Loading

Check backend API is accessible: curl http://localhost:8000/health
Verify CORS settings in backend config
Check browser console for errors

Performance Tips

Database Indexing: Add indexes on frequently queried fields
Caching: Use Redis for caching compound lookups
Batch Operations: Use batch endpoints for multiple operations
Pagination: Always use pagination for large datasets
Async Tasks: Use Celery for long-running predictions

Security Best Practices

Change default SECRET_KEY in production
Use strong database passwords
Enable HTTPS in production
Implement rate limiting
Regularly update dependencies
Use environment variables for secrets
Implement input validation
Enable audit logging

License

MIT

Contributing

Contributions are welcome! Please see the implementation guide for details on extending the platform.

Support

For issues and questions, please open an issue on the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
backend		backend
frontend		frontend
k8s		k8s
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
DEPLOYMENT.md		DEPLOYMENT.md
IMPLEMENTATION.md		IMPLEMENTATION.md
LICENSE		LICENSE
QUICK_START.md		QUICK_START.md
README.md		README.md
SETUP.md		SETUP.md
TESTING.md		TESTING.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Drug Discovery Informatics Platform

Data Usage Notice

Features

Tech Stack

Backend

Frontend

Infrastructure

Quick Start

Prerequisites

Using Docker Compose (Recommended)

Local Development

Backend Setup

Frontend Setup

Start Supporting Services

Project Structure

API Documentation

Testing

Backend Tests

Frontend Tests

Deployment

Kubernetes Deployment

Usage Examples

Creating a Compound

Running a Prediction

Batch Predictions

Troubleshooting

Database Connection Issues

MLflow Connection Issues

Frontend Not Loading

Performance Tips

Security Best Practices

License

Contributing

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages