Skip to content

tienmng/Crystal_Math

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

220 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crystal Structure Analysis (CSA)

Documentation Status Python 3.9 PyTorch License: MIT

A comprehensive Python framework for extracting, processing, and analyzing molecular crystal structures from the Cambridge Structural Database (CSD).

🚀 Key Features

  • High-Performance Pipeline: GPU-accelerated batch processing with PyTorch
  • CSD Integration: Direct interface to Cambridge Structural Database
  • Advanced Analytics: Fragment analysis, intermolecular contacts, and geometric descriptors
  • Efficient Storage: HDF5-based data management with variable-length datasets
  • Scalable Architecture: Parallel processing for large datasets

📖 Documentation

📚 Complete Documentation

The full documentation includes:

⚡ Quick Start

# Install CSA
pip install -e .

# Run analysis
python src/csa_main.py --config your_config.json

For detailed installation instructions and requirements, see the Installation Guide.

🔬 What CSA Does

CSA transforms raw crystallographic data into analysis-ready datasets through a five-stage pipeline:

  1. Family Extraction - Query and organize CSD structures by chemical families
  2. Similarity Clustering - Group structures by 3D packing similarity
  3. Representative Selection - Choose optimal structures using statistical metrics
  4. Data Extraction - Extract atomic coordinates, bonds, and intermolecular contacts
  5. Feature Engineering - Compute advanced geometric and topological descriptors

📋 Requirements

  • Python 3.9 (Required for CSD Python API)
  • PyTorch (GPU recommended)
  • Valid CCDC license for CSD access
  • HDF5 and related dependencies

See the full requirements in the documentation.

🤝 Contributing

Contributions are welcome! Please see our contributing guidelines for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Note: CSA requires a valid Cambridge Crystallographic Data Centre (CCDC) license for full functionality.

About

Algorithms to analyze and predict molecular structures

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%