Skip to content

sorprano/SmartCIF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmartCIF

An intelligent workflow system for processing CIF (Crystallographic Information File) files of Metal-Organic Frameworks (MOFs), featuring automated structure analysis, solvent identification, and literature retrieval.

Overview

This project provides an automated pipeline for processing MOF crystal structures, including:

  • CIF File Processing: Convert structures to P1 space group and primitive cells
  • CSD Database Integration: Retrieve structure information from Cambridge Structural Database
  • Cluster Analysis: Identify and analyze atomic clusters in crystal structures
  • Solvent Identification & Removal: Automatically detect and remove solvent molecules
  • Literature Retrieval: Download and parse related research papers
  • Batch Processing: Parallel processing of multiple CIF files

Key Features

  • Multi-Agent Architecture: Specialized agents for different tasks (CSD info, cluster analysis, solvent identification, etc.)
  • LLM-Powered Analysis: Uses DeepSeek LLM for intelligent structure analysis and decision-making
  • Automated Workflow: End-to-end processing from raw CIF files to cleaned structures
  • Parallel Processing: Efficient batch processing with multiprocessing support
  • PDF Integration: Automatic paper download and content extraction

Requirements

  • Python 3.8+
  • DeepSeek API key (configured in config.py)
  • CSD Python API (optional, for CSD database access)
  • Required Python packages (see environment.yml)

Quick Start

  1. Configure API keys in src/config.py:

    DEEPSEEK_API_KEY = "your_api_key"
    UNPAYWALL_EMAIL = "your_email"
  2. Set input/output paths in config.py:

    INPUT_FOLDER = "path/to/cif/files"
    P1_OUTPUT_FOLDER = "path/to/p1/output"
    FINAL_OUTPUT_FOLDER = "path/to/final/output"
  3. Run single file processing:

    python src/run_workflow.py
  4. Run batch processing:

    python src/batch_process_parallel.py

License

Contact

sopranos@sjtu.edu.cn

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages