Skip to content

Quantum-Enhanced Active Learning for Accelerated Materials Discovery

License

Notifications You must be signed in to change notification settings

arnavk23/Quantum-active-learning

Repository files navigation

Quantum-Enhanced Active Learning for Materials Discovery

This repository contains the complete implementation and research paper for "Quantum-Enhanced Active Learning for Accelerated Materials Discovery: A Novel Framework Combining Quantum Superposition and Multi-Observable Uncertainty Quantification".

Key Achievements

  • 35% reduction in required experiments for materials discovery
  • Statistically significant improvements (p < 0.01) over 9 state-of-the-art methods
  • First quantum-enhanced active learning framework specifically designed for materials science

Python environment and running scripts

  1. Create and activate a Python venv:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
  1. Example: run a preprocessing or training script
python scripts/train_surrogate.py --help
python scripts/download_preprocess_mp.py  # if you need to download dataset assets

Scripts Overview

1. multi_property_optimization.py

Purpose: Demonstrates simultaneous optimization of multiple correlated material properties using coupled quantum observables.

Key Features:

  • Generates synthetic dataset of 1000 compounds with 4 correlated properties: band gap, formation energy, elastic modulus, thermal conductivity
  • Realistic property correlations based on materials science domain knowledge (e.g., ρ(E_form, E_mod) = -0.58)
  • Coupled uncertainty aggregation using correlation matrix
  • Gaussian Process models for each property with joint selection criterion

Output:

  • multi_property_results.json: R² and MAE scores across 8 iterations for each property
  • multi_property_results.png: 4-panel visualization showing learning curves per property

Main Results:

  • 3.2%-7.8% R² gains per iteration vs. independent property training
  • 22% sample cost reduction through transfer among properties
  • Effective handling of 4+ simultaneous objectives

Run:

python scripts/multi_property_optimization.py

2. discrete_classification.py

Purpose: Extends framework from continuous regression to discrete classification tasks (crystal systems, stability classes).

Key Features:

  • Generates 1200 synthetic compounds with 6 discrete crystal system classes
  • Implements quantum-margin sampling: combines entropy (class probability uncertainty) with decision-boundary margins
  • Compares three strategies: quantum-margin, entropy-only, random sampling
  • Tracks both accuracy and weighted F1 score across 8 active learning iterations

Output:

  • classification_results.json: Accuracy and F1 scores for all three strategies
  • classification_results.png: 2-panel comparison (accuracy vs. F1 score)

Main Results:

  • 3.5%-8.2% accuracy improvement via quantum-margin vs. entropy sampling
  • Final accuracy ≈91% on 6-class crystal system task (vs. 85% entropy baseline)
  • Demonstrates applicability beyond continuous property prediction

Run:

python scripts/discrete_classification.py

3. transfer_learning.py

Purpose: Validates transfer learning across materials families (oxides → sulfides/nitrides).

Key Features:

  • Generates three family-specific datasets (oxides as source, sulfides/nitrides as targets)
  • Induces realistic domain shifts via family-dependent feature distributions
  • Compares transfer learning vs. training from scratch for each target family
  • Tracks R² evolution over 8 active learning iterations with 15-sample batches

Output:

  • transfer_learning_results.json: R² and MAE for transfer vs. scratch approaches
  • transfer_learning_results.png: 2-panel side-by-side comparison for sulfides and nitrides

Main Results:

  • Transfer learning reaches R²=0.82 within 6 iterations on sulfides
  • Training from scratch requires 7-8 iterations to match
  • Demonstrates that quantum-inspired uncertainty generalizes across materials classes
  • Enables knowledge reuse reducing sample cost by ~12-15%

Run:

python scripts/transfer_learning.py

4. spurious_correlation_analysis.py

Purpose: Identifies and analyzes failure modes when correlated uncertainty misleads selection in high-noise regimes.

Key Features:

  • Generates datasets with controlled noise levels (σ = 0.01, 0.05, 0.1, 0.2, 0.5)
  • Compares true vs. empirical correlations as noise increases
  • Tracks correlation agreement, R², and MAE across 6 active learning iterations
  • Detects failure scenarios: noise > 0.1 + correlation error > 0.15 + final R² < 0.6

Output:

  • spurious_correlation_results.json: Detailed metrics for all noise levels
  • spurious_correlation_analysis.png: 4-panel failure analysis
    • Panel 1: Spurious correlation growth with noise
    • Panel 2: True vs. empirical correlation divergence
    • Panel 3: Model performance degradation
    • Panel 4: Failure risk index heatmap

Main Results:

  • At moderate noise (σ=0.1): correlation error ≈ 0.04, agreement > 0.85, safe operation
  • At high noise (σ ≥ 0.2): correlation error > 0.15, failure risk > 0.5
  • Recommends: (i) robust correlation estimators (Spearman, MCD); (ii) dynamic coupling weight reduction; (iii) real-time quality monitoring

Run:

python scripts/spurious_correlation_analysis.py

Requirements

All scripts require:

  • NumPy
  • Matplotlib
  • scikit-learn
  • SciPy

Install via:

pip install numpy matplotlib scikit-learn scipy

Integration with Paper

Results from these scripts populate the Results & Discussion section with:

  1. Multi-Property Optimization subsection:

    • Quantifies gains from coupled observables (3.2%-7.8% R² improvement)
    • Validates scalability to 4+ simultaneous objectives
    • Demonstrates 22% sample cost reduction through property transfer
  2. Discrete Classification subsection:

    • Extends framework beyond regression
    • Shows 3.5%-8.2% accuracy gains over entropy sampling
    • Demonstrates 91% accuracy on 6-class crystal system prediction
  3. Transfer Learning subsection:

    • Validates knowledge transfer across materials families
    • Shows R²=0.82 on target domain vs. R²=0.80 from scratch
    • Quantifies 12-15% sample cost reduction
  4. Spurious Covariance subsection:

    • Identifies failure modes in noisy regimes (σ > 0.2)
    • Proposes mitigation strategies
    • Demonstrates robust performance with quality control

Extending the Framework

To adapt these scripts for your own datasets:

Multi-Property:

from scripts.multi_property_optimization import MultiPropertyOptimizer

# Define custom property names and correlations
optimizer = MultiPropertyOptimizer(n_properties=4)
optimizer.property_names = ['Your Property 1', 'Your Property 2', ...]
optimizer.correlation_matrix = your_correlation_matrix
results = optimizer.run_experiment(n_iterations=8, batch_size=15)

Classification:

from scripts.discrete_classification import DiscreteClassificationFramework

classifier = DiscreteClassificationFramework(n_classes=6)
# Implement custom generate_synthetic_classification_dataset() with your data
results = classifier.run_experiment(n_iterations=8, batch_size=20)

Transfer Learning:

from scripts.transfer_learning import TransferLearningFramework

framework = TransferLearningFramework()
# Supply your own materials family datasets
results = framework.run_full_experiment()

Spurious Correlation:

from scripts.spurious_correlation_analysis import SpuriousCorrelationDetector

detector = SpuriousCorrelationDetector()
detector.noise_levels = [0.01, 0.05, 0.1, 0.2, 0.5]  # Adjust as needed
results = detector.run_noise_robustness_experiment()
failure_scenarios = detector.identify_failure_scenarios(results)

Contact

If you use the methods or code here, please cite the manuscript once it is available. For questions, reproducibility requests, or collaboration, contact: Arnav Kapoor — arnavkapoor23@iiserb.ac.in

About

Quantum-Enhanced Active Learning for Accelerated Materials Discovery

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published