Advanced AI system for analyzing the Collatz Conjecture using Deep Learning and parallel brute-force search
This project combines AI-guided pattern recognition with parallel brute-force search to investigate the Collatz Conjecture, one of mathematics' most famous unsolved problems.
- π§ Transformer-based Neural Network for sequence prediction
- π Multi-threaded C++ Loop Searcher (Floyd's Cycle Detection)
- β‘ Native C++ Data Engine for maximum performance
- π Real-time Discord Integration for monitoring
- π Curriculum Learning with "Hard Mode" candidates (n > 2^68)
- π¬ Advanced Optimizations: Mixed Precision (AMP), Cosine Annealing LR, Gradient Clipping
- GPU: NVIDIA GPU with 6GB+ VRAM (tested on RTX 3070 Ti)
- CPU: Multi-core processor (tested on Ryzen 5900X)
- RAM: 16GB+ recommended
- OS: Linux (tested on Arch Linux)
- Python: 3.13+
- CUDA: 12.8+
# Clone the repository
git clone https://github.com/yourusername/collatz-ai.git
cd collatz-ai
# Run setup (creates venv, installs dependencies, compiles C++ modules)
chmod +x run.sh
./run.sh# Start training
./run.sh
# Interactive commands during training:
# - Type 'stop' to save and exit
# - Type 'status' for current progress
# - Ctrl+C for graceful shutdownInput: Parity Vector [0, 1, 0, 1, 1, ...]
β
Embedding Layer (3 β 128d)
β
Positional Encoding
β
Transformer Encoder (4 layers, 4 heads)
β
Dual Heads:
ββ Stopping Time Prediction (Regression, Log-Space)
ββ Next Step Prediction (Classification)
Specifications:
- Model Size: 128d, 4 layers, 4 attention heads
- Batch Size: 512
- Optimizer: AdamW with Cosine Annealing
- Loss: Huber Loss (stopping time) + CrossEntropy (sequence)
// Parallel brute-force search using Floyd's algorithm
// 22 threads Γ 1M numbers = 22M candidates per run
// Target: n > 2^68, n β‘ 3 (mod 4)Features:
- Multi-threaded C++ implementation
- 128-bit integer support (
__int128) - Detects non-trivial cycles
- Runs in background during training
| Metric | Value |
|---|---|
| Final Loss | 0.3698 |
| Stopping Time Error | 0.0003 (log-space) |
| Sequence Accuracy | ~70% |
| Training Speed | ~27s / 100 steps |
| GPU Utilization | ~90% (7.2GB / 8GB) |
| CPU Utilization | ~85% (20 workers) |
- Numbers Checked: 22,000,000 per run
- Range: [2^68, 2^68 + 22M]
- Non-trivial Cycles Found: 0 (as expected)
-
Mixed Precision Training (AMP)
- Reduces VRAM usage by ~40%
- Increases training speed by ~30%
-
Native C++ Engine
- 20-30% faster data generation
- Supports numbers > 2^64 (128-bit)
-
Curriculum Learning
- 50% "Normal" data (sequential numbers)
- 50% "Hard" data (n > 2^68, special patterns)
-
Learning Rate Scheduling
- Cosine Annealing: 1e-4 β 1e-6
- Smooth convergence, prevents oscillation
collatz_ai/
βββ src/
β βββ train.py # Main training script
β βββ model.py # Transformer architecture
β βββ engine.py # Numba-optimized data generation
β βββ dataset.py # PyTorch Dataset/DataLoader
β βββ analyze.py # Model analysis & visualization
β βββ discord_bot.py # Discord webhook integration
β βββ collatz_core.cpp # C++ data engine
β βββ loop_searcher.cpp # C++ parallel loop searcher
β βββ native_engine.py # Python bindings (ctypes)
β βββ loop_search.py # Loop searcher wrapper
βββ checkpoints/ # Model checkpoints (auto-saved)
βββ requirements.txt # Python dependencies
βββ run.sh # Setup & run script
βββ README.md # This file
- Stopping Time Prediction: Near-perfect accuracy (99.97%)
- Parity Patterns: Strong recognition of even/odd sequences
- Anomaly Detection: Identifies numbers with unusual stopping times
The model struggles with these numbers (unusually short stopping times):
Number: 1249, Actual: 176, Predicted: 233, Error: 57
Number: 1695, Actual: 179, Predicted: 236, Error: 57
Number: 1742, Actual: 179, Predicted: 235, Error: 56
- Analyze stopping time distributions
- Identify exceptional numbers
- Visualize sequence embeddings (PCA)
- Benchmark for sequence prediction
- Study curriculum learning effects
- Explore transformer behavior on mathematical sequences
- Automated large-scale verification
- Pattern discovery in high ranges (> 2^68)
- Real-time progress monitoring via Discord
- Distributed training across multiple GPUs
- Larger model (256d, 6 layers) from scratch
- GPU-accelerated loop detection
- Extended search range (2^100 - 2^120)
- Hybrid LSTM+Transformer architecture
Edit src/train.py to customize:
BATCH_SIZE = 512 # Adjust for your VRAM
NUM_WORKERS = 20 # CPU threads for data loading
STEPS = 1000000 # Training duration
D_MODEL = 128 # Model dimension
NUM_LAYERS = 4 # Transformer layers
NHEAD = 4 # Attention headsContributions welcome! Areas of interest:
- Model architecture improvements
- Faster loop detection algorithms
- Better anomaly detection
- Visualization enhancements
GNU General Public License v3.0
This project is licensed under GPL v3, which means:
β You CAN:
- Use for any purpose (personal, commercial, research)
- Modify and improve the code
- Distribute original or modified versions
- Sell modified versions
- Share source code of any modifications
- Use the same GPL v3 license
- State significant changes
- Include copyright and license notices
π― Mission: Help humanity solve the Collatz Conjecture through open collaboration!
See LICENSE file for full details.
- Collatz Conjecture: Lothar Collatz (1937)
- PyTorch Team: For the amazing framework
- Numba Team: For JIT compilation magic
- Community: For mathematical insights
- Issues: GitHub Issues
- Discussions: GitHub Discussions
π― Goal: Advance our understanding of the Collatz Conjecture through AI-guided analysis and exhaustive verification.
Made with β€οΈ for mathematics and machine learning