🚀 A small language model implementation with state-of-the-art features, built from the ground up with modern transformer architectures and training techniques.
A.D.A.M. SLM is a completely self-contained language model that runs locally on your machine. You don't need:
- ❌ OpenAI API keys
- ❌ Anthropic API keys
- ❌ Google API keys
- ❌ Any external LLM service subscriptions
- ❌ Internet connection for inference
Adam-SLM.Demo.mp4
✅ Everything runs locally with your own trained models!
Disclaimer: The development of this GitHub Repository was inspired by "LLMs from Scratch" by Sebastian Raschka. The codebase was forked from the original repository and has been significantly modified and expanded upon. The original repository can be found at https://github.com/rasbt/LLMs-from-scratch. The author of this repository is not affiliated with Sebastian Raschka or the original repository. The author of this repository is Darrell Mesa, and can be contacted at [email protected].
- Rotary Position Embeddings (RoPE) - Better positional understanding than learned embeddings
- Grouped Query Attention (GQA) - Efficient attention mechanism reducing memory usage
- SwiGLU Activation - Superior activation function used in modern LLMs
- RMSNorm - More stable normalization than LayerNorm
- KV-Cache Optimization - Fast inference with key-value caching
- Mixed Precision Training - FP16/BF16 support for faster training
- Gradient Accumulation - Train with larger effective batch sizes
- Learning Rate Scheduling - Cosine annealing with warmup
- Gradient Clipping - Stable training with gradient norm clipping
- Advanced Optimizers - AdamW with proper weight decay handling
- Checkpointing - Automatic model saving and resuming
- Modular Design - Clean, extensible codebase
- Type Hints - Full type annotation for better development
- Configuration Management - Flexible config system
- Comprehensive Logging - Weights & Biases integration
- Batch Inference - Efficient batch text generation
- Model Compilation - PyTorch 2.0 compile support
# Clone the repository
git clone https://github.com/ai-in-pm/adam-slm.git
cd adam-slm
# Install in development mode
pip install -e .
# Or install from PyPI (when available)
pip install adam-slm
# Start chatting immediately - completely offline!
python main.py
🔒 Privacy First: All conversations happen locally on your machine. No data is sent to external services.
python main.py # Start interactive chat (default)
python main.py --info # Show system information
python main.py --demo # Run demonstration
python main.py --test # Run integration test
python main.py --check # Check system status
python main.py --tokenizer # Test tokenizer system
python main.py --help # Show all options
🌐 Offline Operation: All commands work without internet connection or API keys.
import torch
from adam_slm.models import AdamSLM, get_config
from adam_slm.tokenization import AdamTokenizer
from adam_slm.inference import AdamInference, GenerationConfig
# Load model configuration
config = get_config("adam-slm-small")
# Create model and tokenizer
model = AdamSLM(config)
tokenizer = AdamTokenizer("gpt2")
# Setup inference
inference = AdamInference(
model=model,
tokenizer=tokenizer,
generation_config=GenerationConfig(
max_new_tokens=100,
temperature=0.8,
top_k=50,
)
)
# Generate text
text = inference.generate("The future of AI is")
print(text)
from adam_slm.training import AdamTrainer, get_training_config, create_dataloader
# Load training configuration
training_config = get_training_config("small")
# Prepare data
train_loader, eval_loader = create_dataloader(
texts=["Your training text here..."],
tokenizer=tokenizer,
max_length=1024,
batch_size=32,
train_test_split=0.1,
)
# Create trainer
trainer = AdamTrainer(
model=model,
config=training_config,
train_dataloader=train_loader,
eval_dataloader=eval_loader,
tokenizer=tokenizer,
)
# Start training
trainer.train()
Model | Parameters | Layers | Heads | Embedding Dim | Context Length |
---|---|---|---|---|---|
adam-slm-small | ~50M | 8 | 8 | 512 | 1024 |
adam-slm-base | ~124M | 12 | 12 | 768 | 2048 |
adam-slm-large | ~350M | 24 | 16 | 1024 | 4096 |
Train on Shakespeare dataset:
python examples/train_adam_slm.py \
--model_size adam-slm-small \
--training_config small \
--batch_size 16 \
--max_steps 10000 \
--use_wandb
- Grouped Query Attention: Reduces memory usage by sharing key-value heads
- Rotary Position Embeddings: Better handling of positional information
- Causal Masking: Proper autoregressive generation
- SwiGLU Activation:
swish(W1(x)) * W3(x) -> W2
- Gated Linear Units: Improved information flow
- Proper Initialization: Scaled initialization for stable training
- RMSNorm:
x / sqrt(mean(x²) + ε) * scale
- Pre-normalization: Applied before attention and FFN layers
- Stable Training: Better gradient flow than LayerNorm
- Mixed Precision: Up to 2x faster training
- Gradient Checkpointing: Reduced memory usage
- Efficient Data Loading: Optimized data pipeline
- KV-Cache: Faster autoregressive generation
- Model Compilation: PyTorch 2.0 compile support
- Batch Processing: Efficient batch inference
- No External Dependencies: Train and run models entirely on your hardware
- Privacy Guaranteed: All data processing happens locally
- Offline Capable: Works without internet connection
- No Subscription Fees: No ongoing costs for API usage
- Full Control: You own your models and data
- Local Storage: All models, training data, and conversations stored locally
- No Telemetry: No data sent to external servers
- Air-Gap Compatible: Can run in completely isolated environments
- GDPR Compliant: No external data processing
from adam_slm.tokenization import BPETokenizer
# Train custom BPE tokenizer locally
tokenizer = BPETokenizer()
tokenizer.train(texts, vocab_size=32000)
# Optimize for local inference (no API calls)
inference.optimize_for_inference()
# Get local performance stats
stats = inference.get_stats()
print(f"Local tokens/sec: {stats['tokens_per_second']}")
# Save configuration
config.to_json("model_config.json")
# Load configuration
config = AdamSLMConfig.from_json("model_config.json")
A: No! A.D.A.M. SLM is a completely self-contained system. You train and run your own models locally.
A: No! Once installed, A.D.A.M. SLM works completely offline. All inference and training happens on your local machine.
A: No! This is open-source software under AGPL-3.0. No ongoing costs, no API usage fees, no subscriptions.
A: Commercial use is allowed, but if you modify the code or use it in a web service, you must share your source code under the same AGPL-3.0 license. This ensures improvements benefit the entire community.
A: Locally! All models, training data, conversations, and configurations are stored on your machine.
A: Yes! A.D.A.M. SLM is designed to work in completely isolated environments without external dependencies.
We welcome contributions! Please see our Contributing Guide for details.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the LICENSE file for details.
AGPL-3.0 License
- 🔒 Source code disclosure: Any modifications must be shared under the same license
- 🌐 Network copyleft: Even web services using this code must provide source code to users
- ⚖️ Commercial use allowed: But derivative works must also be AGPL-3.0 licensed
- 📤 Share improvements: All enhancements must be contributed back to the community
- Built upon concepts from "LLMs from Scratch" by Sebastian Raschka
- Inspired by modern LLM architectures (LLaMA, GPT, etc.)
- Uses tiktoken for tokenization
- Powered by PyTorch
- 🐛 Report Issues
- 💬 Discussions
- 📧 Email: [email protected]
A.D.A.M. - Where sophistication meets simplicity in language modeling! 🎯
🔒 Completely Self-Contained • 🌐 No API Keys Required • 🏠 Runs Locally • 🔐 Privacy First