Build the most comprehensive open-source library of AI research skills, enabling AI agents to autonomously conduct experiments from hypothesis to deployment.
Target: 86 comprehensive skills — achieved ✅
| Metric | Current | Target |
|---|---|---|
| Skills | 86 (high-quality, standardized YAML) | 86 ✅ |
| Avg Lines/Skill | 420 lines (focused + progressive disclosure) | 200-500 lines |
| Documentation | ~130,000 lines total (SKILL.md + references) | 100,000+ lines |
| Gold Standard Skills | 65 with comprehensive references | 50+ ✅ |
| Coverage | Autoresearch, Ideation, Paper Writing, Architecture, Tokenization, Fine-Tuning, Data Processing, Post-Training, Safety, Distributed, Infrastructure, Optimization, Evaluation, Inference, Agents, RAG, Multimodal, MLOps, Observability, Prompt Engineering, Emerging Techniques | Full Lifecycle ✅ |
Status: Core model architectures covered
Completed Skills:
- ✅ Megatron-Core - NVIDIA's framework for training 2B-462B param models
- ✅ LitGPT - Lightning AI's 20+ clean LLM implementations
- ✅ Mamba - State-space models with O(n) complexity
- ✅ RWKV - RNN+Transformer hybrid, infinite context
- ✅ NanoGPT - Educational GPT in ~300 lines by Karpathy
Status: Essential tokenization frameworks covered
Completed Skills:
- ✅ HuggingFace Tokenizers - Rust-based, BPE/WordPiece/Unigram
- ✅ SentencePiece - Language-independent tokenization
Status: Core fine-tuning frameworks covered
Completed Skills:
- ✅ Axolotl - YAML-based fine-tuning with 100+ models
- ✅ LLaMA-Factory - WebUI no-code fine-tuning
- ✅ Unsloth - 2x faster QLoRA fine-tuning
- ✅ PEFT - Parameter-efficient fine-tuning with LoRA, QLoRA, DoRA, 25+ methods
Status: Distributed data processing covered
Completed Skills:
- ✅ Ray Data - Distributed ML data processing
- ✅ NeMo Curator - GPU-accelerated data curation
Status: RLHF and alignment techniques covered
Completed Skills:
- ✅ TRL Fine-Tuning - Transformer Reinforcement Learning
- ✅ GRPO-RL-Training - Group Relative Policy Optimization (gold standard)
- ✅ OpenRLHF - Full RLHF pipeline with Ray + vLLM
- ✅ SimPO - Simple Preference Optimization
Status: Core safety frameworks covered
Completed Skills:
- ✅ Constitutional AI - AI-driven self-improvement via principles
- ✅ LlamaGuard - Safety classifier for LLM inputs/outputs
- ✅ NeMo Guardrails - Programmable guardrails with Colang
- ✅ Prompt Guard - Meta's 86M prompt injection & jailbreak detector
Status: Major distributed training frameworks covered
Completed Skills:
- ✅ DeepSpeed - Microsoft's ZeRO optimization
- ✅ PyTorch FSDP - Fully Sharded Data Parallel
- ✅ Accelerate - HuggingFace's distributed training API
- ✅ PyTorch Lightning - High-level training framework
- ✅ Ray Train - Multi-node orchestration
Status: Core optimization techniques covered
Completed Skills:
- ✅ Flash Attention - 2-4x faster attention with memory efficiency
- ✅ bitsandbytes - 8-bit/4-bit quantization
- ✅ GPTQ - 4-bit post-training quantization
- ✅ AWQ - Activation-aware weight quantization
- ✅ HQQ - Half-Quadratic Quantization without calibration data
- ✅ GGUF - llama.cpp quantization format for CPU/Metal inference
Status: Standard benchmarking framework available
Completed Skills:
- ✅ lm-evaluation-harness - EleutherAI's standard for benchmarking LLMs
Status: Production inference frameworks covered
Completed Skills:
- ✅ vLLM - High-throughput LLM serving with PagedAttention
- ✅ TensorRT-LLM - NVIDIA's fastest inference
- ✅ llama.cpp - CPU/Apple Silicon inference
- ✅ SGLang - Structured generation with RadixAttention
Status: Cloud infrastructure and orchestration covered
Completed Skills:
- ✅ Modal - Serverless GPU cloud with Python-native API, T4-H200 on-demand
- ✅ SkyPilot - Multi-cloud orchestration across 20+ providers with spot recovery
- ✅ Lambda Labs - Reserved/on-demand GPU cloud with H100/A100, persistent filesystems
Status: Major agent frameworks covered
Completed Skills:
- ✅ LangChain - Most popular agent framework, 500+ integrations
- ✅ LlamaIndex - Data framework for LLM apps, 300+ connectors
- ✅ CrewAI - Multi-agent orchestration with role-based collaboration
- ✅ AutoGPT - Autonomous AI agent platform with visual workflow builder
Status: Core RAG and vector database skills covered
Completed Skills:
- ✅ Chroma - Open-source embedding database
- ✅ FAISS - Facebook's similarity search, billion-scale
- ✅ Sentence Transformers - 5000+ embedding models
- ✅ Pinecone - Managed vector database
- ✅ Qdrant - High-performance Rust vector search with hybrid filtering
Status: Comprehensive multimodal frameworks covered
Completed Skills:
- ✅ CLIP - OpenAI's vision-language model
- ✅ Whisper - Robust speech recognition, 99 languages
- ✅ LLaVA - Vision-language assistant, GPT-4V level
- ✅ Stable Diffusion - Text-to-image generation via HuggingFace Diffusers
- ✅ Segment Anything (SAM) - Meta's zero-shot image segmentation with points/boxes/masks
- ✅ BLIP-2 - Vision-language pretraining with Q-Former, image captioning, VQA
- ✅ AudioCraft - Meta's MusicGen/AudioGen for text-to-music and text-to-sound
Status: Advanced optimization techniques covered (merged into Phase 8)
Note: HQQ and GGUF skills have been completed and merged into Phase 8: Optimization.
Status: Core MLOps and LLM observability covered
Completed Skills:
- ✅ MLflow - Open-source MLOps platform for tracking experiments
- ✅ TensorBoard - Visualization and experiment tracking
- ✅ Weights & Biases - Experiment tracking and collaboration
- ✅ LangSmith - LLM observability, tracing, evaluation
- ✅ Phoenix - Open-source AI observability with OpenTelemetry tracing
Status: Core prompt engineering and multi-agent tools covered
Completed Skills:
- ✅ DSPy - Declarative prompt optimization and LM programming
- ✅ Guidance - Constrained generation and structured prompting
- ✅ Instructor - Structured output with Pydantic models
- ✅ Outlines - Structured text generation with regex and grammars
- ✅ CrewAI - Multi-agent orchestration (completed in Phase 11)
- ✅ AutoGPT - Autonomous agents (completed in Phase 11)
Status: All extended multimodal skills complete, merged into Phase 13
Note: BLIP-2, SAM, and AudioCraft have been completed and merged into Phase 13: Multimodal.
Status: Core emerging techniques covered
Completed Skills:
- ✅ MoE Training - Mixture of Experts with DeepSpeed/HuggingFace
- ✅ Model Merging - mergekit, SLERP, and model composition
- ✅ Long Context - RoPE extensions, ALiBi, and context scaling
- ✅ Speculative Decoding - Medusa, Lookahead, and draft models for faster inference
- ✅ Knowledge Distillation - MiniLLM, reverse KLD, teacher-student training
- ✅ Model Pruning - Wanda, SparseGPT, and structured pruning
Want to help us achieve these goals?
- Pick a skill from the roadmap - Comment on GitHub Discussions to claim it
- Follow the contribution guide - Use our template and quality standards
- Submit your PR - We review within 48 hours
All 70 skills have been completed! The library now covers the full AI research lifecycle:
- ✅ Phase 1-10: Core ML infrastructure (Architecture, Tokenization, Fine-Tuning, Data Processing, Post-Training, Safety, Distributed Training, Optimization, Evaluation, Inference)
- ✅ Phase 10.5: Infrastructure (Modal, SkyPilot, Lambda Labs)
- ✅ Phase 11-12: Applications (Agents, RAG)
- ✅ Phase 13: Multimodal (CLIP, Whisper, LLaVA, Stable Diffusion, SAM, BLIP-2, AudioCraft)
- ✅ Phase 14-16: Advanced (Optimization, MLOps & Observability, Prompt Engineering)
- ✅ Phase 17-18: Extended (Extended Multimodal, Emerging Techniques)
While the 70-skill roadmap is complete, the library will continue to evolve with:
- Updates: Keeping existing skills current with latest versions
- Community contributions: Additional skills from contributors
- Emerging tools: New frameworks and techniques as they mature
Quality over Quantity: Each skill must provide real value with comprehensive guidance, not just links to docs. We aim for 300+ lines of expert-level content per skill, with real code examples, troubleshooting guides, and production-ready workflows.