A structured learning path and project portfolio for software engineers to master Large Language Models. This repository moves beyond theory, focusing on the practical application of LLMs through prompt engineering, API integration, and building scalable applications.
π‘ For Developers, By a Developer: This isn't just a list of concepts. It's a hands-on curriculum designed to take you from foundational understanding to building production-ready LLM applications.
The field of Large Language Models is moving fast. This repository provides a structured path to not just keep up, but to become proficient. It's organized into a 28-step curriculum that balances deep theoretical understanding with immediate, practical application.
Whether you're building AI-powered features into your product, automating workflows, or launching a new AI-based service, this guide will help you develop the necessary skills.
The curriculum is divided into six logical parts:
Part | Focus Area | What You'll Achieve |
---|---|---|
1. Theory Foundations | Core Concepts | Understand how LLMs work under the hood |
2. Prompt Engineering | Communication | Master the art of guiding LLMs to desired outputs |
3. Practical Applications | API Integration | Build functional applications using various LLM APIs |
4. Advanced Topics | Production Systems | Implement RAG, work with vector DBs, and build agents |
5. Build Projects | Portfolio Development | Create showcase projects for your portfolio |
6. Next Steps | Career Planning | Define your specialization and next learning goals |
llm-foundations/
βββ 01-theory-foundations/ # Days 1-8: How LLMs work
βββ 02-prompt-engineering/ # Days 9-16: Effective prompting
βββ 03-practical-applications/ # Days 17-18: API integration & simple apps
βββ 04-advanced-topics/ # Days 19-27: RAG, vector DBs, agents
βββ 05-build-projects/ # Portfolio project development
βββ 06-reflection-next-steps/ # Day 28: Planning your path forward
βββ resources/ # Cheatsheets, tools, reading lists
βββ quizzes/ # Self-assessment tools
To make the most of this curriculum, you'll want to be familiar with these core technologies and have these tools ready.
- Python Programming: Intermediate proficiency (functions, classes, decorators, async/await)
- API Concepts: REST APIs, HTTP requests, authentication (API keys)
- Basic Command Line: Navigating directories, running scripts, managing environments
- Git & GitHub: Cloning repositories, making commits, creating pull requests
Category | Tools & Services | Description |
---|---|---|
Development | Python 3.10+, VS Code, Jupyter Notebook | Core coding environment |
API Access | OpenAI, Anthropic, Cohere | Accounts for LLM API access (some offer free credits) |
Open Source LLMs | Ollama, LM Studio | Run models locally on your machine |
Vector Databases | Pinecone, Chroma, Weaviate | For RAG implementations (free tiers available) |
UI Frameworks | Streamlit, Gradio | For building web interfaces for your LLM apps |
Prompt Tools | PromptHero, FlowGPT | For prompt inspiration and testing |
- Setup Your Environment: Install Python, create a virtual environment, and install key packages (
openai
,langchain
,streamlit
) - Get API Access: Sign up for OpenAI/Anthropic and get your API keys
- Install Ollama: Follow the Ollama installation guide to run models locally
- Clone This Repo:
git clone https://github.com/JawherKl/llm-foundations.git
- Explore the Structure: Review the repository organization and learning path
- Start Small: Begin with simple API calls before tackling complex frameworks
- Use Free Tiers: Most LLM APIs offer free credits to get started
- Experiment Locally: Use Ollama with smaller models (like Llama 3) for experimentation without API costs
- Document Your Learning: Keep notes on what works and what doesn't - this becomes valuable reference material
- Join Communities: Participate in Discord servers and subreddits like r/LocalLLaMA, r/LangChain, and AI developer communities
- DeepLearning.AI Short Courses - Free courses on LLMs, ChatGPT, and LangChain
- Andrew Ng's YouTube Channel - Excellent explanations of AI concepts
- Full Stack LLM Bootcamp - Comprehensive video series on building LLM applications
- Hugging Face Course - Great for understanding transformers and open-source models
- Start with Theory: Begin with
01-theory-foundations/README.md
- Follow the Path: Progress through each section in order
- Build as You Learn: Implement projects in
05-build-projects/
as you acquire relevant skills - Assess Your Knowledge: Use the quizzes to validate your understanding
- Skim the Theory: Review
01-theory-foundations/04-key-terminologies.md
- Master Prompting: Study
02-prompt-engineering/
thoroughly - Pick a Project: Choose a project from
05-build-projects/project-ideas.md
- Learn as You Build: Reference specific sections as needed for your project
- Assessment First: Take the
quizzes/final-assessment.md
to identify knowledge gaps - Targeted Learning: Focus on sections where you need reinforcement
- Contribute: Share your expertise by improving content or adding new examples
This curriculum prepares you to work with:
- LLM APIs: OpenAI GPT, Anthropic Claude, Cohere, OpenRouter
- Frameworks: LangChain, LlamaIndex, Haystack
- Vector Databases: Pinecone, Chroma, Weaviate, Qdrant
- UI Tools: Streamlit, Gradio, Chainlit
- Open Source Models: Llama 2/3, Mistral, Phi via Ollama
- Development: Python, Jupyter Notebooks, Docker
We welcome contributions! Here's how you can help:
- Fix Errors: Found a mistake? Submit a PR with corrections
- Add Examples: Share your prompt engineering examples or code samples
- Improve Explanations: Help make complex concepts more accessible
- Share Projects: Add your LLM projects to the build-projects section
- Suggest Resources: Recommend great learning materials
Please read our Contributing Guidelines before submitting a pull request.
This repository synthesizes knowledge from a wide array of exceptional resources. The following books, articles, papers, and documentation were instrumental in its creation and serve as recommended reading for those who wish to dive deeper.
- [1706.03762] Attention Is All You Need - Vaswani et al. (2017) - The seminal paper introducing the Transformer architecture.
- [1810.04805] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - Devlin et al. (2018) - Introduced the encoder-only Transformer and masked language modeling.
- [2005.14165] Language Models are Few-Shot Learners (GPT-3 Paper) - Brown et al. (2020) - Demonstrated the remarkable scaling and few-shot abilities of large autoregressive models.
- [1910.10683] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5 Paper) - Raffel et al. (2019) - Reframed all NLP tasks into a text-to-text format.
- "Natural Language Processing with Transformers" by Tunstall, von Werra, & Wolf - The definitive practical guide to using the Hugging Face ecosystem.
- "Transformers for Natural Language Processing" by Denis Rothman - A comprehensive guide to Transformer models.
- "Hands-On Large Language Models" by Suraj Patil & others - A very practical, project-based approach.
- "The OpenAI API Book" by Michael King - A great resource focused on practical API usage.
- Jay Alammar's Blog (The Illustrated Transformer) - Legendary visual explanations of complex ML concepts.
- Lil'Log (Prompt Engineering) by Lilian Weng - In-depth and technical overview of prompt engineering techniques.
- Andrej Karpathy's Blog (AI for Full-Self Driving) - While focused on AI for cars, his writing on software 2.0 and NN training is foundational.
- Simon Willison's Blog (LLM tag) - A prolific writer on practical LLM applications and emerging patterns.
- EMAXX.IO (RTF and CRISPE Frameworks) - Excellent breakdown of prompt engineering frameworks.
- OpenAI API Documentation - The source for all things GPT, embeddings, and fine-tuning on OpenAI's platform.
- Anthropic API Documentation - Comprehensive guide to using Claude models.
- LangChain Documentation - Essential for building complex, multi-step LLM applications.
- LlamaIndex Documentation - The best resource for learning about Retrieval-Augmented Generation (RAG).
- Hugging Face Transformers Documentation - The go-to resource for working with open-source models.
- Full Stack LLM Bootcamp - A free, excellent video series on building production LLM apps.
- DeepLearning.AI Short Courses - Specifically "ChatGPT Prompt Engineering for Developers" and "LangChain for LLM Application Development".
- CS324 - Large Language Models - Stanford's course on LLMs, covering fundamentals and advanced topics.
- r/LocalLLaMA - The central Reddit community for open-source LLMs.
- Hugging Face Discord - A vibrant community for discussion and help with open-source models.
- LangChain Discord - Great for getting help with the LangChain framework.
- AI Engineer Summit Talks (YouTube) - Talks from practitioners building the cutting edge of LLM applications.
This bibliography represents a living list. If you have a resource that was foundational to your understanding, please consider contributing to this section.
This project is licensed under the MIT License - see the LICENSE file for details.
- Inspired by various learning paths and roadmaps
- Built upon the work of researchers and developers in the LLM space
- Thanks to all contributors who help improve this resource
β If you find this repository helpful, please give it a star! This helps others discover it and encourages further development.
Ready to begin your LLM journey? Start here: Theory Foundations
This repository is maintained by JawherKl. For questions or suggestions, please open an issue or discussion.