Skip to content

ever-works/awesome-open-source-llms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Open Source LLMs

A comprehensive directory of open source large language models, comparing architectures, benchmark performance, licensing, and deployment options

📑 Table of Contents

Foundation Models

  • Aya - Cohere's state-of-the-art massively multilingual LLM covering 101 languages including 50+ previously underserved languages. Result of year-long collaboration with 3,000 researchers across 119 countries. (Read more) Multilingual Collaborative Open Source
  • Aya 23 - Cohere's open-weight multilingual language model supporting 23 languages in 8B and 35B parameter variants, outperforming Gemma, Mistral, and Mixtral on extensive discriminative and generative tasks through depth-focused training. (Read more) Multilingual Open Weights Instruction Tuned
  • Command R+ - Cohere's 104B parameter open-weights model optimized for RAG and multi-step tool use, supporting 128K context window across 10 languages with 50% higher throughput and 25% lower latency than previous versions. (Read more) Rag Tool Use Multilingual
  • DBRX - Databricks' 132B parameter open-source LLM using fine-grained mixture-of-experts architecture with 36B active parameters, trained on 12T tokens and outperforming GPT-3.5 while achieving 2x faster inference than LLaMA2-70B. (Read more) Mixture Of Experts Open Source Enterprise
  • DeepSeek V3 - A cutting-edge open-source LLM with 685 billion parameters released in December 2025 by Hangzhou-based DeepSeek AI. Features mixture-of-experts architecture with up to 128K token context window, excelling at reasoning, coding, and complex problem-solving tasks. (Read more) Mixture Of Experts Reasoning Coding
  • EXAONE 3.0 - Korea's first open-source bilingual AI language model from LG AI Research, featuring a 7.8B parameter model trained on 8 trillion tokens with exceptional performance in both Korean and English real-world applications. (Read more) Bilingual Korean Multilingual
  • Falcon 3 - TII's latest open-source LLM family with models in 1B, 3B, 7B, and 10B sizes trained on 14 trillion tokens, featuring the world's most powerful small AI models under 13B parameters and designed to run efficiently on laptops. (Read more) Small Models Efficient Multilingual
  • Gemma 2 - Google DeepMind's next-generation open language model family in 2B, 9B, and 27B sizes trained on up to 13T tokens, with the 27B variant ranking as the highest open model on Chatbot Arena and outperforming Llama 3 70B. (Read more) Apache Licensed High Performance Efficient
  • GLM-4 (Zhipu AI) - Zhipu AI's fourth-generation bilingual (Chinese-English) open-source language model, with GLM-4.7 achieving top rankings in early 2026 with exceptional coding (94.2 HumanEval) and mathematical reasoning (95.7 AIME 2025). Features 200K context window. (Read more) Reasoning Coding Bilingual
  • GPT-OSS-120B (OpenAI) - OpenAI's first fully open-weight LLM since GPT-2, with 120 billion parameters matching or surpassing o4-mini on core benchmarks including AIME, MMLU, TauBench, and HealthBench. Marks OpenAI's return to open-source model development. (Read more) Open Weights Reasoning Latest
  • InternLM 3 - Shanghai AI Lab's latest open-source model series providing advanced capabilities for reasoning and complex tasks. Part of the InternLM family with focus on multi-turn conversations. (Read more) Foundation Multilingual Open Source Latest
  • Kimi K2.5 - Advanced language model from Moonshot AI with strong performance on Chinese and English tasks. Part of the Kimi series achieving competitive Chatbot Arena ELO ratings in the 1445-1451 range alongside GLM-5 in 2026. (Read more) Chinese Bilingual High Performance
  • LLaMA 2 (Meta) - Meta's second-generation open-source foundation model family with 7B, 13B, and 70B variants, featuring improved training, safety alignment, and commercial licensing. Includes both base and Chat variants optimized for dialogue applications. (Read more) Foundation Commercial Safety Aligned
  • LLaMA 3 (Meta) - Meta's third-generation open source large language model family, featuring improved efficiency, better reasoning abilities, and enhanced multi-turn dialogue understanding. Widely adopted by startups, enterprises, and research teams for its transparency, customization options, and strong performance across diverse tasks. (Read more) Transformer Open Weights Multilingual
  • Llama 3.3 70B - Meta's latest 70B parameter instruction-tuned model offering similar performance to the Llama 3.1 405B. Features 128K context window with improved reasoning, coding, math, and instruction-following capabilities compared to Llama 3.1 70B. (Read more) Foundation Instruction Tuned Long Context Multilingual Latest Meta
  • LLaMA 4 (Meta) - Meta's latest fourth-generation open source language model family released in April 2025, featuring Llama 4 Scout and Llama 4 Maverick variants. First iteration to use mixture-of-experts architecture, representing cutting-edge advancement in open source AI capabilities. (Read more) Mixture Of Experts Open Weights Latest
  • Mistral 7B - Mistral AI's flagship 7-billion-parameter model with exceptional performance-to-size ratio, outperforming LLaMA 2 13B on most benchmarks. Features sliding window attention for efficient long-context handling and Apache 2.0 license. (Read more) Efficient High Performance apache-2.0
  • Mistral Large 3 - State-of-the-art open-weight model with 675B total parameters using MoE architecture, delivering 92% of GPT-5.2's performance at roughly 15% of the price. Released under Apache 2.0 license by Mistral AI. (Read more) Mixture Of Experts Apache 20 High Performance
  • Mistral NeMo - 12B parameter state-of-the-art model with 128K context window developed by Mistral AI and NVIDIA, released under Apache 2.0 license with leading performance on instruction following, reasoning, and code generation. (Read more) Long Context Apache Licensed Enterprise
  • Mistral Small 3 - A 24-billion-parameter open-source LLM from French startup Mistral AI, achieving performance comparable to 70B models while being 3x faster. Released under Apache 2.0 license, excelling in efficiency and multilingual tasks. (Read more) apache-2.0 Efficient Multilingual
  • Mixtral 8x22B - Mistral AI's larger sparse mixture-of-experts model with 141B total parameters, using only 39B active parameters. Achieves 77.8% on MMLU and 78.6% on GSM8K, excelling in coding and mathematics with cost-efficient performance. (Read more) Mixture Of Experts Coding Mathematics
  • Mixtral 8x7B - Mistral AI's sparse mixture-of-experts model using 8 expert networks with 7B parameters each, activating only 12.9B parameters per token. Outperforms LLaMA 2 70B on most benchmarks with 6x faster inference, featuring 32K context window. (Read more) Mixture Of Experts Efficient apache-2.0
  • NVIDIA Nemotron 3 - A family of open-source models from NVIDIA with open weights, training data, and recipes, featuring Nano (30B/3B active), Super (100B/10B active), and Ultra (500B/50B active) variants optimized for agentic AI applications. (Read more) Mixture Of Experts Agentic Ai Hybrid Architecture
  • OLMo 3 (AI2) - Allen Institute for AI's latest fully open language model family with 7B and 32B variants, featuring OLMo 3-Think (32B) as the best fully open 32B-scale thinking model. Represents America's truly open reasoning models with complete transparency in training data and process. (Read more) Fully Open Reasoning Transparent
  • Qwen 3 (Alibaba Cloud) - Alibaba Cloud's next-generation open-source LLM family with models up to 110B parameters, excelling in multilingual tasks (supporting 119 languages), coding, and extended context understanding. Built for flexibility and scale with strong performance across diverse applications. (Read more) Multilingual apache-2.0 Coding
  • Qwen2.5-Max - Alibaba's flagship open-weights model exceeding 1 trillion parameters via MoE architecture, supporting 119 languages and achieving 92.3% accuracy on AIME25 and 74.1% on LiveCodeBench v6. Released under Apache 2.0 license. (Read more) Mixture Of Experts Multilingual Apache 20
  • Qwen3.5-122B-A10B - Alibaba's largest Qwen3.5 model with 122B total and 10B activated parameters, designed for server-grade GPUs with 80GB VRAM. Supports over 1 million context tokens with enterprise capabilities. (Read more) Mixture Of Experts Long Context Enterprise High Performance Latest
  • Qwen3.5-397B-A17B - Alibaba's frontier-scale Qwen3.5 model with 397B total and 17B activated parameters. Frontier-level performance with ultra-efficient sparse activation and multimodal native support. (Read more) Mixture Of Experts Frontier High Performance Multimodal Latest
  • SeaLLM - Family of multilingual large language models tailored for Southeast Asian languages including Thai, Vietnamese, Indonesian, and 9 others, built on Llama-2 with extended vocabulary and outperforming ChatGPT-3.5 in non-Latin languages. (Read more) Multilingual Southeast Asian Regional
  • Snowflake Arctic - Enterprise-focused open-source LLM with 480B total parameters using Dense-MoE architecture, activating only 17B parameters. Excels at SQL generation, coding, and instruction following. Trained for under $2M in three months under Apache 2.0 license. (Read more) Enterprise Mixture Of Experts Sql
  • Yi 2 - 01.AI's latest Yi model series providing advanced capabilities with strong performance on reasoning and coding. Continues the legacy of the popular Yi models. (Read more) Foundation Reasoning Coding Open Source Latest
  • Aquila - Open-source LLM series from Beijing Academy of Artificial Intelligence (BAAI), including Aquila-7B for code generation and AquilaChat-7B for conversational applications. Part of the WuDao project for Chinese AI advancement. (Read more) Chinese Research Open Source
  • Baichuan 2 - Baichuan Intelligence's 13-billion-parameter open-source language model based on Transformer architecture, trained on approximately 1.2 trillion tokens. Supports both Chinese and English with 4096 context window, commercially usable for bilingual applications. (Read more) Bilingual Chinese apache-2.0
  • BLOOM - BigScience's multilingual open-source language model with 176 billion parameters, supporting 46 natural languages and 13 programming languages. Created through international collaboration of over 1,000 researchers, representing one of the largest open multilingual LLMs. (Read more) Multilingual Collaborative Large Scale
  • ChatGLM - Open-source bilingual language model family from Zhipu AI and Tsinghua University, designed for Chinese and English applications. Certain versions support extremely large context windows up to 1 million tokens. (Read more) Bilingual Chinese Long Context
  • ChatGLM3 - Zhipu AI's third-generation bilingual (Chinese-English) conversational model with enhanced capabilities, longer context support, and improved reasoning. Part of the GLM series with strong performance in Chinese language tasks and general dialogue. (Read more) Bilingual Chinese Conversational
  • Command R (Cohere) - Cohere's retrieval-augmented generation model optimized for enterprise RAG applications, conversational interaction, and long-context tasks. Designed for production deployments with strong instruction following and tool use capabilities. (Read more) Rag Enterprise Conversational
  • DBRX (Databricks) - Databricks' open-source mixture-of-experts language model with 132 billion total parameters, setting new standards for open LLMs when released in 2024. Outperformed existing open models including LLaMA 2 and Mixtral on key benchmarks. (Read more) Mixture Of Experts Commercial High Performance
  • Falcon 2 - Technology Innovation Institute's transformer-based open-source model with 11 billion parameters, providing multimodal capabilities for both text and vision. Known for multilingual support and being fully open without heavy-handed safety filters. (Read more) Multimodal Multilingual apache-2.0
  • GPT-J - EleutherAI's 6B parameter autoregressive language model trained on The Pile dataset. An early open-source alternative to GPT-3 that helped democratize access to large language models. (Read more) Open Source Community Historic
  • GPT-J-6B (EleutherAI) - EleutherAI's 6-billion-parameter open-source autoregressive language model trained on The Pile dataset. At release, it was the largest publicly available GPT-3-style language model, pioneering accessible open-source alternatives to proprietary models. (Read more) Open Source Community Historic
  • GPT-NeoX-20B (EleutherAI) - EleutherAI's 20-billion-parameter autoregressive language model trained using the GPT-NeoX library. Open-source model with Apache 2.0 license, supporting research and production use with strong general-purpose capabilities across diverse tasks. (Read more) Open Source Research apache-2.0
  • InternLM - Large language model from SenseTime and Shanghai AI Lab with 104 billion parameters trained on 1.6 trillion tokens. InternLM2 available in 7B and 20B sizes with comprehensive capabilities in mathematics, code, dialogue, and creative writing. (Read more) Chinese Multilingual Research
  • InternLM 2 - Multilingual foundational language model developed by SenseTime and Shanghai AI Lab with 7B and 20B variants. Pre-trained on 1.6T tokens across 104B parameters, offering strong Chinese and multilingual capabilities for research and commercial applications. (Read more) Multilingual Chinese Research
  • MiniMax (ABAB) - Chinese AI company's multimodal foundation models with strong commercial presence. Part of the 'AI tigers' group, offering text, voice, and vision capabilities for enterprise and consumer applications across the Chinese market. (Read more) Multimodal Chinese Commercial
  • Moonshot AI (Kimi) - Chinese AI startup's long-context language model with exceptional memory capabilities, supporting extremely long conversations and document processing. Part of China's 'AI tigers' group with strong commercial presence and advanced reasoning abilities. (Read more) Long Context Chinese Commercial
  • MPT (MosaicML Pretrained Transformer) - Family of open-source commercially usable LLMs from MosaicML/Databricks. MPT-7B trained on 1T tokens matches LLaMA-7B quality, with MPT-30B designed to deploy on a single NVIDIA A100 GPU. (Read more) Open Source Commercial Efficient
  • MPT-7B (MosaicML) - MosaicML's 7-billion-parameter transformer trained from scratch on 1T tokens of text and code. Open-source with commercial use license, matching LLaMA-7B quality. Includes variants like MPT-7B-Chat, MPT-7B-Instruct, and MPT-7B-StoryWriter with 65K context. (Read more) Commercial Long Context Code
  • OPT (Open Pre-trained Transformer) - Meta's family of open-source models designed to replicate GPT-3 with similar decoder-only architecture. Released to promote reproducibility and research in large language models. (Read more) Meta Open Source Research
  • OPT-175B (Meta) - Meta's 175-billion-parameter open-source model released for research, matching GPT-3 scale. Trained with fully documented process and released with training logs, providing transparency into large-scale language model development. (Read more) Large Scale Research Transparent
  • RedPajama - An open reproduction of the LLaMA training dataset with 1.2 trillion tokens, created by following the LLaMA recipe to make fully open-source models under Apache license. Includes the RedPajama-INCITE model family. (Read more) Open Data Apache 20 Community
  • RedPajama-INCITE - Open-source model family from Together AI trained on the 1.2 trillion token RedPajama dataset. Includes 3B and 7B variants (base, chat, instruct) with Apache 2.0 license. RedPajama-3B outperforms GPT-Neo and Pythia-2.8B on benchmarks. (Read more) Open Data apache-2.0 Commercial
  • Stable LM 2 - Stability AI's multilingual language model family in 1.6B and 12B sizes trained on 2T tokens across seven languages (English, Spanish, German, Italian, French, Portuguese, Dutch), featuring strong tool usage and function calling capabilities. (Read more) Multilingual Tool Use Efficient
  • TigerBot - Open multilingual multitask LLM family ranging from 7B to 180B parameters. Developed from Llama-2 and BLOOM, achieving 6% performance gain in English and 20% in Chinese, with specialized domain data for finance, law, and encyclopedic knowledge. (Read more) Chinese Multilingual Specialized
  • XVERSE - Multilingual LLM from Shenzhen Yuanxiang Technology supporting 40+ languages with 8K context. XVERSE-65B trained on 2.6 trillion tokens with 16K context, optimized for Chinese, English, Russian, and Spanish. (Read more) Multilingual Chinese Large Scale
  • XVERSE 2 - Shenzhen Yuanxiang Technology's latest open-source model series with strong performance on benchmarks. Features advanced reasoning and multilingual capabilities. (Read more) Foundation Multilingual Reasoning Open Source
  • Yi - Bilingual language model series from 01.AI with strong performance on both Chinese and English tasks. Available in multiple sizes from 6B to 34B parameters with variants optimized for different applications. (Read more) Bilingual Chinese High Performance
  • Yi 1.5 (01.AI) - 01.AI's open-source language model excelling at handling long sequences and maintaining contextual coherence. Known as a dark horse model with strong coding capabilities and performance in long-context understanding tasks. (Read more) Long Context Coding apache-2.0

Code Generation Models

  • Code Llama - Meta's code-specialized variant of Llama 2, available in 7B, 13B, 34B, and 70B sizes. Includes specialized Python variant and long-context 100K token version for processing entire codebases. (Read more) Coding Meta Specialized
  • Code Llama (Meta) - Meta's specialized code generation model built on LLaMA 2, available in 7B, 13B, 34B, and 70B sizes. Features up to 100K token context window, trained on 500B tokens of code, with variants for Python specialization and instruction following. (Read more) Coding Specialized Long Context
  • CodeT5+ - Salesforce's open code large language model family with encoder-decoder architecture, achieving 35.0% pass@1 on HumanEval and surpassing OpenAI's code-cushman-001, with models from 220M to 16B parameters optimized for code understanding and generation. (Read more) Encoder Decoder Code Understanding Open Source
  • DeepSeek-Coder - Code-specialized LLM series from DeepSeek with models up to 33B parameters. DeepSeek-Coder-V2 achieves 90% on LiveCodeBench and serves as the foundation for WizardCoder-33B's record-breaking 79.9% on HumanEval. (Read more) Coding Specialized High Performance
  • Devstral Coding Models - Mistral's specialized coding-focused models designed for software engineering tasks. Optimized for code generation, review, and understanding with strong performance on programming benchmarks. (Read more) Coding Software Engineering Specialized Latest
  • GLM-4.7 - Zhipu AI's frontier-level open-source model with 200K context window and advanced reasoning. Achieves state-of-the-art performance on coding (73.8% on SWE-bench) with MoE architecture optimizing computational efficiency without sacrificing depth. (Read more) Coding Reasoning Mixture Of Experts Long Context Software Engineering Latest
  • MiMo-V2-Flash - Efficient software engineering model that outperforms open-source LLMs like DeepSeek-V3.2 and Kimi-K2 on software engineering benchmarks with significantly fewer parameters (1/2-1/3x total parameters), demonstrating superior parameter efficiency. (Read more) Software Engineering Efficient Latest
  • MiniMax M2.5 - Top-performing model on SWE-bench Verified achieving 80.2% score, the highest among all models tested in 2026. Demonstrates exceptional software engineering capabilities on real GitHub issue resolution. (Read more) Software Engineering Latest High Performance
  • NousCoder-14B - Nous Research's competitive programming model fine-tuned from Qwen3-14B via reinforcement learning, achieving 67.87% on LiveCodeBench v6. Trained in 4 days using 48 B200 GPUs with full open-source RL stack. (Read more) Coding Software Engineering Competitive Programming Open Weights Fully Open Latest
  • Qwen2.5-Coder - Alibaba's state-of-the-art open-source code model series (0.5B-32B parameters) trained on 5.5T tokens, achieving SOTA on HumanEval-Infilling and matching GPT-4o coding capabilities with the 32B variant while supporting 92 programming languages. (Read more) Coding Multilingual Code Instruct Tuned
  • StarCoder - 15.5B parameter code model from BigCode trained on 80+ programming languages from The Stack. Features 8K context length, infilling capabilities, and fast large-batch inference enabled by multi-query attention. (Read more) Coding Multilingual Open Source
  • StarCoder (BigCode) - BigCode's 15.5-billion-parameter open-source code generation model trained on 1 trillion tokens from The Stack dataset. Supports 80+ programming languages with 8K context window, developed through collaboration of Hugging Face and ServiceNow. (Read more) Coding Open Source Multilingual
  • StarCoder2 - Next-generation code model from BigCode with improved architecture and enhanced performance over StarCoder. Continues support for 80+ programming languages with better code generation, understanding, and multilingual capabilities. (Read more) Coding Next Gen Multilingual
  • StarCoder2 15B - BigCode's next-generation open-source code model with 15B parameters trained on 4+ trillion tokens from The Stack v2, supporting 600+ programming languages with 16K context window and outperforming CodeLlama-34B on code reasoning. (Read more) Open Source Multilingual Code Long Context
  • WizardCoder - Code-specialized LLM using Evol-Instruct methodology for code generation. WizardCoder-33B-V1.1 achieves 79.9 pass@1 on HumanEval, surpassing Anthropic's Claude and Google's Bard. Built from deepseek-coder with exceptional programming capabilities. (Read more) Coding Evol Instruct High Performance
  • CodeGemma - Google's lightweight open model collection for coding tasks including code completion, generation, and understanding. Built on Gemma architecture with instruction-following capabilities for programming. (Read more) Coding Google Lightweight Open Source
  • CodeGen - Salesforce Research's family of code generation models with CodeGen 2.5 as the latest release. Trained on diverse programming data to support code completion, generation, and understanding tasks. (Read more) Coding Salesforce Open Source
  • SantaCoder - Efficient 1.1B parameter code model from BigCode trained on 236B tokens using Multi-Query Attention. Supports Python, Java, and JavaScript with a 2K context window and achieves competitive scores for its size. (Read more) Lightweight Coding Efficient

Small Language Models

  • Gemma 3 (Google) - Google's third-generation family of lightweight open models built from Gemini technology, available in 2B, 7B, 9B, 12B, and 27B sizes. Gemma-3-27B beats the original Gemini 1.5-Pro across benchmarks, offering state-of-the-art performance for small language models. (Read more) Efficient Google Multilingual
  • H2O-Danube - H2O.ai's family of small language models (500M-4B parameters) trained on up to 6T tokens, achieving top rankings on Hugging Face Open LLM Leaderboard for the 2B range and designed for efficient edge deployment on mobile devices. (Read more) Edge Deployment Mobile Efficient
  • Hermes 4 14B - Nous Research's lightweight 14B variant of the Hermes 4 family with hybrid reasoning. Delivers frontier-level performance in a small package for edge and resource-constrained deployments. (Read more) Reasoning Efficient Hybrid Reasoning Open Weights Latest
  • Ministral 3 14B - Mistral's 14B dense model with native reasoning, vision, and multilingual support. High performance for its size with best cost-to-performance ratio among open-source models. (Read more) Reasoning Multimodal Apache 20 Efficient Latest
  • Ministral 3 8B - Mistral's efficient 8B model with native vision and multilingual capabilities. Balanced performance and resource consumption for local deployment. (Read more) Efficient Multimodal Apache 20 Latest
  • Mistral 3 14B - Mistral's powerful 14B dense model with native multimodal and reasoning capabilities. Achieves 85% on AIME 2025 in reasoning variant and offers the best cost-to-performance ratio among open-source models. (Read more) Efficient Multimodal Reasoning Apache 20 Mathematics Latest
  • Mistral 3 8B - Mistral's state-of-the-art dense 8B model with native vision capabilities and multilingual support. Part of the Ministral 3 series, available in base, instruct, and reasoning variants under Apache 2.0 license. (Read more) Efficient Multimodal Multilingual Apache 20 Reasoning Latest
  • OpenELM - Apple's family of efficient open-source language models (270M to 3B parameters) using layer-wise scaling strategy, achieving 2.36% better accuracy than OLMo with 2x fewer pre-training tokens and optimized for on-device deployment. (Read more) On Device Efficient Layer Wise Scaling
  • Phi-3 - Microsoft's family of small language models (mini 3.8B, small 7B, medium 14B) released under MIT license, with Phi-3-mini achieving performance comparable to models 10x its size and featuring variants with up to 128K context window. (Read more) Mit License Efficient High Quality
  • Phi-4 (Microsoft) - Microsoft's 14-billion-parameter small language model released January 2025 under MIT license. Phi-4-reasoning-plus approaches full DeepSeek R1 performance on AIME 2025, with a 14B model rivaling a 671B model, making it the poster child for the small language model revolution. (Read more) Efficient Reasoning Mit License
  • Qwen3.5-27B - Alibaba's efficient 27B dense model from the Qwen3.5 series, supporting 800K+ context tokens. Offers performance comparable to Claude Sonnet 4.5 while maintaining exceptional efficiency for local deployment. (Read more) Efficient Multilingual Long Context Latest
  • Qwen3.5-35B-A3B - Alibaba's efficient MoE model with 35B total parameters and 3B activated per token. First in Qwen3.5 series supporting tool calling and multimodal agentic capabilities under Apache 2.0 license. (Read more) Mixture Of Experts Efficient Function Calling Latest
  • SmolLM2 - A family of compact, state-of-the-art small language models from Hugging Face available in 135M, 360M, and 1.7B parameters, overtrained on 11 trillion tokens for exceptional on-device AI performance. (Read more) On Device Lightweight Open Source
  • DeepHermes-3 3B - Nous Research's smallest 3B toggle-on reasoning model based on Llama 3.1, enabling hybrid reasoning on edge devices and mobile. Delivers unified intuitive and chain-of-thought capabilities in a tiny package. (Read more) Reasoning Efficient Hybrid Reasoning Lightweight Latest
  • Ministral 3B - Mistral's smallest dense model with native vision, multimodal, and multilingual support. Part of Ministral 3 series for edge deployment under Apache 2.0 license. (Read more) Efficient Lightweight Multimodal Apache 20 Latest
  • Mistral 3 3B - Mistral's smallest and most efficient model in the 3B parameter range, featuring native multimodal capabilities and multilingual support. Optimized for resource-constrained edge deployment. (Read more) Efficient Lightweight Multimodal Apache 20 Cpu Friendly Latest
  • Phi-2 - Microsoft's efficient Transformer-based LLM with 2.7 billion parameters, trained on 1.4 trillion tokens of synthetic data from GPT-3.5-turbo. Despite its size, competes with models up to 25 times larger. (Read more) Microsoft Efficient Synthetic Data
  • StableLM - Stability AI's efficient language model with the 1.6B variant trained on 2 trillion tokens, beating other sub-2B options. Designed for developers who need practical, working code quickly. (Read more) Lightweight Efficient Open Source
  • StableLM 3B - Stability AI's open-source language model with 3 billion parameters, trained on 1.5 trillion text tokens using an updated version of The Pile dataset. Released under CC BY-SA-4.0 for commercial use, achieving state-of-the-art performance at the 3B scale. (Read more) Efficient Open Source Lightweight
  • TinyLlama - A compact 1.1B parameter language model pretrained on 1 trillion tokens with LLaMA architecture, achieving superior training efficiency at 24,000 tokens/sec per A100 GPU and outperforming OPT-1.3B and Pythia-1.4B. (Read more) Efficient Lightweight Llama Compatible

Reasoning Models

  • DeepSeek V3.2 - An advanced open-source LLM with 685B parameters featuring DeepSeek Sparse Attention mechanism, supporting 128K context length. Achieves frontier-level reasoning performance comparable to GPT-5 and surpasses it on mathematical reasoning benchmarks. (Read more) Reasoning Mixture Of Experts Long Context Latest Open Weights
  • DeepSeek-R1 - DeepSeek's reasoning-focused model released in early 2025, designed to excel at complex reasoning and problem-solving tasks. Part of DeepSeek's push into specialized reasoning capabilities alongside V3. (Read more) Reasoning Problem Solving Latest
  • Grok 2.5 - xAI's 270B open-source model released in August 2025, featuring advanced reasoning and real-time data integration. Now available on Hugging Face with 42 files totaling ~500GB, requiring 8 GPUs with 40GB memory each. (Read more) Reasoning Mixture Of Experts High Performance Open Weights Latest
  • Grok 3 - xAI's latest flagship model trained on 200,000 GPUs, described as 'an order of magnitude more capable' than Grok 2.5. Confirmed for open-source release around February 2026 by Elon Musk. (Read more) Reasoning High Performance Latest
  • Hermes 4 405B - Nous Research's flagship 405B model with hybrid reasoning capabilities. Represents the maximum capability variant of Hermes 4 for frontier-level performance and complex reasoning tasks. (Read more) Reasoning High Performance Hybrid Reasoning Open Weights Latest
  • Hermes 4 70B - Nous Research's 70B frontier-level open-weight model featuring hybrid reasoning via toggleable thinking. Trained on massive dataset using DataForge and Atropos RL framework, designed to rival ChatGPT and Claude. (Read more) Reasoning Instruction Tuned Hybrid Reasoning Open Weights Latest
  • Orca 2 (Microsoft) - Microsoft Research's reasoning-focused LLM fine-tuned from LLaMA 2, available in 7B and 13B sizes. Teaches various reasoning strategies and achieves performance levels similar to models 5-10x larger on complex reasoning tasks through innovative training methodology. (Read more) Reasoning Microsoft Efficient
  • Qwen3-235B-A22B - Alibaba's flagship mixture-of-experts model with 235B total parameters and 22B activated parameters. Features hybrid thinking mode for complex reasoning, math, and coding, plus non-thinking mode for efficient dialogue. Supports 100+ languages with native MCP. (Read more) Reasoning Mixture Of Experts Multilingual Thinking Mode Function Calling Latest
  • QwQ-32B - Alibaba's specialized 32B reasoning model delivering high performance on complex problem-solving with significantly reduced compute requirements. Achieves comparable performance to DeepSeek-R1 (671B) while using one-tenth the parameters. (Read more) Reasoning Mathematics Efficient Open Weights Problem Solving Latest
  • DeepHermes-3 24B - Nous Research's 24B variant based on Mistral, combining toggle-on reasoning with a balanced parameter count. Ideal for mid-range inference with hybrid reasoning capabilities. (Read more) Reasoning Efficient Hybrid Reasoning Latest
  • DeepHermes-3 8B - Nous Research's first toggle-on reasoning model based on Llama 3.1 8B, unifying intuitive responses with long chain-of-thought reasoning. Switchable via system prompt for flexible reasoning depth. (Read more) Reasoning Hybrid Reasoning Efficient Latest
  • Orca 2 - Microsoft's reasoning-focused language model based on Llama 2, trained on a censored synthetic dataset using Reinforcement Learning from AI Feedback (RLAIF). Designed to excel in reasoning tasks and teach smaller models to reason effectively. (Read more) Microsoft Reasoning Synthetic Data

Multimodal Models

  • GLM-4 - Open multilingual multimodal chat model from Zhipu AI with exceptional coding and reasoning capabilities. GLM-4.7 achieves 94.2 on HumanEval and 95.7 on AIME 2025, making it one of the most well-rounded open-source models available. (Read more) Multilingual Multimodal Coding
  • Llama 3.2 Vision - Meta's multimodal large language models in 11B and 90B sizes supporting text and image inputs, pretrained on 6B image-text pairs and competitive with Claude 3 Haiku and GPT-4o-mini on visual reasoning tasks. (Read more) Vision Language Multimodal Image Reasoning
  • Llama 4 Maverick - Meta's 400B multimodal model representing the Llama 4 flagship. First open-weight natively multimodal MoE model delivering high-performance text and image understanding with massive parameter scale. (Read more) Multimodal Mixture Of Experts High Performance Latest Meta
  • Llama 4 Scout - Meta's 17B active parameter model with 16 MoE experts (109B total). First open-weight natively multimodal model with 10M token context, trained on 40 trillion tokens. Exceeds Llama 3.1 405B on many benchmarks. (Read more) Multimodal Mixture Of Experts Long Context Latest Meta
  • LLaVA (Large Language-and-Vision Assistant) - End-to-end trained large multimodal model connecting vision encoder and LLM for visual and language understanding. LLaVA-OneVision-1.5 achieves state-of-the-art performance on native-resolution images with comparatively lower training costs. (Read more) Multimodal Vision Language Open Source
  • Molmo - State-of-the-art open-source multimodal vision-language model from Allen AI, with the 72B variant outperforming Gemini 1.5 Pro and Claude 3.5 Sonnet on academic benchmarks while featuring unique pixel-pointing capabilities. (Read more) Vision Language Multimodal Pointing
  • PaliGemma - Google's open vision-language model combining SigLIP (image encoder) and Gemma-2B (text decoder). Excels at image captioning, visual QA, text understanding in images, object detection, and segmentation. (Read more) Multimodal Vision Language Google Open Source
  • Pixtral 12B - Mistral AI's first multimodal model with 12B parameters, combining vision and language understanding. Significantly outperforms other open-source multimodal models like Qwen2-VL 7B and LLaVA-OneVision 7B in instruction following and visual understanding tasks. (Read more) Vision Language Multimodal Mistral
  • Fuyu - Adept's multimodal model with novel architecture eliminating image encoding stage. Fuyu-8B processes images directly alongside text, enabling simpler architecture and better fine-tuning capabilities for vision-language tasks. (Read more) Multimodal Vision Language Novel Architecture
  • StepFun (Step-1) - Chinese AI startup specializing in multimodal models with strong vision-language capabilities. Part of the 'AI tigers' group, focusing on advanced visual understanding and reasoning for enterprise and consumer applications in the Chinese market. (Read more) Multimodal Chinese Vision Language

Novel Architectures

  • Falcon Mamba 7B - TII's first open-source State Space Language Model (SSLM) with 7B parameters, using Mamba architecture for linear-time sequence processing, achieving 62.03% on ARC and outperforming Llama 3.1 8B while fitting on a single A10 24GB GPU. (Read more) State Space Model Mamba Efficient
  • Jamba 1.5 - AI21 Labs' hybrid SSM-Transformer architecture with 256K context window, 52B parameters with only 12B activated. Delivers 3x throughput on long contexts while maintaining quality in a hyper-efficient package. (Read more) Novel Architecture Long Context Mixture Of Experts Efficient Latest
  • Jamba 2 - AI21 Labs' open-source family of language models built for maximum reliability and steerability in enterprise. Continuing the hybrid SSM-Transformer architecture with enhanced capabilities. (Read more) Novel Architecture Enterprise Open Weights Latest
  • Mamba - Selective state space model achieving linear-time sequence modeling with selective state spaces. Mamba-3 combines expressive recurrence with multi-input, multi-output formulation, competing with Transformers while maintaining O(n) complexity. (Read more) State Space Efficient Research
  • Moe-Mamba Hybrid Models - Hybrid architecture models combining Mamba SSM (State Space Models) with Mixture-of-Experts for efficient long-context processing. Emerging frontier in alternative architectures. (Read more) Novel Architecture State Space Long Context Efficient
  • OLMoE - Fully open-source mixture-of-experts language model jointly developed by Allen Institute for AI and Contextual AI, featuring 1B active and 7B total parameters with 2x faster training than equivalent dense models. (Read more) Mixture Of Experts Efficient Fully Open
  • RecurrentGemma-2B - Google DeepMind's 2B parameter model using the novel Griffin architecture combining linear recurrences with local attention. Achieves comparable performance to Gemma-2B while using less memory and enabling faster long-sequence inference. (Read more) Novel Architecture Efficient Long Context Google Lightweight
  • RWKV - Receptance Weighted Key Value model with RNN-like architecture enabling infinite sequence length while maintaining transformer-level performance. Available in sizes from 169M to 14B parameters, trained on The Pile with linear computational complexity. (Read more) Rnn Transformer Hybrid Efficient Long Context
  • SOLAR - Upstage's 10.7B parameter language model using Depth Up-Scaling (DUS) technique to efficiently create larger models from smaller ones. SOLAR 10.7B achieves performance competitive with much larger models through innovative architectural scaling. (Read more) Efficient Innovative Korean

Specialized Models

  • BioGPT - Microsoft's domain-specific generative transformer language model pretrained on large-scale biomedical literature, achieving human-expert-level performance with 78.2% accuracy on PubMedQA and new records on relation extraction tasks. (Read more) Biomedical Scientific Healthcare
  • Chronos - Amazon's family of pretrained time series forecasting foundation models based on language model architectures, featuring Chronos-2 (120M parameters) with state-of-the-art zero-shot accuracy and 600M+ downloads from Hugging Face. (Read more) Time Series Forecasting Zero Shot
  • IBM Granite - Enterprise-focused language model family from IBM with specialized variants for language, code, time series, and geospatial data. Granite 4.0 features hybrid Mamba/Transformer architecture reducing memory by 70% for long-context inference. (Read more) Enterprise Apache 20 Hybrid Architecture
  • Llemma-34B - EleutherAI's larger mathematical reasoning model variant, fine-tuned from Code Llama on Proof-Pile-2. Enhanced performance for complex mathematical problems, proofs, and computational tasks. (Read more) Mathematics Reasoning Specialized Open Source
  • Llemma-7B - EleutherAI's open mathematical reasoning model, fine-tuned from Code Llama on Proof-Pile-2 dataset. Outperforms Code Llama-34B on math benchmarks and supports computational tools for problem-solving. (Read more) Mathematics Reasoning Specialized Open Source
  • NuminaMath-7B - Project Numina's open-source 7B model fine-tuned for competition-level mathematics using tool-integrated reasoning (TIR). Won AIMO Progress Prize with score of 29/50, capable of AMC 12 level problems. (Read more) Mathematics Specialized Reasoning Open Weights Competitive
  • SQLCoder - Specialized SQL generation model from Defog achieving state-of-the-art text-to-SQL performance. SQLCoder-70B outperforms GPT-4 on SQL generation benchmarks, making it the leading open-source model for database queries. (Read more) Sql Specialized High Performance
  • Baichuan - Open-source LLM series from Baichuan Intelligence with strong focus on domain-specific applications in law, finance, medicine, and classical Chinese literature. Baichuan 4 is the premier Chinese open-source model for specialized domains. (Read more) Chinese Specialized Enterprise
  • MedGemma - Google's specialized Gemma model for medical image and text comprehension. Part of Gemma family, optimized for healthcare and biomedical applications. (Read more) Specialized Medical Google Multimodal
  • ShieldGemma - Google's instruction-tuned safety and content moderation model built on Gemma 2, targeting four harm categories. Can evaluate both user inputs and model outputs for safety compliance. (Read more) Safety Aligned Google Content Moderation Open Source
  • TxGemma - Google's specialized Gemma model for therapeutics development. Optimized for drug discovery, protein analysis, and biomedical research applications. (Read more) Specialized Research Google Biomedical
  • WizardMath - Mathematics-specialized LLM using Evol-Instruct methodology adapted for mathematical problems. Achieves exceptional performance on math benchmarks through progressive problem complexity evolution and high-quality mathematical reasoning training. (Read more) Mathematics Reasoning Specialized

Research Models

  • Amber (LLM360) - The first fully transparent 7B English language model from the LLM360 initiative, released with all 360 intermediate training checkpoints, complete training data, metrics, and source code under Apache 2.0 license. (Read more) Fully Transparent Reproducible Research
  • OLMo 3 - Allen Institute for AI's latest open language model focusing on interpretability and transparency. Provides complete training transparency including data, code, and detailed documentation. (Read more) Research Interpretability Transparent Open Science Latest
  • Cerebras-GPT - Family of seven GPT models (111M-13B parameters) trained using the Chinchilla formula on Cerebras CS-2 wafer-scale systems, released under Apache 2.0 license as the first Chinchilla-optimal models available open-source. (Read more) Chinchilla Optimal Apache Licensed Compute Efficient
  • Open Reasoning Tasks - Nous Research's comprehensive repository of reasoning tasks for LLMs. Provides benchmark suite and methodology for developing reasoning-focused language models. (Read more) Reasoning Research Benchmark Open Source
  • Pythia - A suite of LLMs ranging from 70M to 12B parameters trained on The Pile, designed to facilitate research on how language models learn and evolve during training. Fully open with all checkpoints released. (Read more) Research Fully Open Interpretability
  • Pythia (EleutherAI) - EleutherAI's first LLM suite designed specifically for scientific research on learning dynamics. Features 154 checkpoints saved throughout training across multiple model sizes, enabling detailed study of how knowledge develops during training. (Read more) Research Interpretability Open Science

Deployment Platforms

  • vLLM - A high-throughput and memory-efficient inference and serving engine for LLMs, originally developed at UC Berkeley, offering up to 24x performance improvements through innovative PagedAttention technology. (Read more) Inference Serving Optimization
  • GPT4All - Nomic AI's ecosystem for running LLMs locally on consumer hardware, featuring optimized inference engine and collection of open-source models. Enables private, offline AI with CPU-friendly execution and easy-to-use desktop application. (Read more) Local Inference Privacy Cpu Friendly

Instruction Tuned Models

  • Athene-V2 72B - Nexusflow's open-source 72B model fine-tuned from Qwen 2.5 72B, ranked best on Chatbot Arena. Athene-V2-Chat excels in chat/math/coding, while Athene-V2-Agent surpasses GPT-4o in function calling and agentic applications. (Read more) Instruction Tuned Reasoning Function Calling Latest Open Weights
  • Airoboros - Instruction-tuned model series using self-generated synthetic training data via LLM bootstrapping. Features context obedient question answering, creative writing capabilities, and strong function calling support. (Read more) Instruction Tuned Synthetic Data Function Calling
  • Alpaca - Stanford's instruction-following model fine-tuned from LLaMA 7B using 52K instruction-following demonstrations generated by GPT-3.5. Pioneered the use of synthetic data from stronger models for instruction-tuning smaller models. (Read more) Instruction Tuned Synthetic Data Historic
  • BLOOMZ - Instruction-tuned variant of BLOOM with 176B parameters, fine-tuned on multilingual tasks using xP3 dataset. Supports instruction-following across 46 languages, making it ideal for global multilingual applications. (Read more) Multilingual Instruction Tuned Large Scale
  • Dolly - Databricks' instruction-following LLM demonstrating that high-quality instruction-tuning can be achieved with relatively small, curated datasets. Dolly 2.0 trained on 15K human-generated instruction-response pairs. (Read more) Instruction Tuned Human Annotated Databricks
  • FLAN-T5 - Google's instruction-tuned T5 encoder-decoder model family fine-tuned on 1,800+ tasks from the FLAN 2022 Collection, achieving strong few-shot performance comparable to PaLM 62B while being fully open-source and commercially usable. (Read more) Instruction Tuned Encoder Decoder Zero Shot
  • Guanaco - Efficient instruction-tuned model using QLoRA (Quantized Low-Rank Adaptation) for parameter-efficient fine-tuning. Demonstrates that 4-bit quantized models can be fine-tuned effectively, reducing memory requirements significantly. (Read more) Efficient Instruction Tuned Qlora
  • Manticore - Instruction-tuned model series from OpenAccess AI Collective focused on roleplay, creative writing, and multi-turn conversations. Features strong character consistency and engaging narrative capabilities. (Read more) Roleplaying Creative Conversational
  • Nous Capybara - Multi-turn conversational model from Nous Research with extended context support. Trained on high-quality multi-turn datasets for improved dialogue coherence and context retention across long conversations. (Read more) Conversational Long Context Multi Turn
  • Nous Hermes - High-performing instruction-tuned model series from Nous Research, featuring strong reasoning and long-form output capabilities. Nous Hermes 2 variants available in multiple sizes based on Mixtral, Llama, and Yi foundations. (Read more) Instruction Tuned Reasoning Long Context
  • OpenChat - Open-source language model series optimized for conversational AI using innovative C-RLFT (Conditioned Reinforcement Learning Fine-Tuning) methodology. OpenChat 3.5 serves as the foundation for Starling and achieves strong performance on dialogue benchmarks. (Read more) Conversational Open Source High Performance
  • OpenHermes - Mistral-based instruction-tuned model trained on 1 million entries of primarily GPT-4 generated data. OpenHermes 2.5 Mistral 7B achieves strong performance on instruction-following and code generation tasks. (Read more) Instruction Tuned Mistral Based Synthetic Data
  • OpenOrca - Large-scale instruction dataset and model series replicating Microsoft's Orca paper, featuring progressive learning from GPT-4 and GPT-3.5 explanations. OpenOrca dataset contains 4.2M+ augmented instructions for training instruction-following models. (Read more) Instruction Tuned Synthetic Data Reasoning
  • Platypus - Efficient fine-tuned model family using curated Open-Platypus dataset, achieving strong performance with just 25K carefully selected examples. Demonstrates that dataset quality and curation matter more than quantity for fine-tuning. (Read more) Efficient Curated Data High Performance
  • Samantha - Companion-focused instruction-tuned model designed for engaging, empathetic, and helpful interactions. Features strong conversational abilities, emotional intelligence, and personality-driven responses for assistant and companion applications. (Read more) Conversational Empathetic Assistant
  • Starling - Starling 7B alpha is based on OpenChat 3.5, a refinement of Mistral 7B using Conditioned Reinforcement Learning from AI Feedback (C-RLFT). Leads in summary consistency and performance among 7B instruction-tuned models. (Read more) Instruction Tuned Reasoning Conversational
  • Zephyr - Fine-tuned series of Mistral and Mixtral models trained to act as helpful assistants using Direct Preference Optimization (DPO). Zephyr-7B-α achieved state-of-the-art results for 7B-class models at release. (Read more) Instruction Tuned Dpo Conversational

Instruction-Tuned Models

  • Hermes 3 (Nous Research) - Nous Research's latest general-use model with advanced long-term context retention, multi-turn conversation capability, complex roleplaying abilities, and enhanced agentic function-calling. Available in multiple sizes based on LLaMA architecture. (Read more) Conversational Function Calling Roleplaying
  • Dolly 2.0 (Databricks) - Databricks' fully open-source instruction-following LLM built on EleutherAI's 12B model, fine-tuned with 15,000 human-generated instruction-response pairs. First truly open instruction-tuned model with commercially viable licensing and open training data. (Read more) Commercial Human Annotated Fully Open
  • OpenAssistant - Community-driven open-source chat-based assistant from LAION-AI with 161,443 messages in 35 languages, created by 13,500+ volunteers. Provides conversation trees with quality ratings, enabling research on assistant-style dialogue and model alignment. (Read more) Community Driven Multilingual Conversational
  • Stanford Alpaca - Stanford's instruction-tuned model built on LLaMA-7B, fine-tuned on 52,000 instruction-following examples generated using Self-Instruct with GPT-3.5. Demonstrated that high-quality instruction-tuned models could be created cost-effectively using synthetic data. (Read more) Instruction Tuned Synthetic Data Fine Tuned
  • Vicuna - Open-source chatbot built on LLaMA (7B and 13B) fine-tuned on 70,000 user-shared ChatGPT conversations from ShareGPT. Achieves 90%+ ChatGPT quality according to GPT-4 evaluations, representing early success in instruction-tuned open models. (Read more) Chatbot Instruction Tuned Fine Tuned
  • WizardLM - Microsoft and Peking University's open-source LLM trained using the Evol-Instruct methodology to automatically generate complex instructions. WizardLM-2 variants based on Mixtral 8x22B, available in 8x22B, 70B, and 7B sizes with advanced instruction-following capabilities. (Read more) Instruction Tuned Evol Instruct Microsoft
  • Zephyr 7B - HuggingFace H4's instruction-tuned chatbot model based on Mistral-7B, trained using Direct Preference Optimization (DPO) on public synthetic datasets. Available in Alpha and Beta versions, demonstrating effective alignment through DPO methodology. (Read more) Chatbot Dpo Mistral Based

™️ Legal

All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship.

This directory may include content generated by artificial intelligence (AI). While efforts have been made to ensure the accuracy and reliability of the information, we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Users are advised to independently verify the information before making decisions based on it.

We disclaim any responsibility for errors, omissions, or inaccuracies in the content, whether generated by humans, AI, or any other means. By using this directory, you agree to use it at your own risk and acknowledge that the information provided may not always be current or accurate.

If you believe that your intellectual property rights or other legal rights have been infringed, please contact us immediately at legal@ever.co and we will take appropriate action.

🛡️ License

Shield: CC BY-SA 4.0

This work is licensed under a

Creative Commons Attribution-ShareAlike 4.0 International License.

CC BY-SA 4.0

About

A comprehensive directory of open source large language models, comparing architectures, benchmark performance, licensing, and deployment options

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors