A comprehensive directory of open source large language models, comparing architectures, benchmark performance, licensing, and deployment options
- Foundation Models (58)
- Code Generation Models (18)
- Small Language Models (20)
- Reasoning Models (12)
- Multimodal Models (10)
- Novel Architectures (9)
- Specialized Models (12)
- Research Models (6)
- Deployment Platforms (2)
- Instruction Tuned Models (17)
- Instruction-Tuned Models (7)
- Aya - Cohere's state-of-the-art massively multilingual LLM covering 101 languages including 50+ previously underserved languages. Result of year-long collaboration with 3,000 researchers across 119 countries. (Read more)
MultilingualCollaborativeOpen Source - Aya 23 - Cohere's open-weight multilingual language model supporting 23 languages in 8B and 35B parameter variants, outperforming Gemma, Mistral, and Mixtral on extensive discriminative and generative tasks through depth-focused training. (Read more)
MultilingualOpen WeightsInstruction Tuned - Command R+ - Cohere's 104B parameter open-weights model optimized for RAG and multi-step tool use, supporting 128K context window across 10 languages with 50% higher throughput and 25% lower latency than previous versions. (Read more)
RagTool UseMultilingual - DBRX - Databricks' 132B parameter open-source LLM using fine-grained mixture-of-experts architecture with 36B active parameters, trained on 12T tokens and outperforming GPT-3.5 while achieving 2x faster inference than LLaMA2-70B. (Read more)
Mixture Of ExpertsOpen SourceEnterprise - DeepSeek V3 - A cutting-edge open-source LLM with 685 billion parameters released in December 2025 by Hangzhou-based DeepSeek AI. Features mixture-of-experts architecture with up to 128K token context window, excelling at reasoning, coding, and complex problem-solving tasks. (Read more)
Mixture Of ExpertsReasoningCoding - EXAONE 3.0 - Korea's first open-source bilingual AI language model from LG AI Research, featuring a 7.8B parameter model trained on 8 trillion tokens with exceptional performance in both Korean and English real-world applications. (Read more)
BilingualKoreanMultilingual - Falcon 3 - TII's latest open-source LLM family with models in 1B, 3B, 7B, and 10B sizes trained on 14 trillion tokens, featuring the world's most powerful small AI models under 13B parameters and designed to run efficiently on laptops. (Read more)
Small ModelsEfficientMultilingual - Gemma 2 - Google DeepMind's next-generation open language model family in 2B, 9B, and 27B sizes trained on up to 13T tokens, with the 27B variant ranking as the highest open model on Chatbot Arena and outperforming Llama 3 70B. (Read more)
Apache LicensedHigh PerformanceEfficient - GLM-4 (Zhipu AI) - Zhipu AI's fourth-generation bilingual (Chinese-English) open-source language model, with GLM-4.7 achieving top rankings in early 2026 with exceptional coding (94.2 HumanEval) and mathematical reasoning (95.7 AIME 2025). Features 200K context window. (Read more)
ReasoningCodingBilingual - GPT-OSS-120B (OpenAI) - OpenAI's first fully open-weight LLM since GPT-2, with 120 billion parameters matching or surpassing o4-mini on core benchmarks including AIME, MMLU, TauBench, and HealthBench. Marks OpenAI's return to open-source model development. (Read more)
Open WeightsReasoningLatest - InternLM 3 - Shanghai AI Lab's latest open-source model series providing advanced capabilities for reasoning and complex tasks. Part of the InternLM family with focus on multi-turn conversations. (Read more)
FoundationMultilingualOpen SourceLatest - Kimi K2.5 - Advanced language model from Moonshot AI with strong performance on Chinese and English tasks. Part of the Kimi series achieving competitive Chatbot Arena ELO ratings in the 1445-1451 range alongside GLM-5 in 2026. (Read more)
ChineseBilingualHigh Performance - LLaMA 2 (Meta) - Meta's second-generation open-source foundation model family with 7B, 13B, and 70B variants, featuring improved training, safety alignment, and commercial licensing. Includes both base and Chat variants optimized for dialogue applications. (Read more)
FoundationCommercialSafety Aligned - LLaMA 3 (Meta) - Meta's third-generation open source large language model family, featuring improved efficiency, better reasoning abilities, and enhanced multi-turn dialogue understanding. Widely adopted by startups, enterprises, and research teams for its transparency, customization options, and strong performance across diverse tasks. (Read more)
TransformerOpen WeightsMultilingual - Llama 3.3 70B - Meta's latest 70B parameter instruction-tuned model offering similar performance to the Llama 3.1 405B. Features 128K context window with improved reasoning, coding, math, and instruction-following capabilities compared to Llama 3.1 70B. (Read more)
FoundationInstruction TunedLong ContextMultilingualLatestMeta - LLaMA 4 (Meta) - Meta's latest fourth-generation open source language model family released in April 2025, featuring Llama 4 Scout and Llama 4 Maverick variants. First iteration to use mixture-of-experts architecture, representing cutting-edge advancement in open source AI capabilities. (Read more)
Mixture Of ExpertsOpen WeightsLatest - Mistral 7B - Mistral AI's flagship 7-billion-parameter model with exceptional performance-to-size ratio, outperforming LLaMA 2 13B on most benchmarks. Features sliding window attention for efficient long-context handling and Apache 2.0 license. (Read more)
EfficientHigh Performanceapache-2.0 - Mistral Large 3 - State-of-the-art open-weight model with 675B total parameters using MoE architecture, delivering 92% of GPT-5.2's performance at roughly 15% of the price. Released under Apache 2.0 license by Mistral AI. (Read more)
Mixture Of ExpertsApache 20High Performance - Mistral NeMo - 12B parameter state-of-the-art model with 128K context window developed by Mistral AI and NVIDIA, released under Apache 2.0 license with leading performance on instruction following, reasoning, and code generation. (Read more)
Long ContextApache LicensedEnterprise - Mistral Small 3 - A 24-billion-parameter open-source LLM from French startup Mistral AI, achieving performance comparable to 70B models while being 3x faster. Released under Apache 2.0 license, excelling in efficiency and multilingual tasks. (Read more)
apache-2.0EfficientMultilingual - Mixtral 8x22B - Mistral AI's larger sparse mixture-of-experts model with 141B total parameters, using only 39B active parameters. Achieves 77.8% on MMLU and 78.6% on GSM8K, excelling in coding and mathematics with cost-efficient performance. (Read more)
Mixture Of ExpertsCodingMathematics - Mixtral 8x7B - Mistral AI's sparse mixture-of-experts model using 8 expert networks with 7B parameters each, activating only 12.9B parameters per token. Outperforms LLaMA 2 70B on most benchmarks with 6x faster inference, featuring 32K context window. (Read more)
Mixture Of ExpertsEfficientapache-2.0 - NVIDIA Nemotron 3 - A family of open-source models from NVIDIA with open weights, training data, and recipes, featuring Nano (30B/3B active), Super (100B/10B active), and Ultra (500B/50B active) variants optimized for agentic AI applications. (Read more)
Mixture Of ExpertsAgentic AiHybrid Architecture - OLMo 3 (AI2) - Allen Institute for AI's latest fully open language model family with 7B and 32B variants, featuring OLMo 3-Think (32B) as the best fully open 32B-scale thinking model. Represents America's truly open reasoning models with complete transparency in training data and process. (Read more)
Fully OpenReasoningTransparent - Qwen 3 (Alibaba Cloud) - Alibaba Cloud's next-generation open-source LLM family with models up to 110B parameters, excelling in multilingual tasks (supporting 119 languages), coding, and extended context understanding. Built for flexibility and scale with strong performance across diverse applications. (Read more)
Multilingualapache-2.0Coding - Qwen2.5-Max - Alibaba's flagship open-weights model exceeding 1 trillion parameters via MoE architecture, supporting 119 languages and achieving 92.3% accuracy on AIME25 and 74.1% on LiveCodeBench v6. Released under Apache 2.0 license. (Read more)
Mixture Of ExpertsMultilingualApache 20 - Qwen3.5-122B-A10B - Alibaba's largest Qwen3.5 model with 122B total and 10B activated parameters, designed for server-grade GPUs with 80GB VRAM. Supports over 1 million context tokens with enterprise capabilities. (Read more)
Mixture Of ExpertsLong ContextEnterpriseHigh PerformanceLatest - Qwen3.5-397B-A17B - Alibaba's frontier-scale Qwen3.5 model with 397B total and 17B activated parameters. Frontier-level performance with ultra-efficient sparse activation and multimodal native support. (Read more)
Mixture Of ExpertsFrontierHigh PerformanceMultimodalLatest - SeaLLM - Family of multilingual large language models tailored for Southeast Asian languages including Thai, Vietnamese, Indonesian, and 9 others, built on Llama-2 with extended vocabulary and outperforming ChatGPT-3.5 in non-Latin languages. (Read more)
MultilingualSoutheast AsianRegional - Snowflake Arctic - Enterprise-focused open-source LLM with 480B total parameters using Dense-MoE architecture, activating only 17B parameters. Excels at SQL generation, coding, and instruction following. Trained for under $2M in three months under Apache 2.0 license. (Read more)
EnterpriseMixture Of ExpertsSql - Yi 2 - 01.AI's latest Yi model series providing advanced capabilities with strong performance on reasoning and coding. Continues the legacy of the popular Yi models. (Read more)
FoundationReasoningCodingOpen SourceLatest - Aquila - Open-source LLM series from Beijing Academy of Artificial Intelligence (BAAI), including Aquila-7B for code generation and AquilaChat-7B for conversational applications. Part of the WuDao project for Chinese AI advancement. (Read more)
ChineseResearchOpen Source - Baichuan 2 - Baichuan Intelligence's 13-billion-parameter open-source language model based on Transformer architecture, trained on approximately 1.2 trillion tokens. Supports both Chinese and English with 4096 context window, commercially usable for bilingual applications. (Read more)
BilingualChineseapache-2.0 - BLOOM - BigScience's multilingual open-source language model with 176 billion parameters, supporting 46 natural languages and 13 programming languages. Created through international collaboration of over 1,000 researchers, representing one of the largest open multilingual LLMs. (Read more)
MultilingualCollaborativeLarge Scale - ChatGLM - Open-source bilingual language model family from Zhipu AI and Tsinghua University, designed for Chinese and English applications. Certain versions support extremely large context windows up to 1 million tokens. (Read more)
BilingualChineseLong Context - ChatGLM3 - Zhipu AI's third-generation bilingual (Chinese-English) conversational model with enhanced capabilities, longer context support, and improved reasoning. Part of the GLM series with strong performance in Chinese language tasks and general dialogue. (Read more)
BilingualChineseConversational - Command R (Cohere) - Cohere's retrieval-augmented generation model optimized for enterprise RAG applications, conversational interaction, and long-context tasks. Designed for production deployments with strong instruction following and tool use capabilities. (Read more)
RagEnterpriseConversational - DBRX (Databricks) - Databricks' open-source mixture-of-experts language model with 132 billion total parameters, setting new standards for open LLMs when released in 2024. Outperformed existing open models including LLaMA 2 and Mixtral on key benchmarks. (Read more)
Mixture Of ExpertsCommercialHigh Performance - Falcon 2 - Technology Innovation Institute's transformer-based open-source model with 11 billion parameters, providing multimodal capabilities for both text and vision. Known for multilingual support and being fully open without heavy-handed safety filters. (Read more)
MultimodalMultilingualapache-2.0 - GPT-J - EleutherAI's 6B parameter autoregressive language model trained on The Pile dataset. An early open-source alternative to GPT-3 that helped democratize access to large language models. (Read more)
Open SourceCommunityHistoric - GPT-J-6B (EleutherAI) - EleutherAI's 6-billion-parameter open-source autoregressive language model trained on The Pile dataset. At release, it was the largest publicly available GPT-3-style language model, pioneering accessible open-source alternatives to proprietary models. (Read more)
Open SourceCommunityHistoric - GPT-NeoX-20B (EleutherAI) - EleutherAI's 20-billion-parameter autoregressive language model trained using the GPT-NeoX library. Open-source model with Apache 2.0 license, supporting research and production use with strong general-purpose capabilities across diverse tasks. (Read more)
Open SourceResearchapache-2.0 - InternLM - Large language model from SenseTime and Shanghai AI Lab with 104 billion parameters trained on 1.6 trillion tokens. InternLM2 available in 7B and 20B sizes with comprehensive capabilities in mathematics, code, dialogue, and creative writing. (Read more)
ChineseMultilingualResearch - InternLM 2 - Multilingual foundational language model developed by SenseTime and Shanghai AI Lab with 7B and 20B variants. Pre-trained on 1.6T tokens across 104B parameters, offering strong Chinese and multilingual capabilities for research and commercial applications. (Read more)
MultilingualChineseResearch - MiniMax (ABAB) - Chinese AI company's multimodal foundation models with strong commercial presence. Part of the 'AI tigers' group, offering text, voice, and vision capabilities for enterprise and consumer applications across the Chinese market. (Read more)
MultimodalChineseCommercial - Moonshot AI (Kimi) - Chinese AI startup's long-context language model with exceptional memory capabilities, supporting extremely long conversations and document processing. Part of China's 'AI tigers' group with strong commercial presence and advanced reasoning abilities. (Read more)
Long ContextChineseCommercial - MPT (MosaicML Pretrained Transformer) - Family of open-source commercially usable LLMs from MosaicML/Databricks. MPT-7B trained on 1T tokens matches LLaMA-7B quality, with MPT-30B designed to deploy on a single NVIDIA A100 GPU. (Read more)
Open SourceCommercialEfficient - MPT-7B (MosaicML) - MosaicML's 7-billion-parameter transformer trained from scratch on 1T tokens of text and code. Open-source with commercial use license, matching LLaMA-7B quality. Includes variants like MPT-7B-Chat, MPT-7B-Instruct, and MPT-7B-StoryWriter with 65K context. (Read more)
CommercialLong ContextCode - OPT (Open Pre-trained Transformer) - Meta's family of open-source models designed to replicate GPT-3 with similar decoder-only architecture. Released to promote reproducibility and research in large language models. (Read more)
MetaOpen SourceResearch - OPT-175B (Meta) - Meta's 175-billion-parameter open-source model released for research, matching GPT-3 scale. Trained with fully documented process and released with training logs, providing transparency into large-scale language model development. (Read more)
Large ScaleResearchTransparent - RedPajama - An open reproduction of the LLaMA training dataset with 1.2 trillion tokens, created by following the LLaMA recipe to make fully open-source models under Apache license. Includes the RedPajama-INCITE model family. (Read more)
Open DataApache 20Community - RedPajama-INCITE - Open-source model family from Together AI trained on the 1.2 trillion token RedPajama dataset. Includes 3B and 7B variants (base, chat, instruct) with Apache 2.0 license. RedPajama-3B outperforms GPT-Neo and Pythia-2.8B on benchmarks. (Read more)
Open Dataapache-2.0Commercial - Stable LM 2 - Stability AI's multilingual language model family in 1.6B and 12B sizes trained on 2T tokens across seven languages (English, Spanish, German, Italian, French, Portuguese, Dutch), featuring strong tool usage and function calling capabilities. (Read more)
MultilingualTool UseEfficient - TigerBot - Open multilingual multitask LLM family ranging from 7B to 180B parameters. Developed from Llama-2 and BLOOM, achieving 6% performance gain in English and 20% in Chinese, with specialized domain data for finance, law, and encyclopedic knowledge. (Read more)
ChineseMultilingualSpecialized - XVERSE - Multilingual LLM from Shenzhen Yuanxiang Technology supporting 40+ languages with 8K context. XVERSE-65B trained on 2.6 trillion tokens with 16K context, optimized for Chinese, English, Russian, and Spanish. (Read more)
MultilingualChineseLarge Scale - XVERSE 2 - Shenzhen Yuanxiang Technology's latest open-source model series with strong performance on benchmarks. Features advanced reasoning and multilingual capabilities. (Read more)
FoundationMultilingualReasoningOpen Source - Yi - Bilingual language model series from 01.AI with strong performance on both Chinese and English tasks. Available in multiple sizes from 6B to 34B parameters with variants optimized for different applications. (Read more)
BilingualChineseHigh Performance - Yi 1.5 (01.AI) - 01.AI's open-source language model excelling at handling long sequences and maintaining contextual coherence. Known as a dark horse model with strong coding capabilities and performance in long-context understanding tasks. (Read more)
Long ContextCodingapache-2.0
- Code Llama - Meta's code-specialized variant of Llama 2, available in 7B, 13B, 34B, and 70B sizes. Includes specialized Python variant and long-context 100K token version for processing entire codebases. (Read more)
CodingMetaSpecialized - Code Llama (Meta) - Meta's specialized code generation model built on LLaMA 2, available in 7B, 13B, 34B, and 70B sizes. Features up to 100K token context window, trained on 500B tokens of code, with variants for Python specialization and instruction following. (Read more)
CodingSpecializedLong Context - CodeT5+ - Salesforce's open code large language model family with encoder-decoder architecture, achieving 35.0% pass@1 on HumanEval and surpassing OpenAI's code-cushman-001, with models from 220M to 16B parameters optimized for code understanding and generation. (Read more)
Encoder DecoderCode UnderstandingOpen Source - DeepSeek-Coder - Code-specialized LLM series from DeepSeek with models up to 33B parameters. DeepSeek-Coder-V2 achieves 90% on LiveCodeBench and serves as the foundation for WizardCoder-33B's record-breaking 79.9% on HumanEval. (Read more)
CodingSpecializedHigh Performance - Devstral Coding Models - Mistral's specialized coding-focused models designed for software engineering tasks. Optimized for code generation, review, and understanding with strong performance on programming benchmarks. (Read more)
CodingSoftware EngineeringSpecializedLatest - GLM-4.7 - Zhipu AI's frontier-level open-source model with 200K context window and advanced reasoning. Achieves state-of-the-art performance on coding (73.8% on SWE-bench) with MoE architecture optimizing computational efficiency without sacrificing depth. (Read more)
CodingReasoningMixture Of ExpertsLong ContextSoftware EngineeringLatest - MiMo-V2-Flash - Efficient software engineering model that outperforms open-source LLMs like DeepSeek-V3.2 and Kimi-K2 on software engineering benchmarks with significantly fewer parameters (1/2-1/3x total parameters), demonstrating superior parameter efficiency. (Read more)
Software EngineeringEfficientLatest - MiniMax M2.5 - Top-performing model on SWE-bench Verified achieving 80.2% score, the highest among all models tested in 2026. Demonstrates exceptional software engineering capabilities on real GitHub issue resolution. (Read more)
Software EngineeringLatestHigh Performance - NousCoder-14B - Nous Research's competitive programming model fine-tuned from Qwen3-14B via reinforcement learning, achieving 67.87% on LiveCodeBench v6. Trained in 4 days using 48 B200 GPUs with full open-source RL stack. (Read more)
CodingSoftware EngineeringCompetitive ProgrammingOpen WeightsFully OpenLatest - Qwen2.5-Coder - Alibaba's state-of-the-art open-source code model series (0.5B-32B parameters) trained on 5.5T tokens, achieving SOTA on HumanEval-Infilling and matching GPT-4o coding capabilities with the 32B variant while supporting 92 programming languages. (Read more)
CodingMultilingual CodeInstruct Tuned - StarCoder - 15.5B parameter code model from BigCode trained on 80+ programming languages from The Stack. Features 8K context length, infilling capabilities, and fast large-batch inference enabled by multi-query attention. (Read more)
CodingMultilingualOpen Source - StarCoder (BigCode) - BigCode's 15.5-billion-parameter open-source code generation model trained on 1 trillion tokens from The Stack dataset. Supports 80+ programming languages with 8K context window, developed through collaboration of Hugging Face and ServiceNow. (Read more)
CodingOpen SourceMultilingual - StarCoder2 - Next-generation code model from BigCode with improved architecture and enhanced performance over StarCoder. Continues support for 80+ programming languages with better code generation, understanding, and multilingual capabilities. (Read more)
CodingNext GenMultilingual - StarCoder2 15B - BigCode's next-generation open-source code model with 15B parameters trained on 4+ trillion tokens from The Stack v2, supporting 600+ programming languages with 16K context window and outperforming CodeLlama-34B on code reasoning. (Read more)
Open SourceMultilingual CodeLong Context - WizardCoder - Code-specialized LLM using Evol-Instruct methodology for code generation. WizardCoder-33B-V1.1 achieves 79.9 pass@1 on HumanEval, surpassing Anthropic's Claude and Google's Bard. Built from deepseek-coder with exceptional programming capabilities. (Read more)
CodingEvol InstructHigh Performance - CodeGemma - Google's lightweight open model collection for coding tasks including code completion, generation, and understanding. Built on Gemma architecture with instruction-following capabilities for programming. (Read more)
CodingGoogleLightweightOpen Source - CodeGen - Salesforce Research's family of code generation models with CodeGen 2.5 as the latest release. Trained on diverse programming data to support code completion, generation, and understanding tasks. (Read more)
CodingSalesforceOpen Source - SantaCoder - Efficient 1.1B parameter code model from BigCode trained on 236B tokens using Multi-Query Attention. Supports Python, Java, and JavaScript with a 2K context window and achieves competitive scores for its size. (Read more)
LightweightCodingEfficient
- Gemma 3 (Google) - Google's third-generation family of lightweight open models built from Gemini technology, available in 2B, 7B, 9B, 12B, and 27B sizes. Gemma-3-27B beats the original Gemini 1.5-Pro across benchmarks, offering state-of-the-art performance for small language models. (Read more)
EfficientGoogleMultilingual - H2O-Danube - H2O.ai's family of small language models (500M-4B parameters) trained on up to 6T tokens, achieving top rankings on Hugging Face Open LLM Leaderboard for the 2B range and designed for efficient edge deployment on mobile devices. (Read more)
Edge DeploymentMobileEfficient - Hermes 4 14B - Nous Research's lightweight 14B variant of the Hermes 4 family with hybrid reasoning. Delivers frontier-level performance in a small package for edge and resource-constrained deployments. (Read more)
ReasoningEfficientHybrid ReasoningOpen WeightsLatest - Ministral 3 14B - Mistral's 14B dense model with native reasoning, vision, and multilingual support. High performance for its size with best cost-to-performance ratio among open-source models. (Read more)
ReasoningMultimodalApache 20EfficientLatest - Ministral 3 8B - Mistral's efficient 8B model with native vision and multilingual capabilities. Balanced performance and resource consumption for local deployment. (Read more)
EfficientMultimodalApache 20Latest - Mistral 3 14B - Mistral's powerful 14B dense model with native multimodal and reasoning capabilities. Achieves 85% on AIME 2025 in reasoning variant and offers the best cost-to-performance ratio among open-source models. (Read more)
EfficientMultimodalReasoningApache 20MathematicsLatest - Mistral 3 8B - Mistral's state-of-the-art dense 8B model with native vision capabilities and multilingual support. Part of the Ministral 3 series, available in base, instruct, and reasoning variants under Apache 2.0 license. (Read more)
EfficientMultimodalMultilingualApache 20ReasoningLatest - OpenELM - Apple's family of efficient open-source language models (270M to 3B parameters) using layer-wise scaling strategy, achieving 2.36% better accuracy than OLMo with 2x fewer pre-training tokens and optimized for on-device deployment. (Read more)
On DeviceEfficientLayer Wise Scaling - Phi-3 - Microsoft's family of small language models (mini 3.8B, small 7B, medium 14B) released under MIT license, with Phi-3-mini achieving performance comparable to models 10x its size and featuring variants with up to 128K context window. (Read more)
Mit LicenseEfficientHigh Quality - Phi-4 (Microsoft) - Microsoft's 14-billion-parameter small language model released January 2025 under MIT license. Phi-4-reasoning-plus approaches full DeepSeek R1 performance on AIME 2025, with a 14B model rivaling a 671B model, making it the poster child for the small language model revolution. (Read more)
EfficientReasoningMit License - Qwen3.5-27B - Alibaba's efficient 27B dense model from the Qwen3.5 series, supporting 800K+ context tokens. Offers performance comparable to Claude Sonnet 4.5 while maintaining exceptional efficiency for local deployment. (Read more)
EfficientMultilingualLong ContextLatest - Qwen3.5-35B-A3B - Alibaba's efficient MoE model with 35B total parameters and 3B activated per token. First in Qwen3.5 series supporting tool calling and multimodal agentic capabilities under Apache 2.0 license. (Read more)
Mixture Of ExpertsEfficientFunction CallingLatest - SmolLM2 - A family of compact, state-of-the-art small language models from Hugging Face available in 135M, 360M, and 1.7B parameters, overtrained on 11 trillion tokens for exceptional on-device AI performance. (Read more)
On DeviceLightweightOpen Source - DeepHermes-3 3B - Nous Research's smallest 3B toggle-on reasoning model based on Llama 3.1, enabling hybrid reasoning on edge devices and mobile. Delivers unified intuitive and chain-of-thought capabilities in a tiny package. (Read more)
ReasoningEfficientHybrid ReasoningLightweightLatest - Ministral 3B - Mistral's smallest dense model with native vision, multimodal, and multilingual support. Part of Ministral 3 series for edge deployment under Apache 2.0 license. (Read more)
EfficientLightweightMultimodalApache 20Latest - Mistral 3 3B - Mistral's smallest and most efficient model in the 3B parameter range, featuring native multimodal capabilities and multilingual support. Optimized for resource-constrained edge deployment. (Read more)
EfficientLightweightMultimodalApache 20Cpu FriendlyLatest - Phi-2 - Microsoft's efficient Transformer-based LLM with 2.7 billion parameters, trained on 1.4 trillion tokens of synthetic data from GPT-3.5-turbo. Despite its size, competes with models up to 25 times larger. (Read more)
MicrosoftEfficientSynthetic Data - StableLM - Stability AI's efficient language model with the 1.6B variant trained on 2 trillion tokens, beating other sub-2B options. Designed for developers who need practical, working code quickly. (Read more)
LightweightEfficientOpen Source - StableLM 3B - Stability AI's open-source language model with 3 billion parameters, trained on 1.5 trillion text tokens using an updated version of The Pile dataset. Released under CC BY-SA-4.0 for commercial use, achieving state-of-the-art performance at the 3B scale. (Read more)
EfficientOpen SourceLightweight - TinyLlama - A compact 1.1B parameter language model pretrained on 1 trillion tokens with LLaMA architecture, achieving superior training efficiency at 24,000 tokens/sec per A100 GPU and outperforming OPT-1.3B and Pythia-1.4B. (Read more)
EfficientLightweightLlama Compatible
- DeepSeek V3.2 - An advanced open-source LLM with 685B parameters featuring DeepSeek Sparse Attention mechanism, supporting 128K context length. Achieves frontier-level reasoning performance comparable to GPT-5 and surpasses it on mathematical reasoning benchmarks. (Read more)
ReasoningMixture Of ExpertsLong ContextLatestOpen Weights - DeepSeek-R1 - DeepSeek's reasoning-focused model released in early 2025, designed to excel at complex reasoning and problem-solving tasks. Part of DeepSeek's push into specialized reasoning capabilities alongside V3. (Read more)
ReasoningProblem SolvingLatest - Grok 2.5 - xAI's 270B open-source model released in August 2025, featuring advanced reasoning and real-time data integration. Now available on Hugging Face with 42 files totaling ~500GB, requiring 8 GPUs with 40GB memory each. (Read more)
ReasoningMixture Of ExpertsHigh PerformanceOpen WeightsLatest - Grok 3 - xAI's latest flagship model trained on 200,000 GPUs, described as 'an order of magnitude more capable' than Grok 2.5. Confirmed for open-source release around February 2026 by Elon Musk. (Read more)
ReasoningHigh PerformanceLatest - Hermes 4 405B - Nous Research's flagship 405B model with hybrid reasoning capabilities. Represents the maximum capability variant of Hermes 4 for frontier-level performance and complex reasoning tasks. (Read more)
ReasoningHigh PerformanceHybrid ReasoningOpen WeightsLatest - Hermes 4 70B - Nous Research's 70B frontier-level open-weight model featuring hybrid reasoning via toggleable thinking. Trained on massive dataset using DataForge and Atropos RL framework, designed to rival ChatGPT and Claude. (Read more)
ReasoningInstruction TunedHybrid ReasoningOpen WeightsLatest - Orca 2 (Microsoft) - Microsoft Research's reasoning-focused LLM fine-tuned from LLaMA 2, available in 7B and 13B sizes. Teaches various reasoning strategies and achieves performance levels similar to models 5-10x larger on complex reasoning tasks through innovative training methodology. (Read more)
ReasoningMicrosoftEfficient - Qwen3-235B-A22B - Alibaba's flagship mixture-of-experts model with 235B total parameters and 22B activated parameters. Features hybrid thinking mode for complex reasoning, math, and coding, plus non-thinking mode for efficient dialogue. Supports 100+ languages with native MCP. (Read more)
ReasoningMixture Of ExpertsMultilingualThinking ModeFunction CallingLatest - QwQ-32B - Alibaba's specialized 32B reasoning model delivering high performance on complex problem-solving with significantly reduced compute requirements. Achieves comparable performance to DeepSeek-R1 (671B) while using one-tenth the parameters. (Read more)
ReasoningMathematicsEfficientOpen WeightsProblem SolvingLatest - DeepHermes-3 24B - Nous Research's 24B variant based on Mistral, combining toggle-on reasoning with a balanced parameter count. Ideal for mid-range inference with hybrid reasoning capabilities. (Read more)
ReasoningEfficientHybrid ReasoningLatest - DeepHermes-3 8B - Nous Research's first toggle-on reasoning model based on Llama 3.1 8B, unifying intuitive responses with long chain-of-thought reasoning. Switchable via system prompt for flexible reasoning depth. (Read more)
ReasoningHybrid ReasoningEfficientLatest - Orca 2 - Microsoft's reasoning-focused language model based on Llama 2, trained on a censored synthetic dataset using Reinforcement Learning from AI Feedback (RLAIF). Designed to excel in reasoning tasks and teach smaller models to reason effectively. (Read more)
MicrosoftReasoningSynthetic Data
- GLM-4 - Open multilingual multimodal chat model from Zhipu AI with exceptional coding and reasoning capabilities. GLM-4.7 achieves 94.2 on HumanEval and 95.7 on AIME 2025, making it one of the most well-rounded open-source models available. (Read more)
MultilingualMultimodalCoding - Llama 3.2 Vision - Meta's multimodal large language models in 11B and 90B sizes supporting text and image inputs, pretrained on 6B image-text pairs and competitive with Claude 3 Haiku and GPT-4o-mini on visual reasoning tasks. (Read more)
Vision LanguageMultimodalImage Reasoning - Llama 4 Maverick - Meta's 400B multimodal model representing the Llama 4 flagship. First open-weight natively multimodal MoE model delivering high-performance text and image understanding with massive parameter scale. (Read more)
MultimodalMixture Of ExpertsHigh PerformanceLatestMeta - Llama 4 Scout - Meta's 17B active parameter model with 16 MoE experts (109B total). First open-weight natively multimodal model with 10M token context, trained on 40 trillion tokens. Exceeds Llama 3.1 405B on many benchmarks. (Read more)
MultimodalMixture Of ExpertsLong ContextLatestMeta - LLaVA (Large Language-and-Vision Assistant) - End-to-end trained large multimodal model connecting vision encoder and LLM for visual and language understanding. LLaVA-OneVision-1.5 achieves state-of-the-art performance on native-resolution images with comparatively lower training costs. (Read more)
MultimodalVision LanguageOpen Source - Molmo - State-of-the-art open-source multimodal vision-language model from Allen AI, with the 72B variant outperforming Gemini 1.5 Pro and Claude 3.5 Sonnet on academic benchmarks while featuring unique pixel-pointing capabilities. (Read more)
Vision LanguageMultimodalPointing - PaliGemma - Google's open vision-language model combining SigLIP (image encoder) and Gemma-2B (text decoder). Excels at image captioning, visual QA, text understanding in images, object detection, and segmentation. (Read more)
MultimodalVision LanguageGoogleOpen Source - Pixtral 12B - Mistral AI's first multimodal model with 12B parameters, combining vision and language understanding. Significantly outperforms other open-source multimodal models like Qwen2-VL 7B and LLaVA-OneVision 7B in instruction following and visual understanding tasks. (Read more)
Vision LanguageMultimodalMistral - Fuyu - Adept's multimodal model with novel architecture eliminating image encoding stage. Fuyu-8B processes images directly alongside text, enabling simpler architecture and better fine-tuning capabilities for vision-language tasks. (Read more)
MultimodalVision LanguageNovel Architecture - StepFun (Step-1) - Chinese AI startup specializing in multimodal models with strong vision-language capabilities. Part of the 'AI tigers' group, focusing on advanced visual understanding and reasoning for enterprise and consumer applications in the Chinese market. (Read more)
MultimodalChineseVision Language
- Falcon Mamba 7B - TII's first open-source State Space Language Model (SSLM) with 7B parameters, using Mamba architecture for linear-time sequence processing, achieving 62.03% on ARC and outperforming Llama 3.1 8B while fitting on a single A10 24GB GPU. (Read more)
State Space ModelMambaEfficient - Jamba 1.5 - AI21 Labs' hybrid SSM-Transformer architecture with 256K context window, 52B parameters with only 12B activated. Delivers 3x throughput on long contexts while maintaining quality in a hyper-efficient package. (Read more)
Novel ArchitectureLong ContextMixture Of ExpertsEfficientLatest - Jamba 2 - AI21 Labs' open-source family of language models built for maximum reliability and steerability in enterprise. Continuing the hybrid SSM-Transformer architecture with enhanced capabilities. (Read more)
Novel ArchitectureEnterpriseOpen WeightsLatest - Mamba - Selective state space model achieving linear-time sequence modeling with selective state spaces. Mamba-3 combines expressive recurrence with multi-input, multi-output formulation, competing with Transformers while maintaining O(n) complexity. (Read more)
State SpaceEfficientResearch - Moe-Mamba Hybrid Models - Hybrid architecture models combining Mamba SSM (State Space Models) with Mixture-of-Experts for efficient long-context processing. Emerging frontier in alternative architectures. (Read more)
Novel ArchitectureState SpaceLong ContextEfficient - OLMoE - Fully open-source mixture-of-experts language model jointly developed by Allen Institute for AI and Contextual AI, featuring 1B active and 7B total parameters with 2x faster training than equivalent dense models. (Read more)
Mixture Of ExpertsEfficientFully Open - RecurrentGemma-2B - Google DeepMind's 2B parameter model using the novel Griffin architecture combining linear recurrences with local attention. Achieves comparable performance to Gemma-2B while using less memory and enabling faster long-sequence inference. (Read more)
Novel ArchitectureEfficientLong ContextGoogleLightweight - RWKV - Receptance Weighted Key Value model with RNN-like architecture enabling infinite sequence length while maintaining transformer-level performance. Available in sizes from 169M to 14B parameters, trained on The Pile with linear computational complexity. (Read more)
Rnn Transformer HybridEfficientLong Context - SOLAR - Upstage's 10.7B parameter language model using Depth Up-Scaling (DUS) technique to efficiently create larger models from smaller ones. SOLAR 10.7B achieves performance competitive with much larger models through innovative architectural scaling. (Read more)
EfficientInnovativeKorean
- BioGPT - Microsoft's domain-specific generative transformer language model pretrained on large-scale biomedical literature, achieving human-expert-level performance with 78.2% accuracy on PubMedQA and new records on relation extraction tasks. (Read more)
BiomedicalScientificHealthcare - Chronos - Amazon's family of pretrained time series forecasting foundation models based on language model architectures, featuring Chronos-2 (120M parameters) with state-of-the-art zero-shot accuracy and 600M+ downloads from Hugging Face. (Read more)
Time SeriesForecastingZero Shot - IBM Granite - Enterprise-focused language model family from IBM with specialized variants for language, code, time series, and geospatial data. Granite 4.0 features hybrid Mamba/Transformer architecture reducing memory by 70% for long-context inference. (Read more)
EnterpriseApache 20Hybrid Architecture - Llemma-34B - EleutherAI's larger mathematical reasoning model variant, fine-tuned from Code Llama on Proof-Pile-2. Enhanced performance for complex mathematical problems, proofs, and computational tasks. (Read more)
MathematicsReasoningSpecializedOpen Source - Llemma-7B - EleutherAI's open mathematical reasoning model, fine-tuned from Code Llama on Proof-Pile-2 dataset. Outperforms Code Llama-34B on math benchmarks and supports computational tools for problem-solving. (Read more)
MathematicsReasoningSpecializedOpen Source - NuminaMath-7B - Project Numina's open-source 7B model fine-tuned for competition-level mathematics using tool-integrated reasoning (TIR). Won AIMO Progress Prize with score of 29/50, capable of AMC 12 level problems. (Read more)
MathematicsSpecializedReasoningOpen WeightsCompetitive - SQLCoder - Specialized SQL generation model from Defog achieving state-of-the-art text-to-SQL performance. SQLCoder-70B outperforms GPT-4 on SQL generation benchmarks, making it the leading open-source model for database queries. (Read more)
SqlSpecializedHigh Performance - Baichuan - Open-source LLM series from Baichuan Intelligence with strong focus on domain-specific applications in law, finance, medicine, and classical Chinese literature. Baichuan 4 is the premier Chinese open-source model for specialized domains. (Read more)
ChineseSpecializedEnterprise - MedGemma - Google's specialized Gemma model for medical image and text comprehension. Part of Gemma family, optimized for healthcare and biomedical applications. (Read more)
SpecializedMedicalGoogleMultimodal - ShieldGemma - Google's instruction-tuned safety and content moderation model built on Gemma 2, targeting four harm categories. Can evaluate both user inputs and model outputs for safety compliance. (Read more)
Safety AlignedGoogleContent ModerationOpen Source - TxGemma - Google's specialized Gemma model for therapeutics development. Optimized for drug discovery, protein analysis, and biomedical research applications. (Read more)
SpecializedResearchGoogleBiomedical - WizardMath - Mathematics-specialized LLM using Evol-Instruct methodology adapted for mathematical problems. Achieves exceptional performance on math benchmarks through progressive problem complexity evolution and high-quality mathematical reasoning training. (Read more)
MathematicsReasoningSpecialized
- Amber (LLM360) - The first fully transparent 7B English language model from the LLM360 initiative, released with all 360 intermediate training checkpoints, complete training data, metrics, and source code under Apache 2.0 license. (Read more)
Fully TransparentReproducibleResearch - OLMo 3 - Allen Institute for AI's latest open language model focusing on interpretability and transparency. Provides complete training transparency including data, code, and detailed documentation. (Read more)
ResearchInterpretabilityTransparentOpen ScienceLatest - Cerebras-GPT - Family of seven GPT models (111M-13B parameters) trained using the Chinchilla formula on Cerebras CS-2 wafer-scale systems, released under Apache 2.0 license as the first Chinchilla-optimal models available open-source. (Read more)
Chinchilla OptimalApache LicensedCompute Efficient - Open Reasoning Tasks - Nous Research's comprehensive repository of reasoning tasks for LLMs. Provides benchmark suite and methodology for developing reasoning-focused language models. (Read more)
ReasoningResearchBenchmarkOpen Source - Pythia - A suite of LLMs ranging from 70M to 12B parameters trained on The Pile, designed to facilitate research on how language models learn and evolve during training. Fully open with all checkpoints released. (Read more)
ResearchFully OpenInterpretability - Pythia (EleutherAI) - EleutherAI's first LLM suite designed specifically for scientific research on learning dynamics. Features 154 checkpoints saved throughout training across multiple model sizes, enabling detailed study of how knowledge develops during training. (Read more)
ResearchInterpretabilityOpen Science
- vLLM - A high-throughput and memory-efficient inference and serving engine for LLMs, originally developed at UC Berkeley, offering up to 24x performance improvements through innovative PagedAttention technology. (Read more)
InferenceServingOptimization - GPT4All - Nomic AI's ecosystem for running LLMs locally on consumer hardware, featuring optimized inference engine and collection of open-source models. Enables private, offline AI with CPU-friendly execution and easy-to-use desktop application. (Read more)
Local InferencePrivacyCpu Friendly
- Athene-V2 72B - Nexusflow's open-source 72B model fine-tuned from Qwen 2.5 72B, ranked best on Chatbot Arena. Athene-V2-Chat excels in chat/math/coding, while Athene-V2-Agent surpasses GPT-4o in function calling and agentic applications. (Read more)
Instruction TunedReasoningFunction CallingLatestOpen Weights - Airoboros - Instruction-tuned model series using self-generated synthetic training data via LLM bootstrapping. Features context obedient question answering, creative writing capabilities, and strong function calling support. (Read more)
Instruction TunedSynthetic DataFunction Calling - Alpaca - Stanford's instruction-following model fine-tuned from LLaMA 7B using 52K instruction-following demonstrations generated by GPT-3.5. Pioneered the use of synthetic data from stronger models for instruction-tuning smaller models. (Read more)
Instruction TunedSynthetic DataHistoric - BLOOMZ - Instruction-tuned variant of BLOOM with 176B parameters, fine-tuned on multilingual tasks using xP3 dataset. Supports instruction-following across 46 languages, making it ideal for global multilingual applications. (Read more)
MultilingualInstruction TunedLarge Scale - Dolly - Databricks' instruction-following LLM demonstrating that high-quality instruction-tuning can be achieved with relatively small, curated datasets. Dolly 2.0 trained on 15K human-generated instruction-response pairs. (Read more)
Instruction TunedHuman AnnotatedDatabricks - FLAN-T5 - Google's instruction-tuned T5 encoder-decoder model family fine-tuned on 1,800+ tasks from the FLAN 2022 Collection, achieving strong few-shot performance comparable to PaLM 62B while being fully open-source and commercially usable. (Read more)
Instruction TunedEncoder DecoderZero Shot - Guanaco - Efficient instruction-tuned model using QLoRA (Quantized Low-Rank Adaptation) for parameter-efficient fine-tuning. Demonstrates that 4-bit quantized models can be fine-tuned effectively, reducing memory requirements significantly. (Read more)
EfficientInstruction TunedQlora - Manticore - Instruction-tuned model series from OpenAccess AI Collective focused on roleplay, creative writing, and multi-turn conversations. Features strong character consistency and engaging narrative capabilities. (Read more)
RoleplayingCreativeConversational - Nous Capybara - Multi-turn conversational model from Nous Research with extended context support. Trained on high-quality multi-turn datasets for improved dialogue coherence and context retention across long conversations. (Read more)
ConversationalLong ContextMulti Turn - Nous Hermes - High-performing instruction-tuned model series from Nous Research, featuring strong reasoning and long-form output capabilities. Nous Hermes 2 variants available in multiple sizes based on Mixtral, Llama, and Yi foundations. (Read more)
Instruction TunedReasoningLong Context - OpenChat - Open-source language model series optimized for conversational AI using innovative C-RLFT (Conditioned Reinforcement Learning Fine-Tuning) methodology. OpenChat 3.5 serves as the foundation for Starling and achieves strong performance on dialogue benchmarks. (Read more)
ConversationalOpen SourceHigh Performance - OpenHermes - Mistral-based instruction-tuned model trained on 1 million entries of primarily GPT-4 generated data. OpenHermes 2.5 Mistral 7B achieves strong performance on instruction-following and code generation tasks. (Read more)
Instruction TunedMistral BasedSynthetic Data - OpenOrca - Large-scale instruction dataset and model series replicating Microsoft's Orca paper, featuring progressive learning from GPT-4 and GPT-3.5 explanations. OpenOrca dataset contains 4.2M+ augmented instructions for training instruction-following models. (Read more)
Instruction TunedSynthetic DataReasoning - Platypus - Efficient fine-tuned model family using curated Open-Platypus dataset, achieving strong performance with just 25K carefully selected examples. Demonstrates that dataset quality and curation matter more than quantity for fine-tuning. (Read more)
EfficientCurated DataHigh Performance - Samantha - Companion-focused instruction-tuned model designed for engaging, empathetic, and helpful interactions. Features strong conversational abilities, emotional intelligence, and personality-driven responses for assistant and companion applications. (Read more)
ConversationalEmpatheticAssistant - Starling - Starling 7B alpha is based on OpenChat 3.5, a refinement of Mistral 7B using Conditioned Reinforcement Learning from AI Feedback (C-RLFT). Leads in summary consistency and performance among 7B instruction-tuned models. (Read more)
Instruction TunedReasoningConversational - Zephyr - Fine-tuned series of Mistral and Mixtral models trained to act as helpful assistants using Direct Preference Optimization (DPO). Zephyr-7B-α achieved state-of-the-art results for 7B-class models at release. (Read more)
Instruction TunedDpoConversational
- Hermes 3 (Nous Research) - Nous Research's latest general-use model with advanced long-term context retention, multi-turn conversation capability, complex roleplaying abilities, and enhanced agentic function-calling. Available in multiple sizes based on LLaMA architecture. (Read more)
ConversationalFunction CallingRoleplaying - Dolly 2.0 (Databricks) - Databricks' fully open-source instruction-following LLM built on EleutherAI's 12B model, fine-tuned with 15,000 human-generated instruction-response pairs. First truly open instruction-tuned model with commercially viable licensing and open training data. (Read more)
CommercialHuman AnnotatedFully Open - OpenAssistant - Community-driven open-source chat-based assistant from LAION-AI with 161,443 messages in 35 languages, created by 13,500+ volunteers. Provides conversation trees with quality ratings, enabling research on assistant-style dialogue and model alignment. (Read more)
Community DrivenMultilingualConversational - Stanford Alpaca - Stanford's instruction-tuned model built on LLaMA-7B, fine-tuned on 52,000 instruction-following examples generated using Self-Instruct with GPT-3.5. Demonstrated that high-quality instruction-tuned models could be created cost-effectively using synthetic data. (Read more)
Instruction TunedSynthetic DataFine Tuned - Vicuna - Open-source chatbot built on LLaMA (7B and 13B) fine-tuned on 70,000 user-shared ChatGPT conversations from ShareGPT. Achieves 90%+ ChatGPT quality according to GPT-4 evaluations, representing early success in instruction-tuned open models. (Read more)
ChatbotInstruction TunedFine Tuned - WizardLM - Microsoft and Peking University's open-source LLM trained using the Evol-Instruct methodology to automatically generate complex instructions. WizardLM-2 variants based on Mixtral 8x22B, available in 8x22B, 70B, and 7B sizes with advanced instruction-following capabilities. (Read more)
Instruction TunedEvol InstructMicrosoft - Zephyr 7B - HuggingFace H4's instruction-tuned chatbot model based on Mistral-7B, trained using Direct Preference Optimization (DPO) on public synthetic datasets. Available in Alpha and Beta versions, demonstrating effective alignment through DPO methodology. (Read more)
ChatbotDpoMistral Based
All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship.
This directory may include content generated by artificial intelligence (AI). While efforts have been made to ensure the accuracy and reliability of the information, we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information contained herein. Users are advised to independently verify the information before making decisions based on it.
We disclaim any responsibility for errors, omissions, or inaccuracies in the content, whether generated by humans, AI, or any other means. By using this directory, you agree to use it at your own risk and acknowledge that the information provided may not always be current or accurate.
If you believe that your intellectual property rights or other legal rights have been infringed, please contact us immediately at legal@ever.co and we will take appropriate action.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License.
