Vicuna is an open-source chatbot model created by fine-tuning LLaMA on user-shared conversations from ShareGPT. Developed by UC Berkeley, CMU, Stanford, and UC San Diego researchers (LMSYS Org), Vicuna demonstrated that high-quality instruction-following models could be created through fine-tuning open base models.
- Base Models: LLaMA-7B and LLaMA-13B
- Fine-tuning Data: 70,000 user-shared ChatGPT conversations
- Data Source: ShareGPT repository
- Model Type: Instruction-tuned conversational model
- Achieves 90%+ of ChatGPT quality (GPT-4 evaluation)
- Strong conversational abilities
- Effective instruction following
- Cost-effective to create and deploy
- Open-source alternative to proprietary chatbots
- Multiple size options (7B, 13B)
- 90%+ ChatGPT Quality: According to GPT-4 as judge
- Outperforms LLaMA and Stanford Alpaca in 90%+ of cases
- Competitive with GPT-3.5-level performance
- Strong multi-turn dialogue capabilities
- Superior to base LLaMA on conversational tasks
- Better than Alpaca on complex instructions
- Effective across diverse conversation types
- 70,000 conversations from ShareGPT
- Real user interactions with ChatGPT
- Diverse topics and conversation styles
- High-quality instruction-response pairs
- Supervised fine-tuning on conversational data
- Optimization for multi-turn dialogue
- Training to follow instructions effectively
- Relatively low-cost fine-tuning process
- Based on LLaMA-7B
- More efficient, suitable for resource-constrained scenarios
- Good performance for size
- Based on LLaMA-13B
- Better quality and reasoning
- More capable in complex conversations
- Flagship variant
- Demonstrated viability of instruction tuning
- Showed high-quality data matters more than quantity
- Inspired wave of open-source chat models
- Proved open models could approach proprietary quality
- Popularized ShareGPT as training data source
- Validated fine-tuning approach for chat models
- Contributed to development of evaluation methodologies
- Inspired subsequent models (many used similar approaches)
- Used GPT-4 to evaluate response quality
- Comparison with ChatGPT and other models
- Assessment of instruction following
- Analysis of multi-turn conversations
- Self-hosting on consumer or enterprise GPUs
- Relatively low resource requirements (7B variant)
- Compatible with standard frameworks
- Integration with chatbot interfaces
- Available through Hugging Face
- Conversational AI applications
- Research on instruction-tuned models
- Educational chatbot applications
- Cost-effective alternatives to proprietary APIs
- Foundation for further fine-tuning
- Academic research and development
Developed by researchers from:
- UC Berkeley
- Carnegie Mellon University (CMU)
- Stanford University
- UC San Diego
LMSYS Org continues to advance open-source AI through projects like Chatbot Arena.
While newer models have surpassed Vicuna:
- Remains historically significant
- Demonstrated successful approach still used today
- Showed path forward for open-source chatbots
- Proved quality could be achieved with modest resources
- Inspired ecosystem of instruction-tuned models
- Leveraged Meta's strong base model
- Demonstrated value of base model + fine-tuning
- Both built on LLaMA
- Vicuna used conversational data (ShareGPT)
- Alpaca used synthetic data (Self-Instruct)
- Vicuna showed superior conversational abilities
- Many models adopted ShareGPT-style training
- Validation of GPT-4-as-judge methodology
- Template for open-source chat model development
Inherits license restrictions from base LLaMA model:
- Research and educational use
- License terms evolved with LLaMA versions
- Check current licensing for specific variants