Skip to content

Latest commit

 

History

History
72 lines (52 loc) · 1.6 KB

File metadata and controls

72 lines (52 loc) · 1.6 KB

Overview

DeepHermes-3 3B is the most compact variant of Nous Research's toggle-on reasoning model, delivering remarkable reasoning capabilities in a 3-billion-parameter package suitable for edge and mobile deployment.

Model Specifications

  • Parameters: 3 billion
  • Base: Llama 3.1 3B
  • Architecture: Transformer with reasoning tuning
  • Status: Preview release

Dual-Mode Reasoning

Fast Mode

  • Intuitive responses
  • Immediate answers
  • Minimal latency
  • Conversational style

Reasoning Mode

  • Extended chain of thought
  • Deep analysis
  • Step-by-step solving
  • Improved accuracy

Deployment Scenarios

  • Smartphone and tablet
  • IoT devices
  • Single-device local inference
  • Privacy-preserving applications
  • Offline-first systems
  • Battery-limited devices

Performance for Size

  • Remarkable capability for 3B parameters
  • Effective reasoning despite small size
  • Competitive with larger models on some tasks
  • Efficient inference

Use Cases

  • Mobile AI applications
  • Edge device deployment
  • Privacy-focused local inference
  • Offline applications
  • Resource-constrained systems
  • Educational demonstrations

Quantization Support

  • 4-bit quantization
  • 8-bit quantization
  • Further memory reduction
  • On-device deployment

Licensing Considerations

Based on Llama 3.1, organizations with 700M+ monthly active users require Meta approval for commercial use.

Inference Requirements

  • Minimal memory footprint
  • Sub-second latency possible
  • Single GPU or CPU deployment
  • Mobile GPU support
  • Battery-efficient

Community

Part of DeepHermes-3 family with various size variants.