models Local documentation

Local

Models in this category

deepseek-r1-distill-qwen-1.5b

This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B for local inference. Optimized models are published here in ONNX format to run on CPU, GPU, and NPU across devices, including server platforms, Windows, Linux and Mac desktops, and mobile CPUs, with the precision best suited to e...
deepseek-r1-distill-qwen-1.5b-cuda-gpu

This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B to enable local inference on CUDA GPUs. This model uses RTN quantization.

Model Description

Developed by: Microsoft
Model type: ONNX
License: MIT
Model Description: This is a conversion of the DeepSeek-...
deepseek-r1-distill-qwen-1.5b-generic-cpu

This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B to enable local inference on CPUs. This model uses RTN quantization.

Model Description

Developed by: Microsoft
Model type: ONNX
License: MIT
Model Description: This is a conversion of the DeepSeek-R1-Di...
deepseek-r1-distill-qwen-1.5b-generic-gpu

This model is an optimized version of DeepSeek-R1-Distill-Qwen-1.5B to enable local inference on GPUs. This model uses RTN quantization.

Model Description

Developed by: Microsoft
Model type: ONNX
License: MIT
Model Description: This is a conversion of the DeepSeek-R1-Di...
deepseek-r1-distill-qwen-7b

This model is an optimized version of DeepSeek-R1-Distill-Qwen-7B to enable local inference. Optimized models are published here in ONNX format to run on CPU, GPU, and NPU across devices, including server platforms, Windows, Linux and Mac desktops, and mobile CPUs, with the precision best suited ...
deepseek-r1-distill-qwen-7b-cuda-gpu

This model is an optimized version of DeepSeek-R1-Distill-Qwen-7B to enable local inference on CUDA GPUs. This model uses RTN quantization.

Model Description

Developed by: Microsoft
Model type: ONNX
License: MIT
Model Description: This is a conversion of the DeepSeek-R1...
deepseek-r1-distill-qwen-7b-generic-cpu

This model is an optimized version of DeepSeek-R1-Distill-Qwen-7B to enable local inference on CPUs. This model uses RTN quantization.

Model Description

Developed by: Microsoft
Model type: ONNX
License: MIT
Model Description: This is a conversion of the DeepSeek-R1-Dist...
deepseek-r1-distill-qwen-7b-generic-gpu

This model is an optimized version of DeepSeek-R1-Distill-Qwen-7B to enable local inference on GPUs. This model uses RTN quantization.

Model Description

Developed by: Microsoft
Model type: ONNX
License: MIT
Model Description: This is a conversion of the DeepSeek-R1-Dist...

Wiki menu

Home
Reference Documentation
- Components
- Data
- Environments
- Models
Contributing

models Local documentation

Local

Models in this category

Model Description

Model Description

Model Description

Model Description

Model Description

Model Description

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!