Skip to content

Latest commit

 

History

History

README.md

Usage Cookbook

Examples on how to get started with Nemotron models


What's Inside

This directory contains cookbook-style guides showing how to deploy and use the models directly:

  • TensorRT-LLM Launch Guide - Running Nemotron models efficiently with TensorRT-LLM
  • vLLM Integration - Steps for fast inference and scalable serving of Nemotron models with vLLM.
  • SGLang Deployment - Tutorials on serving and interacting with Nemotron via SGLang
  • NIM Microservice - Guide to deploying Nemotron as scalable, production-ready endpoints using NVIDIA Inference Microservices (NIM).
  • Hugging Face Transformers - Direct loading and inference of Nemotron models with Hugging Face Transformers