Skip to content
Change the repository type filter

All

    Repositories list

    • dynamo

      Public
      A Datacenter Scale Distributed Inference Serving Framework
      Rust
      Other
      9276.3k193307Updated Mar 18, 2026Mar 18, 2026
    • nixl

      Public
      NVIDIA Inference Xfer Library (NIXL)
      C++
      Other
      2649393880Updated Mar 18, 2026Mar 18, 2026
    • Offline optimization of your disaggregated Dynamo graph
      Python
      Apache License 2.0
      702201531Updated Mar 18, 2026Mar 18, 2026
    • Model Express is a Rust-based component meant to be placed next to existing model inference systems to speed up their startup times and improve overall performa…
      Rust
      Apache License 2.0
      123638Updated Mar 18, 2026Mar 18, 2026
    • aiperf

      Public
      AIPerf is a comprehensive benchmarking tool that measures the performance of generative AI models served by your preferred inference solution.
      Python
      Apache License 2.0
      481781541Updated Mar 18, 2026Mar 18, 2026
    • grove

      Public
      Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
      Go
      Apache License 2.0
      421686026Updated Mar 17, 2026Mar 17, 2026
    • FlexTensor is a tensor offloading and management library for PyTorch that enables running large models on limited GPU memory by intelligently offloading tensors…
      Python
      Apache License 2.0
      1900Updated Mar 15, 2026Mar 15, 2026
    • aitune

      Public
      NVIDIA AITune is an inference toolkit designed for tuning and deploying Deep Learning models with a focus on NVIDIA GPUs.
      Python
      Apache License 2.0
      0600Updated Mar 13, 2026Mar 13, 2026
    • enhancements

      Public
      Enhancement Proposals and Architecture Decisions
      Apache License 2.0
      108148Updated Mar 9, 2026Mar 9, 2026
    • .github

      Public
      3101Updated Aug 21, 2025Aug 21, 2025