Skip to content

Q1 2026 Roadmap #2262

@dsikka

Description

@dsikka

Overview

The near-term focus for LLM Compressor will centre on performance improvements across core workflows, targeted enhancements to NVFP4, stabilizing and hardening MXFP4 support, and broad improvements to modifier functionality. These efforts aim to improve efficiency, robustness, and overall usability while enabling more reliable and scalable model compression workflows.

In addition, we will continue to expand quantization support for the latest model releases to ensure timely compatibility with newly introduced architectures and checkpoints, including adopting transformers v5.0

We will also be focusing on improving the quality of our documentation, examples, CI/CD for easier access and understanding.

Q1 Roadmap

Performance Refactor - Enable Distributed Quantization Support

Status: In Progress

RFC: #2180

Issues:

Enable Modifier Specific Support:

MXFP4 vLLM Integration / Validation

Status: In Progress

MXFP8 Support

Status: In Progress

AWQ, GPTQ Improvements and Benchmarking

Status: In Progress

NVFP4 Improvements

Status: Not Yet Started

Transformers v5 Support

Status: Not Yet Started

Quantized Model Support

Status: In Progress

CI/CD Buildkite Migration

Status: In Progress

  • Migrate CI/CD to Buildkite

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions