Skip to content

Commit 0482373

Browse files
committed
update CHANGELOG
Signed-off-by: Kai Xu <[email protected]>
1 parent 1f77518 commit 0482373

File tree

2 files changed

+32
-0
lines changed

2 files changed

+32
-0
lines changed

CHANGELOG.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,14 @@
11
Model Optimizer Changelog (Linux)
22
=================================
33

4+
0.41 (2025-12-xx)
5+
^^^^^^^^^^^^^^^^^
6+
7+
**Deprecations**
8+
9+
**New Features**
10+
- Add FP8/NVFP4 KV cache quantization support for Megatron Core models.
11+
412
0.39 (2025-11-xx)
513
^^^^^^^^^^^^^^^^^
614

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
"""Support quantization for Megatron Core specific layers.
17+
18+
This plugin provides additional support for Megatron Core models beyond what's
19+
available in the main megatron plugin. Currently this is a placeholder as the
20+
TEDotProductAttention support is implemented in the megatron.py plugin.
21+
"""
22+
23+
# The TEDotProductAttention quantization support is implemented in megatron.py
24+
# This file exists to satisfy the import in __init__.py

0 commit comments

Comments
 (0)