update CHANGELOG

kaix-nv · kaix-nv · commit 04823732e981 · 2025-10-10T15:54:46.000-07:00
Signed-off-by: Kai Xu &lt;kaix@nvidia.com&gt;
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -1,6 +1,14 @@
 Model Optimizer Changelog (Linux)
 =================================
 
+0.41 (2025-12-xx)
+^^^^^^^^^^^^^^^^^
+
+**Deprecations**
+
+**New Features**
+- Add FP8/NVFP4 KV cache quantization support for Megatron Core models.
+
 0.39 (2025-11-xx)
 ^^^^^^^^^^^^^^^^^
 
diff --git a/modelopt/torch/quantization/plugins/mcore.py b/modelopt/torch/quantization/plugins/mcore.py
@@ -0,0 +1,24 @@
+# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Support quantization for Megatron Core specific layers.
+
+This plugin provides additional support for Megatron Core models beyond what's
+available in the main megatron plugin. Currently this is a placeholder as the
+TEDotProductAttention support is implemented in the megatron.py plugin.
+"""
+
+# The TEDotProductAttention quantization support is implemented in megatron.py
+# This file exists to satisfy the import in __init__.py