Version 4.7.0 (OpenNMT#2003)

jordimas · web-flow · commit 3d05d20c7ba5 · 2026-02-03T02:02:32.000+01:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,7 +4,35 @@
 
 ### Fixes and improvements
 
-## [v4.6.3](https://github.com/OpenNMT/CTranslate2/releases/tag/v4.6.3) (2026-01-XX)
+## [v4.7.0](https://github.com/OpenNMT/CTranslate2/releases/tag/v4.7.0) (2026-02-03)
+
+### New features
+
+* Introduce AMD GPU support with ROCm HIP (#1989) [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Compatibility with Transformers v5 (#1999) by [@jordimas](https://github.com/jordimas)
+
+## Fixes and improvements
+
+* Assume less about whisper vocab (#2000) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Use LLVM ThreadSanitizer instead of Google (#1993) by [@3manifold](https://github.com/3manifold)
+* Optimize all builds with parallel execution (#1992) by [@3manifold](https://github.com/3manifold)
+* Remove unecessary zero init from conv1d (#1990) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Integrate Clang AddressSanitizer in tests (#1903) by [@3manifold](https://github.com/3manifold)
+* Enable multiple of 16 padding for INT8 Tensor Cores (#1982) by [@Purfview](https://github.com/Purfview)
+* Add activation and dilation to conv1d (#1979) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Minor refactor to CMakeLists.txt (#1980) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Remove unnecessary check from wav2vec2 (#1977) by [@plan9better](https://github.com/plan9better)
+* Add optional residual add to gemm op (#1975) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Implement cuda layernorm axis (#1971) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Fix Eole conversion (#1998) by [@vince62s](https://github.com/vince62s)
+* Gemma 3 conversion improvements (#1991) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Add causal flag to fa2 (#1976) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Fixes cross attention tests and refactors code (#1974) by [@jordimas](https://github.com/jordimas)
+* Fix CUDA bf16 median filter (#1972) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+* Fix various compiler warnings (#1970) by [@sssshhhhhh](https://github.com/sssshhhhhh)
+
+
+## [v4.6.3](https://github.com/OpenNMT/CTranslate2/releases/tag/v4.6.3) (2026-01-06)
 
 ### New features
 
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -14,7 +14,7 @@ option(WITH_OPENBLAS "Compile with OpenBLAS backend" OFF)
 option(WITH_RUY "Compile with Ruy backend" OFF)
 option(WITH_CUDA "Compile with CUDA backend" OFF)
 option(WITH_CUDNN "Compile with cuDNN backend" OFF)
-option(WITH_HIP "Compile with HIP backend" OFF)
+option(WITH_HIP "Compile with AMD HIP GPU backend" OFF)
 option(CUDA_DYNAMIC_LOADING "Dynamically load CUDA libraries at runtime" OFF)
 option(ENABLE_CPU_DISPATCH "Compile CPU kernels for multiple ISA and dispatch at runtime" ON)
 option(ENABLE_PROFILING "Compile with profiling support" OFF)
diff --git a/README.md b/README.md
@@ -58,6 +58,8 @@ generator.generate_batch(start_tokens)
 
 See the [documentation](https://opennmt.net/CTranslate2) for more information and examples.
 
+If you have an AMD ROCm GPU, we provide specific Python wheels on the [releases page](https://github.com/OpenNMT/CTranslate2/releases/).
+
 ## Benchmarks
 
 We translate the En->De test set *newstest2014* with multiple models:
diff --git a/docs/faq.md b/docs/faq.md
@@ -16,7 +16,7 @@ CTranslate2 addresses these issues in several ways:
 
 Here are some scenarios where this project could be used:
 
-* You want to accelarate Transformer models for production usage, especially on CPUs.
+* You want to accelerate Transformer models for production usage, especially on CPUs.
 * You need to embed models in an existing C++ application without adding large dependencies.
 * Your application requires custom threading and memory usage control.
 * You want to reduce the model size on disk and/or memory.
diff --git a/docs/installation.md b/docs/installation.md
@@ -11,7 +11,7 @@ pip install ctranslate2
 The Python wheels have the following requirements:
 
 * OS: Linux (x86-64, AArch64), macOS (x86-64, ARM64), Windows (x86-64)
-* Python version: >= 3.7
+* Python version: >= 3.9
 * pip version: >= 19.3 to support `manylinux2014` wheels
 
 ```{admonition} GPU support
@@ -29,7 +29,7 @@ On Windows [the Visual C++ runtime](https://www.microsoft.com/en-US/download/det
 Docker images can be downloaded from the [GitHub Container registry](https://github.com/OpenNMT/CTranslate2/pkgs/container/ctranslate2):
 
 ```bash
-docker pull ghcr.io/opennmt/ctranslate2:latest-ubuntu22.04-cuda11.2
+docker pull ghcr.io/opennmt/ctranslate2:latest-ubuntu22.04-cuda12.8
 ```
 
 The images include:
@@ -114,6 +114,7 @@ The following options can be set with `-DOPTION=VALUE` during the CMake configur
 | WITH_ACCELERATE | **OFF**, ON | Compiles with the Apple Accelerate backend |
 | WITH_OPENBLAS | **OFF**, ON | Compiles with the OpenBLAS backend |
 | WITH_RUY | **OFF**, ON | Compiles with the Ruy backend |
+| WITH_HIP | **OFF**, ON | Compiles with the AMD HIP GPU backend |
 
 Some build options require additional dependencies. See their respective documentation for installation instructions.
 
@@ -123,6 +124,7 @@ Some build options require additional dependencies. See their respective documen
 * `-DWITH_DNNL=ON` requires [oneDNN](https://github.com/oneapi-src/oneDNN) >= 3.0
 * `-DWITH_ACCELERATE=ON` requires [Accelerate](https://developer.apple.com/documentation/accelerate)
 * `-DWITH_OPENBLAS=ON` requires [OpenBLAS](https://github.com/xianyi/OpenBLAS)
+* `-DWITH_HIP=ON` requires [ROCm libraries](https://rocm.docs.amd.com/en/latest/reference/api-libraries.html)
 
 Multiple backends can be enabled for a single build, for example:
 
diff --git a/docs/translation.md b/docs/translation.md
@@ -81,7 +81,7 @@ It is a text file where each line has the following format:
 src_1 src_2 ... src_N<TAB>tgt_1 tgt_2 ... tgt_K
 ```
 
-If the source N-gram is empty (N = 0), the assiocated target tokens will always be included in the reduced vocabulary.
+If the source N-gram is empty (N = 0), the associated target tokens will always be included in the reduced vocabulary.
 
 ```{hint}
 See [here](https://github.com/OpenNMT/papers/tree/master/WNMT2018/vmap) for an example on how to generate this file.
diff --git a/python/ctranslate2/version.py b/python/ctranslate2/version.py
@@ -1,3 +1,3 @@
 """Version information."""
 
-__version__ = "4.6.3"
+__version__ = "4.7.0"

Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,3 @@`
`1`	`1`	`"""Version information."""`
`2`	`2`
`3`		`-__version__ = "4.6.3"`
	`3`	`+__version__ = "4.7.0"`