Skip to content

Commit 3d05d20

Browse files
authored
Version 4.7.0 (OpenNMT#2003)
1 parent 809a36a commit 3d05d20

File tree

7 files changed

+39
-7
lines changed

7 files changed

+39
-7
lines changed

CHANGELOG.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,35 @@
44

55
### Fixes and improvements
66

7-
## [v4.6.3](https://github.com/OpenNMT/CTranslate2/releases/tag/v4.6.3) (2026-01-XX)
7+
## [v4.7.0](https://github.com/OpenNMT/CTranslate2/releases/tag/v4.7.0) (2026-02-03)
8+
9+
### New features
10+
11+
* Introduce AMD GPU support with ROCm HIP (#1989) [@sssshhhhhh](https://github.com/sssshhhhhh)
12+
* Compatibility with Transformers v5 (#1999) by [@jordimas](https://github.com/jordimas)
13+
14+
## Fixes and improvements
15+
16+
* Assume less about whisper vocab (#2000) by [@sssshhhhhh](https://github.com/sssshhhhhh)
17+
* Use LLVM ThreadSanitizer instead of Google (#1993) by [@3manifold](https://github.com/3manifold)
18+
* Optimize all builds with parallel execution (#1992) by [@3manifold](https://github.com/3manifold)
19+
* Remove unecessary zero init from conv1d (#1990) by [@sssshhhhhh](https://github.com/sssshhhhhh)
20+
* Integrate Clang AddressSanitizer in tests (#1903) by [@3manifold](https://github.com/3manifold)
21+
* Enable multiple of 16 padding for INT8 Tensor Cores (#1982) by [@Purfview](https://github.com/Purfview)
22+
* Add activation and dilation to conv1d (#1979) by [@sssshhhhhh](https://github.com/sssshhhhhh)
23+
* Minor refactor to CMakeLists.txt (#1980) by [@sssshhhhhh](https://github.com/sssshhhhhh)
24+
* Remove unnecessary check from wav2vec2 (#1977) by [@plan9better](https://github.com/plan9better)
25+
* Add optional residual add to gemm op (#1975) by [@sssshhhhhh](https://github.com/sssshhhhhh)
26+
* Implement cuda layernorm axis (#1971) by [@sssshhhhhh](https://github.com/sssshhhhhh)
27+
* Fix Eole conversion (#1998) by [@vince62s](https://github.com/vince62s)
28+
* Gemma 3 conversion improvements (#1991) by [@sssshhhhhh](https://github.com/sssshhhhhh)
29+
* Add causal flag to fa2 (#1976) by [@sssshhhhhh](https://github.com/sssshhhhhh)
30+
* Fixes cross attention tests and refactors code (#1974) by [@jordimas](https://github.com/jordimas)
31+
* Fix CUDA bf16 median filter (#1972) by [@sssshhhhhh](https://github.com/sssshhhhhh)
32+
* Fix various compiler warnings (#1970) by [@sssshhhhhh](https://github.com/sssshhhhhh)
33+
34+
35+
## [v4.6.3](https://github.com/OpenNMT/CTranslate2/releases/tag/v4.6.3) (2026-01-06)
836

937
### New features
1038

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ option(WITH_OPENBLAS "Compile with OpenBLAS backend" OFF)
1414
option(WITH_RUY "Compile with Ruy backend" OFF)
1515
option(WITH_CUDA "Compile with CUDA backend" OFF)
1616
option(WITH_CUDNN "Compile with cuDNN backend" OFF)
17-
option(WITH_HIP "Compile with HIP backend" OFF)
17+
option(WITH_HIP "Compile with AMD HIP GPU backend" OFF)
1818
option(CUDA_DYNAMIC_LOADING "Dynamically load CUDA libraries at runtime" OFF)
1919
option(ENABLE_CPU_DISPATCH "Compile CPU kernels for multiple ISA and dispatch at runtime" ON)
2020
option(ENABLE_PROFILING "Compile with profiling support" OFF)

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ generator.generate_batch(start_tokens)
5858

5959
See the [documentation](https://opennmt.net/CTranslate2) for more information and examples.
6060

61+
If you have an AMD ROCm GPU, we provide specific Python wheels on the [releases page](https://github.com/OpenNMT/CTranslate2/releases/).
62+
6163
## Benchmarks
6264

6365
We translate the En->De test set *newstest2014* with multiple models:

docs/faq.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ CTranslate2 addresses these issues in several ways:
1616

1717
Here are some scenarios where this project could be used:
1818

19-
* You want to accelarate Transformer models for production usage, especially on CPUs.
19+
* You want to accelerate Transformer models for production usage, especially on CPUs.
2020
* You need to embed models in an existing C++ application without adding large dependencies.
2121
* Your application requires custom threading and memory usage control.
2222
* You want to reduce the model size on disk and/or memory.

docs/installation.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ pip install ctranslate2
1111
The Python wheels have the following requirements:
1212

1313
* OS: Linux (x86-64, AArch64), macOS (x86-64, ARM64), Windows (x86-64)
14-
* Python version: >= 3.7
14+
* Python version: >= 3.9
1515
* pip version: >= 19.3 to support `manylinux2014` wheels
1616

1717
```{admonition} GPU support
@@ -29,7 +29,7 @@ On Windows [the Visual C++ runtime](https://www.microsoft.com/en-US/download/det
2929
Docker images can be downloaded from the [GitHub Container registry](https://github.com/OpenNMT/CTranslate2/pkgs/container/ctranslate2):
3030

3131
```bash
32-
docker pull ghcr.io/opennmt/ctranslate2:latest-ubuntu22.04-cuda11.2
32+
docker pull ghcr.io/opennmt/ctranslate2:latest-ubuntu22.04-cuda12.8
3333
```
3434

3535
The images include:
@@ -114,6 +114,7 @@ The following options can be set with `-DOPTION=VALUE` during the CMake configur
114114
| WITH_ACCELERATE | **OFF**, ON | Compiles with the Apple Accelerate backend |
115115
| WITH_OPENBLAS | **OFF**, ON | Compiles with the OpenBLAS backend |
116116
| WITH_RUY | **OFF**, ON | Compiles with the Ruy backend |
117+
| WITH_HIP | **OFF**, ON | Compiles with the AMD HIP GPU backend |
117118

118119
Some build options require additional dependencies. See their respective documentation for installation instructions.
119120

@@ -123,6 +124,7 @@ Some build options require additional dependencies. See their respective documen
123124
* `-DWITH_DNNL=ON` requires [oneDNN](https://github.com/oneapi-src/oneDNN) >= 3.0
124125
* `-DWITH_ACCELERATE=ON` requires [Accelerate](https://developer.apple.com/documentation/accelerate)
125126
* `-DWITH_OPENBLAS=ON` requires [OpenBLAS](https://github.com/xianyi/OpenBLAS)
127+
* `-DWITH_HIP=ON` requires [ROCm libraries](https://rocm.docs.amd.com/en/latest/reference/api-libraries.html)
126128

127129
Multiple backends can be enabled for a single build, for example:
128130

docs/translation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ It is a text file where each line has the following format:
8181
src_1 src_2 ... src_N<TAB>tgt_1 tgt_2 ... tgt_K
8282
```
8383

84-
If the source N-gram is empty (N = 0), the assiocated target tokens will always be included in the reduced vocabulary.
84+
If the source N-gram is empty (N = 0), the associated target tokens will always be included in the reduced vocabulary.
8585

8686
```{hint}
8787
See [here](https://github.com/OpenNMT/papers/tree/master/WNMT2018/vmap) for an example on how to generate this file.

python/ctranslate2/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
"""Version information."""
22

3-
__version__ = "4.6.3"
3+
__version__ = "4.7.0"

0 commit comments

Comments
 (0)