Skip to content

Commit a06a0f6

Browse files
committed
Bumped version for new release.
1 parent 412fd0e commit a06a0f6

File tree

2 files changed

+28
-1
lines changed

2 files changed

+28
-1
lines changed

CHANGELOG.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -283,3 +283,30 @@ Bug fixes:
283283
- Removed outdated get_cuda_lib_handle calls that lead to errors. #595 Thank you @ihsanturk
284284
- Fixed bug where read-permission was assumed for a file. #497
285285
- Fixed a bug where prefetchAsync lead to errors on GPUs that do not support unified memory but not prefetching (Maxwell, SM52). #470 #451 #453 #477 Thank you @jllllll and @stoperro
286+
287+
288+
### 0.41.0
289+
290+
Features:
291+
- Added precompiled CUDA 11.8 binaries to support H100 GPUs without compilation #571
292+
- CUDA SETUP now no longer looks for libcuda and libcudart and relies PyTorch CUDA libraries. To manually override this behavior see: how_to_use_nonpytorch_cuda.md. Thank you @rapsealk
293+
294+
Bug fixes:
295+
- Fixed a bug where the default type of absmax was undefined which leads to errors if the default type is different than torch.float32. # 553
296+
- Fixed a missing scipy dependency in requirements.txt. #544
297+
- Fixed a bug, where a view operation could cause an error in 8-bit layers.
298+
- Fixed a bug where CPU bitsandbytes would during the import. #593 Thank you @bilelomrani
299+
- Fixed a but where a non-existent LD_LIBRARY_PATH variable led to a failure in python -m bitsandbytes #588
300+
- Removed outdated get_cuda_lib_handle calls that lead to errors. #595 Thank you @ihsanturk
301+
- Fixed bug where read-permission was assumed for a file. #497
302+
- Fixed a bug where prefetchAsync lead to errors on GPUs that do not support unified memory but not prefetching (Maxwell, SM52). #470 #451 #453 #477 Thank you @jllllll and @stoperro
303+
304+
Documentation:
305+
- Improved documentation for GPUs that do not support 8-bit matmul. #529
306+
- Added description and pointers for the NF4 data type. #543
307+
308+
User experience:
309+
- Improved handling of default compute_dtype for Linear4bit Layers, so that compute_dtype = input_dtype if the input data type is stable enough (float32, bfloat16, but not float16).
310+
311+
Performance:
312+
- improved 4-bit inference performance for A100 GPUs. This degraded performance for A40/RTX3090 and RTX 4090 GPUs slightly.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ def read(fname):
1818

1919
setup(
2020
name=f"bitsandbytes",
21-
version=f"0.40.2",
21+
version=f"0.41.0",
2222
author="Tim Dettmers",
2323
author_email="[email protected]",
2424
description="k-bit optimizers and matrix multiplication routines.",

0 commit comments

Comments
 (0)