Skip to content
This repository was archived by the owner on Sep 11, 2025. It is now read-only.

Update dependency bitsandbytes to ^0.47.0#133

Open
red-hat-konflux[bot] wants to merge 1 commit intokonflux-pocfrom
konflux/mintmaker/konflux-poc/bitsandbytes-0.x
Open

Update dependency bitsandbytes to ^0.47.0#133
red-hat-konflux[bot] wants to merge 1 commit intokonflux-pocfrom
konflux/mintmaker/konflux-poc/bitsandbytes-0.x

Conversation

@red-hat-konflux
Copy link

@red-hat-konflux red-hat-konflux bot commented May 24, 2025

This PR contains the following updates:

Package Change Age Confidence
bitsandbytes (changelog) ^0.42.0 -> ^0.47.0 age confidence

Release Notes

bitsandbytes-foundation/bitsandbytes (bitsandbytes)

v0.47.0

Compare Source

Highlights:

  • FSDP2 compatibility for Params4bit (#​1719)
  • Bugfix for 4bit quantization with large block sizes (#​1721)
  • Further removal of previously deprecated code (#​1669)
  • Improved CPU coverage (#​1628)
  • Include NVIDIA Volta support in CUDA 12.8 and 12.9 builds (#​1715)

What's Changed

New Contributors

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.46.0...0.47.0

v0.46.1

Compare Source

What's Changed

New Contributors

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.46.0...0.46.1

v0.46.0: : torch.compile() support; custom ops refactor; Linux aarch64 wheels

Compare Source

Highlights

  • Support for torch.compile without graph breaks for LLM.int8().
    • Compatible with PyTorch 2.4+, but PyTorch 2.6+ is recommended.
    • Experimental CPU support is included.
  • Support torch.compile without graph breaks for 4bit.
    • Compatible with PyTorch 2.4+ for fullgraph=False.
    • Requires PyTorch 2.8 nightly for fullgraph=True.
  • We are now publishing wheels for CUDA Linux aarch64 (sbsa)!
    • Targets are Turing generation and newer: sm75, sm80, sm90, and sm100.
  • PyTorch Custom Operators refactoring and integration:
    • We have refactored most of the library code to integrate better with PyTorch via the torch.library and custom ops APIs. This helps enable our torch.compile and additional hardware compatibility efforts.
    • End-users do not need to change the way they are using bitsandbytes.
  • Unit tests have been cleaned up for increased determinism and most are now device-agnostic.
    • A new nightly CI runs unit tests for CPU (Windows x86-64, Linux x86-64/aarch64) and CUDA (Linux/Windows x86-64).

Compatability Changes

  • Support for Python 3.8 is dropped.
  • Support for PyTorch < 2.2.0 is dropped.
  • CUDA 12.6 and 12.8 builds are now compatible for manylinux_2_24 (previously manylinux_2_34).
  • Many APIs that were previously marked as deprecated have now been removed.
  • New deprecations:
    • bnb.autograd.get_inverse_transform_indices()
    • bnb.autograd.undo_layout()
    • bnb.functional.create_quantile_map()
    • bnb.functional.estimate_quantiles()
    • bnb.functional.get_colrow_absmax()
    • bnb.functional.get_row_absmax()
    • bnb.functional.histogram_scatter_add_2d()

What's Changed

New Contributors

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.45.4...0.46.0

v0.45.5

Compare Source

This is a minor release that affects CPU-only usage of bitsandbytes. The CPU build of the library was inadvertently omitted from the v0.45.4 wheels.

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.45.4...0.45.5

v0.45.4

Compare Source

This is a minor release that affects CPU-only usage of bitsandbytes. There is one bugfix and improved system compatibility on Linux.

What's Changed

New Contributors

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.45.3...0.45.4

v0.45.3

Compare Source

Overview

This is a small patch release containing a few bug fixes.

Additionally, this release contains a CUDA 12.8 build which adds the sm100 and sm120 targets for NVIDIA Blackwell GPUs.

What's Changed

New Contributors

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.45.2...0.45.3

v0.45.2

Compare Source

This patch release fixes a compatibility issue with Triton 3.2 in PyTorch 2.6. When importing bitsandbytes without any GPUs visible in an environment with Triton installed, a RuntimeError may be raised:

RuntimeError: 0 active drivers ([]). There should only be one.

Full Changelog: bitsandbytes-foundation/bitsandbytes@0.45.1...0.45.2

v0.45.1

Compare Source

Improvements:
  • Compatibility for triton>=3.2.0
  • Moved package configuration to pyproject.toml
  • Build system: initial support for NVIDIA Blackwell B100 GPUs, RTX 50 Blackwell series GPUs and Jetson Thor Blackwell.
    • Note: Binaries built for these platforms are not included in this release. They will be included in future releases upon the availability of the upcoming CUDA Toolkit 12.7 and 12.8.
Bug Fixes:
  • Packaging: wheels will no longer include unit tests. (#​1478)
Dependencies:
  • Sets the minimum PyTorch version to 2.0.0.

v0.45.0

Compare Source

This is a significant release, bringing support for LLM.int8() to NVIDIA Hopper GPUs such as the H100.

As part of the compatibility enhancements, we've rebuilt much of the LLM.int8() code in order to simplify for future compatibility and maintenance. We no longer use the col32 or architecture-specific tensor layout formats while maintaining backwards compatibility. We additionally bring performance improvements targeted for inference scenarios.

Performance Improvements

This release includes broad performance improvements for a wide variety of inference scenarios. See this X thread for a detailed explanation.

Breaking Changes

🤗PEFT users wishing to merge adapters with 8-bit weights will need to upgrade to peft>=0.14.0.

Packaging Improvements
  • The size of our wheel has been reduced by ~43.5% from 122.4 MB to 69.1 MB! This results in an on-disk size decrease from ~396MB to ~224MB.
  • Binaries built with CUDA Toolkit 12.6.2 are now included in the PyPI distribution.
  • The CUDA 12.5.0 build has been updated to CUDA Toolkit 12.5.1.
Deprecations
  • A number of public API functions have been marked for deprecation and will emit FutureWarning when used. These functions will become unavailable in future releases. This should have minimal impact on most end-users.
  • The k-bit quantization features are deprecated in favor of blockwise quantization. For all optimizers, using block_wise=False is not recommended and support will be removed in a future release.
  • As part of the refactoring process, we've implemented many new 8bit operations. These operations no longer use specialized data layouts.
Full Changelog

v0.44.1

Compare Source

Bug fixes:

v0.44.0

Compare Source

New: AdEMAMix Optimizer

The AdEMAMix optimizer is a modification to AdamW which proposes tracking two EMAs to better leverage past gradients. This allows for faster convergence with less training data and improved resistance to forgetting.

We've implemented 8bit and paged variations: AdEMAMix, AdEMAMix8bit, PagedAdEMAMix, and PagedAdEMAMix8bit. These can be used with a similar API to existing optimizers.

Improvements:
  • 8-bit Optimizers: The block size for all 8-bit optimizers has been reduced from 2048 to 256 in this release. This is a change from the original implementation proposed in the paper which improves accuracy.
  • CUDA Graphs support: A fix to enable CUDA Graphs capture of kernel functions was made in #​1330. This allows for performance improvements with inference frameworks like vLLM. Thanks @​jeejeelee!
Full Changelog:

v0.43.3

Compare Source

Improvements:
  • FSDP: Enable loading prequantized weights with bf16/fp16/fp32 quant_storage
    • Background: This update, linked to Transformer PR #​32276, allows loading prequantized weights with alternative storage formats. Metadata is tracked similarly to Params4bit.__new__ post PR #​970. It supports models exported with non-default quant_storage, such as this NF4 model with BF16 storage.
    • Special thanks to @​winglian and @​matthewdouglas for enabling FSDP+QLoRA finetuning of Llama 3.1 405B on a single 8xH100 or 8xA100 node with as little as 256GB system RAM.

v0.43.2

Compare Source

This release is quite significant as the QLoRA bug fix big implications for higher seqlen and batch sizes.

For each sequence (i.e. batch size increase of one) we expect memory savings of:

  • 405B: 39GB for seqlen=1024, and 4888GB for seqlen=128,00
  • 70B: 10.1GB for seqlen=1024 and 1258GB for seqlen=128,00

This was due to activations being unnecessary for frozen parameters, yet the memory for them was still erroneously allocated due to the now fixed bug.

Improvements:
Bug Fixes

v0.43.1

Compare Source

Improvements:
  • Improved the serialization format for 8-bit weights; this change is fully backwards compatible. (#​1164, thanks to @​younesbelkada for the contributions and @​akx for the review).
  • Added CUDA 12.4 support to the Linux x86-64 build workflow, expanding the library's compatibility with the latest CUDA versions. (#​1171, kudos to @​matthewdouglas for this addition).
  • Docs enhancement: Improved the instructions for installing the library from source. (#​1149, special thanks to @​stevhliu for the enhancements).
Bug Fixes
  • Fix 4bit quantization with blocksize = 4096, where an illegal memory access was encountered. (#​1160, thanks @​matthewdouglas for fixing and @​YLGH for reporting)
Internal Improvements:

v0.43.0

Compare Source

Improvements and New Features:
Bug Fixes:
  • Addressed a race condition in kEstimateQuantiles, enhancing the reliability of quantile estimation in concurrent environments (@​pnunna93, #​1061).
  • Fixed various minor issues, including typos in code comments and documentation, to improve code clarity and prevent potential confusion (@​Brian Vaughan, #​1063).
Backwards Compatibility
  • After upgrading from v0.42 to v0.43, when using 4bit quantization, models may generate slightly different outputs (approximately up to the 2nd decimal place) due to a fix in the code. For anyone interested in the details, see this comment.
Internal and Build System Enhancements:
  • Implemented several enhancements to the internal and build systems, including adjustments to the CI workflows, portability improvements, and build artifact management. These changes contribute to a more robust and flexible development process, ensuring the library's ongoing quality and maintainability (@​rickardp, @​akx, @​wkpark, @​matthewdouglas; #​949, #​1053, #​1045, #​1037).
Contributors:

This release is made possible thanks to the many active contributors that submitted PRs and many others who contributed to discussions, reviews, and testing. Your efforts greatly enhance the library's quality and user experience. It's truly inspiring to work with such a dedicated and competent group of volunteers and professionals!

We give a special thanks to @​TimDettmers for managing to find a little bit of time for valuable consultations on critical topics, despite preparing for and touring the states applying for professor positions. We wish him the utmost success!

We also extend our gratitude to the broader community for your continued support, feedback, and engagement, which play a crucial role in driving the library's development forward.


Configuration

📅 Schedule: Branch creation - "after 5am on saturday" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

To execute skipped test pipelines write comment /ok-to-test.

This PR has been generated by MintMaker (powered by Renovate Bot).

@openshift-ci openshift-ci bot requested review from dtrifiro and tarukumar May 24, 2025 18:31
@openshift-ci
Copy link

openshift-ci bot commented May 24, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: red-hat-konflux[bot]
Once this PR has been reviewed and has the lgtm label, please assign danielezonca for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented May 24, 2025

Hi @red-hat-konflux[bot]. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/konflux-poc/bitsandbytes-0.x branch from 3686a0a to 69440dc Compare May 31, 2025 15:53
@red-hat-konflux red-hat-konflux bot changed the title fix(deps): update dependency bitsandbytes to ^0.45.0 fix(deps): update dependency bitsandbytes to ^0.46.0 May 31, 2025
@red-hat-konflux red-hat-konflux bot changed the title fix(deps): update dependency bitsandbytes to ^0.46.0 Update dependency bitsandbytes to ^0.46.0 Jun 7, 2025
Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/konflux-poc/bitsandbytes-0.x branch from 69440dc to e74dca9 Compare August 16, 2025 08:36
@red-hat-konflux red-hat-konflux bot changed the title Update dependency bitsandbytes to ^0.46.0 Update dependency bitsandbytes to ^0.47.0 Aug 16, 2025
@coderabbitai
Copy link

coderabbitai bot commented Aug 16, 2025

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Join our Discord community for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants