You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+26Lines changed: 26 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -221,3 +221,29 @@ Improvements:
221
221
Deprecated:
222
222
- Devices with compute capability 3.0 (GTX 700s, K10) and 3.2 (Tegra K1, Jetson TK1) are now deprecated and support will be removed in 0.39.0.
223
223
- Support for CUDA 10.0 and 10.2 will be removed in bitsandbytes 0.39.0
224
+
225
+
226
+
### 0.38.1
227
+
228
+
Features:
229
+
- Added Int8 SwitchBack layers
230
+
- Added Fake FP8 layers for research purposes (available under `bnb.research.nn. ...`)
231
+
232
+
233
+
### 0.39.0
234
+
235
+
236
+
Features:
237
+
- 4-bit matrix multiplication for Float4 and NormalFloat4 data types.
238
+
- Added 4-bit quantization routines
239
+
- Doubled quantization routines for 4-bit quantization
240
+
- Paged optimizers for Adam and Lion.
241
+
- bfloat16 gradient / weight support for Adam and Lion with 8 or 32-bit states.
242
+
243
+
Bug fixes:
244
+
- Fixed a bug where 8-bit models consumed twice the memory as expected after serialization
245
+
246
+
Deprecated:
247
+
- Kepler binaries (GTX 700s and Tesla K40/K80) are not longer provided via pip and need to be compiled from source. Kepler support might be fully removed in the future.
1. Run `python speed_benchmark/speed_benchmark.py` which times operations and writes their time to `speed_benchmark/info_a100_py2.jsonl` (change the name of the jsonl to a different name for your profiling).
4
+
2. Run `python speed_benchmark/make_plot_with_jsonl.py`, which produces the `speed_benchmark/plot_with_info.pdf`. Again make sure you change the jsonl which is being processed.
0 commit comments