Skip to content

Commit f445e61

Browse files
committed
Resorted changes
1 parent 9eefba6 commit f445e61

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

CHANGELOG.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,17 @@
22
All notable changes to `semchunk` will be documented here. This project adheres to [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
33

44
## [3.0.0] - 2024-12-31
5-
### Changed
6-
- Began removing chunks comprised entirely of whitespace characters from the output of `chunk()`.
7-
- Updated `semchunk`'s description from 'A fast and lightweight Python library for splitting text into semantically meaningful chunks.' and 'A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.'.
8-
95
### Added
106
- Added an `offsets` argument to `chunk()` and `Chunker.__call__()` that specifies whether to return the start and end offsets of each chunk ([#9](https://github.com/umarbutler/semchunk/issues/9)). The argument defaults to `False`.
117
- Added an `overlap` argument to `chunk()` and `Chunker.__call__()` that specifies the proportion of the chunk size, or, if >=1, the number of tokens, by which chunks should overlap ([#1](https://github.com/umarbutler/semchunk/issues/1)). The argument defaults to `None`, in which case no overlapping occurs.
128
- Began raising a `ValueError` where the `chunk_size` is smaller than the number of tokens in an empty string (i.e, where the token counter adds special tokens to every input).
139
- Added an undocumented, private `_make_chunk_function()` method to the `Chunker` class that constructs chunking functions with call-level arguments passed.
1410
- Added more unit tests for new features as well as for multiple token counters and for ensuring there are no chunks comprised entirely of whitespace characters.
1511

12+
### Changed
13+
- Began removing chunks comprised entirely of whitespace characters from the output of `chunk()`.
14+
- Updated `semchunk`'s description from 'A fast and lightweight Python library for splitting text into semantically meaningful chunks.' and 'A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.'.
15+
1616
### Fixed
1717
- Fixed a typo in the docstring for the `__call__()` method of the `Chunker` class returned by `chunkerify()` where most of the documentation for the arguments were listed under the section for the method's returns.
1818

0 commit comments

Comments
 (0)