Skip to content

Commit aa6142f

Browse files
committed
Ensured that memoization does not overwrite chunk()'s function signature.
1 parent da0b25a commit aa6142f

File tree

3 files changed

+10
-4
lines changed

3 files changed

+10
-4
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
## Changelog 🔄
22
All notable changes to `semchunk` will be documented here. This project adheres to [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
33

4+
## [0.2.3] - 2024-03-11
5+
### Fixed
6+
- Ensured that memoization does not overwrite `chunk()`'s function signature.
7+
48
## [0.2.2] - 2024-02-05
59
### Fixed
610
- Ensured that the `memoize` argument is passed back to `chunk()` in recursive calls.
@@ -36,6 +40,7 @@ All notable changes to `semchunk` will be documented here. This project adheres
3640
### Added
3741
- Added the `chunk()` function, which splits text into semantically meaningful chunks of a specified size as determined by a provided token counter.
3842

43+
[0.2.3]: https://github.com/umarbutler/semchunk/compare/v0.2.2...v0.2.3
3944
[0.2.2]: https://github.com/umarbutler/semchunk/compare/v0.2.1...v0.2.2
4045
[0.2.1]: https://github.com/umarbutler/semchunk/compare/v0.2.0...v0.2.1
4146
[0.2.0]: https://github.com/umarbutler/semchunk/compare/v0.1.2...v0.2.0

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "semchunk"
7-
version = "0.2.2"
7+
version = "0.2.3"
88
authors = [
99
{name="Umar Butler", email="[email protected]"},
1010
]

src/semchunk/semchunk.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import re
2-
from functools import cache
2+
from functools import cache, wraps
33

44
_memoised_token_counters = {}
55
"""A map of token counters to their memoised versions."""
@@ -45,7 +45,6 @@ def _split_text(text: str) -> tuple[str, bool, list[str]]:
4545
# Return the splitter and the split text.
4646
return splitter, splitter_is_whitespace, text.split(splitter)
4747

48-
@cache
4948
def chunk(text: str, chunk_size: int, token_counter: callable, memoize: bool=True, _recursion_depth: int = 0) -> list[str]:
5049
"""Split text into semantically meaningful chunks of a specified size as determined by the provided token counter.
5150
@@ -113,4 +112,6 @@ def chunk(text: str, chunk_size: int, token_counter: callable, memoize: bool=Tru
113112
if not _recursion_depth:
114113
chunks = list(filter(None, chunks))
115114

116-
return chunks
115+
return chunks
116+
117+
chunk = wraps(chunk)(cache(chunk))

0 commit comments

Comments
 (0)