You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"""Test that overlap is applied only once, even when text goes through multiple separator levels."""
202
+
settings=Settings("test-model")
203
+
settings.chunk_size=30# Small chunk size to force splitting
204
+
settings.chunk_overlap=8# Significant overlap
205
+
206
+
chunker=Chunker(mock_conn, settings)
207
+
208
+
# Create text that will be split by multiple separators:
209
+
# 1. First by paragraphs (\n\n)
210
+
# 2. Then by sentences (.)
211
+
# 3. Finally by words ( )
212
+
text="This is the first paragraph with multiple sentences. This should be split across separators.\n\nThis is the second paragraph with more content. This will also be split by multiple separators and should trigger the overlap bug."
213
+
214
+
chunks=chunker.chunk(text)
215
+
216
+
# Verify that no chunk exceeds the chunk_size limit
217
+
# If overlap is applied multiple times, chunks will be longer than chunk_size
0 commit comments