Commit b93c1fd
committed
fix: implement chunking for large markdown sections and fix Qdrant deduplication issue (#4660)
- Modified parseMarkdownContent to chunk large sections (>1150 chars)
- Added support for chunking header-less markdown files
- Fixed _chunkTextByLines to handle oversized lines properly
- Added defensive check for parseMarkdown returning undefined
- Fixed Qdrant ID generation to use segmentHash instead of file:line
- This was the root cause: chunks were being deduplicated
- Each chunk now gets a unique ID even from the same line
- Added comprehensive tests for all edge cases
- Ensures all markdown content is properly indexed in Qdrant1 parent d3d1d75 commit b93c1fd
File tree
3 files changed
+861
-25
lines changed- src/services/code-index/processors
- __tests__
3 files changed
+861
-25
lines changed
0 commit comments