fix(elixir): resolve DOCX keyword extraction FunctionClauseError#313
Merged
fix(elixir): resolve DOCX keyword extraction FunctionClauseError#313
Conversation
Fixed crash when extracting DOCX files with keywords metadata by implementing proper keyword parsing at both Rust and Elixir layers. - Parse comma-separated keywords from DOCX core properties into Vec<String> - Store in typed Metadata.keywords field instead of metadata.additional - Ensures consistent data structure across all language bindings - Add string handling clause to normalize_keywords/1 function - Parse comma-separated keyword strings into expected keyword map format - Provides defensive handling for keywords from any source - Added test_docx_keywords_extraction in docx_metadata_extraction_test.rs - Creates minimal DOCX with keywords metadata - Verifies parsing into Vec<String> in Metadata.keywords - Added 8 keyword parsing tests in extraction_result_test.exs - Tests comma-separated strings, whitespace handling, edge cases - Regression tests for GitHub issue #309 DOCX extractor stored keywords as comma-separated strings in metadata.additional["keywords"], but Elixir's normalize_keywords/1 only handled nil, [], and lists - causing FunctionClauseError.
Goldziher
added a commit
that referenced
this pull request
Feb 13, 2026
fix(elixir): resolve DOCX keyword extraction FunctionClauseError
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #309
Changes
Rust (
crates/kreuzberg/src/extractors/docx.rs):Elixir (
packages/elixir/lib/kreuzberg/result.ex):Tests:
Root Cause
DOCX extractor stored keywords as comma-separated strings, but normalize_keywords/1 only handled nil, [], and lists.