Fix PDF document reader's grouping logic to respect pagesPerDocument #4627
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi Team:
The PDF parsing logic have one issue that it doesn’t always respect the pagesPerDocument setting. It couldn't pass the newly added unit test:
https://github.com/spring-projects/spring-ai/pull/4627/files#diff-14539564bf2af8df87bbbc6cf120abd9e706d120ec581d95af53e502e1a9ed64R76
So I refactor the code to clean it up and fix this issue at the same time.
Thanks for review!