fix(vlm): handle content_filter finish reason in API responses#3051
Merged
cau-git merged 4 commits intodocling-project:mainfrom Mar 23, 2026
Merged
Conversation
Contributor
|
✅ DCO Check Passed Thanks @Br1an67, all your commits are properly signed off. 🎉 |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
dolfim-ibm
reviewed
Mar 1, 2026
Member
dolfim-ibm
left a comment
There was a problem hiding this comment.
Overall the idea of the PR looks ok, but I'm not sure about partial success (see prev comment)
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
f314776 to
40e7ff0
Compare
Contributor
Author
|
Hi — just a gentle bump on this. Happy to make any changes if needed! |
Member
Member
|
@Br1an67 only missing thing is your DCO remediation commit. It is mandated by the Linux Foundation that all contributors must sign off their commits. Could you please complete this? Then we can merge. |
Add `CONTENT_FILTERED` to `VlmStopReason` enum and detect `finish_reason == "content_filter"` in `api_image_request()`. This prevents silent drops when an API provider (e.g. Azure OpenAI) filters content, and logs a warning for downstream consumers. Signed-off-by: Br1an67 <932039080@qq.com>
Add unit tests for the api_image_request function covering all finish reason types: content_filter, length, and stop/end_of_sequence. Signed-off-by: Br1an67 <932039080@qq.com>
- Convert Optional[X] to X | None for type annotations (UP045) - Replace percent format with f-strings in tests (UP031) - Prefix unused unpacked variables with underscore (RUF059) This fixes the code-checks / lint (3.12) CI failure. Signed-off-by: Br1an67 <932039080@qq.com>
Signed-off-by: Br1an67 <932039080@qq.com>
bc90e86 to
aa85b9a
Compare
cau-git
approved these changes
Mar 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue resolved by this Pull Request:
Resolves #2988
Add
CONTENT_FILTERED = "content_filter"to theVlmStopReasonenum and detectfinish_reason == "content_filter"inapi_image_request(). This prevents silent drops when an API provider (e.g. Azure OpenAI) filters content due to safety policies.Changes:
docling/datamodel/base_models.py: AddCONTENT_FILTEREDenum value toVlmStopReasondocling/utils/api_image_request.py: Detectcontent_filterfinish reason, log a warning, and returnVlmStopReason.CONTENT_FILTEREDdocling/pipeline/extraction_vlm_pipeline.py: TreatCONTENT_FILTEREDas partial success (same asLENGTHandSTOP_SEQUENCE)Checklist: