Skip to content

Bump unstructured from 0.18.27 to 0.18.31 in /backend-agent#197

Merged
marcorosa merged 1 commit intodevelopfrom
dependabot/uv/backend-agent/develop/unstructured-0.18.31
Feb 9, 2026
Merged

Bump unstructured from 0.18.27 to 0.18.31 in /backend-agent#197
marcorosa merged 1 commit intodevelopfrom
dependabot/uv/backend-agent/develop/unstructured-0.18.31

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Feb 1, 2026

Bumps unstructured from 0.18.27 to 0.18.31.

Release notes

Sourced from unstructured's releases.

0.18.31

What's Changed

New Contributors

Full Changelog: Unstructured-IO/unstructured@0.18.28...0.18.31

0.18.28

Enhancement

  • Optimize clean_extra_whitespace_with_index_run (codeflash)
  • Optimize recursive_xy_cut_swapped (codeflash)
  • Optimize _DocxPartitioner._parse_category_depth_by_style_name (codeflash)
  • Optimize VertexAIEmbeddingEncoder._add_embeddings_to_elements (codeflash)
  • Optimize ngrams (codeflash)
  • Optimize stage_for_datasaur (codeflash)
Changelog

Sourced from unstructured's changelog.

0.18.31

Enhancements

  • Changed default DPI to 350
  • Add token-based chunking support: Added max_tokens, new_after_n_tokens, and tokenizer parameters to chunk_by_title() and chunk_elements() for chunking by token count instead of character count. Uses tiktoken for token counting. Install with pip install "unstructured[chunking-tokens]". (fixes #4127)

Fixes

0.18.30

Enhancements

  • Updated the Dockerfile to build from the chainguard base. Implemented updating and added base-packages that was done in the base-images repo to instead all be done here.
  • is_text_embedded now considers rotated text as low fidelity and and elements with no trivial amount of it are considered not embedded
  • Replace pdf2image with PyPDFium2 for PDF rendering
  • Optimize _get_optimal_value_for_bbox (codeflash)
  • Optimize _DocxPartitioner._style_based_element_type (codeflash)

Fixes

  • Fix EN DASH not cleaned by clean_bullets: Added EN DASH (\u2013) to UNICODE_BULLETS pattern so clean_bullets properly removes EN DASH bullet points without requiring clean_dashes (fixes #4105)
  • Change languages parameter default from ["auto"] to None: Updated default value in detect_languages() and partition_epub() functions. Behavior unchanged as None is converted to ["auto"] internally. (fixes #2471)
  • Resolve GHSA-58pv-8j8x-9vj2
  • use render mode data to determine if a character extracted by pdfminer is invisible or not

0.18.28

Enhancement

  • Optimize clean_extra_whitespace_with_index_run (codeflash)
  • Optimize recursive_xy_cut_swapped (codeflash)
  • Optimize _DocxPartitioner._parse_category_depth_by_style_name (codeflash)
  • Optimize VertexAIEmbeddingEncoder._add_embeddings_to_elements (codeflash)
  • Optimize ngrams (codeflash)
  • Optimize stage_for_datasaur (codeflash)
Commits
  • d1f1bdf chorse sep bump to resolve open CVEs (#4205)
  • d4caedf fix: Preserve Line Breaks in Code Blocks During Chunking (#4196)
  • 8f32550 fix(deps): Update semitechnologies/weaviate Docker tag to v1.35.3 (#4135)
  • dbe96e2 fix(deps): Update opensearchproject/opensearch Docker tag to v2.19.4 (#4134)
  • 7b366c5 fix(deps): Update docker.elastic.co/elasticsearch/elasticsearch Docker tag to...
  • f0b0e7c fix: filter coordinates kwargs to prevent TypeError in hi_res PDF processing ...
  • 01c3f7c Token-Based Chunking Support (#4203)
  • c0323a6 fix: remove sandbox=True from pypandoc to fix ODT conversion (#4193)
  • 95fea7e fix(deps): switch from pip-compile to uv pip compile (#4202)
  • 8cb6278 fix: reduce default dpi to 350 (#4199)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [unstructured](https://github.com/Unstructured-IO/unstructured) from 0.18.27 to 0.18.31.
- [Release notes](https://github.com/Unstructured-IO/unstructured/releases)
- [Changelog](https://github.com/Unstructured-IO/unstructured/blob/main/CHANGELOG.md)
- [Commits](Unstructured-IO/unstructured@0.18.27...0.18.31)

---
updated-dependencies:
- dependency-name: unstructured
  dependency-version: 0.18.31
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added backend Related to the flask backend and general Python stuff dependencies Pull requests that update a dependency file labels Feb 1, 2026
@dependabot dependabot bot requested a review from a team as a code owner February 1, 2026 19:38
@dependabot dependabot bot added backend Related to the flask backend and general Python stuff dependencies Pull requests that update a dependency file labels Feb 1, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 1, 2026

The changes in the pyproject.toml file involve an upgrade of a specific package dependency version. This kind of modification typically aims to incorporate improvements, bug fixes, or new functionality from the updated version.

Walkthrough

  • Chore: Updated unstructured library from version 0.18.27 to 0.18.31 to potentially benefit from the latest bug fixes, performance improvements, or additional features offered by the updated version.

Model: gpt-4o-2024-08-06 | Prompt Tokens: 306 | Completion Tokens: 94

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a supportive code review enhanced with AI assistance. Please note that some suggestions are AI-generated insights rather than definitive requirements, so trust your expertise and apply what feels most valuable. You remain in full control of your code decisions—this AI support is simply here to offer additional perspectives and help streamline your development process.


Always critique what AI says. Do not let AI replace YOUR I.
Model: claude-sonnet-4-20250514 | Prompt Tokens: 854 | Completion Tokens: 260

'PyYAML==6.0.3',
'requests==2.32.5',
'unstructured==0.18.27',
'unstructured==0.18.31',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good practice upgrading the unstructured dependency from 0.18.27 to 0.18.31. However, consider using version ranges instead of pinning exact versions for better dependency management and security updates:

'unstructured>=0.18.31,<0.19.0',

This approach allows for patch updates while preventing breaking changes from minor version bumps. The same consideration applies to other pinned dependencies in this file like PyYAML, requests, pandas, and ollama.

@marcorosa marcorosa merged commit cd7e399 into develop Feb 9, 2026
5 checks passed
@marcorosa marcorosa deleted the dependabot/uv/backend-agent/develop/unstructured-0.18.31 branch February 9, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Related to the flask backend and general Python stuff dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant