chore: update branch with changes from master #32277

mdrxy · 2025-07-28T14:35:43Z

No description provided.

## Summary - Fixed redundant word "done" in SECURITY.md line 69 - Fixed grammar errors in Fireworks README.md line 77: "how it fares compares" → "how it compares" and "in terms just" → "in terms of" ## Test plan - [x] Verified changes improve readability and correct grammar - [x] No functional changes, documentation only 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <[email protected]> Co-authored-by: Claude <[email protected]>

…generation (#32203) The `_dereference_refs_helper` in `langchain_core.utils.json_schema` incorrectly handled objects with a reference and other fields. **Issue**: #32170 # Description We change the check so that it accepts other keys in the object.

fixes #32170

This PR adds scaffolding for langchain 1.0 entry package. Most contents have been removed. Currently remaining entrypoints for: * chat models * embedding models * memory -> trimming messages, filtering messages and counting tokens [we may remove this] * prompts -> we may remove some prompts * storage: primarily to support cache backed embeddings, may remove the kv store * tools -> report tool primitives Things to be added: * Selected agent implementations * Selected workflows * Common primitives: messages, Document * Primitives for type hinting: BaseChatModel, BaseEmbeddings * Selected retrievers * Selected text splitters Things to be removed: * Globals needs to be removed (needs an update in langchain core) Todos: * TBD indexing api (requires sqlalchemy which we don't want as a dependency) * Be explicit about public/private interfaces (e.g., likely rename chat_models.base.py to something more internal) * Remove dockerfiles * Update module doc-strings and README.md

… inside the div tag (#32213) **Description:** We collect the text from the "html", "body", "div", and "main" nodes, if they have any. **Issue:** Fixes #32206.

Further clean up of namespace: - Removed prompts (we'll re-add in a separate commit) - Remove LocalFileStore until we can review whether all the implementation details are necessary - Remove message processing logic from memory (we'll figure out where to expose it) - Remove `Tool` primitive (should be sufficient to use `BaseTool` for typing purposes) - Remove utilities to create kv stores. Unclear if they've had much usage outside MultiparentRetriever

#32230)

See #32098 (comment)

…_body` + others (#32149) This PR addresses the common issue where users struggle to pass custom parameters to OpenAI-compatible APIs like LM Studio, vLLM, and others. The problem occurs when users try to use `model_kwargs` for custom parameters, which causes API errors. ## Problem Users attempting to pass custom parameters (like LM Studio's `ttl` parameter) were getting errors: ```python # ❌ This approach fails llm = ChatOpenAI( base_url="http://localhost:1234/v1", model="mlx-community/QwQ-32B-4bit", model_kwargs={"ttl": 5} # Causes TypeError: unexpected keyword argument 'ttl' ) ``` ## Solution The `extra_body` parameter is the correct way to pass custom parameters to OpenAI-compatible APIs: ```python # ✅ This approach works correctly llm = ChatOpenAI( base_url="http://localhost:1234/v1", model="mlx-community/QwQ-32B-4bit", extra_body={"ttl": 5} # Custom parameters go in extra_body ) ``` ## Changes Made 1. **Enhanced Documentation**: Updated the `extra_body` parameter docstring with comprehensive examples for LM Studio, vLLM, and other providers 2. **Added Documentation Section**: Created a new "OpenAI-compatible APIs" section in the main class docstring with practical examples 3. **Unit Tests**: Added tests to verify `extra_body` functionality works correctly: - `test_extra_body_parameter()`: Verifies custom parameters are included in request payload - `test_extra_body_with_model_kwargs()`: Ensures `extra_body` and `model_kwargs` work together 4. **Clear Guidance**: Documented when to use `extra_body` vs `model_kwargs` ## Examples Added **LM Studio with TTL (auto-eviction):** ```python ChatOpenAI( base_url="http://localhost:1234/v1", api_key="lm-studio", model="mlx-community/QwQ-32B-4bit", extra_body={"ttl": 300} # Auto-evict after 5 minutes ) ``` **vLLM with custom sampling:** ```python ChatOpenAI( base_url="http://localhost:8000/v1", api_key="EMPTY", model="meta-llama/Llama-2-7b-chat-hf", extra_body={ "use_beam_search": True, "best_of": 4 } ) ``` ## Why This Works - `model_kwargs` parameters are passed directly to the OpenAI client's `create()` method, causing errors for non-standard parameters - `extra_body` parameters are included in the HTTP request body, which is exactly what OpenAI-compatible APIs expect for custom parameters Fixes #32115.  --- 💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click [here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to start the survey. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: mdrxy <[email protected]> Co-authored-by: Mason Daugherty <[email protected]> Co-authored-by: Mason Daugherty <[email protected]>

… blocks (#32235) widespread cleanup attempt

… of non-ASCII characters. (#32222) fix: Fix LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. - **Description:** This PR fixes an issue where the LLM would mimic Unicode responses due to forced Unicode conversion of non-ASCII characters in tool calls. The fix involves disabling the `ensure_ascii` flag in `json.dumps()` when converting tool calls to OpenAI format. - **Issue:** Fixes ↓↓↓ input： ```json {'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]} ``` output: ```json {'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]} ``` then: llm will mimic outputting unicode. Unicode's vast number of symbols can lengthen LLM responses, leading to slower performance. <img width="686" height="277" alt="image" src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d" /> --------- Co-authored-by: Mason Daugherty <[email protected]> Co-authored-by: Mason Daugherty <[email protected]>

…es (#32200) **Description:** Fixes a bug in the file callback test where ANSI escape codes were causing test failures. The improved test now properly handles ANSI escape sequences by: - Using exact string comparison instead of substring checking - Applying the `strip_ansi` function consistently to all file contents - Adding descriptive assertion messages - Maintaining test coverage and backward compatibility The changes ensure tests pass reliably even when terminal control sequences are present in the output **Issue:** Fixes #32150 **Dependencies:** None required - uses existing dependencies only. --------- Co-authored-by: Eugene Yurtsev <[email protected]>

See https://docs.astral.sh/ruff/rules/private-member-access/

Thank you for contributing to LangChain! - **Adding documentation for PGVectorStore**: docs: Adding documentation for the new PGVectorStore as a part of langchain-postgres - **Add docs**: The notebook for PGVectorStore is now added to the directory `docs/docs/integrations`. As a part of this change, we've also updated the VectorStore features table and VectorStoreTabs --------- Co-authored-by: Chester Curme <[email protected]> Co-authored-by: Eugene Yurtsev <[email protected]>

Harden permissions for api docs build workflow

**TL;DR much of the provided `Makefile` targets were broken, and any time I wanted to preview changes locally I either had to refer to a command Chester gave me or try waiting on a Vercel preview deployment. With this PR, everything should behave like normal.** Significant updates to the `Makefile` and documentation files, focusing on improving usability, adding clear messaging, and fixing/enhancing documentation workflows. ### Updates to `Makefile`: #### Enhanced build and cleaning processes: - Added informative messages (e.g., "📚 Building LangChain documentation...") to makefile targets like `docs_build`, `docs_clean`, and `api_docs_build` for better user feedback during execution. - Introduced a `clean-cache` target to the `docs` `Makefile` to clear cached dependencies and ensure clean builds. #### Improved dependency handling: - Modified `install-py-deps` to create a `.venv/deps_installed` marker, preventing redundant/duplicate dependency installations and improving efficiency. #### Streamlined file generation and infrastructure setup: - Added caching for the LangServe README download and parallelized feature table generation - Added user-friendly completion messages for targets like `copy-infra` and `render`. #### Documentation server updates: - Enhanced the `start` target with messages indicating server start and URL for local documentation viewing. --- ### Documentation Improvements: #### Content clarity and consistency: - Standardized section titles for consistency across documentation files. [[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1) [[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1) - Refined phrasing and formatting in sections like "Dependency management" and "Formatting and linting" for better readability. [[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6) [[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82) #### Enhanced workflows: - Updated instructions for building and viewing documentation locally, including tips for specifying server ports and handling API reference previews. [[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94) [[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126) - Expanded guidance on cleaning documentation artifacts and using linting tools effectively. [[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126) [[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142) #### API reference documentation: - Improved instructions for generating and formatting in-code documentation, highlighting best practices for docstring writing. [[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142) [[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186) --- ### Minor Changes: - Added support for a new package name (`langchain_v1`) in the API documentation generation script. - Fixed minor capitalization and formatting issues in documentation files. [[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40) [[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160) --------- Co-authored-by: Copilot <[email protected]>

No need for allowing `.`

…build workflow (#32246) for build

> × No solution found when resolving dependencies: ╰─▶ Because only langchain-neo4j==0.5.0 is available and langchain-neo4j==0.5.0 depends on neo4j-graphrag>=1.9.0, we can conclude that all versions of langchain-neo4j depend on neo4j-graphrag>=1.9.0. And because only neo4j-graphrag<=1.9.0 is available and neo4j-graphrag==1.9.0 depends on pypdf>=5.1.0,<6.0.0, we can conclude that all versions of langchain-neo4j depend on pypdf>=5.1.0,<6.0.0. And because langchain-upstage==0.6.0 depends on pypdf>=4.2.0,<5.0.0 and only langchain-upstage==0.6.0 is available, we can conclude that all versions of langchain-neo4j and all versions of langchain-upstage are incompatible. And because you require langchain-neo4j and langchain-upstage, we can conclude that your requirements are unsatisfiable. --------- Co-authored-by: Mason Daugherty <[email protected]>

…ides.txt (#32247)

…s in the AI21 package ecosystem are resolved. (#32248)

See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc

See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg Co-authored-by: Mason Daugherty <[email protected]>

- **Description:** This PR updates the internal documentation link for the RAG tutorials to reflect the updated path. Previously, the link pointed to the root `/docs/tutorials/`, which was generic. It now correctly routes to the RAG-specific tutorial page. - **Issue:** N/A - **Dependencies:** None - **Twitter handle:** N/A

…32261) Following existing codebase conventions

Ensures proper reStructuredText formatting by adding the required blank line before closing docstring quotes, which resolves the "Block quote ends without a blank line; unexpected unindent" warning.

should resolve the file sharing issue for users on macOS.

ensure all relevant packages are correctly processed - cli wasn't included, also fix ValueError

…ount (#32273) **Description:** Fixes incorrect `num_skipped` count in the LangChain indexing API. The current implementation only counts documents that already exist in RecordManager (cross-batch duplicates) but fails to count documents removed during within-batch deduplication via `_deduplicate_in_order()`. This PR adds tracking of the original batch size before deduplication and includes the difference in `num_skipped`, ensuring that `num_added + num_skipped` equals the total number of input documents. **Issue:** Fixes incorrect document count reporting in indexing statistics **Dependencies:** None Fixes #32272 --------- Co-authored-by: Alex Feel <[email protected]>

vercel · 2025-07-28T14:35:51Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchain	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jul 28, 2025 2:35pm

codspeed-hq · 2025-07-28T14:37:15Z

CodSpeed WallTime Performance Report

Merging #32277 will not alter performance

_{Comparing master (f0b6baa) with master (12c0e9b)¹}

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 13 untouched benchmarks

No successful run was found on wip-v0.4 (3496e17) during the generation of this report, so master (12c0e9b) was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

codspeed-hq · 2025-07-28T14:42:58Z

CodSpeed Instrumentation Performance Report

Merging #32277 will not alter performance

_{Comparing master (f0b6baa) with master (12c0e9b)¹}

Summary

✅ 14 untouched benchmarks

No successful run was found on wip-v0.4 (3496e17) during the generation of this report, so master (12c0e9b) was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

cluster2600 and others added 30 commits July 23, 2025 10:43

release(core): 0.3.72 (#32214)

bd3d649

fixes #32170

fix(langchain): class HTMLSemanticPreservingSplitter ignores the text…

622bb05

… inside the div tag (#32213) **Description:** We collect the text from the "html", "body", "div", and "main" nodes, if they have any. **Issue:** Fixes #32206.

release(langchain): 0.3.27 (#32227)

0e139fb

fix(langchain): update langchain-core version to 0.3.72

2c42893

Merge branch 'master' of github.com:langchain-ai/langchain

71ad451

fix(text-splitters): update lock for release

7f015b6

fix(text-splitters): update langchain-core version to 0.3.72

77c9819

fix(langchain): update deps

bdf1cd3

chore: update copilot development guidelines for clarity and structure (

6f3169e

#32230)

refactor(langchain): refactor unit test stub classes (#32209)

0b34be4

See #32098 (comment)

fix: various typos (#32231)

7d2a13f

fix(docs): capitalization, codeblock formatting, and hyperlinks, note…

d53ebf3

… blocks (#32235) widespread cleanup attempt

chore(langchain): add ruff rules SLF (#32112)

e1238b8

See https://docs.astral.sh/ruff/rules/private-member-access/

chore(langchain): add ruff rules D1 (except D100 and D104) (#32123)

12ae42c

chore(infra): harden api docs build workflow (#32243)

549ecd3

Harden permissions for api docs build workflow

ci(infra): no need for . in the regexp (#32245)

db22311

No need for allowing `.`

fix(docs): add validation for repository format and name in API docs …

df20f11

…build workflow (#32246) for build

fix(docs): update protobuf version constraint to <5.0 in vercel_overr…

c102817

…ides.txt (#32247)

fix(docs): temporary workaround until the underlying dependency issue…

5ecbb5f

…s in the AI21 package ecosystem are resolved. (#32248)

chore(langchain): add ruff rules TC (#31921)

a2ad5ac

See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc

cbornet and others added 19 commits July 26, 2025 18:32

chore(langchain): add ruff rules ARG (#32110)

efdfa00

See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg Co-authored-by: Mason Daugherty <[email protected]>

refactor: markdownlint SECURITY.md (#32258)

eafab52

refactor: markdownlint (#32259)

53d0bfe

fix: devcontainer (#32260)

c6cb1fa

refactor: enhance workflow names and descriptions for clarity (#32262)

9d38f17

fix: update links in SECURITY.md to use markdown format

62212c7

feat: add markdownlint configuration file (#32264)

e0ef98d

fix: update service name in devcontainer configuration

5f5b87e

fix: update dev container name to match service name

5295f2a

chore: add .editorconfig for consistent coding styles across files (#…

d1679ce

…32261) Following existing codebase conventions

fix: update workspace folder path in devcontainer configuration

f4ff451

Merge branch 'master' of github.com:langchain-ai/langchain

a8a2cff

fix: formatting issues in docstrings (#32265)

96cbd90

Ensures proper reStructuredText formatting by adding the required blank line before closing docstring quotes, which resolves the "Block quote ends without a blank line; unexpected unindent" warning.

feat: add VSCode configuration files for Python development (#32263)

904066f

fix: devcontainer to use volume to store the workspace (#32266)

caf1919

should resolve the file sharing issue for users on macOS.

fix: explicitly tell uv to copy when using devcontainer (#32267)

ed682ae

fix(docs): local API reference documentation build (#32271)

12c0e9b

ensure all relevant packages are correctly processed - cli wasn't included, also fix ValueError

mdrxy requested review from eyurtsev, baskaryan and ccurme as code owners July 28, 2025 14:35

mdrxy added the ignore-lint-pr-title ⚠️ Shouldn't be regularly used! Bypasses the Validate PR Title linting action label Jul 28, 2025

mdrxy changed the title ~~update v0.4 branch~~ chore: update branch with changes from master Jul 28, 2025

mdrxy merged commit 5e9eb19 into wip-v0.4 Jul 28, 2025
317 of 318 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: update branch with changes from master #32277

chore: update branch with changes from master #32277

Uh oh!

mdrxy commented Jul 28, 2025

Uh oh!

vercel bot commented Jul 28, 2025

Uh oh!

codspeed-hq bot commented Jul 28, 2025

Uh oh!

Uh oh!

codspeed-hq bot commented Jul 28, 2025

Uh oh!

Uh oh!

chore: update branch with changes from master #32277

chore: update branch with changes from master #32277

Uh oh!

Conversation

mdrxy commented Jul 28, 2025

Uh oh!

vercel bot commented Jul 28, 2025

Uh oh!

codspeed-hq bot commented Jul 28, 2025

CodSpeed WallTime Performance Report

Merging #32277 will not alter performance

Summary

Footnotes

Uh oh!

Uh oh!

codspeed-hq bot commented Jul 28, 2025

CodSpeed Instrumentation Performance Report

Merging #32277 will not alter performance

Summary

Footnotes

Uh oh!

Uh oh!