Feature/mcp http acl by EnjoyBacon7 · Pull Request #273 · linagora/openrag

EnjoyBacon7 · 2026-03-11T14:00:25Z

No description provided.

v1.1

We now allow to directly request the LLM through the openAI endpoints. This is useful when we want to interact with the LLM without relying on documents, e.g. for translation, rephrasing, etc. To use it, the client must simply not specify the `model`. Note that is this approach, no system prompt will be used, it is entirely up to the client.

docs: Documenting openrag's env variables

Removed the release job from the GitHub Actions workflow.

We improve the disk file saving by doing the following: - Do not load the whole file in memory, but rather stream the http buffer and write in chunks - Do not make blocking I/O , to allow parallel writes - Add random prefix to saved files, to avoid name collisions

Add a new `v1/tools` endpoint , to allow custom tools execution. This is useful to execute specific openRAG features, such as text extraction, without the semantic search functionnalities

Merge for release v1.1.5

Remove `max_tokens` default value to avoid cutting mid-generation.

new release 1.1.7

docs: adding doc spoken style answer prompt

Expose read-only MCP search tools over streamable HTTP while reusing OpenRAG's high-level indexer search flow and enforcing existing token-based partition access.

Introduce components/app/ with an abstract interface (OpenRAGApiInterface) and a concrete implementation (OpenRAGApplicationService) that wraps both SearchToolService and IndexationService. The service exposes every operation needed by the API routers and the MCP server under a single, testable surface: - search (documents / partition / file) - indexation catalog (list/get/delete files, chunks, partitions, users) - file ingestion (add, replace, copy, update metadata, index URL) - task management (status, error, logs, cancel) - Ray actor management (list, restart) - OpenAI-compatible endpoints (models, chat/completion) - Tools (extractText) RagPipeline is instantiated lazily and cached on the service instance to avoid recreating it on every request. Includes full unit-test coverage (29 tests) in test_service.py.

Replace the separate search_service + indexation_service construction in mcp_server.py with a single OpenRAGApplicationService instance. All tool handlers now call app_service directly. Backward-compatible module-level aliases (search_service, indexation_service) are kept so existing tests that monkeypatch those names continue to work without modification.

…vice Replace direct Ray actor calls and ad-hoc business logic in every API router with delegated calls to a module-level OpenRAGApplicationService instance. No endpoint behaviour changes. Per-router highlights: - actors.py – list_ray_actors / restart_actor delegate to service; removes inline ray.kill / actor recreation code - extract.py – get_chunk_by_id delegates to service; maps KeyError → 404 and PermissionError → 403 - indexer.py – add/replace/delete/patch/copy file, task status/error/ logs/cancel all delegate to service; removes unused json and ray imports - openai.py – list_models and chat/completion pipelines delegate to service; removes unused consts import - partition.py – all partition CRUD and membership operations delegate to service - queue.py – get_queue_info and list_my_tasks delegate to service; removes unused Counter import and dead _format_pool_info helper - search.py – all three search endpoints delegate to service; removes unused get_indexer dependency from function signatures - tools.py – list_tools and execute_tool delegate to service - users.py – all user CRUD operations delegate to service

mcp_server.py now exposes a single app_service (OpenRAGApplicationService) instead of separate search_service and indexation_service objects. Update the patched_services fixture to build a composite MagicMock that forwards every method to the appropriate fake service, then monkeypatches mcp_mod.app_service with it. The module-level search_service and indexation_service aliases are also patched with the same composite to keep any path that still references them working.

…dget - Add LLM_CONTEXT_WINDOW config (default 8192) so the context budget is derived from the actual model limit rather than top_k × chunk_size - Reserve 2048 tokens for system prompt + chat history overhead; the remaining tokens become max_context_tokens for retrieved documents - Replace ChatOpenAI/tiktoken tokenizer in format_context() with a conservative char-based estimator (4 chars/token) that works correctly across all LLM backends including Mistral (tiktoken systematically undercounts Mistral tokens, causing context overflows)

…ualization - Add VLM_CONTEXT_WINDOW config (default 8192) mirroring LLM_CONTEXT_WINDOW - ChunkContextualizer now reads context_window from llm_config and truncates first_chunks / prev_chunks / current_chunk content before building the user message, guaranteeing the total input never exceeds the model's context limit - Replace ChatOpenAI().get_num_tokens (tiktoken, underestimates non-GPT models) with a char-based estimator in BaseChunker._length_function; remove the now unused self.llm ChatOpenAI instance from BaseChunker.__init__

The get_file_chunks tool was returning all chunks of a file at once. For large files this easily exceeds the MCP client LLM's context window (observed: 36,877 tokens sent to an 8192-token model). Add offset/limit pagination (default limit=10) so the model receives a safe-sized page per call. The response now includes total_chunks, offset, limit, and has_more so the model knows when to keep paging. The REST route GET /{partition}/file/{file_id} only uses chunk IDs for link generation (never returns content), so it passes limit=100_000 to retrieve all IDs in a single call unaffected by pagination.

…etch Replace the magic large number with -1 as the canonical sentinel for 'no limit' in get_file_chunks. The REST route that fetches all chunk IDs for link generation now passes limit=-1 explicitly.

Image-captioned chunks can be ~1250 tokens each, making limit=10 exceed an 8192-token context window before the conversation history is even counted. Drop the default to 3 and document the constraint in the tool description so models know to keep limit small.

…exist When a user attempts to index a file into a non-existent partition via the MCP index_url tool, the request was failing with 'Access denied for partition' because _enforce_partition_access only checks membership, not existence. Add _ensure_partition_exists to IndexationService, called before the access check in index_url: if the partition is absent from both the DB and the caller's membership list, it is created with the caller as owner, mirroring the silent-allow behaviour already present in the REST API path (routers/utils.py:ensure_partition_role).

… check index_url was calling indexer.add_file.remote without the user argument, causing set_details to store user_id=None. Any subsequent call to get_indexation_task_status would then raise PermissionError because the authenticated user's id did not match the None stored in the task details.

coderabbitai · 2026-03-11T14:00:36Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ac89f5a5-e62b-40fd-b2bc-9a877e120452

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/mcp-http-acl

📝 Coding Plan for PR comments

Generate coding plan

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

EnjoyBacon7 · 2026-03-11T15:14:03Z

This feature should implement #272

…00 AttributeError (#8, #9)

…s, validate file_id, guard copy_file overwrite (#1, #2, #7, #12, #13)

…ger (#15)

EnjoyBacon7 and others added 30 commits October 22, 2025 16:51

Merge pull request #118 from linagora/dev

d37a6a5

v1.1

Update Docker images to latest versions

e184fca

Merge branch 'main' of github.com:linagora/openrag

1de3fd0

Merge pull request #148 from linagora/update-documentation

191e49a

docs: Documenting openrag's env variables

Push image to DockerHub

4835daa

Remove release job from build workflow

a90823d

Removed the release job from the GitHub Actions workflow.

tests: Add unit test for disk file saving

7854511

refactor: Move file serialization in common module

15a983e

feat: Add tools endpoint

eee2da0

Add a new `v1/tools` endpoint , to allow custom tools execution. This is useful to execute specific openRAG features, such as text extraction, without the semantic search functionnalities

Merge branch 'dev'

9a9f9dd

Merge branch 'dev'

9efbad8

Merge pull request #175 from linagora/dev

efa103f

Merge for release v1.1.5

Remove max_tokens default value to avoid cutting mid-generation.

bfbf3a5

Merge pull request #181 from linagora/lift_default_max_tokens

ca868b9

Remove `max_tokens` default value to avoid cutting mid-generation.

new release 1.1.7

9d82e39

Merge pull request #224 from linagora/release_1.1.7

1719e40

new release 1.1.7

Merge branch 'dev'

3ed3c0e

correct version 1.1.7 ==> 1.1.6

0b5f84c

docs: adding doc spoken style answer prompt

1042a58

Merge pull request #235 from linagora/docs/spoken_style

3e219ea

docs: adding doc spoken style answer prompt

feat: add HTTP MCP search server with partition ACLs

df58eb6

Expose read-only MCP search tools over streamable HTTP while reusing OpenRAG's high-level indexer search flow and enforcing existing token-based partition access.

Added mcp search and metadata fetching + tests

6ba0fc8

Added more tools & tests

dd6d45b

Added mcp tests

9577638

EnjoyBacon7 added 7 commits March 9, 2026 12:41

fix(mcp): use limit=-1 instead of limit=100_000 for unbounded chunk f…

13f1098

…etch Replace the magic large number with -1 as the canonical sentinel for 'no limit' in get_file_chunks. The REST route that fetches all chunk IDs for link generation now passes limit=-1 explicitly.

EnjoyBacon7 added 5 commits March 12, 2026 13:13

fix(mcp): catch non-PermissionError in dispatch middleware to avoid 5…

2a31f12

…00 AttributeError (#8, #9)

fix(mcp): prevent ContextVar mutation, enforce editor ACL on write op…

61a262b

…s, validate file_id, guard copy_file overwrite (#1, #2, #7, #12, #13)

fix(api): replace deprecated @app.on_event with lifespan context mana…

866a071

…ger (#15)

fix(routers): remove redundant Ray remote call in get_task_status (#3)

82a2909

fix(app): pass None instead of invalid Ray task_id in execute_tool (#6)

c0032ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/mcp http acl#273

Feature/mcp http acl#273
EnjoyBacon7 wants to merge 42 commits intodevfrom
feature/mcp-http-acl

EnjoyBacon7 commented Mar 11, 2026

Uh oh!

coderabbitai bot commented Mar 11, 2026 •

edited

Loading

Review skipped

Uh oh!

EnjoyBacon7 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

EnjoyBacon7 commented Mar 11, 2026

Uh oh!

coderabbitai bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

EnjoyBacon7 commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coderabbitai bot commented Mar 11, 2026 •

edited

Loading