Skip to content

Conversation

@nerpaula
Copy link
Contributor

@nerpaula nerpaula commented Oct 17, 2025

Description

Importer & Retriever: Update startup parameters., including descriptions and parameters for instant and deep search.

TO DO:

  • Apply changes to all versions

@arangodb-docs-automation
Copy link
Contributor

Deploy Preview Available Via
https://deploy-preview-809--docs-hugo.netlify.app

@cla-bot cla-bot bot added the cla-signed label Oct 17, 2025
@nerpaula nerpaula self-assigned this Oct 17, 2025
cursor[bot]

This comment was marked as outdated.

Copy link
Member

@aMahanna aMahanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! minor comment about including embedding_api_provider

"openrouter_model": "mistralai/mistral-nemo" // Specify a model here
"chat_api_provider": "openai",
"embedding_api_provider": "openai",
"chat_api_url": "https://openrouter.ai/api/v1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine - since setting openai as the provider means that we just use the OpenAI() client to interact with the OpenRouter URL, which is OpenAI-compatible

Copy link
Contributor

@diegomendez40 diegomendez40 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments.

@bluepal-pavan-kothapalli
Copy link

bluepal-pavan-kothapalli commented Oct 31, 2025

IMHO: It would be great if we could provide short documentation on how to create a project in genai-service.

Create a New Project

Endpoint: POST /v1/project

Validation: The name must be 1–63 characters long and can only contain letters, numbers, underscores, and hyphens.

Request Body:

json
{
"project_name": "my_project_1",
"project_type": "ML",
"project_description": "My project description"
}
This project can then be referenced in other services, like the importer or retriever, using the genai_project_name field:

json
{
"genai_project_name": "my_project_1"
}

WDYT?
CC @diegomendez40

@nerpaula
Copy link
Contributor Author

nerpaula commented Oct 31, 2025

IMHO: It would be great if we could provide short documentation on how to create a project in genai-service.

Create a New Project

Endpoint: POST /v1/project

Validation: The name must be 1–63 characters long and can only contain letters, numbers, underscores, and hyphens.

Request Body:

json { "project_name": "my_project_1", "project_type": "ML", "project_description": "My project description" } This project can then be referenced in other services, like the importer or retriever, using the genai_project_name field:

json { "genai_project_name": "my_project_1" }

WDYT? CC @diegomendez40

@bluepal-pavan-kothapalli There is a section on how to create a new project, in importer.md line 33. We should however reference this in the Retriever page as well.

@cursor
Copy link

cursor bot commented Oct 31, 2025

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on November 28.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@nerpaula
Copy link
Contributor Author

nerpaula commented Nov 4, 2025

@diegomendez40 @bluepal-pavan-kothapalli FYI changes from my two latest commits:

  • remove username from startup parameters
  • add project_db_name as required parameter when creating a new project
  • after some careful consideration, I have decided that the Project endpoint deserves a better description and a section of its own in the GenAI Orchestrator service. I have further improved the content and moved it over. The Importer and Retriever have references to the respective section and marked as a prerequisite.

Copy link
Contributor

@diegomendez40 diegomendez40 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work, @nerpaula. Unfortunately, I have found a number of possible enhancements and corrections.

While I added most of them to the 3.12 folder, they do also apply to 3.13.

## Using Profiles in Creation Request Body
### Creating a project

To create a new GraphRAG project, send a POST request to the project endpoint:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to mention that only 3 services require a project:

  • Importer
  • Retrievers
  • VirtualGraph (but that's not part of this release as of today, will follow very soon)

https://github.com/arangoml/GenAI-Service/blob/main/gen_ai/naming.py#L26-L29

- `triton_model`: Name of the LLM model to use for text processing.

### Using OpenAI (Public LLM)
### Using OpenAI for chat and embedding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"openai" doesn't stand for OpenAI, but rather for any OpenAI-compatible API, including essentially any large LLM provider: OpenRouter, Gemini, Anthropic, corporate LLMs, etc.

The URL can point to the relevant non-OpenAI endpoint, even if it is served via an OpenAI-compatible API.

{{< /info >}}

### Using OpenRouter (Gemini, Anthropic, etc.)
### Using OpenRouter for chat and OpenAI for embedding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just OpenRouter. It's literally any OpenAI compatible API.

Comment on lines +21 to +24
- **Instant search**: Focuses on specific entities and their relationships, ideal
for fast queries about particular concepts.
- **Deep search**: Analyzes the knowledge graph structure to identify themes and patterns,
perfect for comprehensive insights and detailed summaries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, these definitions are incorrect. These are the definitions for global and local. However, I had already provided the relevant definitions for instant vs. deep search, which can be used here.

Comment on lines +62 to +67
Deep Search is designed for highly detailed, accurate responses that require understanding
what kind of information is available in different parts of the knowledge graph and
sequentially retrieving information in an LLM-guided research process. Use whenever
detail and accuracy are required (e.g. aggregation of highly technical details) and
very short latency is not (i.e. caching responses for frequently asked questions,
or use case with agents or research use cases).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

The request parameters are the following:
- `query`: Your search query text.
- `level`: The community hierarchy level to use for the search (`1` for top-level communities).
- `level`: The community hierarchy level to use for the search (`1` for top-level communities). Defaults to `2` if not provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the 'level' parameter. That's for global queries.

Comment on lines -223 to -224
- `1`: Global search.
- `2`: Local search.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global and local queries still exist. They are supported.

The fact that we don't use them on the UI doesn't mean they can't be used by experts.

Comment on lines +262 to +263
- `UNIFIED`: Instant search.
- `LOCAL`: Deep search.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't exactly right.

Local with no LLM planner is the typical local query.

Local with LLM planner is Deep Search.

{{< /info >}}

### Using OpenRouter (Gemini, Anthropic, etc.)
### Using OpenRouter for chat and OpenAI for embedding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this should mention any OpenAI compatible API, not just OpenRouter

Comment on lines +21 to +24
- **Instant search**: Focuses on specific entities and their relationships, ideal
for fast queries about particular concepts.
- **Deep search**: Analyzes the knowledge graph structure to identify themes and patterns,
perfect for comprehensive insights and detailed summaries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes above (on 3.12) should be replicated for 3.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants