Skip to content

Commit 3c319d4

Browse files
feat: Enhance n8dex with multi-LLM, local search, and UI improvements
This commit introduces several key enhancements to the n8dex project: 1. **Multi-LLM Compatibility:** * The backend now supports configurable LLM providers (Gemini, OpenRouter, DeepSeek) via environment variables (`LLM_PROVIDER`, `LLM_API_KEY`, etc.). * The frontend includes a dropdown to select the desired LLM provider. 2. **Local Network Search:** * Added functionality to search local network HTML content. * Configuration via environment variables (`ENABLE_LOCAL_SEARCH`, `LOCAL_SEARCH_DOMAINS`, `SEARCH_MODE`). * The frontend provides a "Search Scope" dropdown to control search behavior (Internet only, Local only, combined modes). 3. **LangSmith Monitoring Toggle:** * Backend respects `LANGSMITH_ENABLED` environment variable for global control. * Frontend UI includes a toggle for your preference regarding LangSmith tracing, passed to the backend. 4. **Frontend UI Enhancements:** * Updated the overall theme to a brighter, more enterprise-friendly light theme. * Added UI elements for selecting LLM provider, LangSmith preference, and search scope. * Improved styling of chat messages and input forms. 5. **Backend Refinements & Testing:** * Refactored backend configuration and graph logic to support new features. * Added a suite of unit tests for backend components (configuration, graph logic, local search tool) to ensure stability. 6. **Documentation:** * Updated `README.md` extensively to cover all new features, environment variables, and UI options. Note: Integration of specific "Finance" and "HR" frontend sections is deferred pending example code.
1 parent fddf107 commit 3c319d4

File tree

13 files changed

+1518
-261
lines changed

13 files changed

+1518
-261
lines changed

README.md

Lines changed: 83 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,20 @@ This project demonstrates a fullstack application using a React frontend and a L
88

99
- 💬 Fullstack application with a React frontend and LangGraph backend.
1010
- 🧠 Powered by a LangGraph agent for advanced research and conversational AI.
11-
- 🔍 Dynamic search query generation using Google Gemini models.
11+
- 💡 **Multi-LLM Support:** Flexibility to use different LLM providers (Gemini, OpenRouter, DeepSeek).
12+
- 🔍 Dynamic search query generation using the configured LLM.
1213
- 🌐 Integrated web research via Google Search API.
14+
- 🏠 **Local Network Search:** Optional capability to search within configured local domains.
15+
- 🔄 **Flexible Search Modes:** Control whether to search internet, local network, or both, and in which order.
1316
- 🤔 Reflective reasoning to identify knowledge gaps and refine searches.
1417
- 📄 Generates answers with citations from gathered sources.
18+
- 🎨 **Updated UI Theme:** Modern, light theme for improved readability and a professional look.
19+
- 🛠️ **Configurable Tracing:** LangSmith tracing can be enabled/disabled.
1520
- 🔄 Hot-reloading for both frontend and backend development during development.
1621

22+
### Upcoming Features
23+
- Dedicated "Finance" and "HR" sections for specialized research tasks.
24+
1725
## Project Structure
1826

1927
The project is divided into two main directories:
@@ -29,10 +37,7 @@ Follow these steps to get the application running locally for development and te
2937

3038
- Node.js and npm (or yarn/pnpm)
3139
- Python 3.8+
32-
- **`GEMINI_API_KEY`**: The backend agent requires a Google Gemini API key.
33-
1. Navigate to the `backend/` directory.
34-
2. Create a file named `.env` by copying the `backend/.env.example` file.
35-
3. Open the `.env` file and add your Gemini API key: `GEMINI_API_KEY="YOUR_ACTUAL_API_KEY"`
40+
- **API Keys & Configuration:** The backend agent requires API keys depending on the chosen LLM provider and other features. See the "Configuration" section below for details on setting up your `.env` file in the `backend/` directory.
3641

3742
**2. Install Dependencies:**
3843

@@ -42,6 +47,11 @@ Follow these steps to get the application running locally for development and te
4247
cd backend
4348
pip install .
4449
```
50+
*Note: If you plan to use the Local Network Search feature, ensure you install its dependencies:*
51+
```bash
52+
pip install ".[local_search]"
53+
```
54+
*(Or `pip install requests beautifulsoup4` if you manage dependencies manually)*
4555

4656
**Frontend:**
4757

@@ -57,21 +67,76 @@ npm install
5767
```bash
5868
make dev
5969
```
60-
This will run the backend and frontend development servers. Open your browser and navigate to the frontend development server URL (e.g., `http://localhost:5173/app`).
70+
This will run the backend and frontend development servers. Open your browser and navigate to the frontend development server URL (e.g., `http://localhost:5173/app`).
6171

6272
_Alternatively, you can run the backend and frontend development servers separately. For the backend, open a terminal in the `backend/` directory and run `langgraph dev`. The backend API will be available at `http://127.0.0.1:2024`. It will also open a browser window to the LangGraph UI. For the frontend, open a terminal in the `frontend/` directory and run `npm run dev`. The frontend will be available at `http://localhost:5173`._
6373

74+
## Configuration
75+
76+
Create a `.env` file in the `backend/` directory by copying `backend/.env.example`. Below are the available environment variables:
77+
78+
### Core Agent & LLM Configuration
79+
- `GEMINI_API_KEY`: Your Google Gemini API key. Required if using "gemini" as the LLM provider for any task or for Google Search functionality.
80+
- `LLM_PROVIDER`: Specifies the primary LLM provider for core agent tasks (query generation, reflection, answer synthesis).
81+
- Options: `"gemini"`, `"openrouter"`, `"deepseek"`.
82+
- Default: `"gemini"`.
83+
- `LLM_API_KEY`: The API key for the selected `LLM_PROVIDER`.
84+
- Example: If `LLM_PROVIDER="openrouter"`, this should be your OpenRouter API key.
85+
- `OPENROUTER_MODEL_NAME`: Specify the full model string if using OpenRouter (e.g., `"anthropic/claude-3-haiku"`). This can be used by the agent if specific task models are not set.
86+
- `DEEPSEEK_MODEL_NAME`: Specify the model name if using DeepSeek (e.g., `"deepseek-chat"`). This can be used by the agent if specific task models are not set.
87+
- `QUERY_GENERATOR_MODEL`: Model used for generating search queries. Interpreted based on `LLM_PROVIDER`.
88+
- Default for Gemini: `"gemini-1.5-flash"`
89+
- `REFLECTION_MODEL`: Model used for reflection and knowledge gap analysis. Interpreted based on `LLM_PROVIDER`.
90+
- Default for Gemini: `"gemini-1.5-flash"`
91+
- `ANSWER_MODEL`: Model used for synthesizing the final answer. Interpreted based on `LLM_PROVIDER`.
92+
- Default for Gemini: `"gemini-1.5-pro"`
93+
- `NUMBER_OF_INITIAL_QUERIES`: Number of initial search queries to generate. Default: `3`.
94+
- `MAX_RESEARCH_LOOPS`: Maximum number of research refinement loops. Default: `2`.
95+
96+
### LangSmith Tracing
97+
- `LANGSMITH_ENABLED`: Master switch to enable (`true`) or disable (`false`) LangSmith tracing for the backend. Default: `true`.
98+
- If `true`, various LangSmith environment variables below should also be set.
99+
- If `false`, tracing is globally disabled for the application process, and the UI toggle cannot override this.
100+
- `LANGCHAIN_API_KEY`: Your LangSmith API key. Required if `LANGSMITH_ENABLED` is true.
101+
- `LANGCHAIN_TRACING_V2`: Set to `"true"` to use the V2 tracing protocol. Usually managed by the `LANGSMITH_ENABLED` setting.
102+
- `LANGCHAIN_ENDPOINT`: LangSmith API endpoint. Defaults to `"https://api.smith.langchain.com"`.
103+
- `LANGCHAIN_PROJECT`: Name of the project in LangSmith.
104+
105+
### Local Network Search
106+
- `ENABLE_LOCAL_SEARCH`: Set to `true` to enable searching within local network domains. Default: `false`.
107+
- `LOCAL_SEARCH_DOMAINS`: A comma-separated list of base URLs or domains for local search.
108+
- Example: `"http://intranet.mycompany.com,http://docs.internal.team"`
109+
- `SEARCH_MODE`: Defines the search behavior when both internet and local search capabilities might be active.
110+
- `"internet_only"` (Default): Searches only the public internet.
111+
* `"local_only"`: Searches only configured local domains (requires `ENABLE_LOCAL_SEARCH=true` and `LOCAL_SEARCH_DOMAINS` to be set).
112+
* `"internet_then_local"`: Performs internet search first, then local search if enabled.
113+
* `"local_then_internet"`: Performs local search first if enabled, then internet search.
114+
115+
## Frontend UI Settings
116+
117+
The user interface provides several controls to customize the agent's behavior for each query:
118+
119+
- **Effort Level:** (Low, Medium, High) - Adjusts the number of initial queries and maximum research loops.
120+
- **Reasoning Model:** (Flash/Fast, Pro/Advanced) - Selects a class of model for reasoning tasks (reflection, answer synthesis). The actual model used depends on the selected LLM Provider.
121+
- **LLM Provider:** (Gemini, OpenRouter, DeepSeek) - Choose the primary LLM provider for the current query. Requires corresponding API keys to be configured on the backend.
122+
- **LangSmith Monitoring:** (Toggle Switch) - If LangSmith is enabled globally on the backend, this allows users to toggle tracing for their specific session/query.
123+
- **Search Scope:** (Internet Only, Local Only, Internet then Local, Local then Internet) - Defines where the agent should search for information. "Local" options require backend configuration for local search.
124+
64125
## How the Backend Agent Works (High-Level)
65126

66127
The core of the backend is a LangGraph agent defined in `backend/src/agent/graph.py`. It follows these steps:
67128

68129
![Agent Flow](./agent.png)
69130

70-
1. **Generate Initial Queries:** Based on your input, it generates a set of initial search queries using a Gemini model.
71-
2. **Web Research:** For each query, it uses the Gemini model with the Google Search API to find relevant web pages.
72-
3. **Reflection & Knowledge Gap Analysis:** The agent analyzes the search results to determine if the information is sufficient or if there are knowledge gaps. It uses a Gemini model for this reflection process.
73-
4. **Iterative Refinement:** If gaps are found or the information is insufficient, it generates follow-up queries and repeats the web research and reflection steps (up to a configured maximum number of loops).
74-
5. **Finalize Answer:** Once the research is deemed sufficient, the agent synthesizes the gathered information into a coherent answer, including citations from the web sources, using a Gemini model.
131+
1. **Configure:** Reads settings from environment variables and per-request UI selections.
132+
2. **Generate Initial Queries:** Based on your input and configured model, it generates initial search queries.
133+
3. **Web/Local Research:** Depending on the `SEARCH_MODE`:
134+
* Performs searches using the Google Search API (for internet results).
135+
* Performs searches using the custom `LocalSearchTool` against configured domains (for local results).
136+
* Combines results if applicable.
137+
4. **Reflection & Knowledge Gap Analysis:** The agent analyzes the search results to determine if the information is sufficient or if there are knowledge gaps.
138+
5. **Iterative Refinement:** If gaps are found, it generates follow-up queries and repeats the research and reflection steps.
139+
6. **Finalize Answer:** Once research is sufficient, the agent synthesizes the information into a coherent answer with citations, using the configured answer model.
75140

76141
## Deployment
77142

@@ -89,8 +154,12 @@ _Note: If you are not running the docker-compose.yml example or exposing the bac
89154
```
90155
**2. Run the Production Server:**
91156

157+
Adjust the `docker-compose.yml` or your deployment environment to include all necessary environment variables as described in the "Configuration" section.
158+
Example:
92159
```bash
93-
GEMINI_API_KEY=<your_gemini_api_key> LANGSMITH_API_KEY=<your_langsmith_api_key> docker-compose up
160+
# Ensure your .env file (if used by docker-compose) or environment variables are set
161+
# e.g., GEMINI_API_KEY, LLM_PROVIDER, LLM_API_KEY, LANGSMITH_API_KEY (if LangSmith enabled), etc.
162+
docker-compose up
94163
```
95164

96165
Open your browser and navigate to `http://localhost:8123/app/` to see the application. The API will be available at `http://localhost:8123`.
@@ -101,7 +170,8 @@ Open your browser and navigate to `http://localhost:8123/app/` to see the applic
101170
- [Tailwind CSS](https://tailwindcss.com/) - For styling.
102171
- [Shadcn UI](https://ui.shadcn.com/) - For components.
103172
- [LangGraph](https://github.com/langchain-ai/langgraph) - For building the backend research agent.
104-
- [Google Gemini](https://ai.google.dev/models/gemini) - LLM for query generation, reflection, and answer synthesis.
173+
- LLMs: [Google Gemini](https://ai.google.dev/models/gemini), and adaptable for others like [OpenRouter](https://openrouter.ai/), [DeepSeek](https://www.deepseek.com/).
174+
- Search: Google Search API, Custom Local Network Search (Python `requests` & `BeautifulSoup`).
105175

106176
## License
107177

backend/pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ dependencies = [
1818
"langgraph-api",
1919
"fastapi",
2020
"google-genai",
21+
"requests>=2.25.0,<3.0.0",
22+
"beautifulsoup4>=4.9.0,<5.0.0",
2123
]
2224

2325

backend/src/agent/configuration.py

Lines changed: 107 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,59 @@
11
import os
2-
from pydantic import BaseModel, Field
3-
from typing import Any, Optional
2+
from pydantic import BaseModel, Field, validator
3+
from typing import Any, Optional, List
44

55
from langchain_core.runnables import RunnableConfig
66

77

88
class Configuration(BaseModel):
99
"""The configuration for the agent."""
1010

11+
llm_provider: str = Field(
12+
default="gemini",
13+
metadata={
14+
"description": "The LLM provider to use (e.g., 'gemini', 'openrouter', 'deepseek'). Environment variable: LLM_PROVIDER"
15+
},
16+
)
17+
18+
llm_api_key: Optional[str] = Field(
19+
default=None,
20+
metadata={
21+
"description": "The API key for the selected LLM provider. Environment variable: LLM_API_KEY"
22+
},
23+
)
24+
25+
openrouter_model_name: Optional[str] = Field(
26+
default=None,
27+
metadata={
28+
"description": "The specific OpenRouter model string (e.g., 'anthropic/claude-3-haiku'). Environment variable: OPENROUTER_MODEL_NAME"
29+
},
30+
)
31+
32+
deepseek_model_name: Optional[str] = Field(
33+
default=None,
34+
metadata={
35+
"description": "The specific DeepSeek model (e.g., 'deepseek-chat'). Environment variable: DEEPSEEK_MODEL_NAME"
36+
},
37+
)
38+
1139
query_generator_model: str = Field(
12-
default="gemini-2.0-flash",
40+
default="gemini-1.5-flash",
1341
metadata={
14-
"description": "The name of the language model to use for the agent's query generation."
42+
"description": "The name of the language model to use for the agent's query generation. Interpreted based on llm_provider (e.g., 'gemini-1.5-flash' for Gemini, part of model string for OpenRouter). Environment variable: QUERY_GENERATOR_MODEL"
1543
},
1644
)
1745

1846
reflection_model: str = Field(
19-
default="gemini-2.5-flash-preview-04-17",
47+
default="gemini-1.5-flash",
2048
metadata={
21-
"description": "The name of the language model to use for the agent's reflection."
49+
"description": "The name of the language model to use for the agent's reflection. Interpreted based on llm_provider. Environment variable: REFLECTION_MODEL"
2250
},
2351
)
2452

2553
answer_model: str = Field(
26-
default="gemini-2.5-pro-preview-05-06",
54+
default="gemini-1.5-pro",
2755
metadata={
28-
"description": "The name of the language model to use for the agent's answer."
56+
"description": "The name of the language model to use for the agent's answer. Interpreted based on llm_provider. Environment variable: ANSWER_MODEL"
2957
},
3058
)
3159

@@ -39,6 +67,44 @@ class Configuration(BaseModel):
3967
metadata={"description": "The maximum number of research loops to perform."},
4068
)
4169

70+
langsmith_enabled: bool = Field(
71+
default=True,
72+
metadata={
73+
"description": "Controls LangSmith tracing. Set to false to disable. If true, ensure LANGCHAIN_API_KEY and other relevant LangSmith environment variables (LANGCHAIN_TRACING_V2, LANGCHAIN_ENDPOINT, LANGCHAIN_PROJECT) are set. Environment variable: LANGSMITH_ENABLED"
74+
},
75+
)
76+
77+
enable_local_search: bool = Field(
78+
default=False,
79+
metadata={
80+
"description": "Enable or disable local network search functionality. Environment variable: ENABLE_LOCAL_SEARCH"
81+
},
82+
)
83+
84+
local_search_domains: List[str] = Field(
85+
default_factory=list, # Use default_factory for mutable types like list
86+
metadata={
87+
"description": "Comma-separated list of base URLs or domains for local network search (e.g., 'http://intranet.mycompany.com,http://docs.internal'). Environment variable: LOCAL_SEARCH_DOMAINS"
88+
},
89+
)
90+
91+
search_mode: str = Field(
92+
default="internet_only",
93+
metadata={
94+
"description": "Search behavior: 'internet_only', 'local_only', 'internet_then_local', 'local_then_internet'. Environment variable: SEARCH_MODE"
95+
},
96+
)
97+
98+
@validator("local_search_domains", pre=True, always=True)
99+
def parse_local_search_domains(cls, v: Any) -> List[str]:
100+
if isinstance(v, str):
101+
if not v: # Handle empty string case
102+
return []
103+
return [domain.strip() for domain in v.split(',')]
104+
if v is None: # Handle None if default_factory is not triggered early enough by env var
105+
return []
106+
return v # Already a list or handled by Pydantic
107+
42108
@classmethod
43109
def from_runnable_config(
44110
cls, config: Optional[RunnableConfig] = None
@@ -48,13 +114,41 @@ def from_runnable_config(
48114
config["configurable"] if config and "configurable" in config else {}
49115
)
50116

51-
# Get raw values from environment or config
117+
# Define a helper to fetch values preferentially from environment, then config, then default
118+
def get_value(field_name: str, default_value: Any = None) -> Any:
119+
env_var_name = field_name.upper()
120+
# For model_fields that have metadata and description, we can try to get env var name from there
121+
# However, it's safer to rely on convention (field_name.upper())
122+
# or explicitly map them if names differ significantly.
123+
# For now, we'll stick to the convention.
124+
value = os.environ.get(env_var_name, configurable.get(field_name))
125+
if value is None:
126+
# Fallback to default if defined in Field
127+
field_info = cls.model_fields.get(field_name)
128+
if field_info and field_info.default is not None:
129+
return field_info.default
130+
return default_value
131+
return value
132+
52133
raw_values: dict[str, Any] = {
53-
name: os.environ.get(name.upper(), configurable.get(name))
134+
name: get_value(name, cls.model_fields[name].default)
54135
for name in cls.model_fields.keys()
55136
}
56137

57-
# Filter out None values
58-
values = {k: v for k, v in raw_values.items() if v is not None}
138+
# Filter out None values for fields that are not explicitly Optional
139+
# and don't have a default value that is None.
140+
# Pydantic handles default values automatically, so this filtering might be redundant
141+
# if defaults are correctly set up in the model fields.
142+
# However, ensuring that we only pass values that are actually provided (env, config, or explicit default)
143+
# can prevent issues with Pydantic's validation if a field is not Optional but no value is found.
144+
145+
values_to_pass = {}
146+
for name, field_info in cls.model_fields.items():
147+
val = raw_values.get(name)
148+
if val is not None:
149+
values_to_pass[name] = val
150+
# If val is None but the field has a default value (even if None),
151+
# Pydantic will handle it. If it's Optional, None is fine.
152+
# If it's required and None, Pydantic will raise an error, which is correct.
59153

60-
return cls(**values)
154+
return cls(**values_to_pass)

0 commit comments

Comments
 (0)