Skip to content

Commit f86ce73

Browse files
tomasonjoa-s-g93
andauthored
cypher - Add configurable token limit to cypher read tool (#157)
* Add configurable token limit to cypher read tool * update changelog, change env var name, update docstrings * Update README.md * add unit tests --------- Co-authored-by: alex <[email protected]>
1 parent 27dcebb commit f86ce73

File tree

9 files changed

+1390
-1015
lines changed

9 files changed

+1390
-1015
lines changed

servers/mcp-neo4j-cypher/CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77

88
### Added
99
* Added Cypher result sanitation function from Neo4j GraphRAG that removes embedding values from the result
10+
* Add response token limit for read Cypher responses
1011

1112
## v0.3.1
1213

servers/mcp-neo4j-cypher/README.md

Lines changed: 43 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,37 @@ The server supports namespacing to allow multiple Neo4j MCP servers to be used s
5151

5252
This is useful when you need to connect to multiple Neo4j databases or instances from the same session.
5353

54+
### ⚙️ Query Configuration
55+
56+
The server provides configuration options to optimize query performance and manage response sizes:
57+
58+
#### 📏 Token Limits
59+
60+
Control the maximum size of query responses to prevent overwhelming the AI model:
61+
62+
**Command Line:**
63+
```bash
64+
mcp-neo4j-cypher --token-limit 4000
65+
```
66+
67+
**Environment Variable:**
68+
```bash
69+
export NEO4J_RESPONSE_TOKEN_LIMIT=4000
70+
```
71+
72+
**Docker:**
73+
```bash
74+
docker run -e NEO4J_RESPONSE_TOKEN_LIMIT=4000 mcp-neo4j-cypher:latest
75+
```
76+
77+
When a response exceeds the token limit, it will be automatically truncated to fit within the specified limit using `tiktoken`. This ensures:
78+
79+
- **Consistent Performance**: Responses stay within model context limits
80+
- **Cost Control**: Prevents excessive token usage in AI interactions
81+
- **Reliability**: Large datasets don't break the conversation flow
82+
83+
**Note**: Token limits only apply to `read_neo4j_cypher` responses. Schema queries and write operations return summary information and are not affected.
84+
5485
## 🏗️ Local Development & Deployment
5586

5687
### 🐳 Local Docker Development
@@ -261,17 +292,18 @@ docker run --rm -p 8000:8000 \
261292

262293
### 🔧 Environment Variables
263294

264-
| Variable | Default | Description |
265-
| ----------------------- | --------------------------------------- | ---------------------------------------------- |
266-
| `NEO4J_URI` | `bolt://localhost:7687` | Neo4j connection URI |
267-
| `NEO4J_USERNAME` | `neo4j` | Neo4j username |
268-
| `NEO4J_PASSWORD` | `password` | Neo4j password |
269-
| `NEO4J_DATABASE` | `neo4j` | Neo4j database name |
270-
| `NEO4J_TRANSPORT` | `stdio` (local), `http` (remote) | Transport protocol (`stdio`, `http`, or `sse`) |
271-
| `NEO4J_NAMESPACE` | _(empty)_ | Tool namespace prefix |
272-
| `NEO4J_MCP_SERVER_HOST` | `127.0.0.1` (local) | Host to bind to |
273-
| `NEO4J_MCP_SERVER_PORT` | `8000` | Port for HTTP/SSE transport |
274-
| `NEO4J_MCP_SERVER_PATH` | `/api/mcp/` | Path for accessing MCP server |
295+
| Variable | Default | Description |
296+
| ----------------------------- | --------------------------------------- | ---------------------------------------------- |
297+
| `NEO4J_URI` | `bolt://localhost:7687` | Neo4j connection URI |
298+
| `NEO4J_USERNAME` | `neo4j` | Neo4j username |
299+
| `NEO4J_PASSWORD` | `password` | Neo4j password |
300+
| `NEO4J_DATABASE` | `neo4j` | Neo4j database name |
301+
| `NEO4J_TRANSPORT` | `stdio` (local), `http` (remote) | Transport protocol (`stdio`, `http`, or `sse`) |
302+
| `NEO4J_NAMESPACE` | _(empty)_ | Tool namespace prefix |
303+
| `NEO4J_MCP_SERVER_HOST` | `127.0.0.1` (local) | Host to bind to |
304+
| `NEO4J_MCP_SERVER_PORT` | `8000` | Port for HTTP/SSE transport |
305+
| `NEO4J_MCP_SERVER_PATH` | `/api/mcp/` | Path for accessing MCP server |
306+
| `NEO4J_RESPONSE_TOKEN_LIMIT` | _(none)_ | Maximum tokens for read query responses |
275307

276308
### 🌐 SSE Transport for Legacy Web Access
277309

servers/mcp-neo4j-cypher/manifest.json

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@
3131
"NEO4J_NAMESPACE": "${user_config.neo4j_namespace}",
3232
"NEO4J_MCP_SERVER_HOST": "${user_config.mcp_server_host}",
3333
"NEO4J_MCP_SERVER_PORT": "${user_config.mcp_server_port}",
34-
"NEO4J_MCP_SERVER_PATH": "${user_config.mcp_server_path}"
34+
"NEO4J_MCP_SERVER_PATH": "${user_config.mcp_server_path}",
35+
"NEO4J_RESPONSE_TOKEN_LIMIT": "${user_config.token_limit}"
3536
}
3637
}
3738
},
@@ -124,6 +125,14 @@
124125
"default": "/mcp/",
125126
"required": false,
126127
"sensitive": false
128+
},
129+
"token_limit": {
130+
"type": "int",
131+
"title": "Response token limit",
132+
"description": "Optional response token limit for the read tool.",
133+
"default": "",
134+
"required": false,
135+
"sensitive": false
127136
}
128137
}
129138
}

servers/mcp-neo4j-cypher/pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ dependencies = [
88
"fastmcp>=2.10.5",
99
"neo4j>=5.26.0",
1010
"pydantic>=2.10.1",
11+
"tiktoken>=0.11.0",
1112
]
1213

1314
[build-system]

servers/mcp-neo4j-cypher/src/mcp_neo4j_cypher/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ def main():
2121
)
2222
parser.add_argument("--server-host", default=None, help="Server host")
2323
parser.add_argument("--server-port", default=None, help="Server port")
24+
parser.add_argument("--token-limit", default=None, help="Response token limit")
2425

2526
args = parser.parse_args()
2627
config = process_config(args)

servers/mcp-neo4j-cypher/src/mcp_neo4j_cypher/server.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import json
22
import logging
33
import re
4-
from typing import Any, Literal
4+
from typing import Any, Literal, Optional
55

66
from fastmcp.exceptions import ToolError
77
from fastmcp.server import FastMCP
@@ -10,7 +10,7 @@
1010
from neo4j import AsyncDriver, AsyncGraphDatabase, RoutingControl
1111
from neo4j.exceptions import ClientError, Neo4jError
1212
from pydantic import Field
13-
from .utils import _value_sanitize
13+
from .utils import _value_sanitize, _truncate_string_to_tokens
1414

1515
logger = logging.getLogger("mcp_neo4j_cypher")
1616

@@ -34,7 +34,10 @@ def _is_write_query(query: str) -> bool:
3434

3535

3636
def create_mcp_server(
37-
neo4j_driver: AsyncDriver, database: str = "neo4j", namespace: str = ""
37+
neo4j_driver: AsyncDriver,
38+
database: str = "neo4j",
39+
namespace: str = "",
40+
token_limit: Optional[int] = None,
3841
) -> FastMCP:
3942
mcp: FastMCP = FastMCP(
4043
"mcp-neo4j-cypher", dependencies=["neo4j", "pydantic"], stateless_http=True
@@ -183,6 +186,10 @@ async def read_neo4j_cypher(
183186
)
184187
sanitized_results = [_value_sanitize(el) for el in results]
185188
results_json_str = json.dumps(sanitized_results, default=str)
189+
if token_limit:
190+
results_json_str = _truncate_string_to_tokens(
191+
results_json_str, token_limit
192+
)
186193

187194
logger.debug(f"Read query returned {len(results_json_str)} rows")
188195

@@ -254,6 +261,7 @@ async def main(
254261
host: str = "127.0.0.1",
255262
port: int = 8000,
256263
path: str = "/mcp/",
264+
token_limit: Optional[int] = None,
257265
) -> None:
258266
logger.info("Starting MCP neo4j Server")
259267

@@ -265,7 +273,7 @@ async def main(
265273
),
266274
)
267275

268-
mcp = create_mcp_server(neo4j_driver, database, namespace)
276+
mcp = create_mcp_server(neo4j_driver, database, namespace, token_limit)
269277

270278
# Run the server with the specified transport
271279
match transport:

servers/mcp-neo4j-cypher/src/mcp_neo4j_cypher/utils.py

Lines changed: 48 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import tiktoken
2+
13
import argparse
24
import logging
35
import os
@@ -167,9 +169,19 @@ def process_config(args: argparse.Namespace) -> dict[str, Union[str, int, None]]
167169
"Info: No server path provided and transport is `stdio`. `server_path` will be None."
168170
)
169171
config["path"] = None
172+
# parse token limit
173+
if args.token_limit is not None:
174+
config["token_limit"] = args.token_limit
175+
else:
176+
if os.getenv("NEO4J_RESPONSE_TOKEN_LIMIT") is not None:
177+
config["token_limit"] = int(os.getenv("NEO4J_RESPONSE_TOKEN_LIMIT"))
178+
else:
179+
logger.info("Info: No token limit provided. No token limit will be used.")
180+
config["token_limit"] = None
170181

171182
return config
172183

184+
173185
def _value_sanitize(d: Any, list_limit: int = 128) -> Any:
174186
"""
175187
Sanitize the input dictionary or list.
@@ -222,4 +234,39 @@ def _value_sanitize(d: Any, list_limit: int = 128) -> Any:
222234
else:
223235
return None
224236
else:
225-
return d
237+
return d
238+
239+
240+
def _truncate_string_to_tokens(
241+
text: str, token_limit: int, model: str = "gpt-4"
242+
) -> str:
243+
"""
244+
Truncates the input string to fit within the specified token limit.
245+
246+
Parameters
247+
----------
248+
text : str
249+
The input text string.
250+
token_limit : int
251+
Maximum number of tokens allowed.
252+
model : str
253+
Model name (affects tokenization). Defaults to "gpt-4".
254+
255+
Returns
256+
-------
257+
str
258+
The truncated string that fits within the token limit.
259+
"""
260+
# Load encoding for the chosen model
261+
encoding = tiktoken.encoding_for_model(model)
262+
263+
# Encode text into tokens
264+
tokens = encoding.encode(text)
265+
266+
# Truncate tokens if they exceed the limit
267+
if len(tokens) > token_limit:
268+
tokens = tokens[:token_limit]
269+
270+
# Decode back into text
271+
truncated_text = encoding.decode(tokens)
272+
return truncated_text

0 commit comments

Comments
 (0)