|
| 1 | +# Functions as Tools |
| 2 | + |
| 3 | +The "Functions as Tools" feature allows the StreamNative MCP Server to dynamically discover Apache Pulsar Functions deployed in your cluster and expose them as invokable MCP tools for AI agents. This significantly enhances the capabilities of AI agents by allowing them to interact with custom business logic encapsulated in Pulsar Functions without manual tool registration for each function. |
| 4 | + |
| 5 | +## How it Works |
| 6 | + |
| 7 | +### 1. Function Discovery |
| 8 | +The MCP Server automatically discovers Pulsar Functions available in the connected Pulsar cluster. It periodically polls for functions and identifies those suitable for exposure as tools. |
| 9 | + |
| 10 | +By default, if no custom name is provided (see Customizing Tool Properties), the MCP tool name might be derived from the Function's Fully Qualified Name (FQN), such as `pulsar_function_$tenant_$namespace_$name`. |
| 11 | + |
| 12 | +### 2. Schema Conversion |
| 13 | +For each discovered function, the MCP Server attempts to extract its input and output schema definitions. Pulsar Functions can be defined with various schema types for their inputs and outputs (e.g., primitive types, AVRO, JSON). |
| 14 | + |
| 15 | +The server then converts these native Pulsar schemas into a format compatible with MCP tools. This allows the AI agent to understand the expected input parameters and the structure of the output. |
| 16 | + |
| 17 | +Supported Pulsar schema types for automatic conversion include: |
| 18 | +* Primitive types (String, Boolean, Numbers like INT8, INT16, INT32, INT64, FLOAT, DOUBLE) |
| 19 | +* AVRO |
| 20 | +* JSON |
| 21 | + |
| 22 | +If a function uses an unsupported schema type for its input or output, or if schemas are not clearly defined, it might not be exposed as an MCP tool. |
| 23 | + |
| 24 | +## Enabling the Feature |
| 25 | +To enable this functionality, you need to specific the default `--pulsar-instance` and `--pulsar-cluster`, and include `functions-as-tools` in the `--features` flag when starting the StreamNative MCP Server. |
| 26 | + |
| 27 | +Example: |
| 28 | +```bash |
| 29 | +snmcp sse --organization my-org --key-file /path/to/key-file.json --features pulsar-admin,pulsar-client,functions-as-tools --pulsar-instance instance --pulsar-cluster cluster |
| 30 | +``` |
| 31 | +If `functions-as-tools` is part of a broader feature set like `all` and `streamnative-cloud`, enabling `all` or `streamnative-cloud` would also activate this feature. |
| 32 | + |
| 33 | +## Customizing Tool Properties |
| 34 | +You can customize how your Pulsar Functions appear as MCP tools (their name and description) by providing specific runtime options when deploying or updating your functions. This is done using the `--custom-runtime-options` flag with `pulsar-admin functions create` or `pulsar-admin functions update`. |
| 35 | + |
| 36 | +The MCP Server looks for the following environment variables within the custom runtime options: |
| 37 | +* `MCP_TOOL_NAME`: Specifies the desired name for the MCP tool. |
| 38 | +* `MCP_TOOL_DESCRIPTION`: Provides a description for the MCP tool, which helps the AI agent understand its purpose. |
| 39 | + |
| 40 | +**Format for `--custom-runtime-options`**: |
| 41 | +The options should be a JSON string where you define an `env` map containing `MCP_TOOL_NAME` and `MCP_TOOL_DESCRIPTION`. |
| 42 | + |
| 43 | +**Example**: |
| 44 | +When deploying a Pulsar Function, you can set these properties as follows: |
| 45 | +```bash |
| 46 | +pulsar-admin functions create \ |
| 47 | + --tenant public \ |
| 48 | + --namespace default \ |
| 49 | + --name my-custom-logic-function \ |
| 50 | + --inputs "persistent://public/default/input-topic" \ |
| 51 | + --output "persistent://public/default/output-topic" \ |
| 52 | + --py my_function.py \ |
| 53 | + --classname my_function.MyFunction \ |
| 54 | + --custom-runtime-options \ |
| 55 | + ''' |
| 56 | + { |
| 57 | + "env": { |
| 58 | + "MCP_TOOL_NAME": "CustomObjectFunction", |
| 59 | + "MCP_TOOL_DESCRIPTION": "Takes an input number and returns the value incremented by 100." |
| 60 | + } |
| 61 | + } |
| 62 | + ''' |
| 63 | +``` |
| 64 | +In this example: |
| 65 | +- The MCP tool derived from `my-custom-logic-function` will be named `CustomObjectFunction`. |
| 66 | +- Its description will be "Takes an input number and returns the value incremented by 100." |
| 67 | + |
| 68 | +If these custom options are not provided, the MCP tool name might default to a derivative of the function's FQN, and the description might be generic and cannot help AI Agent to understand the purpose of the MCP tool. |
| 69 | + |
| 70 | +## Server-Side Configuration via Environment Variables |
| 71 | + |
| 72 | +Beyond customizing individual tool properties at the function deployment level, you can also configure the overall behavior of the "Functions as Tools" feature on the StreamNative MCP Server side using the following environment variables. These variables are typically set when starting the MCP server. |
| 73 | + |
| 74 | +* `FUNCTIONS_AS_TOOLS_POLL_INTERVAL` |
| 75 | + * **Description**: Controls how frequently the MCP Server polls the Pulsar cluster to discover or update available Pulsar Functions. Setting a lower value means functions are discovered faster, but it may increase the load on the Pulsar cluster. |
| 76 | + * **Unit**: Seconds |
| 77 | + * **Default**: Defaults to the value specified in `pftools.DefaultManagerOptions()`. Refer to the `pkg/pftools` package for the precise default (e.g., if the internal default is 60 seconds, it will be `60`). |
| 78 | +* `FUNCTIONS_AS_TOOLS_TIMEOUT` |
| 79 | + * **Description**: Sets the default timeout for invoking a Pulsar Function as an MCP tool. If a function execution exceeds this duration, the call will be considered timed out. |
| 80 | + * **Unit**: Seconds |
| 81 | + * **Default**: Defaults to the value specified in `pftools.DefaultManagerOptions()` (e.g., if the internal default is 30 seconds, it will be `30`). |
| 82 | +* `FUNCTIONS_AS_TOOLS_FAILURE_THRESHOLD` |
| 83 | + * **Description**: Defines the number of consecutive failures for a specific Pulsar Function tool before it is temporarily moved to a "circuit breaker open" state. In this state, further calls to this specific function tool will be immediately rejected without attempting to execute the function, until the `FUNCTIONS_AS_TOOLS_RESET_TIMEOUT` is reached. |
| 84 | + * **Unit**: Integer (number of failures) |
| 85 | + * **Default**: Defaults to the value specified in `pftools.DefaultManagerOptions()` (e.g., if the internal default is 5, it will be `5`). |
| 86 | +* `FUNCTIONS_AS_TOOLS_RESET_TIMEOUT` |
| 87 | + * **Description**: Specifies the duration for which a Pulsar Function tool remains in the "circuit breaker open" state (due to exceeding the failure threshold) before the MCP server attempts to reset the circuit and allow calls again. |
| 88 | + * **Unit**: Seconds |
| 89 | + * **Default**: Defaults to the value specified in `pftools.DefaultManagerOptions()` (e.g., if the internal default is 60 seconds, it will be `60`). |
| 90 | +* `FUNCTIONS_AS_TOOLS_TENANT_NAMESPACES` |
| 91 | + * **Description**: A comma-separated list of Pulsar `tenant/namespace` strings that the MCP Server should scan for Pulsar Functions. This allows you to restrict function discovery to specific namespaces. If not set, the server might attempt to discover functions from all namespaces it has access to, as permitted by its Pulsar client configuration. |
| 92 | + * **Format**: `tenant1/namespace1,tenant2/namespace2` |
| 93 | + * **Example**: `public/default,my-tenant/app-functions` |
| 94 | + * **Default**: Empty (meaning discover from all accessible namespaces (only on StreamNative Cloud)). |
| 95 | +* `FUNCTIONS_AS_TOOLS_STRICT_EXPORT` |
| 96 | + * **Description**: Only export functions with `MCP_TOOL_NAME` and `MCP_TOOL_DESCRIPTION` defined. |
| 97 | + * **Format**: `true` or `false` |
| 98 | + * **Example**: `false` |
| 99 | + * **Default**: `true` |
| 100 | + |
| 101 | +## Considerations and Limitations |
| 102 | + |
| 103 | +* **Schema Definition**: For reliable schema conversion, ensure your Pulsar Functions have clearly defined input and output schemas using Pulsar's schema registry capabilities. Functions with ambiguous or `BYTES` schemas might not be converted effectively or might default to generic byte array inputs/outputs. |
| 104 | +* **Function State**: This feature primarily focuses on the stateless request/response invocation pattern of functions. |
| 105 | +* **Discovery Latency**: There might be a slight delay between deploying/updating a function and it appearing as an MCP tool, due to the server's polling interval for function discovery. |
| 106 | +* **Error Handling**: The MCP Server will attempt to relay errors from function executions, but the specifics might vary. |
| 107 | +* **Security**: Ensure that only intended functions are exposed by managing permissions within your Pulsar cluster. The MCP Server will operate with the permissions of its Pulsar client. |
0 commit comments