Skip to content

[API Policies] Implement Semantic Tool Filtering Policy for WSO2 AI Gateway #762

@NaveenSandaruwan

Description

@NaveenSandaruwan

Description

In current Agentic applications using the WSO2 AI Gateway, all available tools are exposed to the Large Language Model (LLM) by default. This "over-exposure" introduces three critical inefficiencies: degraded application performance due to processing overhead, a higher risk of LLM hallucinations when selecting irrelevant tools, and inflated token costs.

The proposed Semantic Tool Filtering Policy addresses these challenges by intelligently limiting the toolset to only the most contextually relevant options for each request. This is a critical feature for organizations where the number of available backend tools exceeds the optimal context window of the LLM.

Problem Statement

The "all-tools-at-once" approach has become a technical liability due to:

  • Context Window Limitations: Efficient filtering ensures the meaning of the user's request isn't crowded out by metadata.
  • Inference Precision: Reducing the "search space" minimizes the mathematical probability of the model picking the wrong tool.
  • System Latency: Large payloads require more time for the AI Gateway to process and for the LLM to parse, creating a sluggish user experience.

Proposed Solution

We propose an intelligent and semantic tool filtering policy within the WSO2 AI Gateway. This policy executes a two-step retrieval process:

  1. Vectorization: Both the user’s query and the metadata (names/descriptions) of all available tools are converted into high-dimensional vectors using an embedding model.

  2. Semantic Ranking: The system calculates the Cosine Similarity between the query vector and the tool vectors.

  3. Filtering: Only the tools that meet a specific relevance threshold (e.g., top 5 most relevant) are injected into the LLM prompt.

  4. Caching Mechanism: To address performance overhead, a caching mechanism will be implemented for each API to store the tools embeddings list.

    • Tool descriptions are hashed to check if embeddings are already in memory.
    • If a new tool appears, it is appended to the list so embeddings can be used for subsequent requests.

Alternatives

No response

Version

0.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions