Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,11 @@ ENV/
env.bak/
venv.bak/

# Development utilities output
remote_tools.json
remote_prompts.json
remote_prompt_details.json

# Spyder project settings
.spyderproject
.spyproject
Expand Down
52 changes: 52 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,58 @@ Brief description of changes
- [ ] Performance improvement
- [ ] Other: ___

## Development Tools

### Server Interrogation Utility

The `utils/interrogate_server.py` script is a development utility that helps developers understand what tools and prompts are available on the remote PIA MCP server. This is particularly useful when implementing new local server tools or updating existing ones to match the remote server's capabilities.

#### Usage

```bash
# Set your API key (required)
export PIA_API_KEY=your_api_key_here
# Or create a .env file with: PIA_API_KEY=your_api_key_here

# Run the interrogation script
python utils/interrogate_server.py [--output-dir OUTPUT_DIR]
```

#### What it does

1. **Discovers available tools**: Queries the remote server's `tools/list` endpoint to get all available tools with their descriptions and parameter schemas
2. **Discovers available prompts**: Queries the remote server's `prompts/list` endpoint to get all available prompts
3. **Retrieves prompt content**: Gets the actual content/text for each prompt using `prompts/get`
4. **Saves results to JSON files**:
- `remote_tools.json` - Complete tool definitions
- `remote_prompts.json` - Prompt list and metadata
- `remote_prompt_details.json` - Full prompt content

#### Using the results for development

When implementing new tools or updating existing ones:

1. **Compare tool definitions**: Use the `remote_tools.json` to see exact parameter schemas, descriptions, and available tools
2. **Update tool descriptions**: Copy the exact descriptions from the remote server to ensure consistency
3. **Add missing tools**: Identify tools that exist remotely but not locally
4. **Update prompts**: Use the prompt details to ensure local prompts match the remote server exactly

#### Example workflow

```bash
# 1. Interrogate the remote server
python utils/interrogate_server.py --output-dir ./analysis

# 2. Review the generated JSON files
cat analysis/remote_tools.json | jq '.tools[].name' # List all tool names
cat analysis/remote_prompts.json | jq '.prompts[].name' # List all prompt names

# 3. Compare with local implementation and update as needed
# 4. Test changes to ensure compatibility
```

This utility was used to discover and implement the agency-specific search tools (`pia_search_content_gao`, `pia_search_content_oig`, etc.) and the ChatGPT Connector tools (`search`, `fetch`) that are available on the remote server.

## Testing
- [ ] Tests pass locally
- [ ] New tests added for new functionality
Expand Down
140 changes: 123 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,14 +114,21 @@ Add this configuration to your MCP client config file:
"run",
"pia-mcp-server",
"--api-key", "YOUR_API_KEY"
]
],
"cwd": "/path/to/your/pia-mcp-local"
}
}
}
```

For Docker:

You must build the Docker image ...

`docker build -t pia-mcp-server:latest .`

Then add this to your Client, eg Claude ...

```json
{
"mcpServers": {
Expand All @@ -141,57 +148,156 @@ For Docker:

## 💡 Available Tools

The server provides four main tools for searching the Program Integrity Alliance (PIA) database:
The server provides 11 tools for searching the Program Integrity Alliance (PIA) database:

### Core Search Tools

### 1. `pia_search_content`

**Purpose:** Comprehensive search tool for querying document content and recommendations in the PIA database.

**Description:** Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution (GAO, OIG, etc.). Supports complex OData filtering with boolean logic, operators, and grouping.
**Description:** Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). Supports complex OData filtering with boolean logic, operators, and grouping.

**Parameters:**
- `query` (required): Search query text
- `filter` (optional): OData filter expression supporting complex boolean logic
- `page` (optional): Page number (1-based, default: 1)
- `page_size` (optional): Number of results per page (max 50, default: 10)
- `search_mode` (optional): Search mode - "content" for full-text search or "titles" for title-only search (default: "content")
- `limit` (optional): Alternative name for page_size (for compatibility)
- `include_facets` (optional): Whether to include facets in response (default: false to reduce token usage)
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `search_mode` (optional): Search mode (default: content)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### 2. `pia_search_content_facets`

**Purpose:** Get available facets (filter values) for the PIA database content search.

**Description:** This can help understand what filter values are available before performing content searches. Supports complex OData filtering with boolean logic, operators, and grouping.
**Description:** This can help understand what filter values are available before performing content searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).

**Parameters:**
- `query` (optional): Optional query to get facets for (if empty, gets all facets, default: "")
- `query` (optional): Optional query to get facets for (default: "")
- `filter` (optional): Optional OData filter expression

### 3. `pia_search_titles`

**Purpose:** Search the Program Integrity Alliance (PIA) database for document titles only.

**Description:** Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Supports complex OData filtering with boolean logic, operators, and grouping.
**Description:** Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).

**Parameters:**
- `query` (required): Search query text (searches document titles only)
- `filter` (optional): OData filter expression supporting complex boolean logic
- `page` (optional): Page number (1-based, default: 1)
- `page_size` (optional): Number of results per page (max 50, default: 10)
- `limit` (optional): Alternative name for page_size (for compatibility)
- `include_facets` (optional): Whether to include facets in response (default: false to reduce token usage)
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### 4. `pia_search_titles_facets`

**Purpose:** Get available facets (filter values) for the PIA database title search.

**Description:** This can help understand what filter values are available before performing title searches. Supports complex OData filtering with boolean logic, operators, and grouping.
**Description:** This can help understand what filter values are available before performing title searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).

**Parameters:**
- `query` (optional): Optional query to get facets for (if empty, gets all facets, default: "")
- `query` (optional): Optional query to get facets for (default: "")
- `filter` (optional): Optional OData filter expression

### Agency-Specific Search Tools

### 5. `pia_search_content_gao`

**Purpose:** Search for GAO document content and recommendations.

**Description:** This tool automatically filters results to only include documents from the Government Accountability Office (GAO). Returns comprehensive results with full citation information and clickable links for proper attribution.

**Parameters:**
- `query` (required): Search query text
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'GAO')
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `search_mode` (optional): Search mode (default: content)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### 6. `pia_search_content_oig`

**Purpose:** Search for OIG document content and recommendations.

**Description:** This tool automatically filters results to only include documents from Office of Inspector General (OIG) sources. Returns comprehensive results with full citation information and clickable links for proper attribution.

**Parameters:**
- `query` (required): Search query text
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'OIG')
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `search_mode` (optional): Search mode (default: content)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### 7. `pia_search_content_crs`

**Purpose:** Search for CRS document content and recommendations.

**Description:** This tool automatically filters results to only include documents from Congressional Research Service (CRS). Returns comprehensive results with full citation information and clickable links for proper attribution.

**Parameters:**
- `query` (required): Search query text
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'CRS')
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `search_mode` (optional): Search mode (default: content)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### 8. `pia_search_content_doj`

**Purpose:** Search for Department of Justice document content and recommendations.

**Description:** This tool automatically filters results to only include documents from the Department of Justice. Returns comprehensive results with full citation information and clickable links for proper attribution.

**Parameters:**
- `query` (required): Search query text
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'Department of Justice')
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `search_mode` (optional): Search mode (default: content)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### 9. `pia_search_content_congress`

**Purpose:** Search for Congress.gov document content and recommendations.

**Description:** This tool automatically filters results to only include documents from Congress.gov. Returns comprehensive results with full citation information and clickable links for proper attribution.

**Parameters:**
- `query` (required): Search query text
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'Congress.gov')
- `page` (optional): Page number (default: 1)
- `page_size` (optional): Results per page (default: 10)
- `search_mode` (optional): Search mode (default: content)
- `limit` (optional): Maximum results limit
- `include_facets` (optional): Include facets in results (default: false)

### ChatGPT Connector Tools

### 10. `search`

**Purpose:** Simple search interface for ChatGPT Connectors.

**Description:** Search the Program Integrity Alliance (PIA) database and return a list of potentially relevant search results with titles, snippets, and URLs for citation. This endpoint is one of the supported for OpenAI's MCP spec when integrating ChatGPT Connectors.

**Parameters:**
- `query` (required): A search query string to find relevant documents in the PIA database

### 11. `fetch`

**Purpose:** Document retrieval by ID for ChatGPT Connectors.

**Description:** Retrieve the full contents of a specific document from the PIA database using its unique identifier. This endpoint is one of the supported for OpenAI's MCP spec when integrating ChatGPT Connectors.

**Parameters:**
- `id` (required): A unique identifier for the document to retrieve

## Search Modes

Comprehensive search with OData filtering and faceting. The `filter` parameter uses standard [OData query syntax](https://docs.oasis-open.org/odata/odata/v4.01/odata-v4.01-part2-url-conventions.html).
Expand Down
Loading