Skip to content

Commit 3376643

Browse files
authored
Merge pull request #1 from Program-Integrity-Alliance/feat-add-source-specific-tools
feat: add new search tools and prompts from remote server
2 parents f04a752 + 494732a commit 3376643

File tree

9 files changed

+914
-323
lines changed

9 files changed

+914
-323
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,11 @@ ENV/
128128
env.bak/
129129
venv.bak/
130130

131+
# Development utilities output
132+
remote_tools.json
133+
remote_prompts.json
134+
remote_prompt_details.json
135+
131136
# Spyder project settings
132137
.spyderproject
133138
.spyproject

CONTRIBUTING.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,58 @@ Brief description of changes
288288
- [ ] Performance improvement
289289
- [ ] Other: ___
290290

291+
## Development Tools
292+
293+
### Server Interrogation Utility
294+
295+
The `utils/interrogate_server.py` script is a development utility that helps developers understand what tools and prompts are available on the remote PIA MCP server. This is particularly useful when implementing new local server tools or updating existing ones to match the remote server's capabilities.
296+
297+
#### Usage
298+
299+
```bash
300+
# Set your API key (required)
301+
export PIA_API_KEY=your_api_key_here
302+
# Or create a .env file with: PIA_API_KEY=your_api_key_here
303+
304+
# Run the interrogation script
305+
python utils/interrogate_server.py [--output-dir OUTPUT_DIR]
306+
```
307+
308+
#### What it does
309+
310+
1. **Discovers available tools**: Queries the remote server's `tools/list` endpoint to get all available tools with their descriptions and parameter schemas
311+
2. **Discovers available prompts**: Queries the remote server's `prompts/list` endpoint to get all available prompts
312+
3. **Retrieves prompt content**: Gets the actual content/text for each prompt using `prompts/get`
313+
4. **Saves results to JSON files**:
314+
- `remote_tools.json` - Complete tool definitions
315+
- `remote_prompts.json` - Prompt list and metadata
316+
- `remote_prompt_details.json` - Full prompt content
317+
318+
#### Using the results for development
319+
320+
When implementing new tools or updating existing ones:
321+
322+
1. **Compare tool definitions**: Use the `remote_tools.json` to see exact parameter schemas, descriptions, and available tools
323+
2. **Update tool descriptions**: Copy the exact descriptions from the remote server to ensure consistency
324+
3. **Add missing tools**: Identify tools that exist remotely but not locally
325+
4. **Update prompts**: Use the prompt details to ensure local prompts match the remote server exactly
326+
327+
#### Example workflow
328+
329+
```bash
330+
# 1. Interrogate the remote server
331+
python utils/interrogate_server.py --output-dir ./analysis
332+
333+
# 2. Review the generated JSON files
334+
cat analysis/remote_tools.json | jq '.tools[].name' # List all tool names
335+
cat analysis/remote_prompts.json | jq '.prompts[].name' # List all prompt names
336+
337+
# 3. Compare with local implementation and update as needed
338+
# 4. Test changes to ensure compatibility
339+
```
340+
341+
This utility was used to discover and implement the agency-specific search tools (`pia_search_content_gao`, `pia_search_content_oig`, etc.) and the ChatGPT Connector tools (`search`, `fetch`) that are available on the remote server.
342+
291343
## Testing
292344
- [ ] Tests pass locally
293345
- [ ] New tests added for new functionality

README.md

Lines changed: 123 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -114,14 +114,21 @@ Add this configuration to your MCP client config file:
114114
"run",
115115
"pia-mcp-server",
116116
"--api-key", "YOUR_API_KEY"
117-
]
117+
],
118+
"cwd": "/path/to/your/pia-mcp-local"
118119
}
119120
}
120121
}
121122
```
122123

123124
For Docker:
124125

126+
You must build the Docker image ...
127+
128+
`docker build -t pia-mcp-server:latest .`
129+
130+
Then add this to your Client, eg Claude ...
131+
125132
```json
126133
{
127134
"mcpServers": {
@@ -141,57 +148,156 @@ For Docker:
141148

142149
## 💡 Available Tools
143150

144-
The server provides four main tools for searching the Program Integrity Alliance (PIA) database:
151+
The server provides 11 tools for searching the Program Integrity Alliance (PIA) database:
152+
153+
### Core Search Tools
145154

146155
### 1. `pia_search_content`
147156

148157
**Purpose:** Comprehensive search tool for querying document content and recommendations in the PIA database.
149158

150-
**Description:** Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution (GAO, OIG, etc.). Supports complex OData filtering with boolean logic, operators, and grouping.
159+
**Description:** Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). Supports complex OData filtering with boolean logic, operators, and grouping.
151160

152161
**Parameters:**
153162
- `query` (required): Search query text
154163
- `filter` (optional): OData filter expression supporting complex boolean logic
155-
- `page` (optional): Page number (1-based, default: 1)
156-
- `page_size` (optional): Number of results per page (max 50, default: 10)
157-
- `search_mode` (optional): Search mode - "content" for full-text search or "titles" for title-only search (default: "content")
158-
- `limit` (optional): Alternative name for page_size (for compatibility)
159-
- `include_facets` (optional): Whether to include facets in response (default: false to reduce token usage)
164+
- `page` (optional): Page number (default: 1)
165+
- `page_size` (optional): Results per page (default: 10)
166+
- `search_mode` (optional): Search mode (default: content)
167+
- `limit` (optional): Maximum results limit
168+
- `include_facets` (optional): Include facets in results (default: false)
160169

161170
### 2. `pia_search_content_facets`
162171

163172
**Purpose:** Get available facets (filter values) for the PIA database content search.
164173

165-
**Description:** This can help understand what filter values are available before performing content searches. Supports complex OData filtering with boolean logic, operators, and grouping.
174+
**Description:** This can help understand what filter values are available before performing content searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).
166175

167176
**Parameters:**
168-
- `query` (optional): Optional query to get facets for (if empty, gets all facets, default: "")
177+
- `query` (optional): Optional query to get facets for (default: "")
169178
- `filter` (optional): Optional OData filter expression
170179

171180
### 3. `pia_search_titles`
172181

173182
**Purpose:** Search the Program Integrity Alliance (PIA) database for document titles only.
174183

175-
**Description:** Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Supports complex OData filtering with boolean logic, operators, and grouping.
184+
**Description:** Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).
176185

177186
**Parameters:**
178187
- `query` (required): Search query text (searches document titles only)
179188
- `filter` (optional): OData filter expression supporting complex boolean logic
180-
- `page` (optional): Page number (1-based, default: 1)
181-
- `page_size` (optional): Number of results per page (max 50, default: 10)
182-
- `limit` (optional): Alternative name for page_size (for compatibility)
183-
- `include_facets` (optional): Whether to include facets in response (default: false to reduce token usage)
189+
- `page` (optional): Page number (default: 1)
190+
- `page_size` (optional): Results per page (default: 10)
191+
- `limit` (optional): Maximum results limit
192+
- `include_facets` (optional): Include facets in results (default: false)
184193

185194
### 4. `pia_search_titles_facets`
186195

187196
**Purpose:** Get available facets (filter values) for the PIA database title search.
188197

189-
**Description:** This can help understand what filter values are available before performing title searches. Supports complex OData filtering with boolean logic, operators, and grouping.
198+
**Description:** This can help understand what filter values are available before performing title searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).
190199

191200
**Parameters:**
192-
- `query` (optional): Optional query to get facets for (if empty, gets all facets, default: "")
201+
- `query` (optional): Optional query to get facets for (default: "")
193202
- `filter` (optional): Optional OData filter expression
194203

204+
### Agency-Specific Search Tools
205+
206+
### 5. `pia_search_content_gao`
207+
208+
**Purpose:** Search for GAO document content and recommendations.
209+
210+
**Description:** This tool automatically filters results to only include documents from the Government Accountability Office (GAO). Returns comprehensive results with full citation information and clickable links for proper attribution.
211+
212+
**Parameters:**
213+
- `query` (required): Search query text
214+
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'GAO')
215+
- `page` (optional): Page number (default: 1)
216+
- `page_size` (optional): Results per page (default: 10)
217+
- `search_mode` (optional): Search mode (default: content)
218+
- `limit` (optional): Maximum results limit
219+
- `include_facets` (optional): Include facets in results (default: false)
220+
221+
### 6. `pia_search_content_oig`
222+
223+
**Purpose:** Search for OIG document content and recommendations.
224+
225+
**Description:** This tool automatically filters results to only include documents from Office of Inspector General (OIG) sources. Returns comprehensive results with full citation information and clickable links for proper attribution.
226+
227+
**Parameters:**
228+
- `query` (required): Search query text
229+
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'OIG')
230+
- `page` (optional): Page number (default: 1)
231+
- `page_size` (optional): Results per page (default: 10)
232+
- `search_mode` (optional): Search mode (default: content)
233+
- `limit` (optional): Maximum results limit
234+
- `include_facets` (optional): Include facets in results (default: false)
235+
236+
### 7. `pia_search_content_crs`
237+
238+
**Purpose:** Search for CRS document content and recommendations.
239+
240+
**Description:** This tool automatically filters results to only include documents from Congressional Research Service (CRS). Returns comprehensive results with full citation information and clickable links for proper attribution.
241+
242+
**Parameters:**
243+
- `query` (required): Search query text
244+
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'CRS')
245+
- `page` (optional): Page number (default: 1)
246+
- `page_size` (optional): Results per page (default: 10)
247+
- `search_mode` (optional): Search mode (default: content)
248+
- `limit` (optional): Maximum results limit
249+
- `include_facets` (optional): Include facets in results (default: false)
250+
251+
### 8. `pia_search_content_doj`
252+
253+
**Purpose:** Search for Department of Justice document content and recommendations.
254+
255+
**Description:** This tool automatically filters results to only include documents from the Department of Justice. Returns comprehensive results with full citation information and clickable links for proper attribution.
256+
257+
**Parameters:**
258+
- `query` (required): Search query text
259+
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'Department of Justice')
260+
- `page` (optional): Page number (default: 1)
261+
- `page_size` (optional): Results per page (default: 10)
262+
- `search_mode` (optional): Search mode (default: content)
263+
- `limit` (optional): Maximum results limit
264+
- `include_facets` (optional): Include facets in results (default: false)
265+
266+
### 9. `pia_search_content_congress`
267+
268+
**Purpose:** Search for Congress.gov document content and recommendations.
269+
270+
**Description:** This tool automatically filters results to only include documents from Congress.gov. Returns comprehensive results with full citation information and clickable links for proper attribution.
271+
272+
**Parameters:**
273+
- `query` (required): Search query text
274+
- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'Congress.gov')
275+
- `page` (optional): Page number (default: 1)
276+
- `page_size` (optional): Results per page (default: 10)
277+
- `search_mode` (optional): Search mode (default: content)
278+
- `limit` (optional): Maximum results limit
279+
- `include_facets` (optional): Include facets in results (default: false)
280+
281+
### ChatGPT Connector Tools
282+
283+
### 10. `search`
284+
285+
**Purpose:** Simple search interface for ChatGPT Connectors.
286+
287+
**Description:** Search the Program Integrity Alliance (PIA) database and return a list of potentially relevant search results with titles, snippets, and URLs for citation. This endpoint is one of the supported for OpenAI's MCP spec when integrating ChatGPT Connectors.
288+
289+
**Parameters:**
290+
- `query` (required): A search query string to find relevant documents in the PIA database
291+
292+
### 11. `fetch`
293+
294+
**Purpose:** Document retrieval by ID for ChatGPT Connectors.
295+
296+
**Description:** Retrieve the full contents of a specific document from the PIA database using its unique identifier. This endpoint is one of the supported for OpenAI's MCP spec when integrating ChatGPT Connectors.
297+
298+
**Parameters:**
299+
- `id` (required): A unique identifier for the document to retrieve
300+
195301
## Search Modes
196302

197303
Comprehensive search with OData filtering and faceting. The `filter` parameter uses standard [OData query syntax](https://docs.oasis-open.org/odata/odata/v4.01/odata-v4.01-part2-url-conventions.html).

0 commit comments

Comments
 (0)