diff --git a/README.md b/README.md index 4dae6b9..0653c2a 100644 --- a/README.md +++ b/README.md @@ -150,7 +150,7 @@ Then add this to your Client, eg Claude ... ## 💡 Available Tools -The server provides 11 tools for searching the Program Integrity Alliance (PIA) database: +The server provides 12 tools for searching the Program Integrity Alliance (PIA) database: ### Core Search Tools @@ -158,7 +158,7 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) **Purpose:** Comprehensive search tool for querying document content and recommendations in the PIA database. -**Description:** Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). Supports complex OData filtering with boolean logic, operators, and grouping. +**Description:** Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders. Supports complex OData filtering with boolean logic, operators, and grouping. **Parameters:** - `query` (required): Search query text @@ -173,7 +173,7 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) **Purpose:** Get available facets (filter values) for the PIA database content search. -**Description:** This can help understand what filter values are available before performing content searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). +**Description:** This can help understand what filter values are available before performing content searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders. **Parameters:** - `query` (optional): Optional query to get facets for (default: "") @@ -183,7 +183,7 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) **Purpose:** Search the Program Integrity Alliance (PIA) database for document titles only. -**Description:** Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). +**Description:** Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders. **Parameters:** - `query` (required): Search query text (searches document titles only) @@ -197,7 +197,7 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) **Purpose:** Get available facets (filter values) for the PIA database title search. -**Description:** This can help understand what filter values are available before performing title searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). +**Description:** This can help understand what filter values are available before performing title searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders. **Parameters:** - `query` (optional): Optional query to get facets for (default: "") @@ -228,7 +228,7 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) **Parameters:** - `query` (required): Search query text -- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'OIG') +- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'Oversight.gov') - `page` (optional): Page number (default: 1) - `page_size` (optional): Results per page (default: 10) - `search_mode` (optional): Search mode (default: content) @@ -280,9 +280,37 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) - `limit` (optional): Maximum results limit - `include_facets` (optional): Include facets in results (default: false) +### Executive Orders Search Tool + +### 10. `pia_search_content_executive_orders` + +**Purpose:** Search for Executive Orders document content from the Federal Register. + +**Description:** This tool automatically filters results to only include Executive Orders from the Federal Register (https://www.federalregister.gov/). Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Supports complex OData filtering with boolean logic, operators, and grouping. + +**Parameters:** +- `query` (required): Search query text +- `filter` (optional): OData filter expression (SourceDocumentDataSource is automatically set to 'Federal Register' and SourceDocumentDataSet is set to 'executive orders') +- `page` (optional): Page number (default: 1) +- `page_size` (optional): Results per page (default: 10) +- `search_mode` (optional): Search mode (default: content) +- `limit` (optional): Maximum results limit +- `include_facets` (optional): Include facets in results (default: false) + +**Executive Orders Coverage:** +- **Time Period:** Last 7 presidencies +- **Source:** Federal Register (https://www.federalregister.gov/presidential-documents/executive-orders) +- **Volume:** 1k+ executive orders +- **Update Frequency:** Weekly updates + +**Example Searches:** +- Search for cybersecurity executive orders: `{"query": "cybersecurity"}` +- Search for recent executive orders: `{"query": "artificial intelligence", "filter": "SourceDocumentPublishDate ge '2023-01-01'"}` +- Search by specific topics: `{"query": "climate change OR environmental"}` + ### ChatGPT Connector Tools -### 10. `search` +### 11. `search` **Purpose:** Simple search interface for ChatGPT Connectors. @@ -291,7 +319,7 @@ The server provides 11 tools for searching the Program Integrity Alliance (PIA) **Parameters:** - `query` (required): A search query string to find relevant documents in the PIA database -### 11. `fetch` +### 12. `fetch` **Purpose:** Document retrieval by ID for ChatGPT Connectors. @@ -309,12 +337,12 @@ Comprehensive search with OData filtering and faceting. The `filter` parameter u **Example Filter Expressions:** - Basic filter: `"SourceDocumentDataSource eq 'GAO'"` -- Multiple conditions: `"SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'OIG'"` +- Multiple conditions: `"SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'Oversight.gov'"` - Complex grouping: `"SourceDocumentDataSource eq 'GAO' and RecStatus ne 'Closed'"` - Negation: `"SourceDocumentDataSource ne 'Department of Justice' and not (RecStatus eq 'Closed')"` - List membership: `"IsIntegrityRelated eq 'Yes' and RecPriorityFlag in ('High', 'Critical')"` - Date ranges: `"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'"` -- Boolean grouping: `"(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'OIG') and RecStatus eq 'Open'"` +- Boolean grouping: `"(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'Oversight.gov') and RecStatus eq 'Open'"` **OData Filter Operators:** - `eq` - equals: `field eq 'value'` @@ -364,7 +392,7 @@ Use the `pia_search_facets` tool to explore what fields are available for filter The facets response will show available fields and their possible values: ```json { - "SourceDocumentDataSource": ["OIG", "GAO", "CMS", "FBI"], + "SourceDocumentDataSource": ["Oversight.gov", "GAO", "CMS", "FBI"], "RecStatus": ["Open", "Closed", "In Progress"], "RecPriorityFlag": ["High", "Medium", "Low", "Critical"], "IsIntegrityRelated": ["Yes", "No"], @@ -384,7 +412,7 @@ Filter: "SourceDocumentDataSource eq 'GAO' and SourceDocumentPublishDate ge '202 **Complex Example:** ``` Query: "healthcare violations" -Filter: "(SourceDocumentDataSource eq 'OIG' or SourceDocumentDataSource eq 'CMS') and RecPriorityFlag in ('High', 'Critical') and SourceDocumentPublishDate ge '2023-01-01'" +Filter: "(SourceDocumentDataSource eq 'Oversight.gov' or SourceDocumentDataSource eq 'CMS') and RecPriorityFlag in ('High', 'Critical') and SourceDocumentPublishDate ge '2023-01-01'" ``` ## 📝 AI Instruction Prompts diff --git a/src/pia_mcp_server/prompts/handlers.py b/src/pia_mcp_server/prompts/handlers.py index 3027992..3d01e12 100644 --- a/src/pia_mcp_server/prompts/handlers.py +++ b/src/pia_mcp_server/prompts/handlers.py @@ -246,7 +246,7 @@ def _generate_titles_search_guidance() -> str: - Construct the filter in **OData syntax**: - `SourceDocumentDataSource eq 'GAO'` - `SourceDocumentTitle contains 'fraud'` - - `(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'OIG') and SourceDocumentIsRecDoc eq 'Yes'` + - `(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'Oversight.gov') and SourceDocumentIsRecDoc eq 'Yes'` - Use correct operators: `eq`, `ne`, `gt`, `ge`, `lt`, `le`, `and`, `or`, `contains` - Wrap string values in single quotes `'value'` diff --git a/src/pia_mcp_server/server.py b/src/pia_mcp_server/server.py index cb3ad36..895e8e0 100644 --- a/src/pia_mcp_server/server.py +++ b/src/pia_mcp_server/server.py @@ -23,6 +23,7 @@ handle_pia_search_content_crs, handle_pia_search_content_doj, handle_pia_search_content_congress, + handle_pia_search_content_executive_orders, handle_search, handle_fetch, ) @@ -36,6 +37,7 @@ pia_search_content_crs_tool, pia_search_content_doj_tool, pia_search_content_congress_tool, + pia_search_content_executive_orders_tool, search_tool, fetch_tool, ) @@ -75,6 +77,7 @@ async def list_tools() -> List[types.Tool]: pia_search_content_crs_tool, pia_search_content_doj_tool, pia_search_content_congress_tool, + pia_search_content_executive_orders_tool, search_tool, fetch_tool, ] @@ -103,6 +106,8 @@ async def call_tool(name: str, arguments: Dict[str, Any]) -> List[types.TextCont return await handle_pia_search_content_doj(arguments) elif name == "pia_search_content_congress": return await handle_pia_search_content_congress(arguments) + elif name == "pia_search_content_executive_orders": + return await handle_pia_search_content_executive_orders(arguments) elif name == "search": return await handle_search(arguments) elif name == "fetch": diff --git a/src/pia_mcp_server/tools/__init__.py b/src/pia_mcp_server/tools/__init__.py index 800c39b..a3922b0 100644 --- a/src/pia_mcp_server/tools/__init__.py +++ b/src/pia_mcp_server/tools/__init__.py @@ -24,6 +24,8 @@ pia_search_content_doj_tool, handle_pia_search_content_congress, pia_search_content_congress_tool, + handle_pia_search_content_executive_orders, + pia_search_content_executive_orders_tool, handle_search, search_tool, handle_fetch, @@ -49,6 +51,8 @@ "pia_search_content_doj_tool", "handle_pia_search_content_congress", "pia_search_content_congress_tool", + "handle_pia_search_content_executive_orders", + "pia_search_content_executive_orders_tool", "handle_search", "search_tool", "handle_fetch", diff --git a/src/pia_mcp_server/tools/search_tools.py b/src/pia_mcp_server/tools/search_tools.py index f653681..76e589e 100644 --- a/src/pia_mcp_server/tools/search_tools.py +++ b/src/pia_mcp_server/tools/search_tools.py @@ -13,14 +13,14 @@ # Tool definitions - EXACT copies from remote server pia_search_content_tool = types.Tool( name="pia_search_content", - description="Search the Program Integrity Alliance (PIA) database for document content and recommendations. Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs). Supports complex OData filtering with boolean logic, operators, and grouping.", + description="Search the Program Integrity Alliance (PIA) database for document content and recommendations. Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders. Supports complex OData filtering with boolean logic, operators, and grouping.", inputSchema={ "type": "object", "properties": { "query": {"type": "string", "description": "Search query text"}, "filter": { "type": "string", - "description": "Optional OData filter expression supporting complex boolean logic.\n\nAVAILABLE FIELDS:\n• SourceDocumentDataSource: Data source/agency that published the document. Major sources (>1k documents): 'Department of Justice', 'Congress.gov', 'Oversight.gov', 'CRS', 'GAO', 'Federal Register'\n• SourceDocumentDataSet: Dataset or collection the document belongs to. Values: 'press-releases', 'reports', 'bills-and-laws', 'federal-reports', 'executive orders', 'state-and-local-reports', 'federal reports'\n• SourceDocumentOrg: Organization associated with the document. There are many values, use pia_search_content_facets tool to see available options\n• SourceDocumentTitle: Document title - use contains, eq for text matching\n• SourceDocumentPublishDate: Publication date - ISO 8601 format YYYY-MM-DD (e.g., '2023-01-01'). Use ge/le for ranges\n• RecStatus: Recommendation status\n• RecPriorityFlag: Priority flag for recommendations\n• IsIntegrityRelated: Whether the content is integrity-related\n• SourceDocumentIsRecDoc: Whether the document contains recommendations. Values: 'No', 'Yes'\n• RecFraudRiskManagementThemePIA: Fraud risk management theme classification\n• RecMatterForCongressPIA: Whether the matter is for Congressional attention\n• RecRecommendation: Recommendation text - use contains, eq for text matching\n• RecAgencyComments: Agency comments on recommendations - use contains, eq for text matching\n\nOPERATORS:\n• Text: contains, eq, ne, startswith, endswith\n• Exact: eq (equals), ne (not equals), in (in list)\n• Date: ge (greater/equal), le (less/equal), eq (equals)\n• Logic: and, or, not, parentheses for grouping\n\nEXAMPLES:\n• \"SourceDocumentDataSource eq 'GAO'\"\n• \"SourceDocumentDataSource eq 'GAO' and RecStatus ne 'Closed'\"\n• \"IsIntegrityRelated eq 'True' and RecPriorityFlag eq 'Yes'\"\n• \"(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'OIG') and RecStatus eq 'Open'\"\n• \"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'\"\n\nTIP: Use pia_search_content_facets tool to get the most current available values.", + "description": "Optional OData filter expression supporting complex boolean logic.\n\n AVAILABLE FIELDS:\n • SourceDocumentDataSource: Data source/agency that published the document. Major sources (>1k documents): 'Department of Justice', 'Congress.gov', 'Oversight.gov', 'CRS', 'GAO', 'Federal Register'\n• SourceDocumentDataSet: Dataset or collection the document belongs to. Values: 'press-releases', 'reports', 'bills-and-laws', 'federal-reports', 'executive orders', 'state-and-local-reports', 'federal reports'\n• SourceDocumentTitle: Document title - use contains, eq for text matching\n• SourceDocumentPublishDate: Publication date - ISO 8601 format YYYY-MM-DD (e.g., '2023-01-01'). Use ge/le for ranges\n• RecStatus: Recommendation status\n• RecPriorityFlag: Priority flag for recommendations\n• SourceDocumentIsRecDoc: Whether the document contains recommendations. Values: 'No', 'Yes'\n• RecFraudRiskManagementThemePIA: Fraud risk management theme classification\n• RecMatterForCongressPIA: Whether the matter is for Congressional attention\n• RecRecommendation: Recommendation text - use contains, eq for text matching\n• RecAgencyComments: Agency comments on recommendations - use contains, eq for text matching\n• referenced_agencies: Agencies referenced by documents (collection field). Example: (referenced_agencies/any(a: a eq 'Department of Defense (DOD)') or referenced_agencies/any(a: a eq 'Department of Justice (DOJ)')) - for single agency omit outer parentheses and 'or'. Get all values via pia_search_content_facets. Note: Many data sources such as CRS and Congress do not tag documents with agency. In these cases PIA infers agencies through AI tagging and in some cases the agency may be incorrect. This tagging only tags documents where the agency is explicitly mentioned.\n\n OPERATORS:\n • Text: contains, eq, ne, startswith, endswith\n • Exact: eq (equals), ne (not equals), in (in list)\n • Date: ge (greater/equal), le (less/equal), eq (equals)\n • Logic: and, or, not, parentheses for grouping\n\n EXAMPLES:\n • \"SourceDocumentDataSource eq 'GAO'\"\n • \"SourceDocumentDataSource eq 'GAO' and RecStatus ne 'Closed'\"\n • \"(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'OIG') and RecStatus eq 'Open'\"\n • \"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'\"\n\n TIP: Use pia_search_content_facets tool to get the most current available values.", }, "page": { "type": "integer", @@ -46,11 +46,69 @@ }, "required": ["query"], }, + outputSchema={ + "type": "object", + "properties": { + "output": { + "type": "object", + "properties": { + "total_count": {"type": "integer"}, + "query": {"type": "string"}, + "summary": {"type": "string"}, + "results": { + "type": "array", + "items": { + "type": "object", + "properties": { + "id": {"type": "string"}, + "title": {"type": "string"}, + "snippet": {"type": "string"}, + "score": {"type": "number"}, + "data_source": {"type": "string"}, + "url": {"type": "string", "format": "uri"}, + "publication_date": { + "type": "string", + "format": "date-time", + }, + }, + "required": ["id", "title", "data_source", "url"], + "additionalProperties": False, + }, + }, + "citations": { + "type": "array", + "items": { + "type": "object", + "properties": { + "id": {"type": "string"}, + "label": {"type": "string"}, + "url": {"type": "string", "format": "uri"}, + "title": {"type": "string"}, + "data_source": {"type": "string"}, + "publication_date": { + "type": "string", + "format": "date-time", + }, + }, + "required": ["id", "label", "url"], + "additionalProperties": False, + }, + }, + "references": {"type": "array", "items": {"type": "string"}}, + "citation_guidance": {"type": "string"}, + }, + "required": ["total_count", "results"], + "additionalProperties": False, + } + }, + "required": ["output"], + "additionalProperties": False, + }, ) pia_search_content_facets_tool = types.Tool( name="pia_search_content_facets", - description="Get available facets (filter values) for the PIA database content search. This can help understand what filter values are available before performing content searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).", + description="Get available facets (filter values) for the PIA database content search. This can help understand what filter values are available before performing content searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders.", inputSchema={ "type": "object", "properties": { @@ -69,7 +127,7 @@ pia_search_titles_tool = types.Tool( name="pia_search_titles", - description="Search the Program Integrity Alliance (PIA) database for document titles only. Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).", + description="Search the Program Integrity Alliance (PIA) database for document titles only. Returns document titles and metadata without searching the full content. Useful for finding specific documents by title or discovering available documents. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders.", inputSchema={ "type": "object", "properties": { @@ -104,7 +162,7 @@ pia_search_titles_facets_tool = types.Tool( name="pia_search_titles_facets", - description="Get available facets (filter values) for the PIA database title search. This can help understand what filter values are available before performing title searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs).", + description="Get available facets (filter values) for the PIA database title search. This can help understand what filter values are available before performing title searches. Major data sources include: Department of Justice (198k+ docs), Congress.gov (29k+ docs), Oversight.gov (22k+ docs), CRS (22k+ docs), GAO (10k+ docs), Federal Register (1k+ executive orders). Use pia_search_content_executive_orders to search only executive orders.", inputSchema={ "type": "object", "properties": { @@ -168,7 +226,7 @@ "query": {"type": "string", "description": "Search query text"}, "filter": { "type": "string", - "description": "Optional OData filter expression supporting complex boolean logic.\n\nAVAILABLE FIELDS:\n• Note: SourceDocumentDataSource is automatically set to 'OIG' for this tool. Major sources (>1k documents): 'Department of Justice', 'Congress.gov', 'Oversight.gov', 'CRS', 'GAO', 'Federal Register'\n• SourceDocumentDataSet: Dataset or collection the document belongs to. Values: 'press-releases', 'reports', 'bills-and-laws', 'federal-reports', 'executive orders', 'state-and-local-reports', 'federal reports'\n• SourceDocumentOrg: Organization associated with the document. There are many values, use pia_search_content_facets tool to see available options\n• SourceDocumentTitle: Document title - use contains, eq for text matching\n• SourceDocumentPublishDate: Publication date - ISO 8601 format YYYY-MM-DD (e.g., '2023-01-01'). Use ge/le for ranges\n• RecStatus: Recommendation status\n• RecPriorityFlag: Priority flag for recommendations\n• IsIntegrityRelated: Whether the content is integrity-related\n• SourceDocumentIsRecDoc: Whether the document contains recommendations. Values: 'No', 'Yes'\n• RecFraudRiskManagementThemePIA: Fraud risk management theme classification\n• RecMatterForCongressPIA: Whether the matter is for Congressional attention\n• RecRecommendation: Recommendation text - use contains, eq for text matching\n• RecAgencyComments: Agency comments on recommendations - use contains, eq for text matching\n\nOPERATORS:\n• Text: contains, eq, ne, startswith, endswith\n• Exact: eq (equals), ne (not equals), in (in list)\n• Date: ge (greater/equal), le (less/equal), eq (equals)\n• Logic: and, or, not, parentheses for grouping\n\nEXAMPLES:\n• \"RecStatus eq 'Open'\"\n• \"RecStatus ne 'Closed' and RecPriorityFlag eq 'Yes'\"\n• \"IsIntegrityRelated eq 'True' and RecPriorityFlag eq 'Yes'\"\n• \"(RecStatus eq 'Open' and RecPriorityFlag eq 'Yes')\"\n• \"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'\"\n\nTIP: Use pia_search_content_facets tool to get the most current available values.", + "description": "Optional OData filter expression supporting complex boolean logic.\n\nAVAILABLE FIELDS:\n• Note: SourceDocumentDataSource is automatically set to 'Oversight.gov' for this tool. Major sources (>1k documents): 'Department of Justice', 'Congress.gov', 'Oversight.gov', 'CRS', 'GAO', 'Federal Register'\n• SourceDocumentDataSet: Dataset or collection the document belongs to. Values: 'press-releases', 'reports', 'bills-and-laws', 'federal-reports', 'executive orders', 'state-and-local-reports', 'federal reports'\n• SourceDocumentOrg: Organization associated with the document. There are many values, use pia_search_content_facets tool to see available options\n• SourceDocumentTitle: Document title - use contains, eq for text matching\n• SourceDocumentPublishDate: Publication date - ISO 8601 format YYYY-MM-DD (e.g., '2023-01-01'). Use ge/le for ranges\n• RecStatus: Recommendation status\n• RecPriorityFlag: Priority flag for recommendations\n• IsIntegrityRelated: Whether the content is integrity-related\n• SourceDocumentIsRecDoc: Whether the document contains recommendations. Values: 'No', 'Yes'\n• RecFraudRiskManagementThemePIA: Fraud risk management theme classification\n• RecMatterForCongressPIA: Whether the matter is for Congressional attention\n• RecRecommendation: Recommendation text - use contains, eq for text matching\n• RecAgencyComments: Agency comments on recommendations - use contains, eq for text matching\n\nOPERATORS:\n• Text: contains, eq, ne, startswith, endswith\n• Exact: eq (equals), ne (not equals), in (in list)\n• Date: ge (greater/equal), le (less/equal), eq (equals)\n• Logic: and, or, not, parentheses for grouping\n\nEXAMPLES:\n• \"RecStatus eq 'Open'\"\n• \"RecStatus ne 'Closed' and RecPriorityFlag eq 'Yes'\"\n• \"IsIntegrityRelated eq 'True' and RecPriorityFlag eq 'Yes'\"\n• \"(RecStatus eq 'Open' and RecPriorityFlag eq 'Yes')\"\n• \"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'\"\n\nTIP: Use pia_search_content_facets tool to get the most current available values.", }, "page": { "type": "integer", @@ -307,6 +365,43 @@ }, ) +pia_search_content_executive_orders_tool = types.Tool( + name="pia_search_content_executive_orders", + description="Search the Program Integrity Alliance (PIA) database for Executive Orders document content from the Federal Register. This tool automatically filters results to only include Executive Orders from the Federal Register (https://www.federalregister.gov/). Returns comprehensive results with full citation information and clickable links for proper attribution. Each result includes corresponding citations with data source attribution. Supports complex OData filtering with boolean logic, operators, and grouping.", + inputSchema={ + "type": "object", + "properties": { + "query": {"type": "string", "description": "Search query text"}, + "filter": { + "type": "string", + "description": "Optional OData filter expression supporting complex boolean logic.\n\nAVAILABLE FIELDS:\n• Note: SourceDocumentDataSource is automatically set to 'Federal Register' and SourceDocumentDataSet is set to 'executive orders' for this tool. Major sources (>1k documents): 'Department of Justice', 'Congress.gov', 'Oversight.gov', 'CRS', 'GAO', 'Federal Register'\n• SourceDocumentDataSet: Dataset or collection the document belongs to. Values: 'press-releases', 'reports', 'bills-and-laws', 'federal-reports', 'executive orders', 'state-and-local-reports', 'federal reports'\n• SourceDocumentOrg: Organization associated with the document. There are many values, use pia_search_content_facets tool to see available options\n• SourceDocumentTitle: Document title - use contains, eq for text matching\n• SourceDocumentPublishDate: Publication date - ISO 8601 format YYYY-MM-DD (e.g., '2023-01-01'). Use ge/le for ranges\n• RecStatus: Recommendation status\n• RecPriorityFlag: Priority flag for recommendations\n• IsIntegrityRelated: Whether the content is integrity-related\n• SourceDocumentIsRecDoc: Whether the document contains recommendations. Values: 'No', 'Yes'\n• RecFraudRiskManagementThemePIA: Fraud risk management theme classification\n• RecMatterForCongressPIA: Whether the matter is for Congressional attention\n• RecRecommendation: Recommendation text - use contains, eq for text matching\n• RecAgencyComments: Agency comments on recommendations - use contains, eq for text matching\n\nOPERATORS:\n• Text: contains, eq, ne, startswith, endswith\n• Exact: eq (equals), ne (not equals), in (in list)\n• Date: ge (greater/equal), le (less/equal), eq (equals)\n• Logic: and, or, not, parentheses for grouping\n\nEXAMPLES:\n• \"SourceDocumentPublishDate ge '2020-01-01'\"\n• \"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'\"\n• \"IsIntegrityRelated eq 'True' and RecPriorityFlag eq 'Yes'\"\n• \"IsIntegrityRelated eq 'True'\"\n• \"SourceDocumentPublishDate ge '2020-01-01' and SourceDocumentPublishDate le '2024-12-31'\"\n\nTIP: Use pia_search_content_facets tool to get the most current available values.", + }, + "page": { + "type": "integer", + "description": "Page number (default: 1)", + "default": 1, + }, + "page_size": { + "type": "integer", + "description": "Results per page (default: 10)", + "default": 10, + }, + "search_mode": { + "type": "string", + "description": "Search mode (default: content)", + "default": "content", + }, + "limit": {"type": "integer", "description": "Maximum results limit"}, + "include_facets": { + "type": "boolean", + "description": "Include facets in results", + "default": False, + }, + }, + "required": ["query"], + }, +) + search_tool = types.Tool( name="search", description="Search the Program Integrity Alliance (PIA) database and return a list of potentially relevant search results with titles, snippets, and URLs for citation. This endpoint is one of the supported for OpenAI's MCP spec when integrating ChatGPT Connectors.", @@ -402,6 +497,13 @@ async def handle_pia_search_content_congress( return await _forward_to_remote("pia_search_content_congress", arguments) +async def handle_pia_search_content_executive_orders( + arguments: Dict[str, Any], +) -> List[types.TextContent]: + """Handle PIA Executive Orders content search requests.""" + return await _forward_to_remote("pia_search_content_executive_orders", arguments) + + async def handle_search( arguments: Dict[str, Any], ) -> List[types.TextContent]: diff --git a/tests/test_tools.py b/tests/test_tools.py index bebc9eb..4cf5405 100644 --- a/tests/test_tools.py +++ b/tests/test_tools.py @@ -1,5 +1,6 @@ """Tests for tools module.""" +import os import pytest from unittest.mock import AsyncMock, patch, Mock import httpx @@ -8,6 +9,14 @@ handle_pia_search_content_facets, handle_pia_search_titles, handle_pia_search_titles_facets, + handle_pia_search_content_gao, + handle_pia_search_content_oig, + handle_pia_search_content_crs, + handle_pia_search_content_doj, + handle_pia_search_content_congress, + handle_pia_search_content_executive_orders, + handle_search, + handle_fetch, ) from pia_mcp_server.config import Settings @@ -18,10 +27,11 @@ async def test_pia_search_content_no_api_key(): """Test PIA content search without API key.""" with patch.object(Settings, "_get_api_key_from_args", return_value=None): - result = await handle_pia_search_content({"query": "test"}) + with patch.dict(os.environ, {}, clear=True): # Clear all environment variables + result = await handle_pia_search_content({"query": "test"}) - assert len(result) == 1 - assert "PIA API key is required" in result[0].text + assert len(result) == 1 + assert "PIA API key is required" in result[0].text @pytest.mark.asyncio @@ -129,7 +139,7 @@ async def test_pia_search_content_with_complex_odata_filter(): mock_client.return_value.__aenter__.return_value = mock_client_instance # Test complex boolean logic filter - complex_filter = "(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'OIG') and RecPriorityFlag in ('High', 'Critical')" + complex_filter = "(SourceDocumentDataSource eq 'GAO' or SourceDocumentDataSource eq 'Oversight.gov') and RecPriorityFlag in ('High', 'Critical')" result = await handle_pia_search_content( {"query": "integrity violations", "filter": complex_filter} ) @@ -196,10 +206,11 @@ async def test_pia_search_content_http_error(): async def test_pia_search_content_facets_no_api_key(): """Test PIA search facets without API key.""" with patch.object(Settings, "_get_api_key_from_args", return_value=None): - result = await handle_pia_search_content_facets({"query": "test"}) + with patch.dict(os.environ, {}, clear=True): # Clear all environment variables + result = await handle_pia_search_content_facets({"query": "test"}) - assert len(result) == 1 - assert "PIA API key is required" in result[0].text + assert len(result) == 1 + assert "PIA API key is required" in result[0].text @pytest.mark.asyncio @@ -210,7 +221,7 @@ async def test_pia_search_content_facets_success(): "id": 1, "result": { "facets": { - "SourceDocumentDataSource": ["OIG", "GAO", "CMS"], + "SourceDocumentDataSource": ["Oversight.gov", "GAO", "CMS"], "RecStatus": ["Open", "Closed", "In Progress"], "RecPriorityFlag": ["High", "Medium", "Low", "Critical"], "IsIntegrityRelated": ["Yes", "No"], @@ -232,7 +243,7 @@ async def test_pia_search_content_facets_success(): assert len(result) == 1 assert "SourceDocumentDataSource" in result[0].text - assert "OIG" in result[0].text + assert "Oversight.gov" in result[0].text assert "RecStatus" in result[0].text @@ -430,7 +441,7 @@ async def test_pia_search_content_facets_empty_filter(): "id": 1, "result": { "facets": { - "SourceDocumentDataSource": ["OIG", "GAO", "CMS"], + "SourceDocumentDataSource": ["Oversight.gov", "GAO", "CMS"], "RecStatus": ["Open", "Closed"], } }, @@ -453,3 +464,286 @@ async def test_pia_search_content_facets_empty_filter(): assert len(result) == 1 assert "SourceDocumentDataSource" in result[0].text + + +# Agency-specific search tool tests +@pytest.mark.asyncio +async def test_pia_search_content_gao_success(): + """Test successful PIA GAO content search.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "documents": [ + {"title": "GAO Report", "id": "gao-123", "data_source": "GAO"} + ], + "total": 1, + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_pia_search_content_gao({"query": "audit"}) + + assert len(result) == 1 + assert "GAO Report" in result[0].text + + +@pytest.mark.asyncio +async def test_pia_search_content_oig_success(): + """Test successful PIA OIG content search.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "documents": [ + { + "title": "OIG Investigation", + "id": "oig-123", + "data_source": "Oversight.gov", + } + ], + "total": 1, + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_pia_search_content_oig({"query": "oversight"}) + + assert len(result) == 1 + assert "OIG Investigation" in result[0].text + + +@pytest.mark.asyncio +async def test_pia_search_content_crs_success(): + """Test successful PIA CRS content search.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "documents": [ + {"title": "CRS Report", "id": "crs-123", "data_source": "CRS"} + ], + "total": 1, + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_pia_search_content_crs({"query": "research"}) + + assert len(result) == 1 + assert "CRS Report" in result[0].text + + +@pytest.mark.asyncio +async def test_pia_search_content_doj_success(): + """Test successful PIA DOJ content search.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "documents": [ + { + "title": "DOJ Press Release", + "id": "doj-123", + "data_source": "Department of Justice", + } + ], + "total": 1, + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_pia_search_content_doj({"query": "enforcement"}) + + assert len(result) == 1 + assert "DOJ Press Release" in result[0].text + + +@pytest.mark.asyncio +async def test_pia_search_content_congress_success(): + """Test successful PIA Congress content search.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "documents": [ + { + "title": "Congressional Bill", + "id": "congress-123", + "data_source": "Congress.gov", + } + ], + "total": 1, + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_pia_search_content_congress({"query": "legislation"}) + + assert len(result) == 1 + assert "Congressional Bill" in result[0].text + + +@pytest.mark.asyncio +async def test_pia_search_content_executive_orders_success(): + """Test successful PIA Executive Orders content search.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "documents": [ + { + "title": "Executive Order 12345", + "id": "eo-123", + "data_source": "Federal Register", + } + ], + "total": 1, + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_pia_search_content_executive_orders( + {"query": "cybersecurity"} + ) + + assert len(result) == 1 + assert "Executive Order 12345" in result[0].text + + +@pytest.mark.asyncio +async def test_fetch_success(): + """Test successful document fetch.""" + mock_response = { + "jsonrpc": "2.0", + "id": 1, + "result": { + "id": "doc-123", + "title": "Test Document", + "content": "Full document content here", + "url": "https://example.com/doc-123", + }, + } + + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_response_obj = Mock() + mock_response_obj.json.return_value = mock_response + mock_response_obj.raise_for_status.return_value = None + + mock_client_instance = AsyncMock() + mock_client_instance.post.return_value = mock_response_obj + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await handle_fetch({"id": "doc-123"}) + + assert len(result) == 1 + assert "Test Document" in result[0].text + assert "Full document content here" in result[0].text + + +@pytest.mark.asyncio +async def test_agency_tools_no_api_key(): + """Test agency-specific tools without API key.""" + tools_to_test = [ + (handle_pia_search_content_gao, {"query": "test"}), + (handle_pia_search_content_oig, {"query": "test"}), + (handle_pia_search_content_crs, {"query": "test"}), + (handle_pia_search_content_doj, {"query": "test"}), + (handle_pia_search_content_congress, {"query": "test"}), + (handle_pia_search_content_executive_orders, {"query": "test"}), + (handle_fetch, {"id": "test-123"}), + ] + + for tool_handler, args in tools_to_test: + with patch.object(Settings, "_get_api_key_from_args", return_value=None): + with patch.dict( + os.environ, {}, clear=True + ): # Clear all environment variables + result = await tool_handler(args) + assert len(result) == 1 + assert "PIA API key is required" in result[0].text + + +@pytest.mark.asyncio +async def test_agency_tools_http_error(): + """Test agency-specific tools with HTTP error.""" + tools_to_test = [ + (handle_pia_search_content_gao, {"query": "test"}), + (handle_pia_search_content_oig, {"query": "test"}), + (handle_pia_search_content_crs, {"query": "test"}), + (handle_pia_search_content_doj, {"query": "test"}), + (handle_pia_search_content_congress, {"query": "test"}), + (handle_pia_search_content_executive_orders, {"query": "test"}), + (handle_fetch, {"id": "test-123"}), + ] + + for tool_handler, args in tools_to_test: + with patch.object(Settings, "_get_api_key_from_args", return_value="test_key"): + with patch("httpx.AsyncClient") as mock_client: + mock_client_instance = AsyncMock() + mock_client_instance.post.side_effect = httpx.HTTPStatusError( + "Server Error", + request=Mock(), + response=Mock(status_code=500, text="Server Error"), + ) + mock_client.return_value.__aenter__.return_value = mock_client_instance + + result = await tool_handler(args) + + assert len(result) == 1 + assert "HTTP Error 500" in result[0].text