-
Notifications
You must be signed in to change notification settings - Fork 93
feat: add sort parameter to search_datasets and expose column names in get_resource_info #19
Description
During a query session searching for EV charging infrastructure data in Paris, two gaps required stepping outside the MCP and calling the underlying APIs directly.
Sort
search_datasetshas nosortparameter
The data.gouv.fr API supports ?sort=-created (and created, title, -title), but the MCP tool doesn't expose it. When keyword search returns hundreds of fragmented results, there's no way to surface the most recently updated datasets without leaving the MCP.
In our case, the national consolidated IRVE dataset (updated daily) was unreachable through the tool.
search_datasets("IRVE") returns 580 fragmented single-operator datasets with no way to sort by recency:
Found 580 dataset(s) for query: 'IRVE'
1. IRVE statique (organisation IRVE SIED70) — 1 resource
2. IRVE statique (organisation IRVE SIED70) — 1 resource
3. IRVE — Recharge Active Solutions — 1 resource
The national consolidated dataset (140MB, updated daily) does not appear.
Column Info
get_resource_infodoesn't return column names
When using query_resource_data, the correct filter_column value isn't always obvious. The column name had to be discovered by calling the Tabular API /profile/ endpoint directly — the profile is already fetched inside get_resource_info but the column list is discarded.
get_resource_info confirms Tabular availability but gives no column information:
Tabular API availability:
✅ Available via Tabular API (large file exception)
Guessing filter_column: code_postal then produces:
❌ Tabular API error (HTTP 400) — URL: ...?code_postal__exact=75015
The actual column is consolidated_code_postal.
Proposed changes
- Add optional
sortparam tosearch_datasets(passed through todatagouv_api_clientand the API) - Parse and return column names from the profile response already fetched in
get_resource_info
A PR with tests is being opened alongside this issue.