You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: 'Search for datasets by topic, domain, or content in DataFair. Use simple French keywords (not full sentences). Returns a preview with essential metadata: a list of datasets containing ID, title, description, and link to the source URL that must be included in responses. Then use describe_dataset to get detailed metadata. Examples: "élus", "DPE", "entreprises"',
38
+
description: 'Full-text search for datasets in DataFair. Uses French keywords to search across dataset titles, descriptions, and metadata (not full sentences). Returns a preview with essential metadata: a list of datasets containing ID, title, description, and link to the source URL that must be included in responses. Then use describe_dataset to get detailed metadata.',
38
39
inputSchema: {
39
-
query: z.string().min(3,'Search term must be at least 3 characters long').describe('Search terms in French (simple keywords, not sentences). Examples: "élus", "DPE", "entreprises"')
40
+
query: z.string().min(3,'Search term must be at least 3 characters long').describe('French keywords for full-text search across dataset titles, descriptions, and metadata (simple keywords, not sentences). Examples: "élus", "DPE", "entreprises", "logement social"')
40
41
},
41
42
outputSchema: {
42
-
totalCount: z.number().describe('Total number of datasets matching the search criteria'),
43
+
totalCount: z.number().describe('Total number of datasets matching the full-text search criteria'),
43
44
datasets: z.array(
44
45
z.object({
45
46
id: z.string().describe('Unique dataset ID (required for describe_dataset and search_data tools)'),
46
47
title: z.string().describe('Dataset title'),
47
48
description: z.string().optional().describe('A markdown description of the dataset content'),
48
49
source: z.string().describe('Direct URL to the dataset page (must be included in AI responses as citation source)'),
49
50
})
50
-
).describe('Array of datasets matching the search criteria (top 10 results)')
51
+
).describe('Array of datasets matching the full-text search criteria (top 10 results)')
* Tool to search for specific data rows within a dataset.
202
-
* This tool allows users to search for data within a specific dataset using simple French keywords.
203
-
* It returns matching rows with their relevance scores and provides a direct link to view
204
-
* the filtered results in the dataset's table interface.
205
-
* Use this after describe_dataset to understand the dataset structure.
202
+
* Tool to search for specific data rows within a dataset using either full-text search OR precise filters.
203
+
* This tool can search data in two ways:
204
+
* 1) Full-text search across all columns using keywords (quick and broad search)
205
+
* 2) Precise filtering on specific columns with exact matches, comparisons, or column-specific searches (ideal for structured queries)
206
+
*
207
+
* Returns matching rows with their relevance scores and provides a direct link to view the filtered results in the dataset's table interface.
208
+
* Use this after describe_dataset to understand the dataset structure and column keys.
206
209
* @param {string} datasetId - The unique ID of the dataset to search in (obtained from search_datasets)
207
-
* @param {string} query - Simple French keywords to search for within the dataset data
210
+
* @param {string} query - French keywords for full-text search across all dataset columns
211
+
* @param {string} select - Optional comma-separated list of column keys to reduce output size
212
+
* @param {Object} filters - Optional precise filters on specific columns (alternative to query)
208
213
*/
209
214
server.registerTool(
210
215
'search_data',
211
216
{
212
217
title: 'Search data from a dataset',
213
-
description: 'Search for data rows within a specific dataset using simple French keywords. Returns matching rows with relevance scores and a direct link to view filtered results in the dataset table interface. Always include dataset licenseand source information when presenting results to users. Use describe_dataset first to understand the data structure.',
218
+
description: 'Search for data rows in a specific dataset using either : - Full-text search across all columns (query) for quick, broad matches, - Precise filtering (filters) to apply exact conditions, comparisons, or column-specific searches. Use filters whenever your question involves multiple criteria or numerical/date ranges, as they yield more relevant and targeted results. The query parameter is better suited for simple, one-keyword searches across the entire dataset. Returns matching rows with relevance scores and a direct link to view filtered results in the dataset table interface. Always include dataset license, direct link and source information when presenting results to users. Use describe_dataset first to understand the data structure and available column keys.',
214
219
inputSchema: {
215
-
datasetId: z.string().describe('The unique dataset ID obtained from search_datasets'),
216
-
query: z.string().min(1,'Search query cannot be empty').describe('Simple French keywords to search within the dataset (not full sentences). Examples: "Jean Dupont", "Paris"'),
220
+
datasetId: z.string().describe('The unique dataset ID obtained from search_datasets tool'),
221
+
query: z.string().optional().describe('French keywords for full-text search across all dataset columns (simple keywords, not sentences). Do not use with filters parameter. Examples: "Jean Dupont", "Paris", "2025"'),
222
+
select: z.string().optional().describe('Optional comma-separated list of specific column keys to include in the results. Useful when the dataset has many columns to reduce output size. If not provided, all columns are returned. Use column keys from describe_dataset. Example: "nom,age,ville"'),
.describe('Precise filters on specific columns. Ideal for multi-condition queries or range searches. Each filter key must be: column_key + suffix. Available suffixes: _eq (strictly equal - exact match), _search (full-text search within that column), _gte (greater than or equal), _lte (less than or equal). Use column keys from describe_dataset. Example: { "nom_search": "Jean", "age_lte": "30", "ville_eq": "Paris" } searches for people whose names contain "Jean", who are 30 years old or younger, and who live in Paris.')
217
231
},
218
232
outputSchema: {
219
-
totalCount: z.number().describe('Total number of data rows matching the search criteria'),
233
+
totalCount: z.number().describe('Total number of data rows matching the search criteria and filters'),
220
234
datasetId: z.string().describe('The dataset ID that was searched'),
221
-
searchQuery: z.string().describe('The search query that was used'),
222
-
sourceUrl: z.string().describe('Direct URL to view the filtered dataset results in table format (for citation and direct access to filtered view)'),
235
+
sourceUrl: z.string().describe('Direct URL to view the filtered dataset results in table format (must be included in responses for citation and direct access to filtered view)'),
223
236
lines: z.array(
224
-
z.record(z.any()).describe('Data row object with column keys and values, plus _score field indicating relevance')
225
-
).describe('Array of matching data rows (top 10 results). Each row contains dataset columns plus _score for search relevance')
237
+
z.record(z.any()).describe('Data row object containing column keys as object keys with their values, plus _score field indicating search relevance (higher score = more relevant)')
238
+
).describe('Array of matching data rows (top 10 results). Each row contains dataset columns (using column keys) plus _score field for search relevance ranking')
0 commit comments