Skip to content

Commit 9265e81

Browse files
authored
Merge pull request #5 from algolia/enhance/algolia-mcp-skill-improvements
feat: enhance algolia-mcp skill with evals and improved docs 🔬
2 parents e7d04a6 + 7f72647 commit 9265e81

File tree

3 files changed

+267
-2
lines changed

3 files changed

+267
-2
lines changed

skills/algolia-mcp/SKILL.md

Lines changed: 72 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,14 +53,84 @@ For clients that don't support commands, see [connection-setup](references/conne
5353
| Trending facets | `algolia_recommendations` | `trending-facets` |
5454
| Visually similar items | `algolia_recommendations` | `looking-similar` |
5555

56-
## Required workflow
56+
## Search Filter Syntax
57+
58+
Filters go in the `algolia_search_index` call alongside `query`:
59+
60+
**facetFilters** (array-based):
61+
```
62+
[["color:red", "color:blue"]] → OR (red OR blue)
63+
[["brand:Nike"], ["category:running"]] → AND (Nike AND running)
64+
[["size:10"], ["color:red", "color:blue"]] → mixed (size 10 AND (red OR blue))
65+
```
66+
Each inner array is OR'd; outer arrays are AND'd.
67+
68+
**numericFilters** (string-based):
69+
```
70+
["price < 100"] → single condition
71+
["price >= 50", "price <= 200"] → range (AND'd)
72+
```
73+
74+
**Date filtering**: Dates must be stored as Unix timestamps. Use `numericFilters: ["timestamp >= 1704067200"]`.
75+
76+
**Attribute selection**: Use `attributesToRetrieve: ["name", "price"]` to limit response size.
77+
78+
## Analytics Key Details
79+
80+
- **`clickAnalytics: true`**: Set this on `algolia_analytics_top_searches` or `algolia_analytics_top_search_results` to include CTR, conversion rate, and click count. Only these two tools support it.
81+
- **`revenueAnalytics: true`**: Set on the same tools to also include add-to-cart rate, purchase rate, and revenue.
82+
- **Data delay**: Recent data has a 1–4 hour processing delay. Use date ranges ending at least 4 hours ago for complete data.
83+
84+
### Interpreting Results
85+
86+
| No-results rate | Assessment |
87+
|----------------|------------|
88+
| < 5% | Excellent |
89+
| 5–10% | Good |
90+
| 10–20% | Needs improvement |
91+
| > 20% | Poor |
92+
93+
**Click positions**: Healthy = 30–40% of clicks at position 1, decreasing through 10. Even distribution = poor relevance. Concentrated at positions 5–10 = ranking issues.
94+
95+
**Low CTR + high search volume** = poor result relevance. Common causes: missing synonyms, content gaps, mismatched query intent.
96+
97+
## Recommendation Thresholds
98+
99+
| Threshold | Behavior |
100+
|-----------|----------|
101+
| 50 | More results, lower relevance |
102+
| **60** | **Balanced (good default)** |
103+
| 75 | Fewer results, higher relevance |
104+
105+
**Model parameter requirements**:
106+
- `bought-together`, `related-products`, `looking-similar` → require `objectID`
107+
- `trending-items` → does NOT require `objectID`. Use `facetName` + `facetValue` to filter by category
108+
- `trending-facets` → requires `facetName`
109+
110+
## Required Workflow
57111

58112
1. **Discover first**: Always call `algolia_search_list_indices` before other tools to resolve `applicationId` and `indexName`. The `applicationId` parameter is an enum — select from the values in the tool schema, never guess.
59113
2. **Index names are case-sensitive**: Use the exact name returned by `algolia_search_list_indices`.
60114
3. **Date parameters**: Analytics tools accept `startDate` and `endDate` in `YYYY-MM-DD` format. Default period is the last 8 days.
61115
4. **Permissions**: Not all tools are available to every user. Analytics tools require the Analytics permission; recommendations require the Recommend feature.
62116

63-
## Reference docs
117+
## Common Workflows
118+
119+
### Search Quality Audit
120+
1. `algolia_search_list_indices` → get applicationId and index name
121+
2. `algolia_analytics_no_results_rate` → check overall health (< 5% is excellent)
122+
3. `algolia_analytics_searches_no_results` → find the specific failing queries
123+
4. `algolia_analytics_top_searches` with `clickAnalytics: true` → find high-volume queries with low CTR
124+
5. `algolia_analytics_click_positions` → check if clicks are concentrated at position 1 (good) or spread evenly (poor relevance)
125+
6. For each problematic query: `algolia_search_index` with that query to see what results look like
126+
127+
### Recommendation Setup Check
128+
1. `algolia_search_list_indices` → resolve applicationId
129+
2. Start with `trending-items` (requires least data) to verify Recommend is working
130+
3. Then try `bought-together` or `related-products` with a known product objectID
131+
4. If results are empty, check event volume requirements in [recommendations reference](references/recommendations.md)
132+
133+
## Reference Docs
64134

65135
- [connection-setup](references/connection-setup.md) — MCP server configuration and authentication
66136
- [search](references/search.md) — Search parameters, filter syntax (`facetFilters`, `numericFilters`), pagination
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# Algolia MCP Skill — Evaluation Results
2+
3+
Evaluation performed on 2026-03-18 using Claude Opus 4.6 (1M context).
4+
5+
## Summary
6+
7+
The skill was evaluated across 5 realistic user scenarios, comparing **with-skill** (Claude reads the skill before responding) vs **without-skill** (Claude relies on general knowledge).
8+
9+
| Eval | With Skill | Without Skill | Delta |
10+
|------|:----------:|:-------------:|:-----:|
11+
| **Eval 1** — Search with filters | 100% (6/6) | 17% (1/6) | **+83%** |
12+
| **Eval 2** — Analytics report | 100% (6/6) | 33% (2/6) | **+67%** |
13+
| **Eval 3** — Recommendations | 100% (6/6) | 33% (2/6) | **+67%** |
14+
| **Eval 4** — Multi-step investigation | 100% (6/6) | 17% (1/6) | **+83%** |
15+
| **Eval 5** — Date filtering + pagination | 100% (6/6) | 33% (2/6) | **+67%** |
16+
| **Average** | **100%** | **27%** | **+73%** |
17+
18+
## Eval Details
19+
20+
### Eval 1: Search with Filters
21+
22+
**Prompt:** *"I want to search my 'products' index for shoes under $100 in either red or blue. Show me only the name, price, and color fields. Also, what are the available facet values for the 'brand' attribute?"*
23+
24+
| Assertion | Without Skill | With Skill |
25+
|-----------|:---:|:---:|
26+
| Calls `algolia_search_list_indices` first | FAIL | PASS |
27+
| Uses `facetFilters` with OR syntax `[["color:red", "color:blue"]]` | FAIL | PASS |
28+
| Uses `numericFilters` with string syntax `["price < 100"]` | FAIL | PASS |
29+
| Combines facetFilters AND numericFilters in same call | FAIL | PASS |
30+
| Sets `attributesToRetrieve` to `["name", "price", "color"]` | PASS | PASS |
31+
| Uses `algolia_search_for_facet_values` for brand | FAIL | PASS |
32+
33+
**Key finding:** Without the skill, Claude used a generic `algolia_search` tool with a combined `filters` string instead of the MCP-specific `facetFilters`/`numericFilters` array parameters. It also used `facets` parameter instead of the dedicated `algolia_search_for_facet_values` tool.
34+
35+
### Eval 2: Analytics Report
36+
37+
**Prompt:** *"Give me a search quality report for my 'ecommerce' index over the last 30 days — I want to know the no-results rate, top searches that have no clicks, and the click position distribution. Include click-through rates where possible."*
38+
39+
| Assertion | Without Skill | With Skill |
40+
|-----------|:---:|:---:|
41+
| Calls `algolia_search_list_indices` first | FAIL | PASS |
42+
| Uses `algolia_analytics_no_results_rate` with correct dates | FAIL | PASS |
43+
| Uses `algolia_analytics_top_searches_without_clicks` | FAIL | PASS |
44+
| Uses `algolia_analytics_click_positions` | FAIL | PASS |
45+
| Sets `clickAnalytics: true` on a supported tool | PASS | PASS |
46+
| Does NOT use algolia-cli commands | PASS | PASS |
47+
48+
**Key finding:** The baseline fabricated all tool names using camelCase (`algolia_getNoResultsRate`, `algolia_getClickThroughRate`) instead of the actual snake_case MCP tool names (`algolia_analytics_no_results_rate`). It also skipped the discovery step entirely.
49+
50+
### Eval 3: Recommendations
51+
52+
**Prompt:** *"For product ID 'SKU-1234' in my 'catalog' index, show me frequently bought together items and related products. Also show me what's trending in the 'shoes' category. Use a balanced relevance threshold."*
53+
54+
| Assertion | Without Skill | With Skill |
55+
|-----------|:---:|:---:|
56+
| Calls `algolia_search_list_indices` first | FAIL | PASS |
57+
| Uses `algolia_recommendations` with `bought-together` + objectID | FAIL | PASS |
58+
| Uses `algolia_recommendations` with `related-products` + objectID | FAIL | PASS |
59+
| Uses `trending-items` with `facetName`/`facetValue` | FAIL | PASS |
60+
| Sets threshold to 60 (balanced default) | PASS | PASS |
61+
| Does NOT pass objectID for trending-items | PASS | PASS |
62+
63+
**Key finding:** The baseline guessed the tool name as `algolia_get_recommendations` (wrong) and used threshold 50 instead of the documented balanced default of 60. It also used `facetFilters` instead of the dedicated `facetName`/`facetValue` parameters for trending-items.
64+
65+
### Eval 4: Multi-Step Investigation (harder)
66+
67+
**Prompt:** *"Our 'ecommerce' index has a no-results rate of 18%. I need to find the specific queries that are failing, then for the top 3 failing queries, actually run those searches to see what results come back. Also check if our click-through rates have been improving — compare the last 7 days vs the previous 7 days."*
68+
69+
| Assertion | Without Skill | With Skill |
70+
|-----------|:---:|:---:|
71+
| Calls `algolia_search_list_indices` first | FAIL | PASS |
72+
| Uses `algolia_analytics_searches_no_results` | FAIL | PASS |
73+
| Uses `algolia_search_index` to test failing queries | FAIL | PASS |
74+
| Uses `algolia_analytics_top_searches` with `clickAnalytics: true` for BOTH date ranges | FAIL | PASS |
75+
| Sets `clickAnalytics: true` on `algolia_analytics_top_searches` specifically | FAIL | PASS |
76+
| Uses correct YYYY-MM-DD date format | PASS | PASS |
77+
78+
**Key finding:** The baseline invented a non-existent `algolia_getClickThroughRate` endpoint instead of using `algolia_analytics_top_searches` with `clickAnalytics: true`. The skill's Search Quality Audit workflow guided the correct multi-step approach. The with-skill run also accounted for the 1-4 hour data processing delay by ending date ranges at the previous day.
79+
80+
### Eval 5: Date Filtering + Pagination (harder)
81+
82+
**Prompt:** *"Search my 'events' index for all conferences happening after January 1st 2025. The date is stored as a Unix timestamp field called 'event_date'. Filter to only events in 'technology' or 'science' categories with a ticket price between $50 and $500. Show me page 3 with 20 results per page."*
83+
84+
| Assertion | Without Skill | With Skill |
85+
|-----------|:---:|:---:|
86+
| Calls `algolia_search_list_indices` first | FAIL | PASS |
87+
| Uses numericFilters with Unix timestamp (1735689600) | FAIL | PASS |
88+
| Uses facetFilters with OR syntax for categories | PASS | PASS |
89+
| Uses numericFilters for price range | FAIL | PASS |
90+
| Combines all filters in a single `algolia_search_index` call | FAIL | PASS |
91+
| Sets page to 2 (0-indexed) and hitsPerPage to 20 | PASS | PASS |
92+
93+
**Key finding:** The baseline used the wrong tool name (`algolia_search`), guessed the field name as `ticket_price` instead of `price`, and used `>` instead of `>=` for the date filter. The skill's explicit Unix timestamp guidance and filter syntax examples prevented all these mistakes.
94+
95+
## What the Skill Adds
96+
97+
The biggest areas where the skill outperforms general knowledge:
98+
99+
1. **Correct MCP tool names** — Every baseline fabricated plausible but wrong tool names (camelCase vs snake_case, missing `analytics_` prefix, wrong base names)
100+
2. **Discovery workflow**`algolia_search_list_indices` as mandatory first step (every baseline skipped it)
101+
3. **`clickAnalytics: true`** — Knowing this flag exists and which tools support it (`top_searches`, `top_search_results` only)
102+
4. **Filter syntax**`facetFilters` array-based OR/AND vs `numericFilters` string-based format
103+
5. **Recommendation parameters** — Which models need `objectID` vs `facetName`/`facetValue`, and threshold guidance
104+
6. **Multi-step workflows** — Search Quality Audit pattern: analytics → identify problems → search to diagnose
105+
106+
## Improvements Made
107+
108+
1. **Surfaced filter syntax** (facetFilters OR/AND, numericFilters strings) from reference into main SKILL.md
109+
2. **Surfaced `clickAnalytics: true`** guidance with which tools support it
110+
3. **Surfaced recommendation thresholds** (50/60/75) and model parameter requirements table
111+
4. **Added analytics interpretation benchmarks** (no-results rate thresholds, click position patterns)
112+
5. **Added Common Workflows** section (Search Quality Audit, Recommendation Setup Check)
113+
6. **Added algolia-cli cross-reference** for write operations
114+
115+
## Reproducibility
116+
117+
- Model: Claude Opus 4.6 (1M context)
118+
- Eval definitions: `evals/evals.json`
119+
- Date: 2026-03-18
120+
- Each eval was run once per configuration
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
{
2+
"skill_name": "algolia-mcp",
3+
"evals": [
4+
{
5+
"id": 1,
6+
"prompt": "I want to search my 'products' index for shoes under $100 in either red or blue. Show me only the name, price, and color fields. Also, what are the available facet values for the 'brand' attribute?",
7+
"expected_output": "Correct MCP tool calls with proper filter syntax, attribute selection, and facet exploration",
8+
"files": [],
9+
"expectations": [
10+
"Calls algolia_search_list_indices first to discover applicationId and verify index name",
11+
"Uses algolia_search_index with facetFilters using OR syntax for colors: [[\"color:red\", \"color:blue\"]]",
12+
"Uses numericFilters with string syntax: [\"price < 100\"]",
13+
"Combines facetFilters AND numericFilters in the same search call",
14+
"Sets attributesToRetrieve to [\"name\", \"price\", \"color\"] to limit response",
15+
"Uses algolia_search_for_facet_values to explore the 'brand' attribute"
16+
]
17+
},
18+
{
19+
"id": 2,
20+
"prompt": "Give me a search quality report for my 'ecommerce' index over the last 30 days — I want to know the no-results rate, top searches that have no clicks, and the click position distribution. Include click-through rates where possible.",
21+
"expected_output": "Multiple analytics tool calls with correct date params and clickAnalytics enabled",
22+
"files": [],
23+
"expectations": [
24+
"Calls algolia_search_list_indices first to resolve applicationId",
25+
"Uses algolia_analytics_no_results_rate with startDate and endDate in YYYY-MM-DD format spanning 30 days",
26+
"Uses algolia_analytics_top_searches_without_clicks for searches with no clicks",
27+
"Uses algolia_analytics_click_positions for click position distribution",
28+
"Sets clickAnalytics: true on at least one analytics call to include CTR data",
29+
"Does NOT use algolia-cli commands — this is a read-only analytics task"
30+
]
31+
},
32+
{
33+
"id": 3,
34+
"prompt": "For product ID 'SKU-1234' in my 'catalog' index, show me frequently bought together items and related products. Also show me what's trending in the 'shoes' category. Use a balanced relevance threshold.",
35+
"expected_output": "Three recommendation calls with correct model params and threshold",
36+
"files": [],
37+
"expectations": [
38+
"Calls algolia_search_list_indices first to resolve applicationId",
39+
"Uses algolia_recommendations with model 'bought-together' and objectID 'SKU-1234'",
40+
"Uses algolia_recommendations with model 'related-products' and objectID 'SKU-1234'",
41+
"Uses algolia_recommendations with model 'trending-items' with facetName and facetValue for shoes category",
42+
"Sets threshold to 60 (or close) as the balanced default",
43+
"Does NOT pass objectID for the trending-items call (trending-items does not require it)"
44+
]
45+
},
46+
{
47+
"id": 4,
48+
"prompt": "Our 'ecommerce' index has a no-results rate of 18%. I need to find the specific queries that are failing, then for the top 3 failing queries, actually run those searches to see what results come back (or don't). Also check if our click-through rates have been improving — compare the last 7 days vs the previous 7 days.",
49+
"expected_output": "Multi-step investigation workflow: analytics to find failing queries, then search to diagnose, plus two date-range comparisons",
50+
"files": [],
51+
"expectations": [
52+
"Calls algolia_search_list_indices first to resolve applicationId",
53+
"Uses algolia_analytics_searches_no_results to find specific failing queries",
54+
"Uses algolia_search_index to test the failing queries and see actual results",
55+
"Uses algolia_analytics_top_searches with clickAnalytics: true for BOTH date ranges (last 7 days AND previous 7 days) to compare CTR",
56+
"Sets clickAnalytics: true specifically on algolia_analytics_top_searches (not on a tool that doesn't support it)",
57+
"Uses correct YYYY-MM-DD date format for all date parameters"
58+
]
59+
},
60+
{
61+
"id": 5,
62+
"prompt": "Search my 'events' index for all conferences happening after January 1st 2025. The date is stored as a Unix timestamp field called 'event_date'. Filter to only events in 'technology' or 'science' categories with a ticket price between $50 and $500. Show me page 3 with 20 results per page.",
63+
"expected_output": "Search with Unix timestamp numericFilter, facetFilters, pagination, and correct date conversion",
64+
"files": [],
65+
"expectations": [
66+
"Calls algolia_search_list_indices first to resolve applicationId",
67+
"Uses numericFilters with Unix timestamp for date: event_date >= 1735689600 (or equivalent for Jan 1 2025)",
68+
"Uses facetFilters with OR syntax for categories: [[\"category:technology\", \"category:science\"]]",
69+
"Uses numericFilters for price range: [\"price >= 50\", \"price <= 500\"]",
70+
"Combines all three filter types (date, category, price) in a single algolia_search_index call",
71+
"Sets page to 2 (0-indexed, so page 3 = page parameter 2) and hitsPerPage to 20"
72+
]
73+
}
74+
]
75+
}

0 commit comments

Comments
 (0)