This document provides comprehensive examples and guidance for using the Pinecone Assistant MCP effectively, with token optimization strategies and best practices.
Note: This guide uses USPTO patent examination as the reference implementation, but the patterns apply to any document corpus.
CRITICAL: The Pinecone Assistant API is stateless - it has no memory between requests.
Only include history when the new question explicitly references prior context:
- Pronouns: "that", "this", "it", "those", "these"
- References: "as mentioned", "the above", "previously discussed"
- Continuation: "follow up", "expand on", "additionally"
Example requiring history:
{
"tool": "assistant_chat",
"arguments": {
"messages": [
{"role": "user", "content": "What is the Alice/Mayo framework?"},
{"role": "assistant", "content": "[Previous AI response]"},
{"role": "user", "content": "How does THAT framework apply to ML patents?"}
]
}
}Do NOT include history when questions are independent:
Bad (wasteful):
{
"messages": [
{"role": "user", "content": "What is Section 101?"},
{"role": "assistant", "content": "[1000 token response]"},
{"role": "user", "content": "What is Section 103?"} // Independent question!
]
}Good (efficient):
{
"messages": [
{"role": "user", "content": "What is Section 103?"} // Stateless
]
}4-turn conversation comparison:
- Wasteful (full history): 130K input tokens total
- Smart (stateless when possible): 124K input tokens total
- Savings: 6K tokens = 3-5 additional chat queries from your lifetime allocation
Default pattern: Send single-message queries. Only add history if the new question literally cannot be understood without prior context.
Default: 0.2 (Low Temperature for Legal/USPTO Precision)
The MCP is configured with a low default temperature (0.2) optimized for patent law and legal analysis:
Why Low Temperature for USPTO/Legal Work:
- Accurate Citations: Prevents AI from inventing MPEP sections or case names
- Consistent Analysis: Same query yields consistent legal interpretations
- Factual Precision: Critical for exact statute/regulation references
- Risk Mitigation: Reduces hallucinations in legal advice
- IRAC Reliability: Structured analysis requires deterministic responses
Temperature Guide:
| Temperature | Behavior | Best For | USPTO Suitability |
|---|---|---|---|
| 0.0 - 0.3 | Deterministic, precise | Legal citations, factual analysis | ✅ Recommended |
| 0.4 - 0.7 | Balanced | General research | |
| 0.8 - 2.0 | Creative, varied | Brainstorming, creative writing | ❌ Not recommended for legal work |
Override Temperature:
{
"tool": "assistant_chat",
"arguments": {
"messages": [...],
"temperature": 0.3 // Override default for this specific query
}
}Custom Domains: Set higher temperature for creative domains via environment variable:
export DEFAULT_TEMPERATURE=0.7 # For creative/brainstorming domainsLifetime Token Allocation:
- Input tokens: 1.5M
- Context tokens: 500K
- Output tokens: 200K
Usage estimates (total lifetime, not per month — these do NOT reset):
assistant_context: ~50+ total queries (context tokens)assistant_strategic_multi_search_context: ~20-30 total queries (context tokens)assistant_strategic_multi_search_chat: ~50-75 total queries (input tokens)assistant_chat(with smart history): ~50 total queries (input tokens)assistant_chat(with wasteful history): ~35 total queries (input tokens)
| Domain | Focus Area | Typical Use Cases | Search Patterns |
|---|---|---|---|
section_101_eligibility |
Patent eligibility analysis | Alice/Mayo rejections, abstract idea analysis | 4 searches |
section_103_obviousness |
Obviousness analysis | KSR factors, prior art combination | 4 searches |
section_112_requirements |
Specification compliance | Written description, enablement, definiteness | 4 searches |
section_102_novelty |
Prior art and anticipation | Novelty rejections, prior art dates | 3 searches |
claim_construction |
Claim interpretation | Phillips standard, prosecution history | 3 searches |
ptab_procedures |
Post-grant proceedings | IPR/PGR, PTAB claim construction | 3 searches |
mechanical_patents |
Mechanical inventions | Mechanical obviousness, manufacturing | 2 searches |
software_patents |
Software/AI inventions | Software eligibility, AI/ML patents | 2 searches |
general_patent_law |
General guidance | Broad patent law questions | 5 searches |
Domains are organized by legal issue rather than workflow stage, enabling:
- Direct issue targeting: Select domain matching your specific legal question
- Cross-reference capability: Combine multiple domains for complex analysis
- Technology-specific searches:
software_patentsandmechanical_patentsfor tech-focused guidance
Selection Strategy:
- Specific rejection type → Use matching section domain (101, 102, 103, 112)
- Technology-specific → Use
software_patentsormechanical_patents - General research → Start with
general_patent_law - Post-grant → Use
ptab_proceduresorclaim_construction
Situation: Received Alice/Mayo rejection on machine learning algorithm patent
{
"tool": "assistant_strategic_multi_search_chat",
"arguments": {
"query": "machine learning algorithm patent eligibility",
"domain": "section_101_eligibility",
"max_searches": 3
}
}Searches Executed:
- Alice/Mayo framework analysis
- Technological improvement vs abstract idea
- Inventive concept analysis
Workflow:
- Strategic search → Get Alice/Mayo framework and practical application guidance
- PFW MCP → Pull office action with 101 rejection
- Draft response using MPEP examples and case law
Expected Results:
- Alice/Mayo two-step framework application
- Technological improvement arguments
- Inventive concept analysis for Step 2
Situation: Need to overcome obviousness rejection for pharmaceutical formulation
{
"tool": "assistant_strategic_multi_search_chat",
"arguments": {
"query": "pharmaceutical formulation prior art combination",
"domain": "section_103_obviousness",
"max_searches": 4
}
}Searches Executed:
- Graham factors analysis
- KSR motivation to combine rationales
- Secondary considerations of non-obviousness
- Prior art combination analysis
Workflow:
- PFW MCP → Get office action and prior art references
- Strategic search → Find KSR rationales and secondary considerations guidance
- Draft response with Graham factors and secondary considerations arguments
Expected Results:
- Graham v. John Deere four-factor framework
- KSR motivation to combine analysis
- Secondary considerations arguments (commercial success, teaching away)
Situation: Analyzing likelihood of IPR institution for challenged patent
{
"tool": "assistant_strategic_multi_search_chat",
"arguments": {
"query": "IPR institution standards claim construction",
"domain": "ptab_procedures",
"max_searches": 3
}
}Searches Executed:
- IPR petition and institution standards
- PTAB claim construction standards
- PTAB estoppel rules
Workflow:
- Strategic search → Get institution standards and claim construction guidance
- PTAB MCP → Search similar IPR proceedings and institution decisions
- Analyze likelihood of institution based on precedent
Expected Results:
- IPR institution standards and preponderance of evidence requirements
- PTAB claim construction methodology (BRI/Phillips)
- Estoppel implications for non-instituted grounds
Situation: Responding to Section 112(a) written description rejection
{
"tool": "assistant_strategic_multi_search_chat",
"arguments": {
"query": "written description possession requirement biotechnology",
"domain": "section_112_requirements",
"max_searches": 2
}
}Searches Executed:
- Written description requirement analysis
- Enablement requirement and Wands factors
Workflow:
- Strategic search → Get written description and enablement standards
- PFW MCP → Pull specification and office action
- Draft response with possession and enablement arguments
Expected Results:
- Written description possession requirement (§112(a))
- Enablement standards and Wands factors
- Specification support analysis
Situation: Analyzing claim scope for litigation or licensing
{
"tool": "assistant_strategic_multi_search_chat",
"arguments": {
"query": "claim construction means-plus-function limitations",
"domain": "claim_construction",
"max_searches": 3
}
}Searches Executed:
- Phillips v. AWH claim construction standard
- Prosecution history estoppel analysis
- Functional claim language interpretation
Workflow:
- Strategic search → Get claim construction standards and Phillips analysis
- PFW MCP → Review prosecution history for estoppel issues
- Analyze claim scope using intrinsic and extrinsic evidence
Expected Results:
- Phillips standard for claim construction
- Prosecution history estoppel principles
- Means-plus-function interpretation under §112(f)
Four corpus-neutral prompt templates are accessible from the Claude prompt menu. Each generates a pre-filled workflow with structured tool calls — select from the prompt menu and fill in the arguments.
| Prompt | Parameters | Token Tier | Use When |
|---|---|---|---|
deep_research |
topic, domain? |
Context only | Thorough multi-angle coverage |
quick_lookup |
topic |
Context only | Fast single-fact retrieval |
comparative_research |
topic_a, topic_b |
Context only | Side-by-side topic comparison |
delegated_research |
research_question, model?, prior_context? |
Context + LLM | Paid plan / agentic synthesis |
Executes assistant_strategic_multi_search_context with max_searches=4, then fills gaps with targeted assistant_context calls. Context tokens only — no AI synthesis cost.
Example: topic="Alice/Mayo two-step framework software patents", domain="section_101_eligibility"
Single assistant_context call with top_k=3, snippet_size=1024. Under 5K context tokens. One retry with broader terminology if initial results are irrelevant.
Example: topic="MPEP 2106.05(a) practical application"
Two sequential assistant_context calls — one per topic — then structured comparison output covering similarities, differences, and when to apply each.
Example: topic_a="written description requirement", topic_b="enablement requirement"
Delegates synthesis to the Pinecone Assistant AI via assistant_chat. Pinecone handles internal retrieval and synthesis; Claude receives only the compact result. Supports stateless single queries and follow-up queries with prior context.
Example: research_question="What are the key differences between IPR and PGR institution standards?", model="gpt-4o"
Follow-up with prior context: Paste the prior assistant answer into prior_context when the new question references it. Omit prior_context for independent questions (stateless = cheapest).
assistant_chat functions as a sub-agent delegation mechanism — Pinecone internally retrieves context from the knowledge base, feeds it to the configured LLM, and returns a synthesized, citation-backed answer. Claude receives only the compact result (~500–2000 tokens) rather than raw document chunks, preserving Claude's context window for orchestration.
{
"tool": "assistant_chat",
"arguments": {
"messages": [{"role": "user", "content": "What are the seven KSR rationales for combining prior art references?"}],
"model": "gpt-4o",
"temperature": 0.2,
"include_highlights": true,
"context_options": {"top_k": 5, "snippet_size": 2048}
}
}Chain independent calls without history — each is stateless, no accumulated cost:
// Call 1: First research topic
{"tool": "assistant_chat", "arguments": {"messages": [{"role": "user", "content": "IPR institution standards"}]}}
// Call 2: Second independent topic — no history
{"tool": "assistant_chat", "arguments": {"messages": [{"role": "user", "content": "PGR institution standards"}]}}
// Synthesize both results in Claude's contextInclude history only when the new question explicitly references the prior answer:
{
"tool": "assistant_chat",
"arguments": {
"messages": [
{"role": "user", "content": "What are the Graham v. John Deere four factors?"},
{"role": "assistant", "content": "[prior answer text]"},
{"role": "user", "content": "How do those factors apply to pharmaceutical formulations?"}
],
"context_options": {"top_k": 5, "snippet_size": 2048}
}
}Control how much context Pinecone sends to its LLM per call:
| Setting | Tokens to LLM | Use When |
|---|---|---|
{"top_k": 3, "snippet_size": 1024} |
~3K | Focused single-concept question |
{"top_k": 5, "snippet_size": 2048} |
~10K | Standard research (recommended) |
{"top_k": 10, "snippet_size": 2048} |
~20K | Broad topic needing wide coverage |
Use assistant_chat for broad synthesis, then assistant_context for targeted follow-up gaps:
// Step 1: Broad synthesis via delegation
{"tool": "assistant_chat", "arguments": {"messages": [{"role": "user", "content": "Overview of §101 eligibility framework"}], "context_options": {"top_k": 5, "snippet_size": 2048}}}
// Step 2: Targeted gap fill via context retrieval (cheaper)
{"tool": "assistant_context", "arguments": {"query": "MPEP 2106.05(a) practical application technological improvement", "top_k": 3}}Multimodal and messages support expand what assistant_context can retrieve:
// With multimodal disabled (text-only retrieval, smaller response)
{
"tool": "assistant_context",
"arguments": {
"query": "Section 101 eligibility",
"top_k": 5,
"multimodal": false,
"include_binary_content": false
}
}// With messages input (multi-turn context retrieval)
{
"tool": "assistant_context",
"arguments": {
"messages": [
{"role": "user", "content": "What is the Alice/Mayo framework?"},
{"role": "assistant", "content": "[prior response]"},
{"role": "user", "content": "How does that apply to software patents?"}
],
"top_k": 5
}
}Parameter notes:
queryis now optional — eitherqueryORmessagesis requiredmultimodal(optional bool): Enable image retrieval from PDFs. Only works withqueryinputinclude_binary_content(optional bool): Include base64 image data in response. Setfalseto reduce response size. Only works withqueryinput- When using
messages, themultimodalandinclude_binary_contentparameters are ignored by the API
assistant_chat returns clean text, not JSON. The response contains:
- The answer content
- A Sources section listing filename and a ~150-character excerpt per source
- A token summary line
Example response format:
[Answer text here...]
**Sources:**
- MPEP_Part2.md: "...the Alice/Mayo framework requires a two-step analysis. Step 2A asks whether..."
- Combined_Training_Materials.md: "...software claims directed to abstract ideas must provide significantly more..."
*Tokens: 35,205 prompt + 823 completion = 36,028 total*
For complex patent prosecution tasks, combine the Pinecone Assistant with other USPTO MCPs:
# 1. Get Alice/Mayo framework guidance
assistant_strategic_multi_search_chat → domain: "section_101_eligibility"
# 2. Pull office action
PFW MCP → Download office action and specifications
# 3. Draft response with practical application arguments# 1. Research KSR and Graham factors
assistant_strategic_multi_search_chat → domain: "section_103_obviousness"
# 2. Analyze prior art
PFW MCP → Get prior art references and examiner analysis
# 3. Build secondary considerations case# 1. Research institution standards
assistant_strategic_multi_search_chat → domain: "ptab_procedures"
# 2. Find similar proceedings
PTAB MCP → Search institution decisions
# 3. Analyze claim construction approachUse assistant_chat for follow-up questions after strategic searches:
{
"tool": "assistant_chat",
"arguments": {
"messages": [
{
"role": "user",
"content": "Based on the strategic search results, what specific KSR factors should I emphasize for a pharmaceutical formulation obviousness argument?"
}
],
"model": "gpt-4o",
"include_highlights": true
}
}Use assistant_strategic_multi_search_chat when:
- Starting research on a broad topic
- Need comprehensive coverage of related issues
- Want systematic analysis across multiple guidance documents
- Preparing for complex prosecution or post-grant procedures
Use assistant_chat when:
- Following up on specific findings
- Need clarification on particular guidance
- Have targeted questions about specific procedures
- Want to explore nuances of a particular topic
- Match rejection type: Section 101 →
section_101_eligibility, 103 →section_103_obviousness, etc. - Consider technology: Software/AI →
software_patents, Mechanical →mechanical_patents - Legal issue focus: Use specific section domains (101, 102, 103, 112) for targeted analysis
- General research: Start with
general_patent_lawfor broad questions - Post-grant matters: Use
ptab_proceduresorclaim_construction
- Strategic searches: Use for initial comprehensive research (20-30K tokens)
- Follow-up chats: Use for specific questions (5-15K tokens)
- Document specific: Switch to PFW/PTAB/FPD MCPs for specific documents
- Monitor usage: Claude Desktop shows token consumption per interaction
"Strategic search returns too broad results"
- Use more specific query terms
- Reduce
max_searchesparameter to 2 - Switch to
assistant_chatfor targeted questions
"Need more specific technology guidance"
- Use appropriate technology domain:
software_patentsormechanical_patents - Follow up with specific section domains (101, 102, 103) for rejection-specific analysis
- Cross-reference with PFW MCP for technology center guidance
"Context limit reached"
- Switch to direct
assistant_chatfor remaining questions - Use PFW/PTAB/FPD MCPs for document-specific research
- Save strategic search results and start new conversation
Leverage up to 5 Pinecone Assistants (free tier) for specialized knowledge bases using the update_configuration tool.
Create specialized assistants:
1. mpep-assistant → MPEP 9th Edition core guidance (Combined_MPEP_9th_Edition_Part1-4.md)
2. updates-assistant → Recent Federal Register notices & updates (Combined_MPEP_Updates.md)
3. case-law-assistant → Federal Circuit & PTAB decisions (Not included)
4. training-assistant → USPTO training materials & examples (Combined_Training_Materials.md)
5. international-assistant → PCT, Paris Convention, treaties (Not included)
Mid-Conversation Switching:
User: "What does MPEP § 2138.04 say about AI inventorship?"
[Using mpep-assistant - returns MPEP guidance]
User: use update_configuration(assistant_name: "updates-assistant")
✅ Switched to updates-assistant
User: "What recent Federal Register guidance addresses AI inventorship?"
[Now using updates-assistant - returns 2024 guidance]
User: use update_configuration(assistant_name: "case-law-assistant")
✅ Switched to case-law-assistant
User: "How have courts applied these principles in recent cases?"
[Now using case-law-assistant - returns Thaler v. Vidal analysis]
Benefits:
- Single conversation spans multiple knowledge bases
- 10 file limit per assistant (free tier) → 50 files total across 5 assistants
- Specialized metadata and search patterns per assistant
- Maintain conversation context while switching sources
Medical Research:
clinical-trials→ Clinical trial databasesliterature→ PubMed and research papersguidelines→ Medical society practice guidelinespharmacology→ Drug interactions and dosingimaging→ Radiology and diagnostic criteria
Financial Analysis:
sec-filings→ 10-K, 10-Q, 8-K reportsearnings→ Earnings call transcriptsanalyst→ Analyst reports and ratingseconomic→ Economic indicators and Fed minutesregulations→ GAAP, IFRS, regulatory guidance
Software Development:
documentation→ API docs, technical specificationsarchitecture→ System design patterns, best practicessecurity→ Security guidelines, vulnerability reportsperformance→ Optimization guides, benchmarking dataframeworks→ Framework-specific guides and examples
Setup Strategy:
- Domain Separation: Keep related but distinct content in separate assistants
- Size Management: Stay within 10 file limit per assistant by splitting logically
- Naming Convention: Use descriptive names that indicate content type
- Search Patterns: Customize
strategic-searches.yamlper assistant's content
Workflow Optimization:
- Start Broad: Begin with general knowledge base (e.g.,
mpep-assistant) - Narrow Down: Switch to specialized assistants for specific analysis
- Cross-Reference: Use multiple assistants to verify information
- Context Preservation: Claude Desktop maintains conversation context across switches
Token Management:
- Use context tools first in each assistant to minimize costs
- Switch assistants when you need different knowledge, not for token limits
- Each assistant switch resets internal state but preserves Claude conversation
Example Multi-Assistant Research Flow:
// Step 1: General guidance
{
"tool": "assistant_context",
"arguments": {
"query": "Section 101 software patent eligibility requirements"
}
}
// Step 2: Switch to recent updates
{
"tool": "update_configuration",
"arguments": {
"assistant_name": "updates-assistant"
}
}
// Step 3: Check recent guidance
{
"tool": "assistant_context",
"arguments": {
"query": "AI patent eligibility guidance 2024 2025"
}
}
// Step 4: Switch to case law
{
"tool": "update_configuration",
"arguments": {
"assistant_name": "case-law-assistant"
}
}
// Step 5: Analyze precedent
{
"tool": "assistant_strategic_multi_search_chat",
"arguments": {
"query": "software patent eligibility Federal Circuit decisions",
"domain": "section_101_eligibility"
}
}This approach gives you comprehensive coverage while staying within free tier limits and optimizing token usage across multiple specialized knowledge bases.
Evaluate AI answer quality against a ground truth.
Requires Pinecone Standard (paid) plan. Free tier users will receive an error. Rate limited to 20 requests/minute.
Use case: After getting an answer from assistant_chat, evaluate its accuracy against a known-correct answer.
{
"tool": "evaluate_answer",
"arguments": {
"question": "What is the Alice/Mayo two-step test?",
"answer": "The Alice/Mayo framework has two steps. Step 2A asks whether the claim is directed to a judicial exception (abstract idea, law of nature, or natural phenomenon). Step 2B determines whether additional elements amount to significantly more than the judicial exception.",
"ground_truth_answer": "The Alice/Mayo test is a two-step framework established by Alice Corp. v. CLS Bank (2014) and Mayo Collaborative Services v. Prometheus (2012). Step 1 asks whether the claim is directed to a judicial exception. Step 2 asks whether the claim includes additional elements that amount to significantly more than the exception alone, providing an inventive concept."
}
}Response format:
**Alignment Scores:**
- Alignment (overall): 0.847
- Correctness (precision): 0.900
- Completeness (recall): 0.800
**Evaluated Facts:**
[entailed] Two-step framework
[entailed] Step 2A asks whether claim is directed to judicial exception
[entailed] Judicial exceptions include abstract ideas, laws of nature, natural phenomena
[neutral] Established by Alice Corp. v. CLS Bank (2014)
[contradicted] Step 2 is called Step 2B (answer uses "Step 2B" for what ground truth calls "Step 2")
*Tokens: 1,204 prompt + 87 completion = 1,291 total*
Score interpretation:
- alignment >= 0.8: High quality answer
- alignment 0.5–0.8: Acceptable but incomplete
- alignment < 0.5: Significant gaps or errors
Workflow example — evaluate after delegation:
// Step 1: Get answer via delegation
{
"tool": "assistant_chat",
"arguments": {
"messages": [{"role": "user", "content": "What is the Alice/Mayo two-step test?"}],
"model": "gpt-4o"
}
}
// Step 2: Evaluate the answer against a known-correct reference
{
"tool": "evaluate_answer",
"arguments": {
"question": "What is the Alice/Mayo two-step test?",
"answer": "[paste the answer from Step 1]",
"ground_truth_answer": "[your verified reference answer]"
}
}This document is maintained alongside the Pinecone Assistant MCP. For the latest version and updates, see the main repository.