Skip to content

add mcp-for-research blog post #3021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions _blog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6542,3 +6542,13 @@
- llm
- evaluation
- agents

- local: mcp-for-research
title: "MCP for Research: How to Connect AI to Research Tools"
author: dylanebert
thumbnail: /blog/assets/mcp-for-research/thumbnail.png
date: Aug 18, 2025
tags:
- mcp
- research
- guide
Binary file added assets/mcp-for-research/demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/mcp-for-research/thumbnail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
121 changes: 121 additions & 0 deletions mcp-for-research.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
---
title: "MCP for Research: How to Connect AI to Research Tools"
thumbnail: /blog/assets/mcp-for-research/thumbnail.png
authors:
- user: dylanebert
---

# MCP for Research: How to Connect AI to Research Tools

Academic research involves frequent **research discovery**: finding papers, code, related models and datasets. This typically means switching between platforms like [arXiv](https://arxiv.org/), [GitHub](https://github.com/), and [Hugging Face](https://huggingface.co/), manually piecing together connections.

The [Model Context Protocol (MCP)](https://huggingface.co/learn/mcp-course/unit0/introduction) is a standard that allows agentic models to communicate with external tools and data sources. For research discovery, this means AI can use research tools through natural language requests, automating platform switching and cross-referencing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add a bit of research tracker MCP Space here, it's too abstract otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree! We could even show a gif with an example from Cursor.


![Research Tracker MCP in action](./assets/mcp-for-research/demo.gif)

## Research Discovery: Three Layers of Abstraction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Research Discovery: Three Layers of Abstraction
This post shows how we built a set of MCP tools for research, how you can use them for your own research projects, or as inspiration for new tools adapted to your needs.
## Research Discovery: Three Layers of Abstraction


Much like software development, research discovery can be framed in terms of layers of abstraction.

### 1. Manual Research

At the lowest level of abstraction, researchers search manually and cross-reference by hand.

```bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe enclose in html <xmp> / </xmp> tags to avoid weird syntax highlighting. Or use > to blockquote.

# Typical workflow:
1. Find paper on arXiv
2. Search GitHub for implementations
3. Check Hugging Face for models/datasets
4. Cross-reference authors and citations
5. Organize findings manually
```

This manual approach becomes inefficient when tracking multiple research threads or conducting systematic literature reviews. The repetitive nature of searching across platforms, extracting metadata, and cross-referencing information naturally leads to automation through scripting.

### 2. Scripted Tools

Python scripts automate research discovery by handling web requests, parsing responses, and organizing results.

```python
# research_tracker.py
def gather_research_info(paper_url):
paper_data = scrape_arxiv(paper_url)
github_repos = search_github(paper_data['title'])
hf_models = search_huggingface(paper_data['authors'])
return consolidate_results(paper_data, github_repos, hf_models)

# Run for each paper you want to investigate
results = gather_research_info("https://arxiv.org/abs/2103.00020")
```

The [research tracker](https://huggingface.co/spaces/dylanebert/research-tracker) demonstrates systematic research discovery built from these types of scripts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could briefly explain somewhere that you've been using this approach to keep track of research in the 3D space, just to make it more relatable.


While scripts are faster than manual research, they often fail to automatically collect data due to changing APIs, rate limits, or parsing errors. Without human oversight, scripts may miss relevant results or return incomplete information.

### 3. MCP Integration

MCP makes these same Python tools accessible to AI systems through natural language.

```markdown
# Example research directive
Find recent transformer architecture papers published in the last 6 months:
- Must have available implementation code
- Focus on papers with pretrained models
- Include performance benchmarks when available
```

The AI orchestrates multiple tools, fills information gaps, and reasons about results:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The AI orchestrates multiple tools, fills information gaps, and reasons about results:
Because the LLM knows the details of what each tool can do, it's able to orchestrate calls to multiple tools, fill information gaps, reason about results and collate them for the user:


```python
# AI workflow:
# 1. Use research tracker tools
# 2. Search for missing information
# 3. Cross-reference with other MCP servers
# 4. Evaluate relevance to research goals

user: "Find all relevant information (code, models, etc.) on this paper: https://huggingface.co/papers/2010.11929"
ai: # Combines multiple tools to gather complete information
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be cool to show a summary of an actual conversation, or a link to it, or a video.


This can be viewed as an additional layer of abstraction above scripting, where the "programming language" is natural language. This follows the [Software 3.0 Analogy](https://youtu.be/LCEmiRjPEtQ?si=J7elM86eW9XCkMFj), where the natural language research direction is the software implementation.

This comes with the same caveats as scripting:

- Faster than manual research, but error-prone without human guidance
- Quality depends on the implementation
- Understanding the lower layers (both manual and scripted) leads to better implementations

## Setup and Usage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to this point it's been a more or less theoretical discussion. Perhaps we could segue into this by reminding the reader that we used a set of custom scripts that evolved into a set of MCP tools that anyone can use, build upon or use for inspiration.


### Quick Setup

The easiest way to add the Research Tracker MCP is through [Hugging Face MCP Settings](https://huggingface.co/settings/mcp):

1. Visit [huggingface.co/settings/mcp](https://huggingface.co/settings/mcp)
2. Search for "research-tracker-mcp" in the available tools
3. Click to add it to your tools
4. Follow the provided setup instructions for your specific client (Claude Desktop, Cursor, Claude Code, VS Code, etc.)

This workflow leverages the Hugging Face MCP server, which is the standard way to use Hugging Face Spaces as MCP tools. The settings page provides client-specific configuration that's automatically generated and always up-to-date.

<script
type="module"
src="https://gradio.s3-us-west-2.amazonaws.com/4.36.1/gradio.js"
></script>

<gradio-app theme_mode="light" space="dylanebert/research-tracker-mcp"></gradio-app>

## Learn More

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one sentence call for action would be great

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

**Get Started:**
- [Hugging Face MCP Course](https://huggingface.co/learn/mcp-course/en/unit1/introduction) - Complete guide from basics to building your own tools
- [MCP Official Documentation](https://modelcontextprotocol.io) - Protocol specifications and architecture

**Build Your Own:**
- [Gradio MCP Guide](https://www.gradio.app/guides/building-mcp-server-with-gradio) - Turn Python functions into MCP tools
- [Building the Hugging Face MCP Server](https://huggingface.co/blog/building-hf-mcp) - Production implementation case study

**Community:**
- [Hugging Face Discord](https://hf.co/join/discord) - MCP development discussions

Ready to automate your research discovery? Try the [Research Tracker MCP](https://huggingface.co/settings/mcp) or build your own research tools with the resources above.