-
Notifications
You must be signed in to change notification settings - Fork 19
Add SerpApi engine resources #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Test summary
Full test log (engine-test-results.md): |
|
Hi @vladm-serpapi , I just wanted to get your eyes on this contribution of mine. I wanted to make sure that this awesome MCP can use not just a few listed engines but all the engines in serpAPI and has the proper documentation about the required parameters as MCP resources. I am happy to rework or answer any questions you may have! Thanks! |
Hey @pranavkafle , thanks for the work! I will take a look at it this week or early next week to see the best way to integrate that. |
|
Just following up here, I am still working through the review. |
|
Thanks for the follow-up @vladm-serpapi , please feel free to ask any questions or flag anything. Happy to rework/revise if needed. |
vladm-serpapi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good overall and actually will add a lot of value to the MCP users. The integration approach is also sensible.
There were a few questions I wanted to cover before merging:
- Where were the engine JSON schemas sourced from and what was the workflow for that? I think SerpApi has some functionality to expose the parameter information, but I am not aware if it's actually publicly accessible, so curious where the current JSON files were sourced from.
- Given the ongoing API updates, it would be ideal to generate the schema engines on the fly or during the build time. For instance, we could pull schemas from the publicly running API and include them into the Docker image (serialized into engines/... directory). That approach would allow us to keep the schemas in sync with the existing API endpoint parameters and engines.
@pranavkafle Thanks for all the work on contributing the PR! Let me know your considerations on the above and I'll be happy to push this forward faster and get it merged asap.
26afa3d to
40e5e0a
Compare
|
Hi @vladm-serpapi, thanks for the review and great questions! 1. Engine Schema Source: The engine JSON schemas are sourced from the SerpApi Playground ( I've added a def fetch_props(url: str) -> dict[str, object]:
"""Fetch playground HTML and extract React props."""
req = Request(url, headers={"User-Agent": USER_AGENT})
with urlopen(req, timeout=TIMEOUT_SECONDS) as resp:
page_html = resp.read().decode("utf-8", errors="ignore")
soup = BeautifulSoup(page_html, "html.parser")
node = soup.find(attrs={"data-react-props": True})
if not node:
raise RuntimeError("Failed to locate data-react-props in playground HTML.")
return json.loads(html.unescape(node["data-react-props"]))The script normalizes the extracted data (converts HTML descriptions to markdown, filters relevant fields like 2. Build-Time Generation: Per your suggestion, I've committed this in the latest changes. The Dockerfile now generates fresh engine schemas at build time: RUN uv sync
ENV PATH="/app/.venv/bin:$PATH"
RUN python /app/build-engines.pyThis means:
We could probably use a GitHub Action to generate the engine schemas on a schedule in the future. Let me know if you'd like any adjustments to this approach or if anything else is needed to get this merged! |
Interesting approach. Let me circle back with the team on that. I think we'll likely enable some direct generation based on what's available. Thanks for providing an expanded info! |
|
Thanks for the update, @vladm-serpapi. While the Playground scraping works for now, I agree that a direct, structured source would be much more reliable. If SerpApi has a canonical JSON schema or an internal metadata endpoint you'd prefer I use, let me know. I’m happy to update the script to point to a more stable 'source of truth' to keep the MCP robust. Looking forward to the team's feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good, just minor adjustments. I've left a few comments related to formatting and some versions / sanitization logic.
I'll ask one of the team members to take a final look and I think we're good to merge. Thanks for the contribution!
|
Thanks for the feedback! I’ve applied the requested changes: |
Summary
serpapi://engines,serpapi://engines/<engine>)Why
Data provenance
data-react-propsattribute (HTML-encoded JSON).Testing
engines/can be called via MCP.