-
Notifications
You must be signed in to change notification settings - Fork 87
[generate_manifest] improve registry workflow #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughRenamed a GitHub Actions workflow. Enhanced scripts/get_manifest.py by adding a validator for installation entries, integrating it into installation filtering, and broadening error handling in generate_manifest for API responses. Minor formatting adjustments were applied without changing overall execution semantics. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as Caller
participant GM as generate_manifest()
participant API as Registry API
participant VI as validate_installations()
participant VE as validate_installation_entry()
U->>GM: generate_manifest(repo_url)
GM->>API: POST fetch manifest metadata
API-->>GM: Response (installations, etc.)
alt Response OK
GM->>VI: validate_installations(manifest, repo_url)
loop For each installation entry
VI->>VE: validate_installation_entry(type, entry)
VE-->>VI: valid? (true/false)
end
VI-->>GM: cleaned installations (or original if none valid)
GM-->>U: Manifest (validated)
else KeyError/IndexError/HTTP errors
GM-->>U: Handle error (fallback/log)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
.github/workflows/generate-manifest.yml (2)
38-41: Install jsonschema to enable schema validation in CIAdd jsonschema to dependencies so we can validate generated manifests against the registry schema in the same job.
Apply this diff:
- name: Install dependencies run: | python -m pip install --upgrade pip - pip install requests + pip install requests jsonschema
43-49: Add a schema validation step after manifest generationCatching schema issues early in CI avoids opening PRs with invalid manifests.
Apply this diff to add a validation step:
- name: Generate manifest env: ANYON_API_KEY: ${{ secrets.ANYON_API_KEY }} run: | REPO_URL="${{ github.event.inputs.repo_url || github.event.client_payload.repo_url }}" python scripts/get_manifest.py "$REPO_URL" + - name: Validate generated manifests against schema + run: | + python scripts/validate_manifest.py +``` </blockquote></details> <details> <summary>scripts/get_manifest.py (2)</summary><blockquote> `21-28`: **Make JSON code-block extraction more robust (CRLF, optional language, trailing whitespace)** The current regex requires exact newlines before and after the block. Loosen it to handle CRLF and optional language tags; improves resilience to typical LLM outputs. Apply this diff: ```diff - json_match = re.search(r"```json\n(.*?)\n```", content, re.DOTALL) + # Accept ```json fences (case-insensitive), optional whitespace, and both LF/CRLF line endings + codeblock_pattern = r"```(?:json)?\s*\r?\n(.*?)\r?\n```" + json_match = re.search(codeblock_pattern, content, re.DOTALL | re.IGNORECASE)
231-243: Log why entries were rejected to aid debuggingWhen dropping entries, include minimal reason (missing field, wrong command, etc.). This will reduce churn when API outputs are slightly off.
For example:
- else: - print(f"⚠ Removing invalid {install_type} installation entry") + else: + print(f"⚠ Removing invalid {install_type} installation entry (failed schema validation)")Optionally extend validate_installation_entry to return (bool, reason) for richer messages.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
.github/workflows/generate-manifest.yml(1 hunks)scripts/get_manifest.py(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (CLAUDE.md)
Always format Python code with
ruff.
Files:
scripts/get_manifest.py
🧬 Code Graph Analysis (1)
scripts/get_manifest.py (3)
scripts/utils.py (1)
validate_arguments_in_installation(61-153)scripts/validate_manifest.py (2)
main(64-97)validate_manifest(32-48)src/mcpm/commands/install.py (1)
install(123-448)
🔇 Additional comments (5)
.github/workflows/generate-manifest.yml (1)
1-1: Workflow rename to production name looks goodClearer name. No functional changes introduced.
scripts/get_manifest.py (4)
40-50: Repo name parsing handles HTTPS/SSH forms — LGTMCovers https://github.com/owner/repo(.git) and [email protected]:owner/repo(.git) correctly.
67-70: Verify the Chat Completions endpoint accepts “content” as an array of partsYou’re sending content as [{"type":"text","text": "..."}] to a chat.completions-compatible endpoint, but later you expect a plain string in choices[0].message.content. Some OpenAI-compatible endpoints only accept content as a string in chat.completions. If the API rejects this, revert to a plain string.
If needed, switch to string content with this change:
- "content": [{"type": "text", "text": f"help me generate manifest json for this repo: {repo_url}"}], + "content": f"help me generate manifest json for this repo: {repo_url}",
265-267: Explicit UTF-8 encoding on write — LGTMPrevents mojibake for non-ASCII manifests.
276-304: Remember to run ruff format per repo guidelinesPer coding guidelines, Python code should be formatted with ruff. No blockers spotted, but please run ruff format to keep CI happy.
| response = requests.post(url, headers=headers, json=payload) | ||
| response.raise_for_status() | ||
|
|
||
| data = response.json() | ||
| content = data["choices"][0]["message"]["content"] | ||
|
|
||
| return extract_json_from_content(content) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add request timeout and handle JSON decoding errors
Without a timeout, the action can hang indefinitely. Also guard against invalid JSON bodies.
Apply this diff:
- response = requests.post(url, headers=headers, json=payload)
+ response = requests.post(url, headers=headers, json=payload, timeout=30)
response.raise_for_status()
- data = response.json()
+ data = response.json()
content = data["choices"][0]["message"]["content"]And expand error handling:
- except requests.RequestException as e:
+ except requests.RequestException as e:
print(f"API request failed: {e}")
return None
- except (KeyError, IndexError) as e:
+ except (KeyError, IndexError, ValueError, json.JSONDecodeError) as e:
print(f"Unexpected API response format: {e}")
return None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response = requests.post(url, headers=headers, json=payload) | |
| response.raise_for_status() | |
| data = response.json() | |
| content = data["choices"][0]["message"]["content"] | |
| return extract_json_from_content(content) | |
| try: | |
| response = requests.post(url, headers=headers, json=payload, timeout=30) | |
| response.raise_for_status() | |
| data = response.json() | |
| content = data["choices"][0]["message"]["content"] | |
| return extract_json_from_content(content) | |
| except requests.RequestException as e: | |
| print(f"API request failed: {e}") | |
| return None | |
| except (KeyError, IndexError, ValueError, json.JSONDecodeError) as e: | |
| print(f"Unexpected API response format: {e}") | |
| return None |
| def validate_installation_entry(install_type: str, entry: dict) -> bool: | ||
| """Validate a single installation entry against MCP registry schema.""" | ||
| # Required fields for all installation types | ||
| required_fields = {"type", "command", "args"} | ||
|
|
||
| # Check all required fields exist | ||
| if not all(field in entry for field in required_fields): | ||
| return False | ||
|
|
||
| # Type must match install_type | ||
| if entry.get("type") != install_type: | ||
| return False | ||
|
|
||
| # Type-specific validation based on MCP registry patterns | ||
| if install_type == "npm": | ||
| # npm type should use npx command with -y flag | ||
| if entry.get("command") != "npx": | ||
| return False | ||
| args = entry.get("args", []) | ||
| if not args or args[0] != "-y": | ||
| return False | ||
| elif install_type == "uvx": | ||
| # uvx type should use uvx command | ||
| if entry.get("command") != "uvx": | ||
| return False | ||
| elif install_type == "docker": | ||
| # docker type should use docker command | ||
| if entry.get("command") != "docker": | ||
| return False | ||
| args = entry.get("args", []) | ||
| if not args or args[0] != "run": | ||
| return False | ||
|
|
||
| return True | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validator incorrectly rejects HTTP installs and over-constrains types; add type-specific checks
Currently requires command/args for all types and only recognizes npm/uvx/docker. This will drop valid HTTP entries (which use url/headers) and can admit incomplete entries (e.g., npm with only ["-y"]). Make checks per type and ensure minimum args are present.
Apply this diff:
-def validate_installation_entry(install_type: str, entry: dict) -> bool:
- """Validate a single installation entry against MCP registry schema."""
- # Required fields for all installation types
- required_fields = {"type", "command", "args"}
-
- # Check all required fields exist
- if not all(field in entry for field in required_fields):
- return False
-
- # Type must match install_type
- if entry.get("type") != install_type:
- return False
-
- # Type-specific validation based on MCP registry patterns
- if install_type == "npm":
- # npm type should use npx command with -y flag
- if entry.get("command") != "npx":
- return False
- args = entry.get("args", [])
- if not args or args[0] != "-y":
- return False
- elif install_type == "uvx":
- # uvx type should use uvx command
- if entry.get("command") != "uvx":
- return False
- elif install_type == "docker":
- # docker type should use docker command
- if entry.get("command") != "docker":
- return False
- args = entry.get("args", [])
- if not args or args[0] != "run":
- return False
-
- return True
+def validate_installation_entry(install_type: str, entry: dict) -> bool:
+ """Validate a single installation entry against MCP registry schema."""
+ if not isinstance(entry, dict):
+ return False
+
+ # Entry type must match key
+ if entry.get("type") != install_type:
+ return False
+
+ # Type-specific validation
+ if install_type == "npm":
+ # Must be: command=npx, args=["-y", "<package>", ...]
+ if entry.get("command") != "npx":
+ return False
+ args = entry.get("args") or []
+ if len(args) < 2 or args[0] != "-y" or not isinstance(args[1], str) or not args[1]:
+ return False
+ return True
+
+ if install_type == "uvx":
+ # Must be: command=uvx, args=[...at least one token...]
+ if entry.get("command") != "uvx":
+ return False
+ args = entry.get("args") or []
+ if len(args) < 1:
+ return False
+ return True
+
+ if install_type == "docker":
+ # Must be: command=docker, args=["run", ...]
+ if entry.get("command") != "docker":
+ return False
+ args = entry.get("args") or []
+ if not args or args[0] != "run" or len(args) < 2:
+ # require at least an image or a flag after "run"
+ return False
+ return True
+
+ if install_type == "http":
+ # HTTP uses URL/headers instead of command/args
+ url = entry.get("url")
+ if not isinstance(url, str) or not url.strip():
+ return False
+ headers = entry.get("headers")
+ if headers is not None and not isinstance(headers, dict):
+ return False
+ return True
+
+ # Unknown types: reject
+ return False📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def validate_installation_entry(install_type: str, entry: dict) -> bool: | |
| """Validate a single installation entry against MCP registry schema.""" | |
| # Required fields for all installation types | |
| required_fields = {"type", "command", "args"} | |
| # Check all required fields exist | |
| if not all(field in entry for field in required_fields): | |
| return False | |
| # Type must match install_type | |
| if entry.get("type") != install_type: | |
| return False | |
| # Type-specific validation based on MCP registry patterns | |
| if install_type == "npm": | |
| # npm type should use npx command with -y flag | |
| if entry.get("command") != "npx": | |
| return False | |
| args = entry.get("args", []) | |
| if not args or args[0] != "-y": | |
| return False | |
| elif install_type == "uvx": | |
| # uvx type should use uvx command | |
| if entry.get("command") != "uvx": | |
| return False | |
| elif install_type == "docker": | |
| # docker type should use docker command | |
| if entry.get("command") != "docker": | |
| return False | |
| args = entry.get("args", []) | |
| if not args or args[0] != "run": | |
| return False | |
| return True | |
| def validate_installation_entry(install_type: str, entry: dict) -> bool: | |
| """Validate a single installation entry against MCP registry schema.""" | |
| if not isinstance(entry, dict): | |
| return False | |
| # Entry type must match key | |
| if entry.get("type") != install_type: | |
| return False | |
| # Type-specific validation | |
| if install_type == "npm": | |
| # Must be: command=npx, args=["-y", "<package>", ...] | |
| if entry.get("command") != "npx": | |
| return False | |
| args = entry.get("args") or [] | |
| if len(args) < 2 or args[0] != "-y" or not isinstance(args[1], str) or not args[1]: | |
| return False | |
| return True | |
| if install_type == "uvx": | |
| # Must be: command=uvx, args=[...at least one token...] | |
| if entry.get("command") != "uvx": | |
| return False | |
| args = entry.get("args") or [] | |
| if len(args) < 1: | |
| return False | |
| return True | |
| if install_type == "docker": | |
| # Must be: command=docker, args=["run", ...] | |
| if entry.get("command") != "docker": | |
| return False | |
| args = entry.get("args") or [] | |
| if not args or args[0] != "run" or len(args) < 2: | |
| # require at least an image or a flag after "run" | |
| return False | |
| return True | |
| if install_type == "http": | |
| # HTTP uses URL/headers instead of command/args | |
| url = entry.get("url") | |
| if not isinstance(url, str) or not url.strip(): | |
| return False | |
| headers = entry.get("headers") | |
| if headers is not None and not isinstance(headers, dict): | |
| return False | |
| return True | |
| # Unknown types: reject | |
| return False |
🤖 Prompt for AI Agents
In scripts/get_manifest.py around lines 90 to 124, the validator currently
enforces "command" and "args" for every install entry and only recognizes
npm/uvx/docker, which wrongly rejects HTTP installs and under/over-constrains
other types; change validation to branch by install_type and enforce
type-specific required fields: for "http" require "url" (and optional "headers")
and do not demand command/args; for "npm" require command == "npx" and args
length >= 2 with first arg "-y" and at least one package arg after the flags;
for "uvx" require command == "uvx" and no args constraint beyond presence if
needed; for "docker" require command == "docker" and args length >= 1 with first
arg "run"; also return False for unknown install_type instead of attempting
generic checks. Ensure presence checks use explicit required sets per type
rather than global required_fields.
| print("Validating installations against README...") | ||
| response = requests.post(url, headers=headers, json=payload) | ||
| response.raise_for_status() | ||
|
|
||
| data = response.json() | ||
| content = data["choices"][0]["message"]["content"] | ||
|
|
||
| validated_data = extract_json_from_content(content) | ||
| if validated_data and "installations" in validated_data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add timeout and JSON error handling to validation request
Same rationale as generate_manifest — avoid indefinite hangs and handle invalid JSON.
Apply this diff:
- response = requests.post(url, headers=headers, json=payload)
+ response = requests.post(url, headers=headers, json=payload, timeout=60)
response.raise_for_status()
- data = response.json()
+ data = response.json()
content = data["choices"][0]["message"]["content"]And broaden exception handling here too:
- except Exception as e:
- print(f"Error validating installations: {e}")
+ except (requests.RequestException, ValueError, json.JSONDecodeError, KeyError, IndexError) as e:
+ print(f"Error validating installations: {e}")
return manifest📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print("Validating installations against README...") | |
| response = requests.post(url, headers=headers, json=payload) | |
| response.raise_for_status() | |
| data = response.json() | |
| content = data["choices"][0]["message"]["content"] | |
| validated_data = extract_json_from_content(content) | |
| if validated_data and "installations" in validated_data: | |
| print("Validating installations against README...") | |
| response = requests.post(url, headers=headers, json=payload, timeout=60) | |
| response.raise_for_status() | |
| data = response.json() | |
| content = data["choices"][0]["message"]["content"] | |
| validated_data = extract_json_from_content(content) | |
| if validated_data and "installations" in validated_data: | |
| ... | |
| except (requests.RequestException, ValueError, json.JSONDecodeError, KeyError, IndexError) as e: | |
| print(f"Error validating installations: {e}") | |
| return manifest |
🤖 Prompt for AI Agents
In scripts/get_manifest.py around lines 221 to 229, the POST request that
validates installations should include a timeout and robust error handling: add
a timeout argument to requests.post to avoid indefinite hangs, wrap the response
parsing in a try/except that catches requests.RequestException and JSON decoding
errors (ValueError/JSONDecodeError) and handle them gracefully (log/raise a
clear error), and broaden the existing exception handling around
extract_json_from_content so invalid or missing JSON in content is caught and
handled rather than letting the script crash.
|
🎉 This PR is included in version 2.7.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
PR Type
Enhancement
Description
Enhanced manifest generation script with improved validation
Added schema-compliant installation entry validation
Updated workflow name from testing to production
Improved code formatting and error handling
Diagram Walkthrough
File Walkthrough
get_manifest.py
Enhanced manifest validation with schema compliancescripts/get_manifest.py
validate_installation_entry()function for schema validationvalidate_installations()with detailed schema examplesgenerate-manifest.yml
Updated workflow name for production.github/workflows/generate-manifest.yml
"Generate MCP Manifest"
Summary by CodeRabbit
Bug Fixes
Chores