-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Critical Issues in Zscaler MCP AgentCore Official Image
Summary
We have identified critical bugs and missing production features in the official Zscaler MCP AgentCore Docker image (zscaler/zscaler-mcp-server:0.4.0-bedrock). These issues prevent the server from working correctly with standard MCP clients and fail to meet basic AWS security best practices.
Image Reference: 709825985650.dkr.ecr.us-east-1.amazonaws.com/zscaler/zscaler-mcp-server:0.4.0-bedrock
π Critical Bug: tools/list Response Format
Issue
The handle_tools_list() function has multiple critical bugs that break MCP protocol compliance and prevent standard MCP clients from discovering available tools.
Current Buggy Implementation
async def handle_tools_list() -> Dict[str, Any]:
tools = mcp_server.server.list_tools() # β Missing await
return {
"status": "success",
"tool": "tools/list",
"result": [json.dumps(tools, indent=2)] # β Double serialization
}Problems
- Missing
awaitkeyword - The async call is not awaited, returning a coroutine object instead of tools - Double JSON serialization - Tools are serialized to a JSON string, then wrapped in an array
- Wrong response format - Returns
{"status": "success", "result": [...]}instead of MCP-compliant{"tools": [...]} - Object serialization failure - Attempts to serialize Python Tool objects without converting to dictionaries
Actual Output (Broken)
{
"status": "success",
"tool": "tools/list",
"result": [
"[{\"name\": \"zpa_list_app_segments\", ...}]" // β String, not object
]
}Expected Output (MCP Protocol)
{
"tools": [
{
"name": "zpa_list_app_segments",
"description": "List all application segments in ZPA",
"inputSchema": {
"type": "object",
"properties": {...}
}
}
]
}Impact
- β Breaks all standard MCP clients (Claude Desktop, QuickSuite, etc.)
- β Violates MCP protocol specification
- β Tools cannot be discovered or invoked
β οΈ May work with Genesis (which wraps everything), masking the bug
Proposed Fix
async def handle_tools_list() -> Dict[str, Any]:
# Get the list of tools from the MCP server
tools = await mcp_server.server.list_tools() # β
Added await
# Convert Tool objects to dictionaries for JSON serialization
tools_list = []
for tool in tools:
tool_dict = {
"name": tool.name,
"description": tool.description,
}
# MCP spec uses inputSchema (camelCase)
if hasattr(tool, 'inputSchema'):
tool_dict["inputSchema"] = tool.inputSchema
tools_list.append(tool_dict)
# Return MCP protocol format: {"tools": [...]}
return {"tools": tools_list} # β
Correct formatπ Critical Security Issue: No AWS Secrets Manager Support
Issue
The official image requires Zscaler API credentials to be passed as plain-text environment variables, which violates AWS security best practices and fails compliance requirements.
Current Implementation
# Credentials must be passed as plain-text environment variables
ENV ZSCALER_CLIENT_ID=iq7u4xxxxxk6
ENV ZSCALER_CLIENT_SECRET=supersecretvalue123 # β Plain text!
ENV ZSCALER_CUSTOMER_ID=2xxxxxxxxxxxx8Security Risks
| Risk | Impact |
|---|---|
| ECS Task Definition Exposure | Anyone with ecs:DescribeTaskDefinition can read secrets |
| CloudFormation Exposure | Secrets visible in stack parameters and outputs |
| Container Inspection | docker inspect reveals all environment variables |
| No Encryption at Rest | Credentials stored in plain text in AWS APIs |
| No Audit Trail | No CloudTrail logs for credential access |
| No Rotation Support | Requires redeployment to update credentials |
| Compliance Failures | Fails SOC2, PCI-DSS, HIPAA, ISO 27001 audits |
Example Exposure
# Anyone with ECS read permissions can extract secrets
aws ecs describe-task-definition --task-definition zscaler-mcp
# Output exposes credentials in plain text:
{
"environment": [
{"name": "ZSCALER_CLIENT_SECRET", "value": "supersecretvalue123"}
]
}Proposed Solution
Add AWS Secrets Manager integration:
import boto3
from botocore.exceptions import ClientError
# Fetch credentials from Secrets Manager if configured
secret_arn = os.environ.get('ZSCALER_SECRET_ARN')
if secret_arn:
try:
region = secret_arn.split(':')[3]
client = boto3.client('secretsmanager', region_name=region)
response = client.get_secret_value(SecretId=secret_arn)
secret = json.loads(response['SecretString'])
# Set all secret keys as environment variables
for key, value in secret.items():
os.environ[key] = str(value)
logger.info(f"Loaded credentials from Secrets Manager")
except ClientError as e:
logger.error(f"Failed to fetch credentials: {e}")
raiseBenefits:
- β Credentials encrypted at rest with AWS KMS
- β IAM-based access control
- β CloudTrail audit logging
- β Automatic rotation support
- β Compliance with SOC2, PCI-DSS, HIPAA
- β Zero plain-text credential exposure
β οΈ Missing Feature: MCP Protocol Negotiation
Issue
The official image does not handle MCP initialize and ping methods, preventing proper protocol negotiation with MCP clients.
Missing Implementation
# No handling for these required MCP methods:
# - initialize (protocol version negotiation)
# - ping (health check)Impact
- β Cannot negotiate protocol versions with clients
- β No support for MCP 2024-11-05 or 2025-03-26 protocols
- β Breaks standard MCP client handshake
- β No health check mechanism for MCP clients
Proposed Solution
if method == "ping":
logger.info("Handling MCP ping request")
result = {} # MCP spec: ping returns empty object
elif method == "initialize":
logger.info("Handling MCP initialize request")
# Support both 2024-11-05 and 2025-03-26 protocol versions
client_protocol = payload.get("params", {}).get("protocolVersion", "2024-11-05")
logger.info(f"Client requested protocol version: {client_protocol}")
result = {
"protocolVersion": client_protocol, # Echo back client's version
"capabilities": {"tools": {}},
"serverInfo": {"name": "zscaler-mcp", "version": "1.0.0"}
}β οΈ Missing Feature: Standard MCP Client Support
Issue
The official image only supports AWS Genesis NDJSON format and does not support standard MCP clients that use JSON-RPC or SSE (Server-Sent Events).
Current Limitation
# Only returns Genesis NDJSON format
return StreamingResponse(
generate_streaming_response(response_data, session_id),
media_type="application/x-ndjson", # Genesis only
)Impact
- β Cannot be used with Claude Desktop
- β Cannot be used with QuickSuite
- β Cannot be used with standard MCP testing tools
- β Limited to AWS Genesis runtime only
Proposed Solution
Add content negotiation based on request format:
# Check if this is a standard MCP client or Genesis
is_jsonrpc = payload.get("jsonrpc") == "2.0"
accept_header = request.headers.get("accept", "")
prefers_sse = "text/event-stream" in accept_header
if is_jsonrpc:
# Standard JSON-RPC response for MCP clients
response_content = {
"jsonrpc": "2.0",
"id": payload.get("id"),
"result": result
}
if prefers_sse:
# SSE format for streaming clients
async def sse_generator():
yield f"data: {json.dumps(response_content)}\n\n"
return StreamingResponse(
sse_generator(),
media_type="text/event-stream",
)
else:
# Standard JSON response
return JSONResponse(content=response_content)
else:
# Genesis streaming NDJSON response
return StreamingResponse(
generate_streaming_response(response_data, session_id),
media_type="application/x-ndjson",
)β οΈ Missing Feature: Service Filtering
Issue
The official image loads all Zscaler services (ZPA, ZIA, ZDX, ZCC, ZIdentity) with no ability to filter, and exceeding MCP client tool limits.
Current Implementation
# Always loads ALL services
mcp_server = ZscalerMCPServer()Tool Count Problem
The Zscaler MCP server exposes 100+ tools across all services:
- ZPA: ~30 tools
- ZIA: ~40 tools
- ZDX: ~15 tools
- ZCC: ~10 tools
- ZIdentity: ~10 tools
Many MCP clients have hard limits on the number of tools they can handle:
| MCP Client | Tool Limit | Result with All Services |
|---|---|---|
| Claude Desktop | ~50 tools | β Fails to load or truncates |
| Some Genesis Agents | ~100 tools | |
| QuickSuite | 100 tools | silently fails |
| Custom Clients | Varies | β May fail silently |
Real-World Impact
When testing with Claude Desktop:
# Without filtering (100+ tools)
β Error: "Too many tools provided. Maximum 50 tools supported."
# With filtering to only ZPA (30 tools)
β
Success: All tools loaded and functionalImpact
- π« Client Compatibility - Exceeds tool limits in Claude Desktop and other clients
- π° Higher AWS costs - Bedrock charges per tool invocation
- β±οΈ Slower startup - Initializes all services even if unused
- π§ No flexibility - Cannot disable unused services
- π Harder debugging - More tools to troubleshoot
- β‘ Performance degradation - Large tool lists slow down client UX
Proposed Solution
# Read ZSCALER_MCP_SERVICES environment variable to filter services
services_env = os.environ.get('ZSCALER_MCP_SERVICES', '')
if services_env:
enabled_services = set(s.strip() for s in services_env.split(',') if s.strip())
logger.info(f"Filtering to services: {enabled_services}")
mcp_server = ZscalerMCPServer(enabled_services=enabled_services)
else:
logger.info("Loading all services")
mcp_server = ZscalerMCPServer()Usage:
# Only enable ZPA and ZIA
ZSCALER_MCP_SERVICES="zpa,zia"β οΈ Missing Feature: Configurable Logging
Issue
The official image has fixed INFO level logging with no ability to increase verbosity for debugging or decrease for production.
Current Implementation
logging.basicConfig(
level=logging.INFO, # β Fixed, cannot change
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)Impact
- π Harder debugging - Cannot enable DEBUG logs
- π No traffic inspection - Cannot log HTTP headers/bodies
- π Limited troubleshooting - Missing critical diagnostic information
Proposed Solution
# Configure logging with environment variable
log_level = os.environ.get('LOG_LEVEL', 'INFO').upper()
logging.basicConfig(
level=getattr(logging, log_level, logging.INFO),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger.info(f"Logging level set to: {log_level}")
# Optional HTTP traffic logging middleware
@app.middleware("http")
async def log_request_response(request: Request, call_next):
if os.environ.get('LOG_HEADERS', 'false').lower() == 'true':
logger.info(f"Request: {request.method} {request.url.path}")
logger.info(f"Headers: {dict(request.headers)}")
response = await call_next(request)
return responseUsage:
# Enable debug logging
LOG_LEVEL=DEBUG
# Enable HTTP traffic logging
LOG_HEADERS=trueπ Summary of Issues
| Issue | Severity | Impact | Status |
|---|---|---|---|
| tools/list bug | π΄ Critical | Breaks MCP clients | Not fixed |
| No Secrets Manager | π΄ Critical | Security vulnerability | Not implemented |
| No protocol negotiation | π‘ High | Breaks handshake | Not implemented |
| Genesis-only support | π‘ High | Limited compatibility | Not implemented |
| No service filtering | π‘ Medium | Higher costs | Not implemented |
| Fixed logging | π‘ Medium | Harder debugging | Not implemented |
π§ Recommended Actions
-
Immediate (Critical):
- Fix
tools/listasync/await and response format bug - Add AWS Secrets Manager support for credential management
- Fix
-
High Priority:
- Implement MCP
initializeandpingmethods - Add JSON-RPC and SSE support for standard MCP clients
- Implement MCP
-
Medium Priority:
- Add service filtering via environment variable
- Implement configurable logging levels
π Testing
We have validated these issues by:
- Extracting the official Docker image filesystem from marketplace
- Comparing with a working production implementation
- Testing with multiple MCP clients (Genesis, QuickSuite, claude, mcp)
- Reviewing MCP protocol specification compliance
Test Environment:
- Image:
zscaler/zscaler-mcp-server:0.4.0-bedrock - Platform:
linux/arm64 - Extracted:
/tmp/zscaler-official/app/web_server.py
π€ Contributing
We have working implementations of all these fixes and are happy to contribute them back to the project. Please let us know the preferred contribution process.
π‘ Recommendation: Make AgentCore Build Public
Current Situation
The AgentCore/Bedrock-specific build is currently only available as a pre-built Docker image in AWS Marketplace ECR:
- Image:
709825985650.dkr.ecr.us-east-1.amazonaws.com/zscaler/zscaler-mcp-server:0.4.0-bedrock - Source code: Not available in the public repository
- Build process: Undocumented and opaque
The Inconsistency
This approach is particularly puzzling given that the rest of the Zscaler MCP project is fully open source:
| Component | Status | Repository |
|---|---|---|
| Core MCP Server | β Open Source | zscaler/zscaler-sdk-python-mcp |
| All Tool Implementations | β Open Source | Public GitHub |
| ZPA Tools | β Open Source | Public GitHub |
| ZIA Tools | β Open Source | Public GitHub |
| ZDX Tools | β Open Source | Public GitHub |
| ZCC Tools | β Open Source | Public GitHub |
| ZIdentity Tools | β Open Source | Public GitHub |
| AgentCore Wrapper | β Closed | Only pre-built image |
Why hide only the AgentCore wrapper when everything else is public? The wrapper is just a thin HTTP adapter (~300 lines) that translates Genesis NDJSON to MCP protocol calls. It contains no proprietary logic, algorithms, or competitive advantagesβit's purely infrastructure glue code.
This selective opacity creates an inconsistent and confusing developer experience where users can see and modify 95% of the codebase but are blocked from understanding or improving the final 5% needed for AWS deployment.
Why This Is Problematic
-
Security Concerns
- Users cannot audit the build process
- No way to verify what's actually in the container
- Cannot validate security practices
- Difficult to assess supply chain risks
-
Lack of Transparency
- Build process is hidden from users
- Cannot understand how Genesis integration works
- No visibility into dependencies or configurations
- Makes troubleshooting nearly impossible
-
Easy to Reverse Engineer Anyway
- Container images can be easily extracted (as we demonstrated)
docker exportreveals all files and code- Obscurity provides no real protection
- Only creates friction for legitimate users
-
Hinders Adoption
- Enterprise customers require source code review
- Security teams cannot approve "black box" containers
- Developers cannot learn from or improve the implementation
- Community contributions are blocked
-
Prevents Bug Fixes
- Users discover bugs but cannot submit fixes
- No way to validate proposed solutions
- Slows down issue resolution
- Forces users to maintain private forks
Recommended Approach
Make the AgentCore/Genesis wrapper code publicly available in the repository:
zscaler-mcp/
βββ src/
β βββ zscaler_mcp/
β βββ server.py # Core MCP server (already public)
β βββ tools/ # Tool implementations (already public)
β βββ web_server.py # Genesis wrapper (currently hidden)
βββ docker/
β βββ Dockerfile # Build instructions (currently hidden)
β βββ requirements.txt # Dependencies (currently hidden)
βββ docs/
βββ agentcore-deployment.md # Deployment guide (currently missing)
Benefits of Making It Public
- β Increased Trust - Users can audit the code and build process
- β Better Security - Community can identify and report vulnerabilities
- β Faster Bug Fixes - Users can submit PRs for issues they discover
- β Improved Quality - More eyes on the code leads to better implementations
- β Easier Adoption - Enterprise security teams can approve the solution
- β Community Growth - Developers can learn from and contribute to the project
- β Better Documentation - Build process becomes self-documenting
- β Reduced Support Burden - Users can troubleshoot and fix issues themselves
Critical for Enterprise Adoption
The current closed-source approach creates significant barriers for enterprise customers:
Enterprise Security Requirements
Most enterprise organizations have mandatory security policies that require:
-
Source Code Review
- Security teams must review all code before deployment
- Cannot approve "black box" containers from unknown sources
- Need to verify no malicious code or backdoors exist
- Must validate compliance with internal security standards
-
Custom Container Builds
- Enterprises build containers in their own CI/CD pipelines
- Use internal base images with approved security patches
- Apply company-specific hardening and configurations
- Sign containers with internal certificate authorities
-
Vulnerability Scanning
- Must scan all dependencies for known CVEs
- Cannot use pre-built images without scanning source
- Need to rebuild with patched dependencies when vulnerabilities are discovered
- Require SBOM (Software Bill of Materials) generation
-
Supply Chain Security
- Must verify provenance of all code and dependencies
- Cannot trust external container registries
- Need reproducible builds from source
- Require signed commits and verified contributors
Real-World Enterprise Blockers
Without access to source code and Dockerfile, enterprises cannot:
# β Cannot build from source with internal base images
docker build -t internal-registry/zscaler-mcp:1.0.0 \
--build-arg BASE_IMAGE=internal-registry/python:3.12-hardened \
.
# β Cannot scan dependencies before deployment
trivy image zscaler-mcp:latest
snyk container test zscaler-mcp:latest
# β Cannot generate SBOM for compliance
syft zscaler-mcp:latest -o spdx-json > sbom.json
# β Cannot rebuild with patched dependencies
pip install --upgrade cryptography==46.0.2 # CVE fix
docker build -t zscaler-mcp:patched .
# β Cannot apply internal security policies
# - Remove unnecessary packages
# - Add internal CA certificates
# - Configure internal logging/monitoring
# - Apply network security policiesEnterprise Approval Process
Typical enterprise security approval workflow:
1. Developer requests to use Zscaler MCP
β
2. Security team reviews source code β BLOCKED - No source available
β
3. Security team scans for vulnerabilities β BLOCKED - Cannot scan pre-built image
β
4. Security team builds in internal pipeline β BLOCKED - No Dockerfile
β
5. Security team signs and approves β BLOCKED - Cannot proceed
β
6. Deployment to production β REJECTED
Result: Enterprise customers cannot adopt the solution, regardless of technical merit.
Competitive Disadvantage
By keeping the AgentCore build closed:
- β Losing enterprise customers to competitors with open-source solutions
- β Limiting market reach to only small companies without strict security policies
- β Creating support burden from enterprises trying to reverse-engineer the container
- β Reducing trust in Zscaler's commitment to transparency and security
- β Blocking partnerships with security-conscious organizations
The Solution is Simple
Making the source code and Dockerfile public enables enterprises to:
# β
Clone the repository
git clone https://github.com/zscaler/zscaler-mcp.git
cd zscaler-mcp
# β
Review the code
security-team-review src/
# β
Build with internal base image
docker build -t internal-registry/zscaler-mcp:1.0.0 \
--build-arg BASE_IMAGE=internal-registry/python:3.12-hardened \
-f docker/Dockerfile .
# β
Scan for vulnerabilities
trivy image internal-registry/zscaler-mcp:1.0.0
# β
Generate SBOM
syft internal-registry/zscaler-mcp:1.0.0 -o spdx-json > sbom.json
# β
Sign and deploy
docker trust sign internal-registry/zscaler-mcp:1.0.0
kubectl apply -f deployment.yamlThis is standard practice for enterprise software and a requirement for serious adoption.
Precedent
Most successful MCP server implementations are fully open source:
- Anthropic's official MCP servers - Fully open source (GitHub, Filesystem, etc.)
- AWS's own MCP implementations - Fully open source
- Community MCP servers - Fully open source
- Zscaler's own MCP core - Fully open source (except AgentCore wrapper)
There is no competitive advantage to hiding the Genesis wrapper codeβit's a thin adapter layer that follows standard MCP protocol patterns. If Zscaler was comfortable open-sourcing the entire MCP server implementation with all the Zscaler API integrations, why hide the trivial HTTP wrapper?
Conclusion
We strongly urge Zscaler to make the AgentCore build publicly available in the repository. The current model of distributing only pre-built images creates unnecessary friction, reduces trust, and hinders adoption. Since the container can be easily reverse-engineered anyway (as we've demonstrated), obscurity provides no real securityβit only makes it harder for legitimate users to understand, validate, and improve the implementation.
Making the code public would align with industry best practices, increase community trust, and accelerate adoption of the Zscaler MCP server in AWS environments.