Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Migrates the LegacyX/Agents-at-Scale KYC onboarding demo into ARK as a deployable “bundle”, adding new MCP servers (web research, Perplexity ask, PDF extraction), demo data + seeding, and Helm/Argo workflow templates for the full KYC workflow set.
Changes:
- Added a new
demos/kyc-onboarding-bundleHelm chart with agents, teams, RBAC, and Argo WorkflowTemplates. - Introduced lightweight MCP servers for web research, Perplexity “ask”, and PDF extraction (plus Dockerfiles and Kubernetes manifests).
- Updated file-gateway and filesystem-mcp-server charts to support shared storage and configurable MCPServer naming; improved data seeder behavior in demo bundles.
Reviewed changes
Copilot reviewed 95 out of 99 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| services/file-gateway/chart/values.yaml | Adds runAsGroup and configurable filesystemMcp.mcpserverName. |
| services/file-gateway/chart/templates/versitygw-deployment.yaml | Hard-codes pod securityContext and adds initContainer to fix permissions. |
| services/file-gateway/chart/templates/filesystem-mcp-mcpserver.yaml | Allows overriding MCPServer metadata.name via values. |
| mcps/web-research-mcp/requirements.txt | Adds Python deps for a lightweight web-research MCP. |
| mcps/web-research-mcp/main.py | Implements research_ubo_web + compatibility stub tools. |
| mcps/web-research-mcp/k8s-deployment.yaml | Standalone k8s Service/Deployment + MCPServer for web-research MCP. |
| mcps/web-research-mcp/README.md | Usage and deployment notes for web-research MCP. |
| mcps/web-research-mcp/Dockerfile | Container build for web-research MCP. |
| mcps/perplexity-ask-mcp/requirements.txt | Adds Python deps for Perplexity ask MCP. |
| mcps/perplexity-ask-mcp/main.py | Implements LegacyX-compatible ask tool wrapper. |
| mcps/perplexity-ask-mcp/k8s-deployment.yaml | Standalone k8s Service/Deployment + MCPServer for Perplexity MCP. |
| mcps/perplexity-ask-mcp/README.md | Usage and deployment notes for Perplexity ask MCP. |
| mcps/perplexity-ask-mcp/Dockerfile | Container build for Perplexity ask MCP. |
| mcps/pdf-extraction-mcp/requirements.txt | Adds Python deps for PDF extraction MCP (PyMuPDF, httpx, etc.). |
| mcps/pdf-extraction-mcp/main.py | Implements PDF text extraction + LLM-based ownership analysis tools. |
| mcps/pdf-extraction-mcp/k8s-deployment.yaml | Standalone k8s Service/Deployment + MCPServer for PDF extraction MCP. |
| mcps/pdf-extraction-mcp/README.md | Usage and deployment notes for PDF extraction MCP. |
| mcps/pdf-extraction-mcp/Dockerfile | Container build for PDF extraction MCP. |
| mcps/filesystem-mcp-server/chart/values.yaml | Adds podSecurityContext knobs + support for using an existing PVC. |
| mcps/filesystem-mcp-server/chart/templates/pvc.yaml | Skips PVC creation when existingClaim is set. |
| mcps/filesystem-mcp-server/chart/templates/deployment.yaml | Wires optional podSecurityContext and supports mounting an existing PVC. |
| demos/namespaces/kyc-onboarding-demo.yaml | Adds a dedicated demo namespace manifest with ARK discovery labels/annotations. |
| demos/kyc-onboarding-bundle/scripts/run-workflow.sh | Adds a workflow launcher script for running templates by short name. |
| demos/kyc-onboarding-bundle/examples/lx-retrieve-ownership-structure-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-retrieve-key-controllers-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-retrieve-entities-vessels-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-requirements-and-standards-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-profile-initialization-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-profile-finalization-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-profile-enrichment-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-kyc-memo-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-initial-risk-assessment-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-blacklist-sanction-screening-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-assess-purpose-of-relationship-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/lx-adverse-media-screening-from-template.yaml | Adds example Workflow submission manifest. |
| demos/kyc-onboarding-bundle/examples/data/README.md | Documents which demo data is committed vs generated at runtime. |
| demos/kyc-onboarding-bundle/examples/data/Dockerfile | Adds container for demo data seeding image build. |
| demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_ownership_structure.yml | Adds prompt templates for ownership structure extraction. |
| demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_key_controllers.yml | Adds prompt templates for key controllers extraction. |
| demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_entities_vessels.yml | Adds prompt templates for entities/vessels extraction. |
| demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_adverse_media_screening.yml | Adds prompt templates for adverse media screening. |
| demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/mock-up-blacklist.json | Adds mock blacklist/sanctions inputs. |
| demos/kyc-onboarding-bundle/examples/data/1-customer-profile-initialization/input/prompts_web_data.yml | Adds web enrichment prompt templates. |
| demos/kyc-onboarding-bundle/examples/data/1-customer-profile-initialization/input/prompts_uk_gov.yml | Adds UK government/Companies House prompt templates. |
| demos/kyc-onboarding-bundle/examples/data/1-customer-profile-initialization/input/mock-up-email-abf.txt | Adds mock email thread input for profile initialization. |
| demos/kyc-onboarding-bundle/examples/data/.gitignore | Ignores generated runtime outputs under examples/data. |
| demos/kyc-onboarding-bundle/devspace.yaml | Adds DevSpace config for deploying the bundle chart. |
| demos/kyc-onboarding-bundle/chart/values.yaml | Adds main bundle configuration (agents, teams, workflow toggles, data seeder, file-gateway dependency config). |
| demos/kyc-onboarding-bundle/chart/templates/workflows/lx-requirements-and-standards.yaml | Adds WorkflowTemplate for requirements/standards generation. |
| demos/kyc-onboarding-bundle/chart/templates/workflows/lx-profile-initialization.yaml | Adds WorkflowTemplate for extracting inquiry info from email. |
| demos/kyc-onboarding-bundle/chart/templates/workflows/lx-profile-finalization.yaml | Adds WorkflowTemplate to merge profile sections and render Markdown. |
| demos/kyc-onboarding-bundle/chart/templates/workflows/lx-profile-enrichment.yaml | Adds WorkflowTemplate for web + government profile enrichment. |
| demos/kyc-onboarding-bundle/chart/templates/workflows/lx-kyc-memo.yaml | Adds WorkflowTemplate for generating the final KYC memo. |
| demos/kyc-onboarding-bundle/chart/templates/workflows/lx-assess-purpose-of-relationship.yaml | Adds WorkflowTemplate for purpose-of-relationship assessment. |
| demos/kyc-onboarding-bundle/chart/templates/teams/web-research-team.yaml | Defines the Web Research Team CR. |
| demos/kyc-onboarding-bundle/chart/templates/teams/scout-rag-team.yaml | Defines the Scout→RAG Team CR. |
| demos/kyc-onboarding-bundle/chart/templates/teams/doc-extraction-team.yaml | Defines the Document Extraction Team CR. |
| demos/kyc-onboarding-bundle/chart/templates/teams/consolidation-team.yaml | Defines the Consolidation Team CR. |
| demos/kyc-onboarding-bundle/chart/templates/teams/companies-house-team.yaml | Defines the Companies House Team CR. |
| demos/kyc-onboarding-bundle/chart/templates/teams/beneficial-owners-team.yaml | Defines Beneficial Owners team + supporting agent CRs. |
| demos/kyc-onboarding-bundle/chart/templates/rbac.yaml | Adds Role/RoleBinding for Argo workflows interacting with ARK CRDs and pods/exec. |
| demos/kyc-onboarding-bundle/chart/templates/namespace-labels.yaml | Adds optional Namespace resource for discovery labels/annotations (non-default namespaces). |
| demos/kyc-onboarding-bundle/chart/templates/models/default-model.yaml | Adds optional Model+Secret creation for a default modelRef. |
| demos/kyc-onboarding-bundle/chart/templates/data-seeder-job.yaml | Adds a post-install Job to upload sample data via file-gateway API. |
| demos/kyc-onboarding-bundle/chart/templates/agents/web-researcher-agent.yaml | Adds Web Researcher agent definition referencing web-research tools. |
| demos/kyc-onboarding-bundle/chart/templates/agents/web-planner-agent.yaml | Adds Web Planner agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/web-analyst-agent.yaml | Adds Web Analyst agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/scout-agent.yaml | Adds Scout agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/relevance-classification-agent.yaml | Adds relevance classification agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/rag-agent.yaml | Adds RAG agent definition referencing pdf-extraction tools. |
| demos/kyc-onboarding-bundle/chart/templates/agents/file-manager-agent.yaml | Adds file manager agent definition referencing filesystem tools. |
| demos/kyc-onboarding-bundle/chart/templates/agents/doc-planner-agent.yaml | Adds document planner agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/doc-analyst-agent.yaml | Adds document analyst agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/critic-agent.yaml | Adds critic agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/consolidation-planner-agent.yaml | Adds consolidation planner agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/consolidation-analyst-agent.yaml | Adds consolidation analyst agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/ch-planner-agent.yaml | Adds Companies House planner agent definition. |
| demos/kyc-onboarding-bundle/chart/templates/agents/beneficial-owner-tree-agent.yaml | Adds beneficial owner tree agent definition. |
| demos/kyc-onboarding-bundle/chart/Chart.yaml | Adds Helm chart metadata and file-gateway dependency. |
| demos/kyc-onboarding-bundle/chart/Chart.lock | Locks the file-gateway chart dependency. |
| demos/kyc-onboarding-bundle/README.md | Documents install/run steps, outputs, and prerequisites for the KYC bundle. |
| demos/cobol-modernization-bundle/chart/templates/data-seeder-job.yaml | Improves COBOL data seeder reliability with gateway wait + retries. |
| demos/cobol-modernization-bundle/README.md | Updates documentation to reflect new seeding/upload options and dependency behavior. |
| demos/cobol-modernization-bundle/Makefile | Adds options for reusing an existing file-gateway, skipping dep build, and improved upload/uninstall flows. |
| .gitignore | Adds ignores for env files, local artifacts, helm charts, and Claude files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Comment on lines
+47
to
+53
| GATEWAY="http://{{ .Values.dataSeeder.fileGateway }}" | ||
| BASE="source_code_files" | ||
| find /data -type f ! -name 'Dockerfile' ! -name 'README*' ! -path '*/.git*' | while read -r file; do | ||
| rel="$${file#/data/}" | ||
| dir="$$(dirname "$$rel")" | ||
| prefix="$$BASE/$$dir/" | ||
| echo "Uploading: $$rel -> $$prefix" |
Comment on lines
+15
to
+19
| # Execution modes: | ||
| # - fake: Use fake data (fast testing) | ||
| # - seq: Sequential extraction (slower, more reliable) | ||
| # - parallel: Parallel extraction (faster, default) | ||
| # ============================================================ |
Comment on lines
+61
to
+64
| MCP_DATA="/data/aas-files" | ||
| OUTPUT_DIR=$(dirname "$MCP_DATA/{{ "{{inputs.parameters.output-memo}}" }}") | ||
| kubectl exec deployment/mcp-filesystem -- mkdir -p "$OUTPUT_DIR" | ||
| echo "Created directory: $OUTPUT_DIR" |
| USE_EXISTING_FILE_GATEWAY ?= false | ||
| # Set to 1 to skip 'helm dependency build' (use if it hangs at "Deleting outdated charts"; run it once manually first) | ||
| SKIP_DEP_BUILD ?= 0 | ||
| FILE_GATEWAY_SKIP_FLAGS := --set file-gateway.fileApi.enabled=false --set file-gateway.filesystemMcp.enabled=false --set file-gateway.versitygw.enabled=false --set file-gateway.versitygw.createSecret=false --set file-gateway.storage.enabled=false --set file-gateway.httpRoute.enabled=false --set file-gateway.fileApi.enabled=false --set file-gateway.filesystemMcp.enabled=false --set file-gateway.versitygw.enabled=false --set file-gateway.versitygw.createSecret=false --set file-gateway.storage.enabled=false --set file-gateway.httpRoute.enabled=false |
Comment on lines
+31
to
+35
| if pdf_path.startswith("http://") or pdf_path.startswith("https://"): | ||
| # Download PDF from URL | ||
| response = httpx.get(pdf_path, timeout=30.0) | ||
| pdf_content = response.content | ||
| doc = fitz.open(stream=pdf_content, filetype="pdf") |
Comment on lines
20
to
+24
| securityContext: | ||
| {{- toYaml .Values.versitygw.podSecurityContext | nindent 8 }} | ||
| runAsGroup: 1000 | ||
| fsGroup: 1000 | ||
| # runAsNonRoot omitted so fix-data-permissions init can run as root once | ||
| runAsUser: 1000 |
Comment on lines
+26
to
+33
| - name: fix-data-permissions | ||
| image: busybox:latest | ||
| command: ['sh', '-c', 'chown -R 1000:1000 {{ .Values.versitygw.config.dataPath }} && chmod -R 775 {{ .Values.versitygw.config.dataPath }} && mkdir -p {{ .Values.versitygw.config.dataPath }}/{{ .Values.filesystemMcp.config.bucketName }} && chown 1000:1000 {{ .Values.versitygw.config.dataPath }}/{{ .Values.filesystemMcp.config.bucketName }}'] | ||
| volumeMounts: | ||
| - name: data | ||
| mountPath: {{ .Values.versitygw.config.dataPath }} | ||
| securityContext: | ||
| runAsUser: 0 |
Comment on lines
+31
to
+55
| echo "Waiting for file-gateway-api to be ready..." | ||
| for i in $(seq 1 90); do | ||
| if curl -sf "${GATEWAY}/health"; then echo "file-gateway-api is ready."; exit 0; fi | ||
| echo " attempt $$i/90, retrying in 5s..." | ||
| sleep 5 | ||
| done | ||
| echo "Timeout waiting for file-gateway-api"; exit 1 | ||
| containers: | ||
| - name: data-seeder | ||
| image: "{{ .Values.dataSeeder.image.repository }}:{{ .Values.dataSeeder.image.tag }}" | ||
| imagePullPolicy: {{ .Values.dataSeeder.image.pullPolicy }} | ||
| command: ["/bin/sh", "-c"] | ||
| args: | ||
| - | | ||
| set -e | ||
| echo "Seeding KYC sample data via file-gateway-api..." | ||
| GATEWAY="http://{{ .Values.dataSeeder.fileGateway }}" | ||
| BASE="source_code_files" | ||
| find /data -type f ! -name 'Dockerfile' ! -name 'README*' ! -path '*/.git*' | while read -r file; do | ||
| rel="$${file#/data/}" | ||
| dir="$$(dirname "$$rel")" | ||
| prefix="$$BASE/$$dir/" | ||
| echo "Uploading: $$rel -> $$prefix" | ||
| curl --fail -sS -X POST "${GATEWAY}/files" -F "file=@$$file" -F "prefix=$$prefix" | ||
| done |
Comment on lines
+147
to
+155
| global LLM_PROVIDER, LLM_MODEL | ||
| original_provider = LLM_PROVIDER | ||
| original_model = LLM_MODEL | ||
|
|
||
| if extraction_provider: | ||
| LLM_PROVIDER = extraction_provider | ||
| if extraction_model: | ||
| LLM_MODEL = extraction_model | ||
|
|
Comment on lines
+23
to
+55
| apiVersion: apps/v1 | ||
| kind: Deployment | ||
| metadata: | ||
| name: pdf-extraction-mcp | ||
| namespace: default | ||
| spec: | ||
| replicas: 1 | ||
| selector: | ||
| matchLabels: | ||
| app: pdf-extraction-mcp | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: pdf-extraction-mcp | ||
| spec: | ||
| serviceAccountName: pdf-extraction-mcp | ||
| volumes: | ||
| - name: mcp-filesystem-volume | ||
| hostPath: | ||
| path: /mnt/output | ||
| type: DirectoryOrCreate | ||
| containers: | ||
| - name: pdf-extraction-mcp | ||
| image: pdf-extraction-mcp:latest | ||
| imagePullPolicy: Never | ||
| ports: | ||
| - containerPort: 8000 | ||
| name: http | ||
| volumeMounts: | ||
| - name: mcp-filesystem-volume | ||
| mountPath: /mnt/output | ||
| readOnly: true | ||
| env: |
fix: update file-gateway chart version
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Migrate KYC onboarding demo from Agents-at-Scale to ARK, converting all 12 workflows along with associated agents, teams, and MCP servers.