Skip to content

feat: kyc onboarding bundle#123

Open
antoatta85 wants to merge 7 commits intomainfrom
feat/kyc-onboard
Open

feat: kyc onboarding bundle#123
antoatta85 wants to merge 7 commits intomainfrom
feat/kyc-onboard

Conversation

@antoatta85
Copy link

Migrate KYC onboarding demo from Agents-at-Scale to ARK, converting all 12 workflows along with associated agents, teams, and MCP servers.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates the LegacyX/Agents-at-Scale KYC onboarding demo into ARK as a deployable “bundle”, adding new MCP servers (web research, Perplexity ask, PDF extraction), demo data + seeding, and Helm/Argo workflow templates for the full KYC workflow set.

Changes:

  • Added a new demos/kyc-onboarding-bundle Helm chart with agents, teams, RBAC, and Argo WorkflowTemplates.
  • Introduced lightweight MCP servers for web research, Perplexity “ask”, and PDF extraction (plus Dockerfiles and Kubernetes manifests).
  • Updated file-gateway and filesystem-mcp-server charts to support shared storage and configurable MCPServer naming; improved data seeder behavior in demo bundles.

Reviewed changes

Copilot reviewed 95 out of 99 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
services/file-gateway/chart/values.yaml Adds runAsGroup and configurable filesystemMcp.mcpserverName.
services/file-gateway/chart/templates/versitygw-deployment.yaml Hard-codes pod securityContext and adds initContainer to fix permissions.
services/file-gateway/chart/templates/filesystem-mcp-mcpserver.yaml Allows overriding MCPServer metadata.name via values.
mcps/web-research-mcp/requirements.txt Adds Python deps for a lightweight web-research MCP.
mcps/web-research-mcp/main.py Implements research_ubo_web + compatibility stub tools.
mcps/web-research-mcp/k8s-deployment.yaml Standalone k8s Service/Deployment + MCPServer for web-research MCP.
mcps/web-research-mcp/README.md Usage and deployment notes for web-research MCP.
mcps/web-research-mcp/Dockerfile Container build for web-research MCP.
mcps/perplexity-ask-mcp/requirements.txt Adds Python deps for Perplexity ask MCP.
mcps/perplexity-ask-mcp/main.py Implements LegacyX-compatible ask tool wrapper.
mcps/perplexity-ask-mcp/k8s-deployment.yaml Standalone k8s Service/Deployment + MCPServer for Perplexity MCP.
mcps/perplexity-ask-mcp/README.md Usage and deployment notes for Perplexity ask MCP.
mcps/perplexity-ask-mcp/Dockerfile Container build for Perplexity ask MCP.
mcps/pdf-extraction-mcp/requirements.txt Adds Python deps for PDF extraction MCP (PyMuPDF, httpx, etc.).
mcps/pdf-extraction-mcp/main.py Implements PDF text extraction + LLM-based ownership analysis tools.
mcps/pdf-extraction-mcp/k8s-deployment.yaml Standalone k8s Service/Deployment + MCPServer for PDF extraction MCP.
mcps/pdf-extraction-mcp/README.md Usage and deployment notes for PDF extraction MCP.
mcps/pdf-extraction-mcp/Dockerfile Container build for PDF extraction MCP.
mcps/filesystem-mcp-server/chart/values.yaml Adds podSecurityContext knobs + support for using an existing PVC.
mcps/filesystem-mcp-server/chart/templates/pvc.yaml Skips PVC creation when existingClaim is set.
mcps/filesystem-mcp-server/chart/templates/deployment.yaml Wires optional podSecurityContext and supports mounting an existing PVC.
demos/namespaces/kyc-onboarding-demo.yaml Adds a dedicated demo namespace manifest with ARK discovery labels/annotations.
demos/kyc-onboarding-bundle/scripts/run-workflow.sh Adds a workflow launcher script for running templates by short name.
demos/kyc-onboarding-bundle/examples/lx-retrieve-ownership-structure-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-retrieve-key-controllers-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-retrieve-entities-vessels-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-requirements-and-standards-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-profile-initialization-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-profile-finalization-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-profile-enrichment-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-kyc-memo-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-initial-risk-assessment-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-blacklist-sanction-screening-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-assess-purpose-of-relationship-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/lx-adverse-media-screening-from-template.yaml Adds example Workflow submission manifest.
demos/kyc-onboarding-bundle/examples/data/README.md Documents which demo data is committed vs generated at runtime.
demos/kyc-onboarding-bundle/examples/data/Dockerfile Adds container for demo data seeding image build.
demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_ownership_structure.yml Adds prompt templates for ownership structure extraction.
demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_key_controllers.yml Adds prompt templates for key controllers extraction.
demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_entities_vessels.yml Adds prompt templates for entities/vessels extraction.
demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/prompts_adverse_media_screening.yml Adds prompt templates for adverse media screening.
demos/kyc-onboarding-bundle/examples/data/2-customer-due-diligence/input/mock-up-blacklist.json Adds mock blacklist/sanctions inputs.
demos/kyc-onboarding-bundle/examples/data/1-customer-profile-initialization/input/prompts_web_data.yml Adds web enrichment prompt templates.
demos/kyc-onboarding-bundle/examples/data/1-customer-profile-initialization/input/prompts_uk_gov.yml Adds UK government/Companies House prompt templates.
demos/kyc-onboarding-bundle/examples/data/1-customer-profile-initialization/input/mock-up-email-abf.txt Adds mock email thread input for profile initialization.
demos/kyc-onboarding-bundle/examples/data/.gitignore Ignores generated runtime outputs under examples/data.
demos/kyc-onboarding-bundle/devspace.yaml Adds DevSpace config for deploying the bundle chart.
demos/kyc-onboarding-bundle/chart/values.yaml Adds main bundle configuration (agents, teams, workflow toggles, data seeder, file-gateway dependency config).
demos/kyc-onboarding-bundle/chart/templates/workflows/lx-requirements-and-standards.yaml Adds WorkflowTemplate for requirements/standards generation.
demos/kyc-onboarding-bundle/chart/templates/workflows/lx-profile-initialization.yaml Adds WorkflowTemplate for extracting inquiry info from email.
demos/kyc-onboarding-bundle/chart/templates/workflows/lx-profile-finalization.yaml Adds WorkflowTemplate to merge profile sections and render Markdown.
demos/kyc-onboarding-bundle/chart/templates/workflows/lx-profile-enrichment.yaml Adds WorkflowTemplate for web + government profile enrichment.
demos/kyc-onboarding-bundle/chart/templates/workflows/lx-kyc-memo.yaml Adds WorkflowTemplate for generating the final KYC memo.
demos/kyc-onboarding-bundle/chart/templates/workflows/lx-assess-purpose-of-relationship.yaml Adds WorkflowTemplate for purpose-of-relationship assessment.
demos/kyc-onboarding-bundle/chart/templates/teams/web-research-team.yaml Defines the Web Research Team CR.
demos/kyc-onboarding-bundle/chart/templates/teams/scout-rag-team.yaml Defines the Scout→RAG Team CR.
demos/kyc-onboarding-bundle/chart/templates/teams/doc-extraction-team.yaml Defines the Document Extraction Team CR.
demos/kyc-onboarding-bundle/chart/templates/teams/consolidation-team.yaml Defines the Consolidation Team CR.
demos/kyc-onboarding-bundle/chart/templates/teams/companies-house-team.yaml Defines the Companies House Team CR.
demos/kyc-onboarding-bundle/chart/templates/teams/beneficial-owners-team.yaml Defines Beneficial Owners team + supporting agent CRs.
demos/kyc-onboarding-bundle/chart/templates/rbac.yaml Adds Role/RoleBinding for Argo workflows interacting with ARK CRDs and pods/exec.
demos/kyc-onboarding-bundle/chart/templates/namespace-labels.yaml Adds optional Namespace resource for discovery labels/annotations (non-default namespaces).
demos/kyc-onboarding-bundle/chart/templates/models/default-model.yaml Adds optional Model+Secret creation for a default modelRef.
demos/kyc-onboarding-bundle/chart/templates/data-seeder-job.yaml Adds a post-install Job to upload sample data via file-gateway API.
demos/kyc-onboarding-bundle/chart/templates/agents/web-researcher-agent.yaml Adds Web Researcher agent definition referencing web-research tools.
demos/kyc-onboarding-bundle/chart/templates/agents/web-planner-agent.yaml Adds Web Planner agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/web-analyst-agent.yaml Adds Web Analyst agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/scout-agent.yaml Adds Scout agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/relevance-classification-agent.yaml Adds relevance classification agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/rag-agent.yaml Adds RAG agent definition referencing pdf-extraction tools.
demos/kyc-onboarding-bundle/chart/templates/agents/file-manager-agent.yaml Adds file manager agent definition referencing filesystem tools.
demos/kyc-onboarding-bundle/chart/templates/agents/doc-planner-agent.yaml Adds document planner agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/doc-analyst-agent.yaml Adds document analyst agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/critic-agent.yaml Adds critic agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/consolidation-planner-agent.yaml Adds consolidation planner agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/consolidation-analyst-agent.yaml Adds consolidation analyst agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/ch-planner-agent.yaml Adds Companies House planner agent definition.
demos/kyc-onboarding-bundle/chart/templates/agents/beneficial-owner-tree-agent.yaml Adds beneficial owner tree agent definition.
demos/kyc-onboarding-bundle/chart/Chart.yaml Adds Helm chart metadata and file-gateway dependency.
demos/kyc-onboarding-bundle/chart/Chart.lock Locks the file-gateway chart dependency.
demos/kyc-onboarding-bundle/README.md Documents install/run steps, outputs, and prerequisites for the KYC bundle.
demos/cobol-modernization-bundle/chart/templates/data-seeder-job.yaml Improves COBOL data seeder reliability with gateway wait + retries.
demos/cobol-modernization-bundle/README.md Updates documentation to reflect new seeding/upload options and dependency behavior.
demos/cobol-modernization-bundle/Makefile Adds options for reusing an existing file-gateway, skipping dep build, and improved upload/uninstall flows.
.gitignore Adds ignores for env files, local artifacts, helm charts, and Claude files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +47 to +53
GATEWAY="http://{{ .Values.dataSeeder.fileGateway }}"
BASE="source_code_files"
find /data -type f ! -name 'Dockerfile' ! -name 'README*' ! -path '*/.git*' | while read -r file; do
rel="$${file#/data/}"
dir="$$(dirname "$$rel")"
prefix="$$BASE/$$dir/"
echo "Uploading: $$rel -> $$prefix"
Comment on lines +15 to +19
# Execution modes:
# - fake: Use fake data (fast testing)
# - seq: Sequential extraction (slower, more reliable)
# - parallel: Parallel extraction (faster, default)
# ============================================================
Comment on lines +61 to +64
MCP_DATA="/data/aas-files"
OUTPUT_DIR=$(dirname "$MCP_DATA/{{ "{{inputs.parameters.output-memo}}" }}")
kubectl exec deployment/mcp-filesystem -- mkdir -p "$OUTPUT_DIR"
echo "Created directory: $OUTPUT_DIR"
USE_EXISTING_FILE_GATEWAY ?= false
# Set to 1 to skip 'helm dependency build' (use if it hangs at "Deleting outdated charts"; run it once manually first)
SKIP_DEP_BUILD ?= 0
FILE_GATEWAY_SKIP_FLAGS := --set file-gateway.fileApi.enabled=false --set file-gateway.filesystemMcp.enabled=false --set file-gateway.versitygw.enabled=false --set file-gateway.versitygw.createSecret=false --set file-gateway.storage.enabled=false --set file-gateway.httpRoute.enabled=false --set file-gateway.fileApi.enabled=false --set file-gateway.filesystemMcp.enabled=false --set file-gateway.versitygw.enabled=false --set file-gateway.versitygw.createSecret=false --set file-gateway.storage.enabled=false --set file-gateway.httpRoute.enabled=false
Comment on lines +31 to +35
if pdf_path.startswith("http://") or pdf_path.startswith("https://"):
# Download PDF from URL
response = httpx.get(pdf_path, timeout=30.0)
pdf_content = response.content
doc = fitz.open(stream=pdf_content, filetype="pdf")
Comment on lines 20 to +24
securityContext:
{{- toYaml .Values.versitygw.podSecurityContext | nindent 8 }}
runAsGroup: 1000
fsGroup: 1000
# runAsNonRoot omitted so fix-data-permissions init can run as root once
runAsUser: 1000
Comment on lines +26 to +33
- name: fix-data-permissions
image: busybox:latest
command: ['sh', '-c', 'chown -R 1000:1000 {{ .Values.versitygw.config.dataPath }} && chmod -R 775 {{ .Values.versitygw.config.dataPath }} && mkdir -p {{ .Values.versitygw.config.dataPath }}/{{ .Values.filesystemMcp.config.bucketName }} && chown 1000:1000 {{ .Values.versitygw.config.dataPath }}/{{ .Values.filesystemMcp.config.bucketName }}']
volumeMounts:
- name: data
mountPath: {{ .Values.versitygw.config.dataPath }}
securityContext:
runAsUser: 0
Comment on lines +31 to +55
echo "Waiting for file-gateway-api to be ready..."
for i in $(seq 1 90); do
if curl -sf "${GATEWAY}/health"; then echo "file-gateway-api is ready."; exit 0; fi
echo " attempt $$i/90, retrying in 5s..."
sleep 5
done
echo "Timeout waiting for file-gateway-api"; exit 1
containers:
- name: data-seeder
image: "{{ .Values.dataSeeder.image.repository }}:{{ .Values.dataSeeder.image.tag }}"
imagePullPolicy: {{ .Values.dataSeeder.image.pullPolicy }}
command: ["/bin/sh", "-c"]
args:
- |
set -e
echo "Seeding KYC sample data via file-gateway-api..."
GATEWAY="http://{{ .Values.dataSeeder.fileGateway }}"
BASE="source_code_files"
find /data -type f ! -name 'Dockerfile' ! -name 'README*' ! -path '*/.git*' | while read -r file; do
rel="$${file#/data/}"
dir="$$(dirname "$$rel")"
prefix="$$BASE/$$dir/"
echo "Uploading: $$rel -> $$prefix"
curl --fail -sS -X POST "${GATEWAY}/files" -F "file=@$$file" -F "prefix=$$prefix"
done
Comment on lines +147 to +155
global LLM_PROVIDER, LLM_MODEL
original_provider = LLM_PROVIDER
original_model = LLM_MODEL

if extraction_provider:
LLM_PROVIDER = extraction_provider
if extraction_model:
LLM_MODEL = extraction_model

Comment on lines +23 to +55
apiVersion: apps/v1
kind: Deployment
metadata:
name: pdf-extraction-mcp
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: pdf-extraction-mcp
template:
metadata:
labels:
app: pdf-extraction-mcp
spec:
serviceAccountName: pdf-extraction-mcp
volumes:
- name: mcp-filesystem-volume
hostPath:
path: /mnt/output
type: DirectoryOrCreate
containers:
- name: pdf-extraction-mcp
image: pdf-extraction-mcp:latest
imagePullPolicy: Never
ports:
- containerPort: 8000
name: http
volumeMounts:
- name: mcp-filesystem-volume
mountPath: /mnt/output
readOnly: true
env:
@poornimanagQB poornimanagQB marked this pull request as ready for review March 16, 2026 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants