- Specific implementation demo of GitLab Center of Excellence repository
An autonomous AI agent system for 5G Radio Access Network troubleshooting and remediation. Demonstrates ReAct framework implementation with live network simulation, Retrieval-Augmented Generation, and vLLM inference for operational intelligence.
┌─────────────────────────────────────────────────────────────┐
│ Next.js Application (Port 3001) │
│ ├─ Agentic Workflows UI │
│ ├─ Context-Aware AI Assistant (Zustand state) │
│ └─ Real-time Network Visualization │
└────────────────────┬────────────────────────────────────────┘
│
┌────────────────────┴────────────────────────────────────────┐
│ API Layer │
│ ├─ /api/ran-agent → Agent orchestration │
│ ├─ /api/chat-vllm → vLLM + RAG integration │
│ └─ /api/ran-health → Service health checks │
└────────────────────┬────────────────────────────────────────┘
│
┌───────────┴───────────┐
│ │
┌────────▼────────┐ ┌────────▼────────┐
│ RAN Services │ │ RAG Service │
│ (Flask:5000) │ │ (Flask:50052) │
│ ├─ Alarms │ │ ├─ Embeddings │
│ ├─ KPIs │ │ ├─ Vector DB │
│ ├─ Playbooks │ │ └─ Similarity │
│ └─ Live Proxy │ └─────────────────┘
└────────┬────────┘
│
┌────────▼────────┐ ┌─────────────────┐
│ RAN Simulator │───▶│ vLLM Service │
│ (Flask:5001) │ │ (External) │
│ ├─ 4 Sites │ │ ├─ Granite 3.0 │
│ ├─ 11 Cells │ │ └─ Inference │
│ └─ Live UEs │ └─────────────────┘
└─────────────────┘
- Framework: Next.js 15 with React 19
- Language: TypeScript
- State Management: Zustand context store
- UI Components: ShadCN with Tailwind CSS
- Features:
- Agentic workflow visualization with step-by-step reasoning display
- Context-aware AI assistant with page and ticket tracking
- Real-time RAN metrics and site status monitoring
- Interactive service health testing
- Technology: Python Flask
- Purpose: Simulates 5G gNodeB network operations
- Network Topology:
- 4 sites (SITE-001 to SITE-004)
- 11 cells with dynamic RF metrics
- Simulated UE connections
- SITE-002 configured with critical alarm scenario
- Endpoints:
/gnb/sites- Site information/gnb/cells- Cell status and RF metrics/gnb/ues- User equipment contexts/gnb/metrics- Network performance metrics
- Code:
gnb.py- Live 5G gNodeB simulator with 4 sites, 11 cells, dynamic UE connections, and background thread for realistic network activity | Fully-Expanded DU/CU/RRU/gNB version available !
- Technology: Python Flask
- Purpose: API gateway for RAN data and agent tools
- Implementation: Hybrid architecture combining JSON fixtures (alarms, KPIs, remediation playbooks) with live simulator proxy endpoints to provide both static reference data and real-time network state
- Agent Tools:
get_alarms- Query alarm databaseget_kpis- Retrieve KPI reportsget_cell_details- Cell-level RF metricssearch_remediation- Find applicable playbooks
- Code:
app.py- Flask service providing fixture endpoints and simulator proxy |agent.py- ReAct agent implementation with tool calling and vLLM integration
[This is a naive RAG implementation as a placeholder for more advanced approaches that would include knowledge graphs from NetworkX (see tiny-graphrag for example) and an integrated RAG-LLM and retrieval approach, such as] ZeroEntropy (ZBench with SageMath integration for exact Elo computation and document ranking)
- Technology: Python Flask
- Model: sentence-transformers (all-MiniLM-L6-v2) for embeddings
- Storage: In-memory vector store with cosine similarity search
- Data: JIRA tickets(optional), knowledge base articles (json/API/service), ticket templates (json/API/service), Remediation playbooks (json/API/service).
- Function: Retrieves contextually relevant data for LLM prompts
- Framework: ReAct (Reasoning + Acting) - The LLM reasons about what information it needs and decides which tool to call via
call_vllm(), the agent parses the response withparse_action()and executes that tool viacall_tool(), feeds the observation back to the LLM in conversation history, and iterates until the LLM synthesizes a complete answer (Build an AI agent in 175 lines, example gist) - LLM: vLLM serving IBM Granite 3.0 8B Instruct
- Workflow:
- Receive user query with page context
- Reason about required information
- Execute tool calls (get_alarms, get_kpis, etc.)
- Observe results and iterate
- Synthesize final response with remediation steps
- Modes:
- Demo Mode: Pre-programmed responses for reliable demonstrations
- Live Agent Mode: vLLM-powered autonomous reasoning
Status: DEGRADED
Alarms:
- ALM-983451 (CRITICAL): Transport Link Failure
Description: N2 interface to AMF-01 is down
- ALM-983452 (MAJOR): Cell Down
Affected: CELL-2B
Cells:
- CELL-2A: ACTIVE (85% load, overloaded)
- CELL-2B: DOWN (0 users, N2 link failure)
- CELL-2C: ACTIVE (95% load, overloaded)
KPIs:
- Call drop rate: 45.7% (threshold: 2%)
- E-RAB setup success: 88.4% (threshold: 98%)
- Handover success: 91.2% (threshold: 95%)
Remediation: RMD-001 - N2 Interface Link Failure
- Check physical layer (fiber cables)
- Verify router/switch port status
- Test gNodeB to AMF connectivity
- Verify routing to 5G core
- Check NG-C/N2 SCTP association
- Review gNodeB logs for errors
- OpenShift cluster with admin access
- OpenShift CLI (
oc) installed - vLLM service endpoint (Red Hat OpenShift AI or external)
- 2GB+ memory per pod recommended
- Clone repository
git clone <repository-url>
cd RAN-remediation-using-Agentic-AI/ai-assistant-based-application-main- Login to OpenShift
oc login --token=<your-token> --server=<your-server>
oc project ai-assistant- Update cluster configuration
cd openshift
# Edit params.env with your cluster domain and vLLM endpoint
vim params.env- Deploy services
# Apply Kustomize configuration
kubectl apply -k .
# Build main application
cd ..
oc start-build ai-assistant-app-build --from-dir=. --follow
# Build RAG service
oc start-build rag-service --from-dir=. --follow
# Build RAN services
cd ran-services
oc start-build ran-services --from-dir=. --follow
# Build RAN simulator
cd ../ran-simulator
oc start-build ran-simulator --from-dir=. --follow- Verify deployment
oc get pods -n ai-assistant
oc get route ai-assistant-app-route -n ai-assistantEnvironment Variables (Deployment):
VLLM_API_URL- vLLM inference endpointRAG_SERVICE_URL- RAG service address (default:rag-service:50052)RAN_SERVICES_URL- RAN services address (default:ran-services:5000)RAN_SIMULATOR_URL- RAN simulator address (default:ran-simulator:5001)
Build Configuration:
- All BuildConfigs use Binary source strategy
- Images pushed to internal OpenShift registry
- Resource limits defined in deployment YAML
Once deployed, access the application at:
https://<route-host>/admin/agentic-workflows
Navigate to Settings to toggle between Demo Mode and Live Agent Mode.
- Node.js 20+ and npm
- Python 3.10+
Frontend:
cd ai-assistant-based-application-main
npm install
npm run dev
# Application at http://localhost:3001RAN Services:
cd ai-assistant-based-application-main/ran-services
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python app.py
# Service at http://localhost:5000RAN Simulator:
cd ai-assistant-based-application-main/ran-simulator
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python gnb.py
# Simulator at http://localhost:5001RAG Service:
cd ai-assistant-based-application-main/server/nest/src/ipex-inference/python
python -m venv venv
source venv/bin/activate
pip install -r rag_requirements.txt
python rag_service.py
# HTTP server at localhost:50052- Navigate to Agentic Workflows page
- Click example questions or type custom query:
- "Show me all critical alarms"
- "What's wrong with SITE-002?"
- "What are the recommended actions?"
- Observe agent reasoning in Agent Activity tab
- Review retrieved data in Retrieval tab
- Test live services in RAN Services tab
Navigate to Settings → RAN Agentic Workflow:
- Demo Mode (Default): Fast, consistent responses for demonstrations
- Live Agent Mode: vLLM-powered autonomous reasoning (requires vLLM service)
| Component | Technology | Version |
|---|---|---|
| Frontend Framework | Next.js | 15.1.3 |
| UI Library | React | 19 |
| Language | TypeScript | 5.x |
| State Management | Zustand | 5.0.3 |
| Backend API | NestJS | 11.x |
| RAN Services | Flask | 3.0.0 |
| RAG Service | Flask | - |
| Embeddings | sentence-transformers | - |
| LLM | IBM Granite 3.0 8B Instruct | - |
| Inference | vLLM | - |
| Container Platform | OpenShift | 4.x |
- Deployment Quickstart
- RAN Integration Details
- Session Notes - Implementation history
This application uses 5G NR terminology:
- gNodeB
- N2 interface
- AMF
- NG-RAN architecture
- NGAP protocol
.
├── ai-assistant-based-application-main/
│ ├── app/ # Next.js application
│ │ ├── admin/ # Admin pages (Kanban, Tickets, Workflows)
│ │ ├── api/ # API routes
│ │ │ ├── chat-vllm/ # vLLM + RAG integration
│ │ │ ├── ran-agent/ # Agent orchestration
│ │ │ └── ran-health/ # Health checks
│ │ └── services/ # Frontend API client
│ ├── components/ # React components
│ │ └── right-sidebar.tsx # Context-aware AI assistant
│ ├── lib/
│ │ └── stores/ # Zustand state management
│ ├── ran-services/ # RAN services API
│ │ ├── app.py # Flask application
│ │ ├── agent.py # ReAct agent implementation
│ │ └── data/ # JSON fixtures (alarms, KPIs, playbooks)
│ ├── ran-simulator/ # 5G gNodeB simulator
│ │ └── gnb.py # Simulator implementation
│ ├── server/
│ │ └── nest/ # NestJS backend
│ │ └── src/ipex-inference/python/ # RAG service
│ └── openshift/ # Deployment configurations
│ ├── kustomization.yaml
│ ├── deployment.yaml
│ ├── buildconfig.yaml
│ ├── service.yaml
│ ├── ran-services/
│ └── ran-simulator/
└── AI_Session_Notes/ # Implementation documentation
Apache License 2.0