Skip to content

pavelkushtia/LangGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Local LangGraph AI Cluster πŸš€

A complete guide to setting up a distributed LangGraph infrastructure using your local hardware - zero external API costs!

πŸ—οΈ Architecture Overview

Your optimal setup uses your available hardware efficiently:

  • jetson-node (Orin Nano 8GB): Primary LLM server (Ollama + small/fast models)
  • cpu-node (32GB Intel): Coordinator + heavy LLM tasks (Ollama + large models) + HAProxy + Redis
  • rp-node (8GB ARM): Embeddings server (efficient ARM processing)
  • worker-node3 (6GB VM): Tools execution server (web search, scraping, commands)
  • worker-node4 (6GB VM): Monitoring and health checks (optional)

πŸš€ Quick Start

1. Set Up Your Machines (Modular Approach)

Follow the setup guides in order - each guide is self-contained and updated from the comprehensive SOT:

# 1. Setup Jetson Orin Nano (Primary LLM Server)
# Follow: setup_guides/01_jetson_setup.md
# Sets up Ollama + TensorRT optimizations on jetson-node (192.168.1.177)

# 2. Setup CPU Coordinator (Heavy LLM + Load Balancer + Cache)  
# Follow: setup_guides/02_cpu_setup.md
# Sets up Ollama + HAProxy + Redis on cpu-node (192.168.1.81)

# 3. Setup LangGraph Integration (Workflows + Routing)
# Follow: setup_guides/03_langgraph_integration.md  
# Creates intelligent routing and tool integration

# 4. Setup Worker Nodes (Embeddings + Tools + Monitoring)
# Follow: setup_guides/04_distributed_coordination.md
# Sets up rp-node, worker-node3, worker-node4 + orchestration

2. Start Your Cluster

# All IPs are pre-configured for your actual nodes!
cd ~/ai-infrastructure/langgraph-config
source ~/langgraph-env/bin/activate

# Start entire cluster
python3 cluster_orchestrator.py start

# Check cluster status  
python3 cluster_orchestrator.py status

# Test all services
python3 cluster_orchestrator.py test

3. Test Your Setup

# Test LangGraph workflows
cd ~/ai-infrastructure/langgraph-config
python3 main_app.py

# Run example workflows
cd /home/sanzad/git/langgraph/examples/
python3 example_workflows.py

🎯 Single Source of Truth

  • Complete Guide: 00_complete_deployment_guide.md - Full walkthrough with all commands
  • Modular Guides: 01-04 are extracted and synchronized from the complete guide

πŸ“Š Performance Expectations

Machine Model RAM Usage Speed Use Case
jetson-node Llama 3.2 3B ~3GB 15-25 tok/s General chat
jetson-node Llama 3.2 1B ~1.5GB 30-50 tok/s Quick responses
cpu-node Mistral 7B ~4.4GB 8-15 tok/s Complex analysis
rp-node all-MiniLM-L6-v2 ~200MB 1000+ emb/s Semantic intelligence (ARM)

πŸ“š Documentation

🎯 Example Workflows

Research Assistant

# Automatically searches web, synthesizes findings
result = await research_workflow.invoke("What are the latest AI trends?")

Coding Assistant

# Routes simple/complex coding questions appropriately
result = await coding_workflow.invoke("Build a FastAPI app with auth")

Data Analysis

# Scrapes data and provides analysis
result = await data_workflow.invoke("Analyze this dataset: https://...")

πŸ”§ Key Features

  • πŸ†“ Zero Cost: No LLM API costs ever (local inference only)
  • ⚑ Smart Routing: Auto-routes tasks to optimal hardware
  • πŸ”„ Load Balancing: HAProxy distributes load automatically
  • πŸ“Š Monitoring: Real-time health checks + optional Langfuse/Helicone
  • πŸ›‘οΈ Fault Tolerance: Automatic failover and restart
  • πŸ“ˆ Auto-scaling: Dynamic model loading based on load

🌟 Why This Setup?

For your hardware specifically:

  • Jetson advantages: ARM efficiency, unified memory, low power
  • Skip RTX cards: They're overkill for learning and consume 175W+ each
  • Distributed approach: Maximizes utilization of all machines
  • Local-first: Complete privacy and control

πŸ” Monitoring

# Check cluster health
curl http://192.168.1.191:8083/cluster_health

# View load balancer stats
open http://192.168.1.81:9000/haproxy_stats

# Monitor real-time performance
htop  # On each machine

πŸ› οΈ Customization

Add New Models

# On Jetson
ollama pull codellama:7b

# On CPU
wget https://huggingface.co/.../model.bin

Scale Up/Down

# Add more workers by modifying config.py
# Adjust model sizes based on available RAM

Custom Workflows

# Create new workflows in examples/
# Follow the existing patterns for routing and tools

Add Advanced Monitoring (Optional)

# Option 1: Langfuse (LangSmith alternative) - Advanced tracing & analytics
# Follow: setup_guides/05_langfuse_setup.md

# Option 2: Helicone (Alternative monitoring) - Real-time monitoring & debugging  
# Follow: setup_guides/06_helicone_setup.md

# Both are completely free and self-hosted!

πŸ“ Setup Guide Structure

Guide Purpose Machine(s) Status
00_complete_deployment_guide.md 🎯 Master SOT - Complete walkthrough All machines βœ… Production ready
01_jetson_setup.md Jetson Orin Nano setup jetson-node βœ… Synced from SOT
02_cpu_setup.md CPU coordinator setup cpu-node βœ… Synced from SOT
03_langgraph_integration.md LangGraph workflows cpu-node βœ… Synced from SOT
04_distributed_coordination.md Worker nodes + orchestration All workers βœ… Synced from SOT
05_langfuse_setup.md Optional: Advanced monitoring cpu-node βœ… Optional feature
06_helicone_setup.md Optional: Alternative monitoring cpu-node βœ… Optional feature

Benefits of this structure:

  • βœ… Modular: Focus on one machine/service at a time
  • βœ… Updated: All guides synced from the comprehensive SOT
  • βœ… Flexible: Use individual guides or complete guide
  • βœ… Maintained: Single source of truth prevents sync issues

🚨 Troubleshooting

Common Issues

Out of Memory

# Switch to smaller models
ollama pull tinyllama:1.1b

Service Not Starting

# Check logs
sudo journalctl -u ollama -f
sudo systemctl restart ollama

Network Issues

# Test connectivity
curl http://MACHINE_IP:PORT/health

πŸ“š Learning Resources

This setup is perfect for learning:

  • LangGraph workflow patterns
  • Distributed AI systems
  • Local model deployment
  • Resource optimization
  • MLOps practices

πŸŽ‰ What You've Built

  • Production-ready local AI infrastructure
  • Cost-effective learning environment
  • Scalable architecture that grows with you
  • Privacy-focused - your data never leaves your network

πŸ”— Next Steps

  1. Experiment with the example workflows
  2. Create your own domain-specific flows
  3. Scale up by adding more models or machines
  4. Optimize based on your specific use cases
  5. Share your workflows with the community!

Happy Learning! πŸŽ“ You now have a professional-grade local AI infrastructure that rivals cloud solutions - for free!

About

Small Homelab LangGraph Cluster Setup with a Workflow Orchastrator UI

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors