This project provides a fully local, containerized AI development assistant that can:
- Index Flutter and React codebases
- Answer natural language queries about your code
- Generate or refactor code based on your task prompts
- Operate completely offline — no external API usage
- LlamaIndex v0.12.32
- Ollama (serving
llama3
) - FastAPI (backend API)
- Docker + Docker Compose
- CLI interface (via Python + Rich)
Make sure the following are installed on your local machine:
Recommended:
- At least 8 GB RAM for LLaMA 3 (use 70B only if you have a powerful machine)
- Directory
./repos
with:flutter-app/
codebasereact-app/
codebase
.
├── docker-compose.yml
├── ollama/
│ └── Dockerfile
├── llama-index/
│ ├── Dockerfile
│ ├── requirements.txt
│ └── app/
│ ├── config.py
│ ├── ingest.py
│ └── main.py
├── cli/
│ ├── Dockerfile
│ └── cli.py
└── repos/
├── flutter-app/
└── react-app/
-
Place your codebases in the
./repos/
folder:mkdir -p repos/flutter-app repos/react-app
-
Build and start the assistant:
docker compose up --build
This will:
- Start Ollama
- Pull
llama3
model - Start the indexing and API services
-
Use the assistant via CLI:
docker compose run --rm cli
Example prompts:
> What does the AuthService class do? > Where is the login widget defined? > Refactor the main entry point
- The assistant will build a vector index on first run from your code.
- The embedding model is
sentence-transformers/paraphrase-MiniLM-L6-v2
(local). - Ollama will automatically pull
llama3
on first boot if not found. - Everything runs offline — no OpenAI or external APIs.
- Model not found: Make sure Ollama is allowed to pull
llama3
(requires internet on first run). - Slow first response: First-time indexing and embedding can take 1–2 minutes depending on repo size.
- No answers?: Make sure your code contains enough comments or function names for semantic matching.
If you want to extend this with:
- Git PR integration
- VSCode plugin
- Web UI interface
Let’s connect!