CodeGlia is an AI-augmented static analysis tool that integrates Semgrep, Bandit, and LLM-based scanning (Google Gemini, OpenAI GPT-4, or Ollama) to detect and explain vulnerabilities across multiple programming languages.
CodeGlia provides real-time vulnerability scanning through an interactive Flask-based dashboard.
It allows developers to upload entire projects (ZIP or folder), scans the source code for potential security issues, and generates AI-enriched reports with performance comparison metrics.
- π Multi-language static analysis (Python, Java, C, C#, PHP, JavaScript)
- π€ AI-powered vulnerability explanation and severity labeling
- π¨ NEW: LLM-based vulnerability detection with multiple backend support
- π Performance comparison between static and LLM analysis
- π Trust Score computation and visual trend tracking
- π Flask-based web dashboard with live scan progress
- π§± Dockerized for fast deployment and consistency
- βοΈ Supports local Semgrep rule packs for offline scanning
- Google Gemini (default) - Fast and efficient with gemini-2.0-flash-exp
- OpenAI GPT-4 - High-quality analysis (requires API key)
- Ollama - Local models (CodeLlama, Mistral) for offline scanning
- Scans multiple file types:
.py,.js,.java,.php,.c,.cpp,.rb, etc. - Detects: SQL injection, command injection, XSS, path traversal, hardcoded secrets, weak crypto, and more
- Maps findings to CWE/CVE using existing infrastructure
- Implements caching to avoid re-analyzing identical code
- Provides detailed explanations and secure fix recommendations
The scanner tracks and compares:
- Individual timing for each scanner (Bandit, Semgrep, LLM)
- Throughput metrics (lines/second)
- Total static vs LLM analysis time
- Time difference and speedup/slowdown factors
| Layer | Technology |
|---|---|
| Frontend | HTML, CSS (custom dark theme), JS |
| Backend | Python (Flask) |
| AI Integration | Google Gemini API, OpenAI (optional), Ollama (optional) |
| Security Analysis | Semgrep, Bandit, LLM Scanner |
| Packaging & Deployment | Docker |
| Data Handling | JSON, Pandas (for parsing reports) |
git clone https://github.com/Shreyas-135/CODEGLIA.git
cd CODEGLIA
pip install -r requirements.txt
flask rundocker build -t codeglia .
docker run -p 5050:5050 --env-file .env codegliaThen open your browser at:
π http://localhost:5050
Create a file named .env in the project root with the following content:
# Flask configuration
FLASK_APP=app.py
FLASK_RUN_HOST=0.0.0.0
# AI Configuration
GEMINI_API_KEY=your_api_key_here
# LLM Scanner Configuration (optional)
LLM_ENABLED=false # Set to true to enable LLM scanning
LLM_BACKEND=gemini # Options: gemini, openai, ollama
LLM_MODEL=gemini-2.0-flash-exp # Model name (optional, uses defaults)
# Optional: OpenAI Configuration
# OPENAI_API_KEY=your_openai_key_here
# Optional: Ollama Configuration (for local models)
# OLLAMA_BASE_URL=http://localhost:11434
python run_scan.py datasets/python run_scan.py datasets/ --enable-llmpython run_scan.py datasets/ --enable-llm --llm-backend gemini --llm-model gemini-2.0-flash-exppython run_scan.py datasets/ --comparepython run_scan.py datasets/ --static-onlypython run_scan.py datasets/ --llm-only --llm-backend gemini--enable-llm- Enable LLM scanning--llm-backend [gemini|openai|ollama]- Choose LLM backend--llm-model <model-name>- Specify model--static-only- Run only static analyzers (default for backward compatibility)--llm-only- Run only LLM scanner (for testing)--compare- Run all scanners and generate comparison report
After scanning, reports are generated in the output/ folder:
scan_report.jsonβ Raw vulnerability datascan_report.htmlβ Human-readable HTML report with performance comparisonperformance.jsonβ Detailed metrics on scan time, throughput, and comparisons
===== Starting CodeGlia Workflow =====
π Using target dataset directory: datasets
π€ LLM scanning enabled (backend: gemini, model: gemini-2.0-flash-exp)
π Running Bandit Scan (Python)...
β
Bandit completed in 2.5s (5 vulnerabilities found)
π Running Semgrep Scan (Multi-language)...
β
Semgrep completed in 8.3s (12 vulnerabilities found)
π Running LLM Scan (gemini/gemini-2.0-flash-exp)...
π Processing files in datasets...
β
LLM completed in 45.2s (8 vulnerabilities found)
π Performance Comparison:
Static Analysis Total: 10.8s
LLM Analysis Total: 45.2s
LLM is 4.2x slower (+34.4s)
Throughput:
- Bandit: 494 lines/sec
- Semgrep: 149 lines/sec
- LLM: 27 lines/sec
β
Total scan time: 56.0s
CodeGlia/
β
βββ app.py # Flask backend
βββ run_scan.py # Handles Semgrep + Bandit + LLM scanning
βββ parse_results.py # AI-driven report parsing
βββ llm_scanner.py # LLM-based vulnerability scanner module
βββ requirements.txt # Dependencies
βββ Dockerfile # Docker configuration
β
βββ static/ # CSS, JS, icons
βββ templates/ # HTML templates (Flask views)
βββ output/ # Generated reports
- Gemini 2.0 Flash - Fast, cost-effective, good accuracy
- Ollama with CodeLlama 7B - Local, fast, no API costs
- GPT-4 - Best accuracy, slower, higher cost
- Gemini Pro - Good balance of speed and accuracy
- Ollama with Mistral - Run locally, no internet required
- Ollama with CodeLlama - Optimized for code analysis
- LLM scanner is disabled by default to maintain backward compatibility
- API keys are never logged or stored in scan results
- Cache files use content hashes, not source code
- All scanners can run independently if one fails
CodeGlia can be deployed easily on:
- Render
- Railway
- Oracle Cloud (OCI)
- Azure App Service
- Amazon Web Services
Each instance supports .env for secure API key handling.