Environment Configuration Guide

This document explains how to configure Flexible GraphRAG using environment variables and configuration files.

📁 Configuration Files

Primary Configuration

.env - Your main configuration file (copy from flexible-graphrag/env-sample.txt)
flexible-graphrag/env-sample.txt - Template with all options and examples

Frontend Configuration

Angular: Uses PROCESS_FOLDER_PATH
React/Vue: Uses VITE_PROCESS_FOLDER_PATH
See docs/SOURCE-PATH-EXAMPLES.md for details

🏗️ 5-Section Configuration Structure

The env-sample.txt follows a logical 5-section structure:

Section 1: Graph Database Configuration

Graph database selection (GRAPH_DB)
Knowledge graph extraction settings (ENABLE_KNOWLEDGE_GRAPH)
Schema configuration (SCHEMA_NAME, SCHEMAS)
Graph database connection configs (GRAPH_DB_CONFIG)

Section 2: Vector Database Configuration

Vector database selection (VECTOR_DB)
Vector database connection configs (VECTOR_DB_CONFIG)
Index/collection names for vector storage

Section 3: Search Database Configuration

Search database selection (SEARCH_DB)
Search database connection configs (SEARCH_DB_CONFIG)
Index names for fulltext search

Section 4: LLM Configuration

LLM provider selection (LLM_PROVIDER)
Provider-specific settings (OpenAI, Ollama, Azure, etc.)
API keys and model configurations
Timeout settings

Section 5: Content Sources Configuration

CMIS and Alfresco settings for document sources
Authentication credentials

🔧 Database Configuration Patterns

Selection Variables

These control which backend to use:

VECTOR_DB=neo4j        # neo4j, qdrant, elasticsearch, opensearch, none
SEARCH_DB=elasticsearch # elasticsearch, opensearch, bm25, none  
GRAPH_DB=neo4j         # neo4j, kuzu, falkordb, arcadedb, memgraph, nebula, neptune, neptune_analytics, none

Connection Configuration

Each database type has a *_DB_CONFIG JSON configuration:

# Vector database connections
VECTOR_DB_CONFIG={"host": "localhost", "port": 6333, "collection_name": "hybrid_search_vector"}

# Search database connections  
SEARCH_DB_CONFIG={"url": "http://localhost:9200", "index_name": "hybrid_search_fulltext"}

# Graph database connections
GRAPH_DB_CONFIG={"url": "bolt://localhost:7687", "username": "neo4j", "password": "password"}

Individual Database Settings

Traditional environment variables for specific databases:

# Neo4j
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password

# Elasticsearch
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_USERNAME=
ELASTICSEARCH_PASSWORD=

# Qdrant
QDRANT_HOST=localhost
QDRANT_PORT=6333
QDRANT_API_KEY=

🎯 Easy Database Switching

The configuration is designed for easy switching:

# Current setup: OpenAI + Qdrant + Elasticsearch + Neo4j
LLM_PROVIDER=openai
VECTOR_DB=qdrant
SEARCH_DB=elasticsearch  
GRAPH_DB=neo4j

# Switch to: Ollama + Neo4j (all-in-one)
LLM_PROVIDER=ollama
VECTOR_DB=neo4j
SEARCH_DB=elasticsearch
GRAPH_DB=neo4j

# Switch to: OpenAI + Kuzu + OpenSearch
LLM_PROVIDER=openai  
VECTOR_DB=opensearch
SEARCH_DB=opensearch
GRAPH_DB=kuzu

📝 Configuration Best Practices

Development Setup

Start simple: Use Neo4j for both vector and graph storage
Use defaults: Copy env-sample.txt to .env and adjust API keys
Test locally: Use localhost connections before cloud deployment

Production Setup

Separate concerns: Use specialized databases (Qdrant for vectors, Elasticsearch for search)
Secure connections: Use proper authentication and HTTPS where supported
Performance tuning: Adjust timeout values and batch sizes

Schema Configuration

Start with default: Use SCHEMA_NAME=default for comprehensive extraction
Customize gradually: Create domain-specific schemas as needed
Test thoroughly: Compare different schema approaches on your content

🔗 Related Documentation

Source paths: docs/SOURCE-PATH-EXAMPLES.md
Schema examples: docs/SCHEMA-EXAMPLES.md
Timeout configuration: docs/TIMEOUT-CONFIGURATIONS.md
Neo4j URLs: docs/GRAPH-DATABASES/Neo4j-URLs.md
Vector dimensions: docs/VECTOR-DATABASES/VECTOR-DIMENSIONS.md

🚀 Quick Start

Copy template: cp flexible-graphrag/env-sample.txt .env
Set API key: Add your OpenAI API key to OPENAI_API_KEY
Choose databases: Uncomment your preferred database options
Update connections: Modify database URLs/credentials as needed
Test configuration: Run a small test to verify everything works

The modular 5-section structure makes it easy to understand and modify any part of the configuration without affecting others.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Environment Configuration Guide

📁 Configuration Files

Primary Configuration

Frontend Configuration

🏗️ 5-Section Configuration Structure

Section 1: Graph Database Configuration

Section 2: Vector Database Configuration

Section 3: Search Database Configuration

Section 4: LLM Configuration

Section 5: Content Sources Configuration

🔧 Database Configuration Patterns

Selection Variables

Connection Configuration

Individual Database Settings

🎯 Easy Database Switching

📝 Configuration Best Practices

Development Setup

Production Setup

Schema Configuration

🔗 Related Documentation

🚀 Quick Start

FilesExpand file tree

ENVIRONMENT-CONFIGURATION.md

Latest commit

History

ENVIRONMENT-CONFIGURATION.md

File metadata and controls

Environment Configuration Guide

📁 Configuration Files

Primary Configuration

Frontend Configuration

🏗️ 5-Section Configuration Structure

Section 1: Graph Database Configuration

Section 2: Vector Database Configuration

Section 3: Search Database Configuration

Section 4: LLM Configuration

Section 5: Content Sources Configuration

🔧 Database Configuration Patterns

Selection Variables

Connection Configuration

Individual Database Settings

🎯 Easy Database Switching

📝 Configuration Best Practices

Development Setup

Production Setup

Schema Configuration

🔗 Related Documentation

🚀 Quick Start