This document describes the environment variables available for configuring the Observatory data generation performance optimizations.
| Variable | Default | Description |
|---|---|---|
OBSERVATORY_CHUNK_SIZE |
20 |
Number of directories to process in each chunk. Smaller values use less memory but may take longer. |
ENABLE_INCREMENTAL_OBSERVATORY |
false |
Enable incremental processing - only rebuild directories with changes since last run. |
ENABLE_OBSERVATORY_GC |
false |
Force garbage collection between chunks to manage memory usage. |
OBSERVATORY_MAX_MEMORY_MB |
1024 |
Maximum memory usage target in MB (for future monitoring features). |
# Use smaller chunks and enable GC for memory-constrained environments
export OBSERVATORY_CHUNK_SIZE=10
export ENABLE_OBSERVATORY_GC=true
export OBSERVATORY_MAX_MEMORY_MB=512# Use larger chunks for faster processing when memory is not a concern
export OBSERVATORY_CHUNK_SIZE=50
export ENABLE_INCREMENTAL_OBSERVATORY=true# Enable incremental processing to speed up daily runs
export ENABLE_INCREMENTAL_OBSERVATORY=true
export OBSERVATORY_CHUNK_SIZE=20- Small chunks (5-10): Best for memory-constrained environments (< 2GB RAM)
- Medium chunks (15-25): Good balance for most production environments
- Large chunks (30-50): Best for development or high-memory environments (> 8GB RAM)
- Enable when: You have daily/regular automated runs
- Disable when: Manual runs or when you need guaranteed full data refresh
- Note: First run after enabling incremental will still be a full run
- Enable when: Running in memory-constrained environments
- Disable when: Performance is more important than memory usage
- Note: Adds slight processing overhead but prevents memory issues
Before enabling these optimizations, ensure the performance indexes are applied:
-- Apply indexes from performance-indexes.sql
source performance-indexes.sql;The optimized Observatory service provides console logging for monitoring:
Starting chunked data generation with chunk size: 20
Processing 120 directories in chunks of 20
Processing chunk 1/6
Processing chunk 2/6
...
Forced garbage collection after chunk 3
Chunked data generation completed
If you still encounter memory issues:
- Reduce
OBSERVATORY_CHUNK_SIZEto 5-10 - Enable
ENABLE_OBSERVATORY_GC=true - Consider running during off-peak hours
If processing is too slow:
- Increase
OBSERVATORY_CHUNK_SIZEto 30-50 - Disable
ENABLE_OBSERVATORY_GC - Ensure database indexes are applied
If incremental processing misses changes:
- Run one manual generation:
generateData(true) - Check the change detection queries in
getChangedDirectoryIds() - Temporarily disable incremental processing
- Phase 1: Apply database indexes
- Phase 2: Enable chunked processing with default settings
- Phase 3: Fine-tune chunk sizes based on your environment
- Phase 4: Enable incremental processing for regular automated runs
The legacy getDataLegacy() method is maintained for rollback capability if needed.