FlexKV integrates a Prometheus-based runtime metrics monitoring framework, covering critical paths in both the Python and C++ layers. The framework is embedded in the FlexKV runtime in a zero-intrusion manner — users simply set the environment variable FLEXKV_ENABLE_METRICS=1 to automatically collect core metrics such as cache hits, memory pool status, and data transfers during application runtime, exposing them via standard HTTP endpoints for Prometheus scraping and Grafana visualization.
| Variable | Default | Description |
|---|---|---|
FLEXKV_ENABLE_METRICS |
0 |
Enable metrics collection (set to 1 to enable, disabled by default) |
FLEXKV_PY_METRICS_PORT |
8080 |
Python metrics HTTP server port |
FLEXKV_CPP_METRICS_PORT |
8081 |
C++ metrics HTTP server port |
# Enable FlexKV metrics collection
export FLEXKV_ENABLE_METRICS=1
# Custom ports (optional)
export FLEXKV_PY_METRICS_PORT=8080
export FLEXKV_CPP_METRICS_PORT=8081Python metrics are recorded by GlobalCacheEngine in cache_engine.py and collected via FlexKVMetricsCollector.
| Metric Name | Type | Labels | Description |
|---|---|---|---|
flexkv_py_cache_hit_blocks_total |
Counter | device |
Total number of cache-hit blocks |
flexkv_py_cache_miss_blocks_total |
Counter | - | Total number of cache-miss blocks (missed at all levels) |
flexkv_py_transfer_blocks_total |
Counter | transfer_type, operation |
Total number of transferred blocks |
flexkv_py_transfer_ops_total |
Counter | transfer_type, operation |
Number of transfer operations |
flexkv_py_mempool_total_blocks |
Gauge | device |
Total blocks in memory pool |
flexkv_py_mempool_free_blocks |
Gauge | device |
Free blocks in memory pool |
flexkv_py_evicted_blocks_total |
Counter | device |
Total number of evicted blocks |
flexkv_py_allocated_blocks_total |
Counter | device |
Total number of allocated blocks |
flexkv_py_allocation_failures_total |
Counter | mode |
Number of allocation failures |
C++ metrics are managed by the MetricsManager singleton, primarily instrumented in RadixTree cache operations and data transfers.
| Metric Name | Type | Labels | Description |
|---|---|---|---|
flexkv_cpp_transfer_ops_total |
Counter | type, direction |
C++ layer data transfer operation count |
flexkv_cpp_transfer_bytes_total |
Counter | type, direction |
C++ layer total transferred bytes |
flexkv_cpp_cache_ops_total |
Counter | operation |
RadixTree cache operation count |
flexkv_cpp_cache_blocks_total |
Counter | operation |
Blocks involved in RadixTree cache operations |
FlexKV/monitoring/
├── docker-compose.yml # Prometheus + Grafana container orchestration
├── prometheus.yml # Prometheus scrape configuration
└── grafana/
├── dashboards/
│ └── flexkv-demo.json # Grafana pre-built dashboard
└── provisioning/
├── dashboards/
│ └── dashboards.yml # Dashboard auto-load configuration
└── datasources/
└── prometheus.yml # Datasource auto-configuration
# 0. Install Python dependency
pip3 install prometheus_client
# 1. Start FlexKV application with monitoring enabled
export FLEXKV_ENABLE_METRICS=1
python your_flexkv_app.py
# 2. Start Prometheus + Grafana services
cd <path-to-FlexKV>/monitoring
docker compose up -d
# 3. Stop Prometheus + Grafana services
cd <path-to-FlexKV>/monitoring
docker compose stop
# 4. Fully clean up Prometheus + Grafana services
cd <path-to-FlexKV>/monitoring
docker compose down -v| Service | URL | Description |
|---|---|---|
| Python Metrics | http://localhost:8080/metrics |
Python runtime metrics endpoint |
| C++ Metrics | http://localhost:8081/metrics |
C++ runtime metrics endpoint |
| Prometheus | http://localhost:9090 |
Metrics query interface |
| Grafana | http://localhost:3000 |
Visualization dashboards |
Quick endpoint verification:
# Verify Python metrics endpoint
curl -s http://localhost:8080/metrics | grep flexkv_py_
# Verify C++ metrics endpoint
curl -s http://localhost:8081/metrics | grep flexkv_cpp_- Open your browser and navigate to
http://localhost:3000 - Log in with default credentials: username
admin, passwordadmin - Go to Dashboards → FlexKV Demo to view the pre-built dashboard
Pre-built dashboard panels:
| Panel | Description |
|---|---|
| Cache Hit/Miss Rate | Python layer cache hit/miss rate |
| Memory Pool Blocks | Python layer memory pool block statistics |
| C++ Cache Operations Rate | C++ layer cache operation rate |
| C++ Transfer Throughput | C++ layer data transfer throughput |
Users can create custom panels and configure PromQL queries as needed.