FlexKV Prometheus Metrics Documentation

FlexKV integrates a Prometheus-based runtime metrics monitoring framework, covering critical paths in both the Python and C++ layers. The framework is embedded in the FlexKV runtime in a zero-intrusion manner — users simply set the environment variable FLEXKV_ENABLE_METRICS=1 to automatically collect core metrics such as cache hits, memory pool status, and data transfers during application runtime, exposing them via standard HTTP endpoints for Prometheus scraping and Grafana visualization.

1. Configuration

1.1 Environment Variables

Variable	Default	Description
`FLEXKV_ENABLE_METRICS`	`0`	Enable metrics collection (set to `1` to enable, disabled by default)
`FLEXKV_PY_METRICS_PORT`	`8080`	Python metrics HTTP server port
`FLEXKV_CPP_METRICS_PORT`	`8081`	C++ metrics HTTP server port

1.2 Configuration

# Enable FlexKV metrics collection
export FLEXKV_ENABLE_METRICS=1

# Custom ports (optional)
export FLEXKV_PY_METRICS_PORT=8080
export FLEXKV_CPP_METRICS_PORT=8081

2. Metrics Reference

2.1 Python Runtime Metrics (`flexkv_py_*`)

Python metrics are recorded by GlobalCacheEngine in cache_engine.py and collected via FlexKVMetricsCollector.

Metric Name	Type	Labels	Description
`flexkv_py_cache_hit_blocks_total`	Counter	`device`	Total number of cache-hit blocks
`flexkv_py_cache_miss_blocks_total`	Counter	-	Total number of cache-miss blocks (missed at all levels)
`flexkv_py_transfer_blocks_total`	Counter	`transfer_type`, `operation`	Total number of transferred blocks
`flexkv_py_transfer_ops_total`	Counter	`transfer_type`, `operation`	Number of transfer operations
`flexkv_py_mempool_total_blocks`	Gauge	`device`	Total blocks in memory pool
`flexkv_py_mempool_free_blocks`	Gauge	`device`	Free blocks in memory pool
`flexkv_py_evicted_blocks_total`	Counter	`device`	Total number of evicted blocks
`flexkv_py_allocated_blocks_total`	Counter	`device`	Total number of allocated blocks
`flexkv_py_allocation_failures_total`	Counter	`mode`	Number of allocation failures

2.2 C++ Runtime Metrics (`flexkv_cpp_*`)

C++ metrics are managed by the MetricsManager singleton, primarily instrumented in RadixTree cache operations and data transfers.

Metric Name	Type	Labels	Description
`flexkv_cpp_transfer_ops_total`	Counter	`type`, `direction`	C++ layer data transfer operation count
`flexkv_cpp_transfer_bytes_total`	Counter	`type`, `direction`	C++ layer total transferred bytes
`flexkv_cpp_cache_ops_total`	Counter	`operation`	RadixTree cache operation count
`flexkv_cpp_cache_blocks_total`	Counter	`operation`	Blocks involved in RadixTree cache operations

3. Monitoring Stack Deployment

3.1 Directory Structure

FlexKV/monitoring/
├── docker-compose.yml         # Prometheus + Grafana container orchestration
├── prometheus.yml             # Prometheus scrape configuration
└── grafana/
    ├── dashboards/
    │   └── flexkv-demo.json   # Grafana pre-built dashboard
    └── provisioning/
        ├── dashboards/
        │   └── dashboards.yml # Dashboard auto-load configuration
        └── datasources/
            └── prometheus.yml # Datasource auto-configuration

3.2 Quick Deploy

# 0. Install Python dependency
pip3 install prometheus_client

# 1. Start FlexKV application with monitoring enabled
export FLEXKV_ENABLE_METRICS=1
python your_flexkv_app.py

# 2. Start Prometheus + Grafana services
cd <path-to-FlexKV>/monitoring
docker compose up -d

# 3. Stop Prometheus + Grafana services
cd <path-to-FlexKV>/monitoring
docker compose stop

# 4. Fully clean up Prometheus + Grafana services
cd <path-to-FlexKV>/monitoring
docker compose down -v

3.3 Service Access

Service	URL	Description
Python Metrics	`http://localhost:8080/metrics`	Python runtime metrics endpoint
C++ Metrics	`http://localhost:8081/metrics`	C++ runtime metrics endpoint
Prometheus	`http://localhost:9090`	Metrics query interface
Grafana	`http://localhost:3000`	Visualization dashboards

Quick endpoint verification:

# Verify Python metrics endpoint
curl -s http://localhost:8080/metrics | grep flexkv_py_

# Verify C++ metrics endpoint
curl -s http://localhost:8081/metrics | grep flexkv_cpp_

3.4 Accessing Grafana Dashboards

Open your browser and navigate to http://localhost:3000
Log in with default credentials: username admin, password admin
Go to Dashboards → FlexKV Demo to view the pre-built dashboard

Pre-built dashboard panels:

Panel	Description
Cache Hit/Miss Rate	Python layer cache hit/miss rate
Memory Pool Blocks	Python layer memory pool block statistics
C++ Cache Operations Rate	C++ layer cache operation rate
C++ Transfer Throughput	C++ layer data transfer throughput

Users can create custom panels and configure PromQL queries as needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FlexKV Prometheus Metrics Documentation

1. Configuration

1.1 Environment Variables

1.2 Configuration

2. Metrics Reference

2.1 Python Runtime Metrics (`flexkv_py_*`)

2.2 C++ Runtime Metrics (`flexkv_cpp_*`)

3. Monitoring Stack Deployment

3.1 Directory Structure

3.2 Quick Deploy

3.3 Service Access

3.4 Accessing Grafana Dashboards

FilesExpand file tree

README_en.md

Latest commit

History

README_en.md

File metadata and controls

FlexKV Prometheus Metrics Documentation

1. Configuration

1.1 Environment Variables

1.2 Configuration

2. Metrics Reference

2.1 Python Runtime Metrics (flexkv_py_*)

2.2 C++ Runtime Metrics (flexkv_cpp_*)

3. Monitoring Stack Deployment

3.1 Directory Structure

3.2 Quick Deploy

3.3 Service Access

3.4 Accessing Grafana Dashboards

2.1 Python Runtime Metrics (`flexkv_py_*`)

2.2 C++ Runtime Metrics (`flexkv_cpp_*`)