feat: Add comprehensive monitoring metrics for batch classification API #58

OneZero-Y · 2025-09-05T11:57:43Z

What type of PR is this?
feature

What this PR does / why we need it:

This PR adds comprehensive Prometheus monitoring metrics for the batch classification API to provide detailed performance monitoring and error tracking capabilities.

Key Features:

6 Core Metrics: Request counters, processing duration histograms, text count counters, error counters, concurrent goroutine gauges, and batch size distribution
Full Configuration Support: Configurable histogram buckets, batch size range labels, sample rate control, and optional high-precision tracking
Enhanced Error Handling: Input validation error tracking and classification error tracking with proper error type classification

Configuration Example:

api:
  batch_classification:
    metrics:
      enabled: true                      # Enable comprehensive metrics collection
      detailed_goroutine_tracking: true  # Track individual goroutine lifecycle 
      high_resolution_timing: false      # Use nanosecond precision timing 
      sample_rate: 1.0                   # Collect metrics for all requests (1.0 = 100%, 0.5 = 50%)
      
      # Batch size range labels for metrics (OPTIONAL - uses sensible defaults)
      # Default ranges: "1", "2-5", "6-10", "11-20", "21-50", "50+"
      # Only specify if you need custom ranges:
      # batch_size_ranges:
      #   - {min: 1, max: 1, label: "1"}
      #   - {min: 2, max: 5, label: "2-5"}
      #   - {min: 6, max: 10, label: "6-10"}
      #   - {min: 11, max: 20, label: "11-20"}
      #   - {min: 21, max: 50, label: "21-50"}
      #   - {min: 51, max: -1, label: "50+"}  # -1 means no upper limit
      
      # Histogram buckets for metrics (directly configure what you need)
      duration_buckets: [0.001, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, 30]
      size_buckets: [1, 2, 5, 10, 20, 50, 100, 200]

Metrics Query Examples:

# 1. View total requests by processing type
curl -s http://localhost:9190/metrics | grep "batch_classification_requests_total"

# 2. View processing duration distribution
curl -s http://localhost:9190/metrics | grep "batch_classification_duration_seconds"

# 3. View total texts processed
curl -s http://localhost:9190/metrics | grep "batch_classification_texts_total"

# 4. View error statistics by type
curl -s http://localhost:9190/metrics | grep "batch_classification_errors_total"

# 5. View concurrent goroutines (real-time)
curl -s http://localhost:9190/metrics | grep "batch_classification_concurrent_goroutines"

# 6. View batch size distribution
curl -s http://localhost:9190/metrics | grep "batch_classification_batch_size_distribution"

Signed-off-by: OneZero-Y <[email protected]>

netlify · 2025-09-05T12:02:48Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`c805974`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68bad03b3198e1000863de3f
😎 Deploy Preview	https://deploy-preview-58--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-09-05T12:05:44Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/pkg/metrics/metrics_test.go
src/semantic-router/pkg/api/server.go
src/semantic-router/pkg/api/server_test.go
src/semantic-router/pkg/config/config.go
src/semantic-router/pkg/config/config_test.go
src/semantic-router/pkg/metrics/metrics.go

📁 `config`

Owners: @rootfs
Files changed:

config/config.yaml

📁 `website`

Owners: @Xunzhuo
Files changed:

website/docs/getting-started/configuration.md

This comment was automatically generated based on the OWNER files in the repository.

netlify · 2025-09-05T12:08:05Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`219ff84`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68baeb4f8d1f3f0008a43158
😎 Deploy Preview	https://deploy-preview-58--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Signed-off-by: OneZero-Y <[email protected]>

rootfs · 2025-09-05T13:04:45Z

@OneZero-Y nicely done!

For UX consideration, would batch size range be ok to have a hardcode default instead of exposing on the config?

batch_size_ranges:
        - {min: 1, max: 1, label: "1"}
        - {min: 2, max: 5, label: "2-5"}
        - {min: 6, max: 10, label: "6-10"}
        - {min: 11, max: 20, label: "11-20"}
        - {min: 21, max: 50, label: "21-50"}
        - {min: 51, max: -1, label: "50+"}

OneZero-Y · 2025-09-05T13:28:10Z

@OneZero-Y nicely done!

For UX consideration, would batch size range be ok to have a hardcode default instead of exposing on the config?

batch_size_ranges:
        - {min: 1, max: 1, label: "1"}
        - {min: 2, max: 5, label: "2-5"}
        - {min: 6, max: 10, label: "6-10"}
        - {min: 11, max: 20, label: "11-20"}
        - {min: 21, max: 50, label: "21-50"}
        - {min: 51, max: -1, label: "50+"}

@rootfs Thank you for your recognition.
Should the configuration be optional (fallback to hardcoded defaults when batch_size_ranges is not specified), or just keep hardcoded values only?

…coded defaults Signed-off-by: OneZero-Y <[email protected]>

rootfs · 2025-09-05T13:55:18Z

let's hardcode it first. You can add a comment there and explain this.

OneZero-Y · 2025-09-05T14:08:47Z

let's hardcode it first. You can add a comment there and explain this.

There are instructions in configuration.md,The batch_size_ranges configuration in config. yaml has been removed.
please review it.

# Batch size range labels for metrics (OPTIONAL - uses sensible defaults)
      # Default ranges: "1", "2-5", "6-10", "11-20", "21-50", "50+"
      # Only specify if you need custom ranges:
      # batch_size_ranges:
      #   - {min: 1, max: 1, label: "1"}
      #   - {min: 2, max: 5, label: "2-5"}
      #   - {min: 6, max: 10, label: "6-10"}
      #   - {min: 11, max: 20, label: "11-20"}
      #   - {min: 21, max: 50, label: "21-50"}
      #   - {min: 51, max: -1, label: "50+"}  # -1 means no upper limit

feat: Add comprehensive monitoring metrics for batch classification API

c805974

Signed-off-by: OneZero-Y <[email protected]>

OneZero-Y requested review from Xunzhuo, rootfs and wangchen615 as code owners September 5, 2025 11:57

Merge branch 'main' into feat/add-batch-metrics

589692e

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Sep 5, 2025

fix:metrics_test

ec93fff

Signed-off-by: OneZero-Y <[email protected]>

feat: make batch_size_ranges configuration optional by providing hard…

219ff84

…coded defaults Signed-off-by: OneZero-Y <[email protected]>

rootfs approved these changes Sep 5, 2025

View reviewed changes

rootfs merged commit 420f7bc into vllm-project:main Sep 5, 2025
9 checks passed

OneZero-Y deleted the feat/add-batch-metrics branch September 5, 2025 14:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add comprehensive monitoring metrics for batch classification API #58

feat: Add comprehensive monitoring metrics for batch classification API #58

Uh oh!

OneZero-Y commented Sep 5, 2025 •

edited

Loading

Uh oh!

netlify bot commented Sep 5, 2025

Uh oh!

github-actions bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

netlify bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

rootfs commented Sep 5, 2025

Uh oh!

OneZero-Y commented Sep 5, 2025

Uh oh!

rootfs commented Sep 5, 2025

Uh oh!

OneZero-Y commented Sep 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add comprehensive monitoring metrics for batch classification API #58

feat: Add comprehensive monitoring metrics for batch classification API #58

Uh oh!

Conversation

OneZero-Y commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Sep 5, 2025

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 src

📁 config

📁 website

Uh oh!

netlify bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

rootfs commented Sep 5, 2025

Uh oh!

OneZero-Y commented Sep 5, 2025

Uh oh!

rootfs commented Sep 5, 2025

Uh oh!

OneZero-Y commented Sep 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

OneZero-Y commented Sep 5, 2025 •

edited

Loading

github-actions bot commented Sep 5, 2025 •

edited

Loading

📁 `src`

📁 `config`

📁 `website`

netlify bot commented Sep 5, 2025 •

edited

Loading