Conversation
| /// \param metric The metric object to update. | ||
| /// \param value The amount to observe metric's value to. | ||
| /// \return a TRITONSERVER_Error indicating success or failure. | ||
| TRITONSERVER_DECLSPEC struct TRITONSERVER_Error* TRITONSERVER_MetricObserve( |
There was a problem hiding this comment.
Reuse TRITONSERVER_MetricSet?
There was a problem hiding this comment.
I would need to take a closer look, but my gut reaction is that Guan is probably right and we can probably just reuse MetricValue and MetricSet which will call Collect and Observe internally when kind == KIND_HISTOGRAM if functionally equivalent
There was a problem hiding this comment.
MetricValue may not work if Collect returns multiple values (one per bucket?), but again will need to take a closer look. Let me know if you already know more details on this from your research.
But similar to the new C API for MetricV2, keep in mind how this would work if we added support for Summary metric and wanted to get the values for each quantile, which is basically same as values for each bucket. Ideally the same API would work for both or all types.
There was a problem hiding this comment.
MetricValue cannot be reused for histogram.
There was a problem hiding this comment.
I would like to revisit this change. The consensus is to keep C API and python_backend API 1:1 matched. I am inclined to add a new C API TRITONSERVER_MetricObserve for histogram instead of reusing TRITONSERVER_MetricSet for three reasons.
- Both
HistogramandSummarytypes callObserveto record new value. We can reuse observe forSummarytype if we add it in the future. - Histogram also has
ObserveMultipleAPI which may be added in the future. I don't like the idea thatHistogram.SetandHistogram.ObserveMultiplecoexist. - Setting histogram to a value aka Histogram.set(val) is semantically wrong. It is confusing to users familiar with Prometheus APIs. The description of
TRITONSERVER_MetricSetcan be verbose as well in order to describe different behaviors for counter/gauge and histogram/summary.
There was a problem hiding this comment.
There was a problem hiding this comment.
Well.. Triton metrics API is not supposed to be mirroring "Prometheus API", Prometheus is one of the "forms" we can exhibit the metrics as. So we should design the API to be using generic terms for statistics, the meaning of gauge/counter/histogram (and summary?) is not affected by the fact that Prometheus or other metrics libraries are used.
Thinking from this mindset, my question is if observe is the generic verb for recording the statistics of histogram. If that is the case, then I am fine to add XXXObserve, otherwise, we should use the proper verb
There was a problem hiding this comment.
I think either sample or observe. Voting for Observe for simplicity.
A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
Similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
d0fed63 to
fd5c44b
Compare
79566e7 to
edb0533
Compare
include/triton/core/tritonserver.h
Outdated
| /// Supports metrics of kind TRITONSERVER_METRIC_KIND_GAUGE and returns | ||
| /// Set the current value of metric to value or observe the value to metric. | ||
| /// Supports metrics of kind TRITONSERVER_METRIC_KIND_GAUGE and | ||
| /// TRITONSERVER_METRIC_KIND_HISTOGRAM. Returns |
There was a problem hiding this comment.
Do we want to explain what it does when TRITONSERVER_METRIC_KIND_HISTOGRAM is histogram (i.e. increment the counter for the bucket that value matches)?
There was a problem hiding this comment.
What does "observe" mean? Can we add more details?
There was a problem hiding this comment.
That's why I still think we need a new C API TRITONSERVER_MetricObserve.
There was a problem hiding this comment.
src/metric_family.h
Outdated
| buckets_.resize(bucket_count); | ||
| std::memcpy(buckets_.data(), buckets, sizeof(double) * bucket_count); |
There was a problem hiding this comment.
| buckets_.resize(bucket_count); | |
| std::memcpy(buckets_.data(), buckets, sizeof(double) * bucket_count); | |
| buckets_ = std::vector<double>(buckets, buckets + bucket_count); |
src/test/metrics_api_test.cc
Outdated
| std::vector<std::uint64_t> cumulative_counts = {1, 1, 2, 2, 3, 3}; | ||
| ASSERT_EQ(buckets.size() + 1, cumulative_counts.size()); |
There was a problem hiding this comment.
The cumulative_counts is depending on the buckets you split for the histogram, you should initialize cumulative_counts according to the buckets and data
What does the PR do?
Support histogram metric type and add tests.
Checklist
<commit_type>: <Title>Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
triton-inference-server/vllm_backend#56
triton-inference-server/python_backend#374
triton-inference-server/server#7525
Where should the reviewer start?
n/a
Test plan:
n/a
17487728
Caveats:
n/a
Background
Customer requested histogram metrics from vLLM.
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
n/a