Optimize metric name validation to fix 2-3x performance regression (#1662)

Copilot · zeitlinger · web-flow · commit c9cfa60744ae · 2025-11-04T14:15:39.000+01:00
Regex validation in `isValidLegacyLabelName()`, `isValidLegacyMetricName()`, and `validateUnitName()` was being called on every metric name during text format export, causing significant overhead. ## Changes - Replace regex pattern matching with character-by-character validation in `isValidLegacyLabelName()`, `isValidLegacyMetricName()`, and `validateUnitName()` - Deprecate unused `METRIC_NAME_PATTERN`, `LEGACY_LABEL_NAME_PATTERN`, and `UNIT_NAME_PATTERN` fields (kept for API compatibility) - Update JavaDoc to reflect validation approach ## Implementation Before: ```java public static boolean isValidLegacyLabelName(String name) { return LEGACY_LABEL_NAME_PATTERN.matcher(name).matches(); } ``` After: ```java public static boolean isValidLegacyLabelName(String name) { if (name.isEmpty()) return false; char first = name.charAt(0); if (!((first >= 'a' && first <= 'z') || (first >= 'A' && first <= 'Z') || first == '_')) { return false; } for (int i = 1; i < name.length(); i++) { char c = name.charAt(i); if (!((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || (c >= '0' && c <= '9') || c == '_')) { return false; } } return true; } ``` Benchmark results show recovery to near-baseline performance (532k ops/s vs 534k ops/s on main). > [!WARNING] > <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <__filter_complete__></__filter_complete__></details><issue_title>Performance regression in text-format export on 1.4.0+</issue_title> ><issue_description>It seems like the adding of support for UTF-8 characters has considerably increased processing time in the `io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeNameAndLabels` method due to it now calling `io.prometheus.metrics.model.snapshots.PrometheusNaming.isValidLegacyMetricName`. > > I ran a local test with 1000 metrics exported. Average time per export on version 1.3.10 was 0.9-1.1 ms, while on version 1.4.1 I got an average of 2.7-2.9 ms. > > I'm not sure which avenues make sense to avoid this regression and I'm not sure how relevant it is. I just found this while evaluating a move from the old `simpleclient` to the new `client-java` and that drew my attention, as the text format export is now slower than in `simpleclient`. > > I have attached the flamegraph I captured on both versions. > ><a href="https://github.com/user-attachments/files/23315564/flamegraph-v141.html">flamegraph-v141.html</a> > <a href="https://github.com/user-attachments/files/23315565/flamegraph-v1310.html">flamegraph-v1310.html</a></issue_description> > > ## Comments on the Issue (you are @copilot in this section) > ><comments> ><comment_new><author>@zeitlinger</author> > Thanks for the report > > I can confirm that the existing benchmark shows the same - we just have not integrated that into our release process (yet). > > ``` > ## Report details > - **Date generated:** 2025-11-04 08:16:07 > > ## Hardware Information: > - **Hardware Model:** Micro-Star International Co., Ltd. MS-7D76 > - **Memory:** 96.0 GiB > - **Processor:** AMD Ryzen™ 9 7900 × 24 > > ## Software Information: > - **Firmware Version:** A.N1 > - **OS Name:** Ubuntu 24.04.3 LTS > - **OS Build:** (null) > - **OS Type:** 64-bit > - **GNOME Version:** 46 > - **Windowing System:** X11 > - **Kernel Version:** Linux 6.14.0-114034-tuxedo > > tooling > > temurin-25.0.1+8.0.LTS > > main > > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 489698.973 ± 16399.308 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 507779.365 ± 2619.768 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 534028.708 ± 5998.689 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 522323.579 ± 18123.729 ops/s > > 1.3.10 > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 934183.222 ± 10919.023 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 936023.986 ± 10402.193 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 958813.578 ± 17958.230 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 965133.616 ± 10907.457 ops/s > ``` > </comment_new> ></comments> > - Fixes #1660  <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>Performance regression in text-format export on 1.4.0+</issue_title> > <issue_description>It seems like the adding of support for UTF-8 characters has considerably increased processing time in the `io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeNameAndLabels` method due to it now calling `io.prometheus.metrics.model.snapshots.PrometheusNaming.isValidLegacyMetricName`. > > I ran a local test with 1000 metrics exported. Average time per export on version 1.3.10 was 0.9-1.1 ms, while on version 1.4.1 I got an average of 2.7-2.9 ms. > > I'm not sure which avenues make sense to avoid this regression and I'm not sure how relevant it is. I just found this while evaluating a move from the old `simpleclient` to the new `client-java` and that drew my attention, as the text format export is now slower than in `simpleclient`. > > I have attached the flamegraph I captured on both versions. > > [flamegraph-v141.html](https://github.com/user-attachments/files/23315564/flamegraph-v141.html) > [flamegraph-v1310.html](https://github.com/user-attachments/files/23315565/flamegraph-v1310.html)</issue_description> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@zeitlinger</author><body> > Thanks for the report > > I can confirm that the existing benchmark shows the same - we just have not integrated that into our release process (yet). > > ``` > ## Report details > - **Date generated:** 2025-11-04 08:16:07 > > ## Hardware Information: > - **Hardware Model:** Micro-Star International Co., Ltd. MS-7D76 > - **Memory:** 96.0 GiB > - **Processor:** AMD Ryzen™ 9 7900 × 24 > > ## Software Information: > - **Firmware Version:** A.N1 > - **OS Name:** Ubuntu 24.04.3 LTS > - **OS Build:** (null) > - **OS Type:** 64-bit > - **GNOME Version:** 46 > - **Windowing System:** X11 > - **Kernel Version:** Linux 6.14.0-114034-tuxedo > > tooling > > temurin-25.0.1+8.0.LTS > > main > > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 489698.973 ± 16399.308 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 507779.365 ± 2619.768 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 534028.708 ± 5998.689 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 522323.579 ± 18123.729 ops/s > > 1.3.10 > Benchmark Mode Cnt Score Error Units > TextFormatUtilBenchmark.openMetricsWriteToByteArray thrpt 25 934183.222 ± 10919.023 ops/s > TextFormatUtilBenchmark.openMetricsWriteToNull thrpt 25 936023.986 ± 10402.193 ops/s > TextFormatUtilBenchmark.prometheusWriteToByteArray thrpt 25 958813.578 ± 17958.230 ops/s > TextFormatUtilBenchmark.prometheusWriteToNull thrpt 25 965133.616 ± 10907.457 ops/s > ``` > </body></comment_new> > </comments> > </details> - Fixes #1660  --- 💬 We'd love your input! Share your thoughts on Copilot coding agent in our [2 minute survey](https://gh.io/copilot-coding-agent-survey). --------- Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: zeitlinger <2832627+zeitlinger@users.noreply.github.com> Co-authored-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
diff --git a/prometheus-metrics-model/src/main/java/io/prometheus/metrics/model/snapshots/PrometheusNaming.java b/prometheus-metrics-model/src/main/java/io/prometheus/metrics/model/snapshots/PrometheusNaming.java
@@ -6,7 +6,6 @@
 
 import io.prometheus.metrics.config.EscapingScheme;
 import java.nio.charset.StandardCharsets;
-import java.util.regex.Pattern;
 import javax.annotation.Nullable;
 
 /**
@@ -18,15 +17,6 @@
  */
 public class PrometheusNaming {
 
-  private static final Pattern METRIC_NAME_PATTERN = Pattern.compile("^[a-zA-Z_:][a-zA-Z0-9_:]*$");
-
-  /** Legal characters for label names. */
-  private static final Pattern LEGACY_LABEL_NAME_PATTERN =
-      Pattern.compile("^[a-zA-Z_][a-zA-Z0-9_]*$");
-
-  /** Legal characters for unit names, including dot. */
-  private static final Pattern UNIT_NAME_PATTERN = Pattern.compile("^[a-zA-Z0-9_.:]+$");
-
   /**
    * According to OpenMetrics {@code _count} and {@code _sum} (and {@code _gcount}, {@code _gsum})
    * should also be reserved metric name suffixes. However, popular instrumentation libraries have
@@ -51,7 +41,9 @@ public class PrometheusNaming {
    * Test if a metric name is valid. Rules:
    *
    * <ul>
-   *   <li>The name must match {@link #METRIC_NAME_PATTERN}.
+   *   <li>The name must match <a
+   *       href="https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels">Metric
+   *       names</a>.
    *   <li>The name MUST NOT end with one of the {@link #RESERVED_METRIC_NAME_SUFFIXES}.
    * </ul>
    *
@@ -90,7 +82,29 @@ public static String validateMetricName(String name) {
   }
 
   public static boolean isValidLegacyMetricName(String name) {
-    return METRIC_NAME_PATTERN.matcher(name).matches();
+    if (name.isEmpty()) {
+      return false;
+    }
+    // First character must be [a-zA-Z_:]
+    char first = name.charAt(0);
+    if (!((first >= 'a' && first <= 'z')
+        || (first >= 'A' && first <= 'Z')
+        || first == '_'
+        || first == ':')) {
+      return false;
+    }
+    // Remaining characters must be [a-zA-Z0-9_:]
+    for (int i = 1; i < name.length(); i++) {
+      char c = name.charAt(i);
+      if (!((c >= 'a' && c <= 'z')
+          || (c >= 'A' && c <= 'Z')
+          || (c >= '0' && c <= '9')
+          || c == '_'
+          || c == ':')) {
+        return false;
+      }
+    }
+    return true;
   }
 
   public static boolean isValidLabelName(String name) {
@@ -106,7 +120,25 @@ private static boolean isValidUtf8(String name) {
   }
 
   public static boolean isValidLegacyLabelName(String name) {
-    return LEGACY_LABEL_NAME_PATTERN.matcher(name).matches();
+    if (name.isEmpty()) {
+      return false;
+    }
+    // First character must be [a-zA-Z_]
+    char first = name.charAt(0);
+    if (!((first >= 'a' && first <= 'z') || (first >= 'A' && first <= 'Z') || first == '_')) {
+      return false;
+    }
+    // Remaining characters must be [a-zA-Z0-9_]
+    for (int i = 1; i < name.length(); i++) {
+      char c = name.charAt(i);
+      if (!((c >= 'a' && c <= 'z')
+          || (c >= 'A' && c <= 'Z')
+          || (c >= '0' && c <= '9')
+          || c == '_')) {
+        return false;
+      }
+    }
+    return true;
   }
 
   /**
@@ -129,8 +161,17 @@ public static String validateUnitName(String name) {
         return suffixName + " is a reserved suffix in Prometheus";
       }
     }
-    if (!UNIT_NAME_PATTERN.matcher(name).matches()) {
-      return "The unit name contains unsupported characters";
+    // Check if all characters are [a-zA-Z0-9_.:]+
+    for (int i = 0; i < name.length(); i++) {
+      char c = name.charAt(i);
+      if (!((c >= 'a' && c <= 'z')
+          || (c >= 'A' && c <= 'Z')
+          || (c >= '0' && c <= '9')
+          || c == '_'
+          || c == '.'
+          || c == ':')) {
+        return "The unit name contains unsupported characters";
+      }
     }
     return null;
   }
@@ -246,7 +287,7 @@ public static String sanitizeUnitName(String unitName) {
     return sanitizedName;
   }
 
-  /** Returns a string that matches {@link #UNIT_NAME_PATTERN}. */
+  /** Returns a string with only valid unit name characters [a-zA-Z0-9_.:]. */
   private static String replaceIllegalCharsInUnitName(String name) {
     int length = name.length();
     char[] sanitized = new char[length];