General UDAF pushdown as scripts by songkant-aws · Pull Request #5064 · opensearch-project/sql

songkant-aws · 2026-01-22T09:08:30Z

Description

Push down any UDAF as scripts to allow parallel evaluating sub aggregation result per shard and reduce them into a final aggregation result. We expect it will speed up some complex command like patterns or future UDAFs. Pending benchmark test.

Related Issues

Resolves #4354

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

coderabbitai · 2026-01-22T09:08:39Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Optional pushdown of pattern aggregation to data nodes for distributed processing and improved performance
- Option to show numbered tokens in pattern aggregation results
- New cluster setting to enable/disable pattern aggregation pushdown
Documentation
- Expanded docs explaining pushdown behavior, how to enable it, and memory/circuit-breaker considerations
Tests
- Added integration and explain-plan tests covering pushdown and numbered-token scenarios

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds scripted-metric UDAF pushdown for pattern aggregation: introduces PatternAggregationHelpers, scripted-metric UDAF framework (UDAF interface, registry, per-phase factories, ScriptedMetricDataContext), Calcite/UDF integration, OpenSearch scripted-metric wiring, a runtime setting to toggle pushdown, tests, and docs/plan updates.

Changes

Cohort / File(s)	Summary
Pattern Aggregation Helpers `common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java`, `core/src/main/java/org/opensearch/sql/calcite/udf/udaf/LogPatternAggFunction.java`	New shared Map-based accumulator helpers for init/add/combine/result; LogPatternAggFunction refactored to delegate state updates and result production to helpers.
Settings & Runtime Flag `common/src/main/java/org/opensearch/sql/common/setting/Settings.java`, `opensearch/src/main/java/org/opensearch/sql/opensearch/setting/OpenSearchSettings.java`	Added `CALCITE_UDAF_PUSHDOWN_ENABLED` key and a dynamic node-scoped OpenSearch setting to enable/disable UDAF pushdown.
Calcite UDFs & Function Wiring `core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java`, `core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java`, `core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java`, `core/src/main/java/org/opensearch/sql/calcite/utils/UserDefinedFunctionUtils.java`	New builtin function names for pattern phases and adapters wrapping PatternAggregationHelpers static methods into Calcite UDF operators; helper to adapt static methods as UDFs.
Pattern Parsing & Projection `core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`, `core/src/main/java/org/opensearch/sql/expression/function/PatternParserFunctionImpl.java`	Added flattenParsedPattern and buildEvalAggSamplesCall to project parsed pattern/tokens/sample_logs; new evalAggSamples path for aggregation-mode parsing including optional numbered tokens.
Scripted-metric UDAF Framework `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/ScriptedMetricUDAF.java`, `.../ScriptedMetricUDAFRegistry.java`, `.../ScriptedMetricDataContext.java`, `.../udaf/PatternScriptedMetricUDAF.java`	New public interface for scripted-metric UDAFs (lifecycle methods), registry for UDAFs, DataContext implementations for phases, and PatternScriptedMetricUDAF that builds RexNode-based init/map/combine/reduce scripts.
Script Factories (per-phase) `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricInitScriptFactory.java`, `.../CalciteScriptedMetricMapScriptFactory.java`, `.../CalciteScriptedMetricCombineScriptFactory.java`, `.../CalciteScriptedMetricReduceScriptFactory.java`	New factories that wrap compiled RexNode functions into ScriptedMetricAggContexts for init, map, combine, and reduce phases, binding DataContext/params/state/states.
Pushdown Integration & Request Analysis `opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java`, `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java`	AggregateAnalyzer gains udafPushdownEnabled guard and delegates INTERNAL_PATTERN pushdown to ScriptedMetricUDAFRegistry; CalciteLogicalIndexScan propagates the pushdown flag to the analyzer helper.
Script Engine & Parameter Handling `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/CalciteScriptEngine.java`, `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.java`	Added scripted-metric script contexts to CalciteScriptEngine; ScriptParameterHelper gains addSpecialVariable to register special-variable params for scripted-metric bindings.
Scripted-metric Result Parsing `opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/ScriptedMetricParser.java`	New MetricParser implementation to parse scripted-metric aggregation results into List/Map structure expected by the engine.
OpenSearch value parsing `opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java`	Enhanced array parsing flow with unified parseArray pre-check and improved handling of empty/mixed arrays and single-element arrays.
Tests & Expectations `integ-test/src/test/java/.../CalcitePPLPatternsIT.java`, `integ-test/src/test/java/.../ExplainIT.java`, `ppl/src/test/java/.../CalcitePPLPatternsTest.java`, test resources under `integ-test/src/test/resources/expectedOutput/...`, `opensearch/src/test/java/.../AggregateAnalyzerTest.java`	Added integration tests toggling pushdown, updated explain-plan expected outputs for pushdown/no-pushdown plans, and adjusted unit tests for constructor signature changes.
Documentation `docs/user/ppl/cmd/patterns.md`	Documented UDAF pushdown option, how to enable it, and operational cautions (memory/circuit-breaker).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Coord as Coordinator
    participant DN1 as DataNode1
    participant DN2 as DataNode2
    participant DN3 as DataNode3
    participant Reducer as Reducer

    Client->>Coord: Submit pattern aggregation query (udaf pushdown enabled)
    Coord->>DN1: InitScript (state init)
    DN1->>DN1: MapScript (per-doc add -> buffer/partial merge)
    DN1->>DN1: CombineScript (emit shard state)
    Coord->>DN2: InitScript
    DN2->>DN2: MapScript
    DN2->>DN2: CombineScript
    Coord->>DN3: InitScript
    DN3->>DN3: MapScript
    DN3->>DN3: CombineScript
    DN1-->>Reducer: Shard state 1 (map)
    DN2-->>Reducer: Shard state 2
    DN3-->>Reducer: Shard state 3
    Reducer->>Reducer: ReduceScript (merge shard states -> PatternAggregationHelpers.produce)
    Reducer-->>Client: Final pattern results (List<Map<String,Object>>)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

#4868: Fixes LogPatternAggFunction buffer and return behavior — closely related to the refactor delegating accumulation/result to PatternAggregationHelpers.
#4914: Modifies ScriptParameterHelper and script pushdown infra — overlaps with added special-variable handling and scripted-metric integration.

Suggested labels

PPL, calcite

Suggested reviewers

LantaoJin
penghuo
ps48
kavithacm
derek-ho
joshuali925
anirudha
GumpacG
Swiddis
mengweieric

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.30% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'General UDAF pushdown as scripts' clearly and concisely describes the main objective of this pull request: implementing pushdown of UDAFs as scripted metric aggregations.
Description check	✅ Passed	The description is directly related to the changeset, explaining the motivation for UDAF pushdown, linking to the corresponding issue, and documenting the testing and documentation status.
Linked Issues check	✅ Passed	The PR implementation meets the core requirements from issue `#4354`: enabling UDAF pushdown via scripted metric aggregations (init/map/combine/reduce pipeline), with pattern aggregation as a primary use case and supporting infrastructure.
Out of Scope Changes check	✅ Passed	All code changes are in scope with the linked issue's objectives: new UDAF scripting infrastructure, pattern aggregation helpers, registry/factory implementations, and integration tests directly support the scripted metric pushdown feature.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Songkan Tang <songkant@amazon.com>

yuancu

LGTM

penghuo · 2026-01-26T23:42:24Z

opensearch/src/main/java/org/opensearch/sql/opensearch/setting/OpenSearchSettings.java

+  public static final Setting<?> CALCITE_UDAF_PUSHDOWN_ENABLED_SETTING =
+      Setting.boolSetting(
+          Key.CALCITE_UDAF_PUSHDOWN_ENABLED.getKeyValue(),
+          false,
+          Setting.Property.NodeScope,
+          Setting.Property.Dynamic);
+


Is it a a feature flag (cluster setting) that gates UDAF pushdown on/off at runtime? This flag will be deprecated without notice if push-down stable?

Yes, it's a feature flag to gates pushdown feature at runtime.

But I'm still cautious about deprecation. There is a comment left in OpenSearch ScriptedMetricAggregator: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/search/aggregations/metrics/ScriptedMetricAggregator.java#L65-L73. It bypasses circuit breaker. It will be prudent to keep a kill switch.

penghuo · 2026-01-26T23:47:01Z

opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java

+                  "UDAF pushdown is disabled. Enable it via cluster setting"
+                      + " 'plugins.calcite.udaf_pushdown.enabled'");
+            }
+            yield ScriptedMetricUDAFRegistry.INSTANCE


As this PR need to backport to 2.x which using JDK 11, Can we avoid yield keyword?

I heard we will no longer backport. So could we keep it here?

penghuo · 2026-01-26T23:48:41Z

opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java

+                .orElseThrow(
+                    () ->
+                        new AggregateAnalyzerException(
+                            String.format(
+                                "No scripted metric UDAF registered for %s", functionName)));


when will it happen? is it a bug in our system?

For now, it's not possible because we hardcoded the registry with only one supported UDAF(patterns) in this PR. It's a defensive check to provide more meaning info in case it really happens.

qianheng-aws · 2026-01-29T06:24:23Z

...ain/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/ScriptedMetricUDAF.java

+    ScriptParameterHelper initParamHelper =
+        new ScriptParameterHelper(fieldList, fieldTypes, rexBuilder);
+    ScriptParameterHelper mapParamHelper =
+        new ScriptParameterHelper(fieldList, fieldTypes, rexBuilder);
+    ScriptParameterHelper combineParamHelper =
+        new ScriptParameterHelper(fieldList, fieldTypes, rexBuilder);
+    ScriptParameterHelper reduceParamHelper =
+        new ScriptParameterHelper(fieldList, fieldTypes, rexBuilder);


There 4 parameters seem to be exactly the same. Do we really need all of them?

Oh, I get it. There is actually status in each param helper. We should clear these status if we want to reuse it, otherwise it will get messed.

It looks fine then.

Actually, different phases of UDAF script will use different RexNodes to serialize script. So, yes. They will have different status of sources, digests in ScriptParameterHelper.

qianheng-aws · 2026-01-29T06:25:53Z

...ain/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/ScriptedMetricUDAF.java

+        new ScriptContext(rexBuilder, combineParamHelper, cluster, rowType, fieldTypes);
+    ScriptContext reduceContext =
+        new ScriptContext(rexBuilder, reduceParamHelper, cluster, rowType, fieldTypes);
+


Ditto. Since the param helpers are the same. The ScriptContext should be the same as well.

qianheng-aws · 2026-01-29T07:17:02Z

.../opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java

+    mapArgs.add(getArgOrDefault(args, 1, makeIntLiteral(rexBuilder, DEFAULT_MAX_SAMPLE_COUNT)));
+    mapArgs.add(getArgOrDefault(args, 2, makeIntLiteral(rexBuilder, DEFAULT_BUFFER_LIMIT)));
+    mapArgs.add(
+        getArgOrDefault(args, 5, makeIntLiteral(rexBuilder, DEFAULT_VARIABLE_COUNT_THRESHOLD)));
+    mapArgs.add(
+        getArgOrDefault(args, 4, makeDoubleLiteral(rexBuilder, DEFAULT_THRESHOLD_PERCENTAGE)));
+


Shall these change to dynamic params? Otherwise, the script won't be reusable if these values differs.

In this part, we only define RexNode. The assignment of dynamic params for literals will be performed at serialization by RexStandardizer.standardizeRexNodeExpression.

qianheng-aws · 2026-01-29T07:17:21Z

.../opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java

+    reduceArgs.add(getArgOrDefault(args, 1, makeIntLiteral(rexBuilder, DEFAULT_MAX_SAMPLE_COUNT)));
+
+    // Determine variableCountThreshold and thresholdPercentage
+    RexNode variableCountThreshold = makeIntLiteral(rexBuilder, DEFAULT_VARIABLE_COUNT_THRESHOLD);
+    RexNode thresholdPercentage = makeDoubleLiteral(rexBuilder, DEFAULT_THRESHOLD_PERCENTAGE);


Same as above response.

opensearch-trigger-bot · 2026-02-12T15:31:48Z

This PR is stalled because it has been open for 2 weeks with no activity.

songkant-aws · 2026-03-03T03:10:24Z

Update the PR with a simple performance comparison between queries with pushdown disabled versus pushdown enabled.

The test query was executed on a three-shard index containing approximately 2.7 million rows of log messages, with profiling enabled:

curl -s -X POST http://localhost:9200/_plugins/_ppl -H 'Content-Type: application/json' -d '{"query":"source=logs-181998 | patterns request method=brain mode=aggregation","profile":true}'

Baseline(Pushdown disabled)

{
  "profile": {
    "summary": {
      "total_time_ms": 43977.34
    },
    "phases": {
      "analyze": {
        "time_ms": 4.17
      },
      "optimize": {
        "time_ms": 25.94
      },
      "execute": {
        "time_ms": 43947.1
      },
      "format": {
        "time_ms": 0.09
      }
    },
    "plan": {
      "node": "EnumerableLimit",
      "time_ms": 2.08,
      "rows": 29,
      "children": [
        {
          "node": "EnumerableCalc",
          "time_ms": 1.11,
          "rows": 29,
          "children": [
            {
              "node": "EnumerableCorrelate",
              "time_ms": 0.85,
              "rows": 29,
              "children": [
                {
                  "node": "EnumerableAggregate",
                  "time_ms": 0.0,
                  "rows": 1,
                  "children": [
                    {
                      "node": "EnumerableCalc",
                      "time_ms": 4304.11,
                      "rows": 2708746,
                      "children": [
                        {
                          "node": "CalciteEnumerableIndexScan",
                          "time_ms": 4150.36,
                          "rows": 2708746
                        }
                      ]
                    }
                  ]
                },
                {
                  "node": "EnumerableUncollect",
                  "time_ms": 0.22,
                  "rows": 29,
                  "children": [
                    {
                      "node": "EnumerableCalc",
                      "time_ms": 0.21,
                      "rows": 1,
                      "children": [
                        {
                          "node": "EnumerableValues",
                          "time_ms": 0.01,
                          "rows": 1
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  },
  "schema": [
    {
      "name": "patterns_field",
      "type": "string"
    },
    {
      "name": "pattern_count",
      "type": "bigint"
    },
    {
      "name": "sample_logs",
      "type": "array"
    }
  ],
  "datarows": [
    [
      "GET <*> HTTP/<*>",
      2699196,
      [
        "GET /english/images/nav_tickets_off.gif HTTP/1.1",
        "GET /english/images/comp_bg2_hm.jpg HTTP/1.0",
        "GET /english/nav_inet.html HTTP/1.1",
        "GET /robots.txt HTTP/1.0",
        "GET /french/individuals/player2913.htm HTTP/1.0",
        "GET /english/splash_inet.html HTTP/1.1",
        "GET /english/images/comp_bu_stage1n.gif HTTP/1.0",
        "GET /images/arw_lk.gif HTTP/1.0",
        "GET /english/images/space.gif HTTP/1.0",
        "GET /images/s102381.gif HTTP/1.0"
      ]
    ],
    [
      "POST <*> HTTP/<*>",
      7764,
      [
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0"
      ]
    ],
    [
      "HEAD <*> HTTP/<*>",
      1537,
      [
        "HEAD /french/tickets/ang.zip HTTP/1.0",
        "HEAD /english/venues/cities/images/paris/paris_i.gif HTTP/1.0",
        "HEAD /images/home_bg_stars.gif HTTP/1.0",
        "HEAD /images/cal_mont.gif HTTP/1.0",
        "HEAD /images/s102336.gif HTTP/1.0",
        "HEAD /english/index.html HTTP/1.0",
        "HEAD /images/s140875.gif HTTP/1.0",
        "HEAD /english/news/3004bres.htm HTTP/1.0",
        "HEAD /english/history/history_of/images/cup/515txt.gif HTTP/1.0",
        "HEAD /english/venues/images/Venue_map_top_off.gif HTTP/1.0"
      ]
    ],
    [
      "GET <*> HTTP/X.X",
      182,
      [
        "GET /english/playing/download/media/screen/win_95/dance.zip HTTP/X.X",
        "GET /english/playing/download/media/screen/win_95/intro.zip HTTP/X.X",
        "GET /english/playing/download/media/screen/win_95/dance.zip HTTP/X.X",
        "GET /english/playing/download/media/screen/win_95/intro.zip HTTP/X.X",
        "GET /cnfl/ HTTP/X.X",
        "GET /english/news/><HR><H3>Transfer/ HTTP/X.X",
        "GET /english/playing/download/media/screen/win_95/dance.zip HTTP/X.X",
        "GET /english/news/><HR><H3>Transfer/ HTTP/X.X",
        "GET /english/member/body.html HTTP/X.X",
        "GET /english/playing/download/media/screen/win_95/intro.zip HTTP/X.X"
      ]
    ]
   ...

UDAF Pushdown enabled

{
  "profile": {
    "summary": {
      "total_time_ms": 16231.11
    },
    "phases": {
      "analyze": {
        "time_ms": 3.29
      },
      "optimize": {
        "time_ms": 18.57
      },
      "execute": {
        "time_ms": 16209.09
      },
      "format": {
        "time_ms": 0.1
      }
    },
    "plan": {
      "node": "EnumerableLimit",
      "time_ms": 16207.37,
      "rows": 34,
      "children": [
        {
          "node": "EnumerableCalc",
          "time_ms": 16206.31,
          "rows": 34,
          "children": [
            {
              "node": "EnumerableCorrelate",
              "time_ms": 16206.04,
              "rows": 34,
              "children": [
                {
                  "node": "CalciteEnumerableIndexScan",
                  "time_ms": 16204.15,
                  "rows": 1
                },
                {
                  "node": "EnumerableUncollect",
                  "time_ms": 0.29,
                  "rows": 34,
                  "children": [
                    {
                      "node": "EnumerableCalc",
                      "time_ms": 0.27,
                      "rows": 1,
                      "children": [
                        {
                          "node": "EnumerableValues",
                          "time_ms": 0.01,
                          "rows": 1
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  },
  "schema": [
    {
      "name": "patterns_field",
      "type": "string"
    },
    {
      "name": "pattern_count",
      "type": "bigint"
    },
    {
      "name": "sample_logs",
      "type": "array"
    }
  ],
  "datarows": [
    [
      "GET <*> HTTP/<*>",
      2699196,
      [
        "GET /english/images/nav_tickets_off.gif HTTP/1.1",
        "GET /robots.txt HTTP/1.0",
        "GET /english/images/comp_bu_stage1n.gif HTTP/1.0",
        "GET /images/s102381.gif HTTP/1.0",
        "GET /images/s102381.gif HTTP/1.0",
        "GET /images/s102324.gif HTTP/1.1",
        "GET /images/hm_brdl.gif HTTP/1.1",
        "GET /english/tickets/images/ticket_hm_header.gif HTTP/1.0",
        "GET /images/space.gif HTTP/1.0",
        "GET /french/news/l3304.htm HTTP/1.0"
      ]
    ],
    [
      "POST <*> HTTP/<*>",
      7764,
      [
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0",
        "POST /cgi-bin/trivia/Trivia.pl HTTP/1.0"
      ]
    ],
    [
      "HEAD <*> HTTP/<*>",
      1537,
      [
        "HEAD /images/s140875.gif HTTP/1.0",
        "HEAD /english/venues/images/Venue_map_top_off.gif HTTP/1.0",
        "HEAD /english/venues/cities/images/lens/21discov.gif HTTP/1.0",
        "HEAD /english/history/images/history_hm_nav.gif HTTP/1.0",
        "HEAD /french/index.html HTTP/1.0",
        "HEAD /english/playing/images/france98b.GIF HTTP/1.0",
        "HEAD /french/images/fpnewstop.gif HTTP/1.0",
        "HEAD / HTTP/1.0",
        "HEAD /english/images/nav_sitemap_off.gif HTTP/1.0",
        "HEAD /french/news/2304tix.htm HTTP/1.0"
      ]
    ],
    [
      "GET <*> HTTP/X.X",
      176,
      [
        "GET /english/playing/download/media/screen/win_95/dance.zip HTTP/X.X",
        "GET /cnfl/ HTTP/X.X",
        "GET /english/playing/download/media/screen/win_95/dance.zip HTTP/X.X",
        "GET /english/news/><HR><H3>Transfer/ HTTP/X.X",
        "GET / HTTP/X.X",
        "GET /images/calendarvf.swf HTTP/X.X",
        "GET /english/index.html/screen/ HTTP/X.X",
        "GET /images/hobutt.gif HTTP/X.X",
        "GET /images/hobutt.gif HTTP/X.X",
        "GET /images/hobutt.gif HTTP/X.X"
      ]
    ],
    ...

Results show that the query with pushdown enabled is 2.7 times faster than without pushdown when running across three shards. The performance improvement scales with the number of shards available for parallel processing.

github-actions · 2026-03-03T03:33:27Z

PR Reviewer Guide 🔍

(Review updated until commit `16da4de`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java [518]: Add proper operand type checking for PATTERN_ADD_UDF core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java [526]: Add proper operand type checking for PATTERN_FLUSH_UDF core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java [541]: Add proper operand type checking for PATTERN_RESULT_UDF
🔀 Multiple PR themes Sub-PR theme: Refactor pattern aggregation logic into shared helpers and update Calcite plan Relevant files: common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java core/src/main/java/org/opensearch/sql/calcite/udf/udaf/LogPatternAggFunction.java core/src/main/java/org/opensearch/sql/expression/function/PatternParserFunctionImpl.java core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLPatternsTest.java Sub-PR theme: Implement scripted metric UDAF pushdown infrastructure and pattern UDAF Relevant files: opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/ScriptedMetricUDAF.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/ScriptedMetricDataContext.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/CalciteScriptEngine.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java core/src/main/java/org/opensearch/sql/calcite/utils/UserDefinedFunctionUtils.java Sub-PR theme: Fix array parsing in ExprValueFactory and add UDAF pushdown integration tests Relevant files: opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLPatternsIT.java
⚡ Recommended focus areas for review Thread Safety The `addLogToPattern` method modifies the accumulator map in-place (e.g., `logMessages.add(logMessage)`, `logMessages.clear()`, `acc.put(...)`) without any synchronization. If multiple threads share the same accumulator (possible in certain Calcite execution models), this could lead to race conditions and data corruption. public static Map<String, Object> addLogToPattern( Map<String, Object> acc, String logMessage, int maxSampleCount, int bufferLimit, int variableCountThreshold, double thresholdPercentage) { if (logMessage == null) { return acc; } List<String> logMessages = (List<String>) acc.get("logMessages"); logMessages.add(logMessage); // Trigger partial merge when buffer reaches limit if (bufferLimit > 0 && logMessages.size() >= bufferLimit) { Map<String, Map<String, Object>> patternGroupMap = (Map<String, Map<String, Object>>) acc.get("patternGroupMap"); BrainLogParser parser = new BrainLogParser(variableCountThreshold, (float) thresholdPercentage); Map<String, Map<String, Object>> partialPatterns = parser.parseAllLogPatterns(logMessages, maxSampleCount); patternGroupMap = PatternUtils.mergePatternGroups(patternGroupMap, partialPatterns, maxSampleCount); acc.put("patternGroupMap", patternGroupMap); logMessages.clear(); } return acc; } Stale Doc Values The `dataContext` is created once in the constructor using `getDoc()` and `lookup.getLeafSearchLookup(leafContext).source()`. The comment claims OpenSearch updates doc values internally before each `execute()` call, but the `MapContext` stores a reference to the doc map at construction time. If OpenSearch replaces the doc map object (rather than updating it in-place) between documents, the cached reference would be stale. This should be verified against OpenSearch's scripted metric internals. public CalciteScriptedMetricMapScript( Function1<DataContext, Object[]> function, Map<String, Object> params, Map<String, Object> state, SearchLookup lookup, LeafReaderContext leafContext) { super(params, state, lookup, leafContext); this.function = function; // Create DataContext once and reuse for all documents in this segment. // OpenSearch updates doc values and source lookup internally before each execute(). this.dataContext = new ScriptedMetricDataContext.MapContext( params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); } @Override @SuppressWarnings("unchecked") public void execute() { // Execute the compiled RexNode expression (reusing the same DataContext) Object[] result = function.apply(dataContext); ScriptedMetricDataContext.mergeResultIntoState(result, (Map<String, Object>) getState()); } } Behavior Change The new early array-check block (lines 202-209) runs before the existing field-type-undefined handling. This changes the parsing order for fields that are arrays but have no type mapping. Previously such fields would fall through to the "parse based on value" logic; now they are always parsed as arrays. This could break existing behavior for fields that happen to be arrays but were previously handled differently (e.g., GeoPoint exclusion may not cover all edge cases like nested objects stored as arrays). if (content.isArray() && (fieldType.isEmpty() \|\| supportArrays) && !fieldType .map(t -> t.equals(OpenSearchDataType.of(OpenSearchDataType.MappingType.GeoPoint))) .orElse(false)) { ExprType type = fieldType.orElse(ARRAY); return parseArray(content, field, type, supportArrays); } Fragile Arg Ordering The `resolveBrainParams` method relies on alphabetical ordering of optional brain parameters (frequency_threshold_percentage before variable_count_threshold) to distinguish them when only one is present. This implicit contract is fragile — if the caller's ordering changes or a new parameter is added, the type-based disambiguation could silently produce wrong results. Consider using named parameters or a more explicit mapping. private static RexNode[] resolveBrainParams(List<RexNode> args, RexBuilder rexBuilder) { RexNode variableCountThreshold = makeIntLiteral(rexBuilder, DEFAULT_VARIABLE_COUNT_THRESHOLD); RexNode thresholdPercentage = makeDoubleLiteral(rexBuilder, DEFAULT_THRESHOLD_PERCENTAGE); if (args.size() > 5) { // Both present: alphabetical order means args[4]=frequency_threshold_percentage, // args[5]=variable_count_threshold thresholdPercentage = args.get(4); variableCountThreshold = args.get(5); } else if (args.size() > 4) { // Only one present: determine by type RexNode arg4 = args.get(4); SqlTypeName arg4Type = arg4.getType().getSqlTypeName(); if (arg4Type == SqlTypeName.DOUBLE \|\| arg4Type == SqlTypeName.DECIMAL \|\| arg4Type == SqlTypeName.FLOAT) { thresholdPercentage = arg4; } else { variableCountThreshold = arg4; } } return new RexNode[] {variableCountThreshold, thresholdPercentage}; } TODO Left In Multiple `TODO: Add proper operand type checking` comments are left in production code for `PATTERN_ADD_UDF`, `PATTERN_FLUSH_UDF`, and `PATTERN_RESULT_UDF`. Missing operand type checking means invalid arguments won't be caught at planning time, leading to runtime errors that are harder to diagnose. public static final SqlOperator PATTERN_ADD_UDF = UserDefinedFunctionUtils.adaptStaticMethodToUDF( PatternAggregationHelpers.class, "addLogToPattern", ReturnTypes.explicit(SqlTypeName.ANY), // Returns Map<String, Object> NullPolicy.ANY, null) // TODO: Add proper operand type checking .toUDF("PATTERN_ADD_UDF"); public static final SqlOperator PATTERN_FLUSH_UDF = UserDefinedFunctionUtils.adaptStaticMethodToUDF( PatternAggregationHelpers.class, "flushPatternAccumulator", ReturnTypes.explicit(SqlTypeName.ANY), // Returns Map<String, Object> NullPolicy.ANY, null) // TODO: Add proper operand type checking .toUDF("PATTERN_FLUSH_UDF"); public static final SqlOperator PATTERN_RESULT_UDF = UserDefinedFunctionUtils.adaptStaticMethodToUDF( PatternAggregationHelpers.class, "producePatternResultFromStates", opBinding -> { // Returns List<Map<String, Object>> - represented as ARRAY<ANY> RelDataTypeFactory typeFactory = opBinding.getTypeFactory(); RelDataType anyType = typeFactory.createSqlType(SqlTypeName.ANY); return SqlTypeUtil.createArrayType(typeFactory, anyType, true); }, NullPolicy.ANY, null) // TODO: Add proper operand type checking .toUDF("PATTERN_RESULT_UDF");

github-actions · 2026-03-03T03:33:45Z

Persistent review updated to latest commit a4a6e4e

github-actions · 2026-03-03T03:34:22Z

PR Code Suggestions ✨

Latest suggestions up to 16da4de

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Prevent unintended array parsing when arrays are unsupported The new early array-detection block uses `fieldType.isEmpty() \|\| supportArrays` as the condition, but the original code only entered `parseArray` when `supportArrays` was true or the type was ARRAY. This change may cause arrays to be parsed when `supportArrays` is false and a field type is defined, potentially breaking existing behavior for non-aggregation paths where arrays should be treated as single values. opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java [202-209] if (content.isArray() - && (fieldType.isEmpty() \|\| supportArrays) + && supportArrays && !fieldType .map(t -> t.equals(OpenSearchDataType.of(OpenSearchDataType.MappingType.GeoPoint))) .orElse(false)) { ExprType type = fieldType.orElse(ARRAY); return parseArray(content, field, type, supportArrays); } Suggestion importance[1-10]: 7 __ Why: The new early array-detection block uses `fieldType.isEmpty() \|\| supportArrays` which could cause arrays to be parsed when `supportArrays` is false and a field type is defined, potentially breaking existing behavior. The suggestion to use only `supportArrays` as the condition is more conservative and safer, though it may not handle the case of undefined field types with arrays in aggregation results that this PR is trying to fix.	Medium
	Verify doc values reference is live for reuse The `dataContext` is created once in the constructor with `getDoc()` at construction time, but OpenSearch updates doc values internally before each `execute()` call. The `getDoc()` reference may be a live map that OpenSearch updates in-place, but if it returns a snapshot, the `dataContext` will use stale doc values for all documents after the first. Verify that `getDoc()` returns a live reference; if not, the `DataContext` must be recreated per document. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java [75-78] +// Note: getDoc() must return a live reference that OpenSearch updates in-place. +// If getDoc() returns a snapshot, dataContext must be recreated per execute() call. this.dataContext = new ScriptedMetricDataContext.MapContext( params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); Suggestion importance[1-10]: 6 __ Why: This is a valid concern about whether `getDoc()` returns a live reference. The improved code only adds a comment without changing behavior, so it's more of a documentation/verification suggestion. The risk is real if `getDoc()` returns a snapshot, but the improved code doesn't actually fix the issue.	Low
	Avoid mutating original state during reduction The first state (`states.get(0)`) is used directly as the initial `combined` value and then passed to `combinePatternAccumulators`, which may mutate it. This could corrupt the original state object. Create a defensive copy or start combining from index 0 explicitly to avoid unintended side effects. common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java [433-437] -Map<String, Object> combined = (Map<String, Object>) states.get(0); +Map<String, Object> combined = new HashMap<>((Map<String, Object>) states.get(0)); for (int i = 1; i < states.size(); i++) { Map<String, Object> state = (Map<String, Object>) states.get(i); combined = combinePatternAccumulators(combined, state, maxSampleCount); } Suggestion importance[1-10]: 5 __ Why: The first state is used directly as `combined` and passed to `combinePatternAccumulators`, which creates a new `HashMap` result rather than mutating in-place. Looking at `combinePatternAccumulators`, it creates a new `result` map, so the original state is not mutated. The suggestion is valid as a defensive practice but the actual risk is low given the current implementation.	Low
General	Clarify which circuit breakers are bypassed The warning mentions "certain memory circuit breakers" without specifying which ones are bypassed. This could leave users without enough information to make an informed decision. Consider clarifying which specific circuit breakers are bypassed (e.g., the parent/field data circuit breakers) so users understand the exact risk. docs/user/ppl/cmd/patterns.md [102] -> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses certain memory circuit breakers and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. +> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. Scripted metric aggregations bypass the parent and field data memory circuit breakers, which may cause out-of-memory errors on data nodes when processing very large datasets. Use with caution and monitor cluster memory usage closely. Suggestion importance[1-10]: 4 __ Why: The suggestion improves documentation clarity by specifying which circuit breakers are bypassed, but it's a minor documentation enhancement that doesn't affect functionality. The current text is still informative enough for users to understand the risk.	Low
	Improve type disambiguation for optional parameters The type-based disambiguation for a single optional brain parameter at `args[4]` is fragile. If the user passes `variable_count_threshold` as a value that happens to be typed as DECIMAL (e.g., due to literal coercion), it will be incorrectly assigned to `thresholdPercentage`. Consider using a more robust mechanism such as named parameters or positional convention documented clearly, and add a validation check. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java [145-156] } else if (args.size() > 4) { // Only one present: determine by type RexNode arg4 = args.get(4); SqlTypeName arg4Type = arg4.getType().getSqlTypeName(); if (arg4Type == SqlTypeName.DOUBLE - \|\| arg4Type == SqlTypeName.DECIMAL \|\| arg4Type == SqlTypeName.FLOAT) { + // Floating-point types unambiguously indicate frequency_threshold_percentage + thresholdPercentage = arg4; + } else if (arg4Type == SqlTypeName.INTEGER \|\| arg4Type == SqlTypeName.BIGINT) { + // Integer types unambiguously indicate variable_count_threshold + variableCountThreshold = arg4; + } else if (arg4Type == SqlTypeName.DECIMAL) { + // DECIMAL could be either; treat as thresholdPercentage (fractional values expected) thresholdPercentage = arg4; } else { variableCountThreshold = arg4; } } Suggestion importance[1-10]: 3 __ Why: The suggestion adds explicit handling for `INTEGER` and `BIGINT` types but the behavior for `DECIMAL` remains the same as the original. The improvement is marginal since the original code already handles `DOUBLE`, `DECIMAL`, and `FLOAT` as `thresholdPercentage`, and the new code doesn't fundamentally change the disambiguation logic.	Low
	Fix incorrect code block language hint The code block uses `bash ignore` as the language hint, but this is a REST API call, not a bash command. Other examples in the same file use `json` or similar. Using `bash` `ignore` may prevent syntax highlighting and confuse readers about how to execute the command. docs/user/ppl/cmd/patterns.md [93-100] -```bash ignore +```json PUT _cluster/settings { "persistent": { "plugins.calcite.udaf_pushdown.enabled": true } } <details><summary>Suggestion importance[1-10]: 3</summary> __ Why: The `bash ignore` language hint is indeed inconsistent with the REST API call format, but looking at the existing file, the earlier example at line 83 also uses `bash ignore` for the same type of `PUT _cluster/settings` call, suggesting this is an intentional convention in this file. The improvement is minor and cosmetic. </details></details></td><td align=center>Low </td></tr></tr></tbody></table> ___ #### Previous suggestions <details><summary>Suggestions up to commit bf583fa</summary> <br><table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion                                                                                                                                    </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=1>Possible issue</td> <td> <details><summary>Avoid mutating original state during reduction</summary> ___ The first state (<code>states.get(0)</code>) is used directly as the initial <code>combined</code> value and <br>then passed to <code>combinePatternAccumulators</code>, which may mutate it. This can corrupt the <br>original state object. Create a defensive copy or start combining from index 0 <br>explicitly to avoid unintended side effects. [common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java [433-437]](https://github.com/opensearch-project/sql/pull/5064/files#diff-3db9c3255990772833efe78b2ae8f456a1b8b1323dbed16782965bdd776068b1R433-R437) ```diff -Map<String, Object> combined = (Map<String, Object>) states.get(0); +Map<String, Object> combined = new HashMap<>((Map<String, Object>) states.get(0)); for (int i = 1; i < states.size(); i++) { Map<String, Object> state = (Map<String, Object>) states.get(i); combined = combinePatternAccumulators(combined, state, maxSampleCount); } Suggestion importance[1-10]: 5 __ Why: The first state is used directly as `combined` and then passed to `combinePatternAccumulators`, which creates a new `HashMap` result, so the original state is not mutated. The suggestion is valid as a defensive practice but the actual risk is low given the current implementation of `combinePatternAccumulators`.	Low
General	Ensure source lookup instance is correctly reused per document The `DataContext` is created once in the constructor using `getDoc()`, which returns the doc map reference at construction time. However, OpenSearch updates doc values per document by updating the contents of the map object, so the reference should remain valid. But `lookup.getLeafSearchLookup(leafContext).source()` creates a new `SourceLookup` per call — verify that the source lookup returned here is the same instance that gets updated per document, otherwise source-based field access will always return the first document's data. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java [75-78] +var leafSearchLookup = lookup.getLeafSearchLookup(leafContext); this.dataContext = new ScriptedMetricDataContext.MapContext( - params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); + params, state, getDoc(), leafSearchLookup.source()); Suggestion importance[1-10]: 4 __ Why: The `improved_code` is functionally equivalent to the existing code — it just stores the `leafSearchLookup` in a local variable before calling `.source()`. The actual concern about whether the `SourceLookup` instance is updated per document is valid but the suggested fix doesn't actually address it.	Low
	Reuse DataContext instance to avoid redundant allocations The combine script creates a new `DataContext` on every `execute()` call, which includes re-parsing `SOURCES` and `DIGESTS` from params each time. Since params don't change between calls, the `DataContext` should be created once (similar to `MapScript`) and reused to avoid redundant object allocation and parsing overhead. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricCombineScriptFactory.java [45-56] +private final DataContext dataContext; + +public CalciteScriptedMetricCombineScript( + Function1<DataContext, Object[]> function, + Map<String, Object> params, + Map<String, Object> state) { + super(params, state); + this.function = function; + this.dataContext = new ScriptedMetricDataContext.CombineContext(params, state); +} + @Override public Object execute() { - // Create data context for combine script - @SuppressWarnings("unchecked") - Map<String, Object> state = (Map<String, Object>) getState(); - DataContext dataContext = new ScriptedMetricDataContext.CombineContext(getParams(), state); - - // Execute the compiled RexNode expression Object[] result = function.apply(dataContext); - - // Return the combined result return (result != null && result.length > 0) ? result[0] : getState(); } Suggestion importance[1-10]: 4 __ Why: The combine script's `execute()` is typically called once per shard, so the performance benefit of caching the `DataContext` is minimal. The suggestion is valid for consistency with `MapScript` but has low practical impact.	Low
	Clarify which circuit breakers are bypassed The warning mentions "certain memory circuit breakers" without specifying which ones are bypassed. This could be misleading to users who need to understand the exact risk. Consider clarifying which specific circuit breakers are bypassed (e.g., the request circuit breaker or the field data circuit breaker) to help users make an informed decision. docs/user/ppl/cmd/patterns.md [102] -> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses certain memory circuit breakers and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. +> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. Scripted metric aggregations bypass the request and field data memory circuit breakers, which may cause out-of-memory errors on data nodes when processing very large datasets. Use with caution and monitor cluster memory usage closely. Suggestion importance[1-10]: 4 __ Why: The suggestion improves documentation clarity by specifying which circuit breakers are bypassed, but it's a minor documentation enhancement. The current text is not incorrect, just less specific.	Low
	Validate list element types to prevent downstream ClassCastException When `result` is an empty `List`, the code correctly wraps it. However, if the reduce script returns a non-List, non-null value (e.g., a `Map` or scalar), the exception message says "Expected List<Map<String, Object>>" but the actual contract should also handle the case where `result` is a `Map` (e.g., a single result). More importantly, the unchecked cast to `List<Map<String, Object>>` is implicit and could cause a `ClassCastException` downstream — consider validating the list element types. opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/ScriptedMetricParser.java [43-49] -if (result instanceof List) { +if (result instanceof List<?> list) { + // Validate that list contains maps (if non-empty) + if (!list.isEmpty() && !(list.get(0) instanceof Map)) { + throw new IllegalArgumentException( + String.format( + "Expected List<Map<String, Object>> from scripted metric but list contains %s", + list.get(0).getClass().getSimpleName())); + } return List.of(Map.of(name, result)); } throw new IllegalArgumentException( String.format( "Expected List<Map<String, Object>> from scripted metric but got %s", result.getClass().getSimpleName())); Suggestion importance[1-10]: 3 __ Why: The validation adds a runtime check for list element types, which is a defensive measure. However, since the reduce script is controlled code that always returns `List<Map<String, Object>>`, the risk of a `ClassCastException` is low in practice.	Low
	Fix incorrect code block language hint The code block uses `bash ignore` as the language hint, but the content is a REST API call (JSON body with HTTP method), not a bash script. This is inconsistent with the earlier cluster settings example in the same file which uses `json`. Using `json` would provide proper syntax highlighting for the JSON body. docs/user/ppl/cmd/patterns.md [93-100] -```bash ignore +```json PUT _cluster/settings { "persistent": { "plugins.calcite.udaf_pushdown.enabled": true } } <details><summary>Suggestion importance[1-10]: 3</summary> __ Why: The `bash ignore` language hint is inconsistent with the REST API content, but this is a minor style issue. The earlier example in the same file uses `json`, so consistency would be improved by changing to `json`. </details></details></td><td align=center>Low </td></tr></tr></tbody></table> </details> <details><summary>Suggestions up to commit d815a87</summary> <br><table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion                                                                                                                                    </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=4>Possible issue</td> <td> <details><summary>Restrict early array detection to supportArrays context</summary> ___ The new early array-detection block uses <code>(fieldType.isEmpty() \|\| supportArrays)</code> as <br>its condition, but the original code path only entered <code>parseArray</code> when <code>supportArrays</code> <br>was true or the type was an OpenSearch ARRAY type. This change may cause array <br>parsing to be triggered for fields with a known non-array type (e.g., a keyword <br>field that happens to contain a JSON array due to multi-value), potentially breaking <br>existing behavior for those fields. [opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java [202-209]](https://github.com/opensearch-project/sql/pull/5064/files#diff-fef3b03341db11f5fc8b5f1f1470de8be5555a22f43d945dea3bdc25b68f3f48R202-R209) ```diff if (content.isArray() - && (fieldType.isEmpty() \|\| supportArrays) + && supportArrays && !fieldType .map(t -> t.equals(OpenSearchDataType.of(OpenSearchDataType.MappingType.GeoPoint))) .orElse(false)) { ExprType type = fieldType.orElse(ARRAY); return parseArray(content, field, type, supportArrays); } Suggestion importance[1-10]: 7 __ Why: The new early array-detection block with `(fieldType.isEmpty() \|\| supportArrays)` could trigger array parsing for fields with known non-array types, potentially breaking existing behavior. The suggestion to restrict to `supportArrays` only is more conservative and safer, though it may not handle the `fieldType.isEmpty()` case that was intentionally added for unmapped fields in aggregation results.	Medium
Add null checks for accumulator arguments If either `acc1` or `acc2` is `null`, the method will throw a `NullPointerException`. Add null checks for both accumulators at the start of the method to handle cases where a shard returns no data. common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java [258-264] public static Map<String, Object> combinePatternAccumulators( Map<String, Object> acc1, Map<String, Object> acc2, int maxSampleCount) { + if (acc1 == null) return acc2 != null ? acc2 : new HashMap<>(); + if (acc2 == null) return acc1; Map<String, Map<String, Object>> patterns1 = (Map<String, Map<String, Object>>) acc1.get("patternGroupMap"); Map<String, Map<String, Object>> patterns2 = (Map<String, Map<String, Object>>) acc2.get("patternGroupMap"); Suggestion importance[1-10]: 6 __ Why: The `combinePatternAccumulators` method could throw a `NullPointerException` if either `acc1` or `acc2` is null, which is a valid scenario when a shard returns no data. Adding null guards is a reasonable defensive improvement.	Low
Verify cached DataContext doc reference safety The `dataContext` is created once in the constructor with `getDoc()`, but `getDoc()` returns the doc map reference at construction time. If OpenSearch replaces the doc map reference (rather than updating it in-place) between documents, the cached `dataContext` will hold a stale reference. Verify that OpenSearch always updates the same map instance in-place; if not, the `DataContext` must be recreated per document. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java [75-78] +// If OpenSearch updates doc values in-place (same map reference), this is safe. +// Otherwise, override execute() to create a new MapContext per document. this.dataContext = new ScriptedMetricDataContext.MapContext( params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); Suggestion importance[1-10]: 5 __ Why: The suggestion raises a valid concern about whether `getDoc()` returns the same map instance updated in-place between documents. However, the improved code only adds a comment without actually fixing the potential issue, making it a verification suggestion rather than a concrete fix.	Low
Fix fragile type-based argument disambiguation The type-based heuristic to distinguish `thresholdPercentage` from `variableCountThreshold` when `args.size() == 5` is fragile. If a user passes an integer-typed threshold percentage or a decimal-typed variable count threshold, the arguments will be silently swapped, producing incorrect results. The argument positions should be fixed and documented rather than inferred from types. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java [126-139] +// args[4] = thresholdPercentage, args[5] = variableCountThreshold (fixed positions) if (args.size() > 5) { thresholdPercentage = args.get(4); variableCountThreshold = args.get(5); } else if (args.size() > 4) { - RexNode arg4 = args.get(4); - SqlTypeName arg4Type = arg4.getType().getSqlTypeName(); - if (arg4Type == SqlTypeName.DOUBLE - \|\| arg4Type == SqlTypeName.DECIMAL - \|\| arg4Type == SqlTypeName.FLOAT) { - thresholdPercentage = arg4; - } else { - variableCountThreshold = arg4; - } + thresholdPercentage = args.get(4); } Suggestion importance[1-10]: 5 __ Why: The type-based heuristic to distinguish `thresholdPercentage` from `variableCountThreshold` is indeed fragile and could silently produce incorrect results. However, the improved code changes the semantics when `args.size() == 5` (only sets `thresholdPercentage`, not `variableCountThreshold`), which may not be the intended fix and could break the existing logic.	Low
General	Clarify which circuit breakers are bypassed The warning mentions "certain memory circuit breakers" but doesn't specify which ones are bypassed. Users need to understand the specific risks to make an informed decision. Consider clarifying which circuit breakers are bypassed (e.g., the parent/request circuit breakers) and whether there are any recommended safeguards or limits to set before enabling this feature. docs/user/ppl/cmd/patterns.md [102] -> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses certain memory circuit breakers and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. +> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. Scripted metric aggregations bypass the request and parent memory circuit breakers, which may cause out-of-memory errors on data nodes when processing very large datasets. Consider setting appropriate JVM heap sizes and monitoring node memory usage before enabling this feature in production. Use with caution. Suggestion importance[1-10]: 4 __ Why: The suggestion improves documentation clarity by specifying which circuit breakers are bypassed, but this is a documentation enhancement rather than a critical fix. The existing warning is already informative enough for most users.	Low
General	Fix inconsistent code block language tag The code block uses `bash ignore` as the language hint, which is inconsistent with the existing cluster settings example earlier in the document that uses just `bash` or `json`. Using `ignore` may prevent syntax highlighting in some renderers. Consider using a consistent language tag like `json` since the content is a JSON body for a REST API call. docs/user/ppl/cmd/patterns.md [93-100] -```bash ignore +```json PUT _cluster/settings { "persistent": { "plugins.calcite.udaf_pushdown.enabled": true } } <details><summary>Suggestion importance[1-10]: 3</summary> __ Why: The `bash ignore` language tag is inconsistent with other code blocks in the document and may affect syntax highlighting. However, looking at the existing code block at line 83 which uses `bash ignore` as well, this appears to be an intentional pattern in this document, reducing the importance of this suggestion. </details></details></td><td align=center>Low </td></tr></tr></tbody></table> </details> <details><summary>Suggestions up to commit 0dece50</summary> <br><table><thead><tr><td><strong>Category</strong></td><td align=left><strong>Suggestion                                                                                                                                    </strong></td><td align=center><strong>Impact</strong></td></tr><tbody><tr><td rowspan=3>Possible issue</td> <td> <details><summary>Fix overly broad early array-handling condition</summary> ___ The new early array-check condition uses <code>fieldType.isEmpty() \|\| supportArrays</code>, which <br>means it will intercept arrays even when <code>supportArrays</code> is false and <code>fieldType</code> is <br>present (non-empty). This changes existing behavior for known-typed fields when <br><code>supportArrays</code> is false, potentially bypassing the original single-value extraction <br>path. The condition should be <code>fieldType.isEmpty() && !supportArrays</code> or restructured <br>to only apply when <code>fieldType</code> is absent. [opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java [202-209]](https://github.com/opensearch-project/sql/pull/5064/files#diff-fef3b03341db11f5fc8b5f1f1470de8be5555a22f43d945dea3bdc25b68f3f48R202-R209) ```diff +// Only apply early array handling when field type is unknown (not in mapping) if (content.isArray() - && (fieldType.isEmpty() \|\| supportArrays) + && fieldType.isEmpty() && !fieldType .map(t -> t.equals(OpenSearchDataType.of(OpenSearchDataType.MappingType.GeoPoint))) .orElse(false)) { - ExprType type = fieldType.orElse(ARRAY); + ExprType type = ARRAY; return parseArray(content, field, type, supportArrays); } Suggestion importance[1-10]: 7 __ Why: The condition `fieldType.isEmpty() \|\| supportArrays` could intercept arrays for known-typed fields when `supportArrays` is false, potentially changing existing behavior. This is a legitimate concern about the correctness of the new early array-check path, though the suggested fix may be too restrictive by removing the `supportArrays` branch entirely.	Medium
Guard against null pattern maps before merging If either `patterns1` or `patterns2` is null (e.g., when `acc1` or `acc2` was never populated with a `patternGroupMap`), passing null to `PatternUtils.mergePatternGroups` may cause a NullPointerException. Add null-checks or default to empty maps before merging. common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java [262-268] public static Map<String, Object> combinePatternAccumulators( Map<String, Object> acc1, Map<String, Object> acc2, int maxSampleCount) { Map<String, Map<String, Object>> patterns1 = - (Map<String, Map<String, Object>>) acc1.get("patternGroupMap"); + acc1.get("patternGroupMap") != null + ? (Map<String, Map<String, Object>>) acc1.get("patternGroupMap") + : new HashMap<>(); Map<String, Map<String, Object>> patterns2 = - (Map<String, Map<String, Object>>) acc2.get("patternGroupMap"); + acc2.get("patternGroupMap") != null + ? (Map<String, Map<String, Object>>) acc2.get("patternGroupMap") + : new HashMap<>(); Map<String, Map<String, Object>> merged = PatternUtils.mergePatternGroups(patterns1, patterns2, maxSampleCount); Suggestion importance[1-10]: 6 __ Why: If `acc1` or `acc2` was never populated with a `patternGroupMap`, passing null to `PatternUtils.mergePatternGroups` could cause a NullPointerException. The fix adds null-safety, though in practice the accumulators are always initialized via `initPatternAccumulator()` which sets an empty map.	Low
Verify doc values reference is live across documents The `dataContext` is created once in the constructor using `getDoc()`, but OpenSearch updates the doc values reference per document. If `getDoc()` returns a snapshot rather than a live reference, the `dataContext` will read stale doc values for subsequent documents. Verify that `getDoc()` returns a live/mutable reference that OpenSearch updates in-place; if not, the `DataContext` must be recreated per `execute()` call. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java [75-78] +// If getDoc() does not return a live reference updated by OpenSearch per document, +// create the DataContext per execute() call instead: this.dataContext = new ScriptedMetricDataContext.MapContext( params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); +// NOTE: Confirm getDoc() is a live reference; otherwise move DataContext creation to execute() Suggestion importance[1-10]: 5 __ Why: The concern about `getDoc()` returning a stale snapshot is valid, but the suggestion only asks to verify behavior rather than providing a concrete fix. The `improved_code` is essentially the same as `existing_code` with a comment added.	Low
General	Use fixed argument positions instead of type-based disambiguation The type-based disambiguation for `arg4` when `args.size() == 5` is fragile: if a user passes an integer literal for `thresholdPercentage` or a decimal for `variableCountThreshold`, the wrong parameter will be assigned. The argument positions should be fixed and documented rather than inferred from runtime types, matching the positional convention used in `buildMapScript` and `buildCombineScript` (args[4]=thresholdPercentage, args[5]=variableCountThreshold). opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java [127-140] +// Use fixed positional convention: args[4]=thresholdPercentage, args[5]=variableCountThreshold if (args.size() > 5) { thresholdPercentage = args.get(4); variableCountThreshold = args.get(5); } else if (args.size() > 4) { - RexNode arg4 = args.get(4); - SqlTypeName arg4Type = arg4.getType().getSqlTypeName(); - if (arg4Type == SqlTypeName.DOUBLE - \|\| arg4Type == SqlTypeName.DECIMAL - \|\| arg4Type == SqlTypeName.FLOAT) { - thresholdPercentage = arg4; - } else { - variableCountThreshold = arg4; - } + thresholdPercentage = args.get(4); } Suggestion importance[1-10]: 6 __ Why: The type-based disambiguation for `arg4` is fragile and inconsistent with the positional convention used in `buildMapScript` and `buildCombineScript`. Using fixed positions is more robust and maintainable, though the improved code drops `variableCountThreshold` assignment when only 5 args are present.	Low
General	Clarify which circuit breakers are bypassed The warning mentions "certain memory circuit breakers" without specifying which ones are bypassed. This could leave users without enough information to assess the risk. Consider clarifying which specific circuit breakers are bypassed (e.g., the parent/field data circuit breakers) so users can make an informed decision. docs/user/ppl/cmd/patterns.md [102] -> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses certain memory circuit breakers and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. +> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. Scripted metric aggregations bypass the parent and field data memory circuit breakers, which may cause out-of-memory errors on data nodes when processing very large datasets. Use with caution and monitor cluster memory usage closely. Suggestion importance[1-10]: 4 __ Why: The suggestion improves documentation clarity by specifying which circuit breakers are bypassed, helping users make more informed decisions. However, it's a documentation improvement with moderate impact, and the specific circuit breakers mentioned (parent and field data) may not be fully accurate without verification.	Low

Suggestions up to commit a4a6e4e

Category	Suggestion	Impact
General	Handle null aggregation result gracefully When `result` is `null`, the `instanceof List` check returns `false` and the code falls through to throw an exception with a null-safe message. However, an empty result (null) from a scripted metric aggregation on an empty index is a valid scenario and should return an empty list rather than throwing an exception. Handle the null case explicitly to avoid unexpected failures on empty indices. opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/ScriptedMetricParser.java [40-47] +if (result == null) { + return List.of(); +} if (result instanceof List) { return List.of(Map.of(name, result)); } throw new IllegalArgumentException( String.format( "Expected List<Map<String, Object>> from scripted metric but got %s", - result == null ? "null" : result.getClass().getSimpleName())); + result.getClass().getSimpleName())); Suggestion importance[1-10]: 6 __ Why: A null result from a scripted metric aggregation on an empty index is a realistic scenario, and throwing an exception in that case would cause unexpected failures. Returning an empty list for null results is a more robust approach and prevents runtime errors on empty indices.	Low
	Replace magic number with enum constant reference The magic number `3` is used directly instead of referencing the `Source.SPECIAL_VARIABLE.getValue()` enum constant. This creates a maintenance risk — if the enum values change, this code will silently break. Use the enum value directly to keep the code consistent and safe. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ScriptParameterHelper.java [108-113] public int addSpecialVariable(String variableName) { int index = sources.size(); - sources.add(3); // SPECIAL_VARIABLE = 3 + sources.add(CalciteScriptEngine.Source.SPECIAL_VARIABLE.getValue()); digests.add(variableName); return index; } Suggestion importance[1-10]: 5 __ Why: Using the magic number `3` instead of `CalciteScriptEngine.Source.SPECIAL_VARIABLE.getValue()` creates a maintenance risk if enum values change. This is a valid maintainability concern, though the risk is low since the enum is defined in the same codebase.	Low
	Avoid mutating input state during reduce phase The first state (`states.get(0)`) is directly cast and used as the initial `combined` accumulator without copying. Since `combinePatternAccumulators` creates a new map for subsequent merges, the first state's `logMessages` and `patternGroupMap` are passed into the first merge but the original state object is not mutated. However, if `states.get(0)` is the same object reference used elsewhere (e.g., by OpenSearch), mutating it could cause issues. More critically, the buffered `logMessages` in `states.get(0)` are included in the combine but never processed through the parser before the final `producePatternResult` call, which handles remaining logs correctly — this is fine. The logic is safe as written, but consider using `initPatternAccumulator()` as the starting point and iterating all states for clarity and safety. common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java [460-465] -// Combine all states into a single accumulator -Map<String, Object> combined = (Map<String, Object>) states.get(0); -for (int i = 1; i < states.size(); i++) { - Map<String, Object> state = (Map<String, Object>) states.get(i); +// Combine all states into a single accumulator starting from an empty one +Map<String, Object> combined = initPatternAccumulator(); +for (Object stateObj : states) { + Map<String, Object> state = (Map<String, Object>) stateObj; combined = combinePatternAccumulators(combined, state, maxSampleCount); } Suggestion importance[1-10]: 4 __ Why: Starting from an empty accumulator and iterating all states is cleaner and avoids any risk of aliasing the first state object. However, since `combinePatternAccumulators` creates a new map for the result, the original `states.get(0)` is not mutated, making this primarily a readability/safety improvement rather than a bug fix.	Low
	Clarify which circuit breakers are bypassed The warning mentions "certain memory circuit breakers" without specifying which ones are bypassed. This could leave users without enough information to make an informed decision. Consider clarifying which specific circuit breakers are bypassed (e.g., the parent/request circuit breakers) so users understand the exact risk. docs/user/ppl/cmd/patterns.md [102] -> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses certain memory circuit breakers and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. +> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses the request and parent memory circuit breakers (which normally protect against excessive memory usage during query execution) and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. Suggestion importance[1-10]: 4 __ Why: The suggestion improves clarity by specifying which circuit breakers are bypassed, but this is a minor documentation enhancement. The exact circuit breakers bypassed may depend on implementation details that the PR author would need to verify.	Low
	Fix inconsistent code block language hint The code block uses `bash ignore` as the language hint, which is inconsistent with the other cluster settings examples in the same file that use `bash`. Using `ignore` will prevent syntax highlighting and may confuse readers or tooling. It should be changed to match the existing style. docs/user/ppl/cmd/patterns.md [93-100] -```bash ignore +```bash PUT _cluster/settings { "persistent": { "plugins.calcite.udaf_pushdown.enabled": true } } <details><summary>Suggestion importance[1-10]: 3</summary> __ Why: The `bash ignore` language hint is inconsistent with other code blocks in the file that use `bash`. This is a minor style/consistency issue that affects syntax highlighting but not functionality. </details></details></td><td align=center>Low </td></tr><tr><td rowspan=1>Possible issue</td> <td> <details><summary>Ensure doc values are fresh per document execution</summary> ___ The <code>MapContext</code> is created once in the constructor using <code>getDoc()</code>, which returns the <br>doc map at construction time. However, OpenSearch updates the doc values internally <br>before each <code>execute()</code> call by updating the same map reference. If <code>getDoc()</code> returns a <br>snapshot rather than the live reference, doc values will be stale for all documents <br>after the first. Verify that <code>getDoc()</code> returns the same mutable map reference that <br>OpenSearch updates per document, otherwise the DataContext must be recreated per <br><code>execute()</code> call. [opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java [73-78]](https://github.com/opensearch-project/sql/pull/5064/files#diff-71e9a88d7cdaf170b68b083346df8ec904a1c21ea50643c66d8a03f405e7e0d1R73-R78) ```diff -this.dataContext = - new ScriptedMetricDataContext.MapContext( - params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); +@Override +@SuppressWarnings("unchecked") +public void execute() { + // Recreate DataContext per document to ensure fresh doc values and source lookup + DataContext ctx = new ScriptedMetricDataContext.MapContext( + getParams(), (Map<String, Object>) getState(), getDoc(), + lookup.getLeafSearchLookup(leafContext).source()); + Object[] result = function.apply(ctx); + ScriptedMetricDataContext.mergeResultIntoState(result, (Map<String, Object>) getState()); +} Suggestion importance[1-10]: 5 __ Why: The concern about stale doc values is valid in principle, but the code comment explicitly explains that `doc` and `sourceLookup` are updated internally by OpenSearch before each `execute()` call, meaning the same map reference is mutated in-place. The suggestion to recreate the context per document would add overhead without necessarily fixing a real bug, and the existing approach is a deliberate optimization.	Low

Suggestions up to commit a4a6e4e

Category	Suggestion	Impact
Possible issue	Flush buffered messages before combining accumulators The `combinePatternAccumulators` method does not process the buffered `logMessages` from either accumulator before merging `patternGroupMap`. If either accumulator has unprocessed messages in its buffer, those messages will be silently dropped during the combine phase, leading to incorrect pattern counts and missing data in the final result. Before merging the pattern group maps, the buffered log messages from both accumulators should be parsed and merged into their respective `patternGroupMap` entries first. common/src/main/java/org/opensearch/sql/common/patterns/PatternAggregationHelpers.java [260-269] +@SuppressWarnings("unchecked") public static Map<String, Object> combinePatternAccumulators( Map<String, Object> acc1, Map<String, Object> acc2, int maxSampleCount) { + // Flush buffered messages from acc1 before combining + List<String> logMessages1 = (List<String>) acc1.get("logMessages"); Map<String, Map<String, Object>> patterns1 = (Map<String, Map<String, Object>>) acc1.get("patternGroupMap"); + if (logMessages1 != null && !logMessages1.isEmpty()) { + BrainLogParser parser = new BrainLogParser( + BrainLogParser.DEFAULT_VARIABLE_COUNT_THRESHOLD, + (float) BrainLogParser.DEFAULT_FREQUENCY_THRESHOLD_PERCENTAGE); + patterns1 = PatternUtils.mergePatternGroups( + patterns1, parser.parseAllLogPatterns(logMessages1, maxSampleCount), maxSampleCount); + logMessages1.clear(); + } + + // Flush buffered messages from acc2 before combining + List<String> logMessages2 = (List<String>) acc2.get("logMessages"); Map<String, Map<String, Object>> patterns2 = (Map<String, Map<String, Object>>) acc2.get("patternGroupMap"); + if (logMessages2 != null && !logMessages2.isEmpty()) { + BrainLogParser parser = new BrainLogParser( + BrainLogParser.DEFAULT_VARIABLE_COUNT_THRESHOLD, + (float) BrainLogParser.DEFAULT_FREQUENCY_THRESHOLD_PERCENTAGE); + patterns2 = PatternUtils.mergePatternGroups( + patterns2, parser.parseAllLogPatterns(logMessages2, maxSampleCount), maxSampleCount); + logMessages2.clear(); + } Map<String, Map<String, Object>> merged = PatternUtils.mergePatternGroups(patterns1, patterns2, maxSampleCount); Suggestion importance[1-10]: 7 __ Why: The `combinePatternAccumulators` method merges `patternGroupMap` entries but ignores buffered `logMessages` in both accumulators. However, looking at the code more carefully, the combine phase in `producePatternResultFromStates` calls `combinePatternAccumulators` which merges the logMessages lists (lines 272-280), and then `producePatternResult` flushes remaining logMessages. So the buffered messages are not dropped but carried forward. The suggestion has some merit but the current design intentionally defers flushing to the reduce phase.	Medium
	Handle null scripted metric result gracefully When `result` is `null` (e.g., no documents matched or all shards returned null), the code throws an `IllegalArgumentException` instead of returning an empty result. A null result from a scripted metric aggregation is a valid scenario (e.g., empty index or no matching documents) and should be handled gracefully by returning an empty list. opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/ScriptedMetricParser.java [40-46] if (result instanceof List) { return List.of(Map.of(name, result)); + } + if (result == null) { + return List.of(); } throw new IllegalArgumentException( String.format( "Expected List<Map<String, Object>> from scripted metric but got %s", - result == null ? "null" : result.getClass().getSimpleName())); + result.getClass().getSimpleName())); Suggestion importance[1-10]: 6 __ Why: A null result from a scripted metric aggregation (e.g., empty index) would cause an `IllegalArgumentException` instead of returning an empty list. This is a valid edge case that should be handled gracefully to avoid unexpected errors in production.	Low
	Ensure per-document source lookup is correctly advanced The `DataContext` is created once in the constructor using `getDoc()` and `lookup.getLeafSearchLookup(leafContext).source()`. However, `getDoc()` returns a reference to the doc map that OpenSearch updates in-place before each `execute()` call, but `lookup.getLeafSearchLookup(leafContext).source()` returns a `SourceLookup` instance that may need to be advanced per document. If the `SourceLookup` is not the same instance that OpenSearch advances, source field access will return stale data for all documents after the first. Verify that the `SourceLookup` obtained via `lookup.getLeafSearchLookup(leafContext).source()` is the same instance that OpenSearch advances per document, or obtain it lazily per `execute()` call to ensure correctness. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/CalciteScriptedMetricMapScriptFactory.java [65-78] public CalciteScriptedMetricMapScript( Function1<DataContext, Object[]> function, Map<String, Object> params, Map<String, Object> state, SearchLookup lookup, LeafReaderContext leafContext) { super(params, state, lookup, leafContext); this.function = function; - // Create DataContext once and reuse for all documents in this segment. - // OpenSearch updates doc values and source lookup internally before each execute(). + // Use the SourceLookup from the leaf search lookup - OpenSearch advances this per document. + // getDoc() returns the same map reference that OpenSearch updates in-place per document. this.dataContext = new ScriptedMetricDataContext.MapContext( - params, state, getDoc(), lookup.getLeafSearchLookup(leafContext).source()); + params, state, getDoc(), getLeafLookup().source()); } Suggestion importance[1-10]: 5 __ Why: The concern about `SourceLookup` staleness is valid in principle, but the comment in the code already acknowledges that OpenSearch updates doc values and source lookup internally. The suggestion to use `getLeafLookup().source()` may not be a valid API call on `ScriptedMetricAggContexts.MapScript`. The issue needs verification but the improved code may not compile correctly.	Low
	Fix inconsistent argument index mapping between script phases The argument index mapping in `buildReduceScript` is inconsistent with `buildMapScript`. In `buildMapScript`, `args[4]` is `thresholdPercentage` and `args[5]` is `variableCountThreshold`, but in `buildReduceScript` when `args.size() > 5`, `args.get(4)` is assigned to `thresholdPercentage` and `args.get(5)` to `variableCountThreshold`. The type-based disambiguation for `args.size() == 5` is fragile and error-prone. The argument ordering should be consistent between map and reduce scripts. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java [108-121] if (args.size() > 5) { thresholdPercentage = args.get(4); variableCountThreshold = args.get(5); } else if (args.size() > 4) { - RexNode arg4 = args.get(4); - SqlTypeName arg4Type = arg4.getType().getSqlTypeName(); - if (arg4Type == SqlTypeName.DOUBLE - \|\| arg4Type == SqlTypeName.DECIMAL - \|\| arg4Type == SqlTypeName.FLOAT) { - thresholdPercentage = arg4; - } else { - variableCountThreshold = arg4; - } + // args[4] = thresholdPercentage (consistent with buildMapScript ordering) + thresholdPercentage = args.get(4); } Suggestion importance[1-10]: 4 __ Why: The type-based disambiguation for `args.size() == 5` is fragile, but the improved code removes the `variableCountThreshold` assignment for the 5-argument case entirely, which changes behavior. The existing code at least attempts to handle both cases, while the suggestion silently drops `variableCountThreshold` when only 5 args are present.	Low
General	Clarify which circuit breakers are bypassed The warning mentions "certain memory circuit breakers" without specifying which ones are bypassed. This is important security/operational information that users need to make an informed decision. Consider clarifying which specific circuit breakers are bypassed (e.g., the parent/field data circuit breakers) so users understand the exact risk. docs/user/ppl/cmd/patterns.md [102] -> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. This bypasses certain memory circuit breakers and may cause out-of-memory errors on nodes when processing very large datasets. Use with caution and monitor cluster resource usage. +> Warning: Enabling UDAF pushdown executes user-defined aggregation functions as scripted metric aggregations on OpenSearch data nodes. Scripted metric aggregations bypass the parent and field data memory circuit breakers, which may cause out-of-memory errors on data nodes when processing very large datasets. Use with caution and monitor cluster heap usage closely. Suggestion importance[1-10]: 4 __ Why: The suggestion improves documentation clarity by specifying which circuit breakers are bypassed, but this is a minor documentation enhancement rather than a critical fix. The existing warning is already informative enough for most users.	Low
General	Fix incorrect code block language tag<...

…reduce Signed-off-by: Songkan Tang <songkant@amazon.com>

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-03T09:13:21Z

Persistent review updated to latest commit 0dece50

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-03T09:20:00Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit d815a87.

Path	Line	Severity	Description
integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_brain_agg_push.yaml	22	medium	Base64-encoded payloads with Java serialization magic bytes (rO0ABX prefix, indicating Java object serialization) are embedded in OpenSearch scripted metric scripts. These serialized Calcite RexNode expressions are sent to and deserialized on data nodes. Java deserialization of remotely-supplied payloads is a known RCE vector (CVE pattern). While the serialization appears to be from server-generated query plans, the lack of explicit deserialization safeguards in CalciteScriptEngine warrants investigation.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/ScriptedMetricDataContext.java	107	medium	The mergeResultIntoState() method performs an unchecked cast of arbitrary execution results into the shared state map (state.putAll((Map) result[0])). Combined with the UDAF pushdown explicitly bypassing OpenSearch memory circuit breakers (documented in the PR), a malicious or malformed payload could overwrite state entries or trigger unbounded memory growth on data nodes without circuit breaker protection. The bypass of safety mechanisms with no alternative bounds checking is a concern.
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/scriptedmetric/udaf/PatternScriptedMetricUDAF.java	93	low	PATTERN_FLUSH_UDF is referenced in buildCombineScript() and defined in PPLBuiltinOperators.java, but has no corresponding entry in BuiltinFunctionName.java (only PATTERN_INIT_UDF, PATTERN_ADD_UDF, PATTERN_COMBINE_UDF, and PATTERN_RESULT_UDF are registered). This registry inconsistency could indicate an incomplete or hidden registration path for a script executed on remote data nodes.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 2 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-03T09:21:22Z

Persistent review updated to latest commit d815a87

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-03T10:03:47Z

Persistent review updated to latest commit bf583fa

opensearch-trigger-bot · 2026-03-17T15:37:38Z

This PR is stalled because it has been open for 2 weeks with no activity.

ahkcs · 2026-03-17T17:40:29Z

Hi @songkant-aws, any updates on this PR?

opensearch-trigger-bot · 2026-04-01T15:43:29Z

This PR is stalled because it has been open for 2 weeks with no activity.

github-actions · 2026-04-02T03:26:54Z

Persistent review updated to latest commit 16da4de

General UDAF pushdown as scripts

ac5994e

Signed-off-by: Songkan Tang <songkant@amazon.com>

songkant-aws force-pushed the udaf-script-pushdown branch from d03bc0b to ac5994e Compare January 22, 2026 09:11

songkant-aws added 6 commits January 22, 2026 18:11

Refactor to reduce some duplicate logic

4015f27

Signed-off-by: Songkan Tang <songkant@amazon.com>

Organize files

85cac77

Signed-off-by: Songkan Tang <songkant@amazon.com>

Fix tests

a881d0e

Signed-off-by: Songkan Tang <songkant@amazon.com>

Add udaf pushdown enabled setting

595be1c

Signed-off-by: Songkan Tang <songkant@amazon.com>

Fix compileTestJava failure

006b9fd

Signed-off-by: Songkan Tang <songkant@amazon.com>

Update patterns.md doc with new udaf_pushdown setting explanation

222b56c

Signed-off-by: Songkan Tang <songkant@amazon.com>

songkant-aws marked this pull request as ready for review January 23, 2026 06:02

songkant-aws requested review from GumpacG, LantaoJin, MaxKsyunz, Swiddis, YANG-DB, Yury-Fridlyand, acarbonetto, anirudha, dai-chen, derek-ho, forestmvey, joshuali925, kavithacm, mengweieric, noCharger, penghuo, ps48, seankao-az, vamsimanohar and ykmr1224 as code owners January 23, 2026 06:02

yuancu previously approved these changes Jan 23, 2026

View reviewed changes

penghuo reviewed Jan 27, 2026

View reviewed changes

qianheng-aws reviewed Jan 29, 2026

View reviewed changes

opensearch-trigger-bot bot added the stalled label Feb 12, 2026

Merge branch 'main' into udaf-script-pushdown

a4a6e4e

songkant-aws dismissed yuancu’s stale review via a4a6e4e March 3, 2026 03:31

songkant-aws added 2 commits March 3, 2026 15:09

Support flushing remaining log messages for pattern agg before final …

642d827

…reduce Signed-off-by: Songkan Tang <songkant@amazon.com>

Minor change

0dece50

Signed-off-by: Songkan Tang <songkant@amazon.com>

Fix spotless check

d815a87

Signed-off-by: Songkan Tang <songkant@amazon.com>

Fix bug of new flush UDF's parameter's mapping

bf583fa

Signed-off-by: Songkan Tang <songkant@amazon.com>

opensearch-trigger-bot bot removed the stalled label Mar 3, 2026

opensearch-trigger-bot bot added the stalled label Mar 17, 2026

qianheng-aws mentioned this pull request Mar 18, 2026

Add songkant-aws as maintainer #5244

Merged

opensearch-trigger-bot bot removed the stalled label Mar 18, 2026

opensearch-trigger-bot bot added the stalled label Apr 1, 2026

Merge branch 'main' into udaf-script-pushdown

16da4de

songkant-aws requested a review from ahkcs as a code owner April 2, 2026 03:24

Conversation

songkant-aws commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

coderabbitai bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

yuancu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

opensearch-trigger-bot bot commented Feb 12, 2026

Uh oh!

songkant-aws commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 16da4de)

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

github-actions bot commented Mar 3, 2026

Uh oh!

opensearch-trigger-bot bot commented Mar 17, 2026

Uh oh!

ahkcs commented Mar 17, 2026

Uh oh!

opensearch-trigger-bot bot commented Apr 1, 2026

Uh oh!

songkant-aws commented Jan 22, 2026 •

edited

Loading

coderabbitai bot commented Jan 22, 2026 •

edited

Loading

songkant-aws commented Mar 3, 2026 •

edited

Loading

github-actions bot commented Mar 3, 2026 •

edited

Loading

(Review updated until commit `16da4de`)

github-actions bot commented Mar 3, 2026 •

edited

Loading