[LIQ] Add new aggregate functions, aliases, and queryable aggregate registry#1891
Merged
zefhemel merged 28 commits intosilverbulletmd:mainfrom Mar 19, 2026
Merged
[LIQ] Add new aggregate functions, aliases, and queryable aggregate registry#1891zefhemel merged 28 commits intosilverbulletmd:mainfrom
zefhemel merged 28 commits intosilverbulletmd:mainfrom
Conversation
…egistry * Extend with 13 new built-in aggregates: `product`, `string_agg`, `yaml_agg`, `json_agg`, `bit_and`, `bit_or`, `bit_xor`, `bool_and`, `bool_or`, `stddev_pop`, `stddev_samp`, `var_pop` and `var_samp`. * Introduce `aggregate.alias` API allowing users to define custom aliases for any aggregate. Standard aliases (`every`, `std`, `stddev` and `variance`) are now defined via this API rather than hardcoded. * Add `index.aggregates` queryable collection so users can discover all available aggregates directly from LIQ queries. Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Contributor
Author
|
This too fixes custom aggregates that were in fact never working. |
…rage `config.set` uses `LuaNativeJSFunction` which calls `luaValueToJS` on all arguments. This converted the aggregate `LuaTable` to a plain JS object and wrapped `LuaFunction` callbacks in JS functions that also converted their returned values via `luaValueToJS`. The result was that state returned by initialize (a `LuaTable`) got converted to a plain JS object before being passed to `iterate`. Therefor Lua operations like `table.insert` on that were failing because they expected a `LuaTable` and not a plain JS array. Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Contributor
Author
|
Also fixes table functions not working in intermediate state in aggregates. |
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
…egistry Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
This preserves `null`/`undefined` as-is (both map to Lua nil) and
prevents them from falling through to the `typeof` "object" branch.
For this PR it means that null `target` in our `aggregates` entries will
correctly show as empty/`nil` in query results rather than `{}`.
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
… leaks
* `sum(`) and` product(`) now return null when no rows match (matching
Postgres semantics) instead of returning 0 and 1 respectively.
* Query result columns that hold null are internally preserved using
a `LIQ_NULL` sentinel so that column keys survive in `LuaTable`
storage. This sentinel was leaking into Lua code as "userdata"
through three read paths:
* `luaIndexValue`: `rawGet` returned the sentinel directly to Lua when
accessing table fields,
* `rawget` (stdlib): the builtin `rawget` function exposed the
sentinel without converting it back to `nil`,
* `createAugmentedEnv`: string interpolation unpacked table values via
`rawGet` into local variables, making the sentinel visible in
template expressions like `${var}`.
All three now convert `LIQ_NULL` to `nil` at the read boundary,
keeping the sentinel internal to table storage where it belongs.
* Update affected test expectations accordingly.
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
…regate `iterate`s Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Contributor
Author
|
Switched to draft again, working on |
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
`JSON.stringify(Symbol(...))` in an array produces null by accident. That is a JS implementation detail we **MUST NOT** rely on. Explicit null push makes intent clear and avoids surprises if the `Symbol` representation ever changes. Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
`js-yaml` has no knowledge of the `LIQ_NULL` symbol. Passing null makes it emit YAML null (or `~`), which is the correct YAML representation of a missing value and matches standard `json_agg`/`yaml_agg` NULL-inclusion semantics. Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Without this, `LIQ_NULL` sort keys would fall through to `valA < valB` which is always false for `Symbol`s which is breaking the `nulls first`/`nulls last` contract... Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
…NULL` sentinel Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
…visible text Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Contributor
Author
|
I believe most real issues are now solved. |
…egates Extra arguments (2nd, 3rd, etc.) to aggregate functions were evaluated against the outer query environment where the object variable is not bound. This caused multi-argument aggregates like `covar_samp(data.y, data.x)` to fail with nil reference errors. This commit addresses this by evaluating extra args per-item inside the iterate loop using the item environment so all arguments resolve correctly. We also add few common aggregates: - `covar_pop`, `covar_samp`, `corr`: population/sample covariance and correlation coefficient using online co-moment algorithm. - `quantile(value, q, method)`: general quantile with interpolation methods: lower, higher, nearest, midpoint and default linear. - `percentile_cont(value, q)`: continuous percentile (linear) - `percentile_disc(value, q)`: discrete percentile (lower) Note: `percentile_cont` and `percentile_disc` share the `quantile` implementation through `ctx.name` at initialize time. Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Contributor
Author
|
Ok, I was wrong, now I believe we are ok to go... 🎆 |
Collaborator
|
Very cool, but could you add some tests for the new aggregates? |
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
Signed-off-by: Matouš Jan Fialka <mjf@mjf.cz>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extend with 13 new built-in aggregates:
product,string_agg,yaml_agg,json_agg,bit_and,bit_or,bit_xor,bool_and,bool_or,stddev_pop,stddev_samp,var_popandvar_samp.Introduce
aggregate.aliasAPI allowing users to define custom aliases for any aggregate. Standard aliases (every,std,stddevandvariance) are now defined via this API rather than hardcoded.Add
index.aggregatesqueryable collection so users can discover all available aggregates directly from LIQ queries.UPDATE:
LIQ_NULLleaking in misc. way in many places!TL;DR in commit messages... :)