Skip to content

Commit 534e771

Browse files
authored
Merge pull request github#5934 from github/hmakholm/pr/monotonic-agg
QL language reference: add monotonic aggregate example
2 parents e7a349b + 70b9739 commit 534e771

File tree

1 file changed

+95
-0
lines changed

1 file changed

+95
-0
lines changed

docs/codeql/ql-language-reference/expressions.rst

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -488,6 +488,101 @@ value for each value generated by the ``<formula>``:
488488
value generated by the ``<formula>``. Here, the aggregation function is applied to each of the
489489
resulting combinations.
490490

491+
Example of monotonic aggregates
492+
-------------------------------
493+
494+
Consider this query:
495+
496+
.. code-block:: ql
497+
498+
string getPerson() { result = "Alice" or
499+
result = "Bob" or
500+
result = "Charles" or
501+
result = "Diane"
502+
}
503+
string getFruit(string p) { p = "Alice" and result = "Orange" or
504+
p = "Alice" and result = "Apple" or
505+
p = "Bob" and result = "Apple" or
506+
p = "Charles" and result = "Apple" or
507+
p = "Charles" and result = "Banana"
508+
}
509+
int getPrice(string f) { f = "Apple" and result = 100 or
510+
f = "Orange" and result = 100 or
511+
f = "Orange" and result = 1
512+
}
513+
514+
predicate nonmono(string p, int cost) {
515+
p = getPerson() and cost = sum(string f | f = getFruit(p) | getPrice(f))
516+
}
517+
518+
language[monotonicAggregates]
519+
predicate mono(string p, int cost) {
520+
p = getPerson() and cost = sum(string f | f = getFruit(p) | getPrice(f))
521+
}
522+
523+
from string variant, string person, int cost
524+
where variant = "default" and nonmono(person, cost) or
525+
variant = "monotonic" and mono(person, cost)
526+
select variant, person, cost
527+
order by variant, person
528+
529+
The query produces these results:
530+
531+
+-----------+---------+------+
532+
| variant | person | cost |
533+
+-----------+---------+------+
534+
| default | Alice | 201 |
535+
| default | Bob | 100 |
536+
| default | Charles | 100 |
537+
| default | Diane | 0 |
538+
| monotonic | Alice | 101 |
539+
| monotonic | Alice | 200 |
540+
| monotonic | Bob | 100 |
541+
| monotonic | Diane | 0 |
542+
+-----------+---------+------+
543+
544+
The two variants of the aggregate semantics differ in what happens
545+
when ``getPrice(f)`` has either multiple results or no results
546+
for a given ``f``.
547+
548+
In this query, oranges are available at two different prices, and the
549+
default ``sum`` aggregate returns a single line where Alice buys an
550+
orange at a price of 100, another orange at a price of 1, and an apple
551+
at a price of 100, totalling 201. On the other hand, in the the
552+
*monotonic* semantics for ``sum``, Alice always buys one orange and
553+
one apple, and a line of output is produced for each *way* she can
554+
complete her shopping list.
555+
556+
If there had been two different prices for apples too, the monotonic
557+
``sum`` would have produced *four* output lines for Alice.
558+
559+
Charles wants to buy a banana, which is not for sale at all. In the
560+
default case, the sum produced for Charles includes the cost of the
561+
apple he *can* buy, but there's no line for Charles in the monontonic
562+
``sum`` output, because there *is no way* for Charles to buy one apple
563+
plus one banana.
564+
565+
(Diane buys no fruit at all, and in both variants her total cost
566+
is 0. The ``strictsum`` aggregate would have excluded her from the
567+
results in both cases).
568+
569+
In actual QL practice, it is quite rare to use monotonic aggregates
570+
with the *goal* of having multiple output lines, as in the "Alice"
571+
case of this example. The more significant point is the "Charles"
572+
case: As long as there's no price for bananas, no output is produced
573+
for him. This means that if we later do learn of a banana price, we
574+
don't need to *remove* any output tuple already produced. The
575+
importance of this is that the monotonic aggregate behavior works well
576+
with a fixpoint-based semantics for recursion, so it will be meaningul
577+
to let the ``getPrice`` predicate be mutually recursive with the count
578+
aggregate itself. (On the other hand, ``getFruit`` still cannot be
579+
allowed to be recursive, because adding another fruit to someone's
580+
shopping list would invalidate the total costs we already knew for
581+
them).
582+
583+
This opportunity to use recursion is the main practical reason for
584+
requesting monotonic semantics of aggregates.
585+
491586
Recursive monotonic aggregates
492587
------------------------------
493588

0 commit comments

Comments
 (0)