WIP: The great pairwise reduction refactor #58418

mbauman · 2025-05-14T23:03:15Z

This is the grand culmination of (and core motivation for) all the reductions work I've slowly been chipping away at over the past year. It implements the following major changes:

All reduce-based reductions use a pairwise re-association by default. This includes seven different implementations:
- Whole array reductions
  - IndexLinear (these were often pairwise already)
  - IndexCartesian (these were previously foldls without SIMD)
  - Iterators with Shape or Length (previously foldls without SIMD)
  - Iterators with unknown length (previously foldls without SIMD)
- Dimensional array reductions
  - Contiguous column major dimensions (only IndexLinear were pairwise before)
  - Discontiguous column major dimensions
  - Row major dimensions
All of these implementations do their inner work through a singular mapreduce_kernel function (with specific signatures for arrays and iterators) after performing the pairwise splits and/or dimensional selection. This could eventually become a public API that should help alleviate some of the disruption to those who had been extending the internals before. See, for example, how LinearAlgebra replaced its internal and tricky _sum overloads with this method: JuliaLang/LinearAlgebra.jl@e70953e.
The dimensional array reductions no longer guess at initialization. This fully replaces my previous attempt in WIP: a more principled take on dimensional reduction inits #55318, and is at the core of what I want to address here — the guesswork here is what was responsible for many bugs.
All of those implementations consistently use init to start every reduction chain as we decided the documentation should direct implementations to do in Document that reduce-like functions use init one or more times #53945. They also consistently use mapreduce_first.

This currently stands atop the yet-to-be-merged #58374 and #55628.

As the successor to #55318, this carries all its tests and addresses all those issues in the dimensional reductions. Closes: #45566, closes #47231, closes #55213, closes #31427, closes #41054, closes #54920, closes #42259, closes #38660, closes #21097, closes #32366, closes #54875, closes #43731, closes #45562.

In addition, this further brings pairwise reassociations (and SIMD!) to the other class of reduction. Closes #29883, closes #52365, closes #30421, closes #52457, closes #39385.

This requires some changes to the stdlibs: It requires JuliaStats/Statistics.jl#172 (which fixes another four issues: JuliaStats/Statistics.jl#7, JuliaStats/Statistics.jl#126, JuliaStats/Statistics.jl#160, JuliaStats/Statistics.jl#183). This also requires changes to SparseArrays and LinearAlgebra.

There are other pull requests that this supersedes that were inspirational for my work here. In particular, this supersedes and would close #45651 (pairwise whole-array IndexCartesian), closes #52397 (pairwise whole-array Iterators.SizeUnknown). It also supersedes a few other PRs; would close #45822 (findmin/max for offset arrays; I've added those tests here), close #58241 (my PR for whole-array reductions init refactor that snowballed into this one).

Yet to do:

…]Int`. (cherry picked from commit c923b1d)

(cherry picked from commit 4bee191)

for accumulate inference (cherry picked from commit 4fadd8f)

Previously, Julia tried to guess at initial values for empty dimensional reductions in a slightly different way than whole-array reductions. This change unifies those behaviors, such that `mapreduce_empty` is called for empty dimensional reductions, just like it is called for empty whole-array reductions. Beyond the unification (which is valuable in its own right), this change enables some downstream simplifications to dimensional reductions and is likely to be the "most breaking" public behavior in a refactoring targeting more correctness improvements. (cherry picked from commit 94f24b8)

(cherry picked from commit 0e924cf)

they already supported axes through the default fallback that uses size to construct OneTos

and this was causing ambiguities

(cherry picked from commit 32fc16b)

tecosaur · 2025-05-15T03:58:34Z

Fantastic stuff, this is really exciting to see! 🥳

oscardssmith · 2025-05-15T04:43:05Z

base/dict.jl

-length(t::ImmutableDict) = count(Returns(true), t)
+# length is defined in terms of iteration; using a higher-order function to do the iteration
+# is likely to call `length` again, so this is a manual for loop:
+function length(t::ImmutableDict)


won't this be horribly slow?

I mean, wasn't it already doing this? This shouldn't be any worse. It's definitely faster than infinite recursion :)

The trouble is that ImmutableDict has an IteratorSize of HasLength. And so the new reduction machinery happily calls length to determine the pairwise splits. So if length itself uses reductions (like count) to calculate length, we're in trouble.

On nightly:

julia> using BenchmarkTools julia> d = Base.ImmutableDict((i => i+1 for i in 1:200)...); julia> @benchmark length($d) BenchmarkTools.Trial: 10000 samples with 641 evaluations per sample. Range (min … max): 193.512 ns … 392.225 ns ┊ GC (min … max): 0.00% … 0.00% Time (median): 193.772 ns ┊ GC (median): 0.00% Time (mean ± σ): 196.869 ns ± 11.655 ns ┊ GC (mean ± σ): 0.00% ± 0.00% █▁ ▁▃▂▄▂ ▁ ███████████▇██▇▇▆▇▆▆▆▅▅▇▆▆▅▆▅▅▆▅▅▅▅▅▅▅▄▄▅▅▅▅▂▄▃▃▄▄▄▄▃▂▂▂▃▄▂▅▄ █ 194 ns Histogram: log(frequency) by time 241 ns < Memory estimate: 0 bytes, allocs estimate: 0. julia> @eval Base function length(t::ImmutableDict) r = 0 for _ in t r+=1 end return r end length (generic function with 95 methods) julia> @benchmark length($d) BenchmarkTools.Trial: 10000 samples with 631 evaluations per sample. Range (min … max): 193.542 ns … 366.482 ns ┊ GC (min … max): 0.00% … 0.00% Time (median): 193.740 ns ┊ GC (median): 0.00% Time (mean ± σ): 196.829 ns ± 11.533 ns ┊ GC (mean ± σ): 0.00% ± 0.00% █ ▁ ▄ ▂ ▁ ███████▇███▇██▇▇▇▇▆▇▆▆▅▆▆▆▇▆▆▆▅▅▅▅▅▅▄▅▆▅▄▅▅▅▅▄▄▅▄▄▄▅▅▄▄▃▄▄▃▃▄ █ 194 ns Histogram: log(frequency) by time 238 ns < Memory estimate: 0 bytes, allocs estimate: 0.

ah that's unfortunate...

jakobnissen · 2025-05-15T05:47:48Z

Worth seeing if it also closes #43501

This reverts commit be75442.

mbauman · 2025-05-27T21:39:19Z

I have benchmarks! They're... ok. A bit of a mixed bag.

The code

using Chairmarks: Chairmarks, @be
using CSV: CSV
using ProgressMeter: Progress, next!

struct LengthUnknown{T}
    itr::T
end
@inline Base.iterate(x::LengthUnknown) = iterate(x.itr)
@inline Base.iterate(x::LengthUnknown, s) = iterate(x.itr, s)
Base.IteratorSize(::LengthUnknown) = Base.SizeUnknown()
Base.IteratorEltype(::LengthUnknown) = Base.HasEltype()
Base.eltype(u::LengthUnknown) = Base.eltype(u.itr)

using Random: Random
const rng = Random.MersenneTwister(1155)
rand(sz...) = Random.rand(rng, sz...)
rand(::Type{T}, sz...) where {T} = Random.rand(rng, T, sz...)

haslength(x) = Iterators.take(x,length(x))
function cartesianvec(x)
    n = length(x)
    n1 = nextpow(2, sqrt(n))
    return @view reshape(x, n1, :)[begin:end, begin:end]
end

function bench_wholearray(Ns, Ts)
    results = (; style=String[], eltype=String[], init=Bool[], dims=Union{Missing,String}[], n=Int[], time=Float64[])
    P = Progress(4*4*2*length(Ts))
    for (xf,type) in [(identity, "IndexLinear"), (cartesianvec, "IndexCartesian"), (haslength, "HasLength"), (LengthUnknown, "SizeUnknown")],
        (xf,type) in [(xf, type),
                      (x->skipmissing(xf(Array{Union{Missing, eltype(x)}}(x))), "SkipMissing $type 0%"),
                      (x->skipmissing(xf(Array{Union{Missing, eltype(x)}}([rand() < .5 ? missing : x for x in x]))), "SkipMissing $type 50%"),
                      (x->skipmissing(xf(Array{Union{Missing, eltype(x)}}(fill(missing, size(x))))), "SkipMissing $type 100%"),],
        T in Ts,
        init in (nothing, zero(T))
            ts = [begin
                A = xf(rand(T, N÷sizeof(T)))
                if isempty(A) && isnothing(init) && try mapreduce(identity, +, A); false; catch; true; end
                    # Empty SkipMissings don't define reduce_empty; skip it.
                    [(;time = NaN,)]
                elseif isnothing(init)
                    @be(mapreduce(identity, +, $A))
                else
                    @be(mapreduce(identity, +, $A; init=$init))
                end
            end for N in Ns]
            append!(results.time, getproperty.(minimum.(ts), :time))
            append!(results.n, Ns)
            append!(results.dims, fill(missing, length(ts)))
            append!(results.eltype, fill(string(T), length(ts)))
            append!(results.style, fill(type, length(ts)))
            append!(results.init, fill(!isnothing(init), length(ts)))
            next!(P)
    end
    results
end

function bench_dims(Ns, Ts)
    results = (; style=String[], eltype=String[], init=Bool[], dims=Union{Missing,String}[], n=Int[], time=Float64[])
    P = Progress(2*8*2*length(Ts))
    for (xf,type) in [(identity, "IndexLinear"), (x->@view(x[1:end,1:end,1:end,1:end]), "IndexCartesian")],
        (szf, dims) in [((n,_,_)->(n÷4,4), 1), ((n,_,_)->(4,n÷4), 2),
                        ((n,n2,_)->(n2,n÷n2÷4,4), (1,2)),
                        ((n,n2,_)->(n2,4,n÷n2÷4), (1,3)),
                        ((n,n2,_)->(4,n2,n÷n2÷4), (2,3)),
                        ((n,_,n3)->(n3,n3,n÷n3÷n3÷4,4), (1,2,3)),
                        ((n,_,n3)->(n3,n3,4,n÷n3÷n3÷4), (1,2,4)),
                        ((n,_,n3)->(n3,4,n3,n÷n3÷n3÷4), (1,3,4)),
                        ((n,_,n3)->(4,n3,n3,n÷n3÷n3÷4), (2,3,4))],
        T in Ts,
        init in (nothing, zero(T))
                ts = [begin
                    n = N÷sizeof(T)
                    A = xf(rand(T, szf(n, nextpow(2, sqrt(n)), nextpow(2, cbrt(n)))))
                    if isnothing(init)
                        @be(mapreduce(identity, +, $A; dims=$dims))
                    else
                        @be(mapreduce(identity, +, $A; dims=$dims, init=$init))
                    end
                end for N in Ns]
                append!(results.time, getproperty.(minimum.(ts), :time))
                append!(results.n, Ns)
                append!(results.eltype, fill(string(T), length(ts)))
                append!(results.style, fill(type, length(ts)))
                append!(results.init, fill(!isnothing(init), length(ts)))
                append!(results.dims, fill(string(dims), length(ts)))
                next!(P)
    end
    return results
end

whole = bench_wholearray(2 .^ (3:22), (Int32,Int64,Float32,Float64))
dims = bench_dims(2 .^ (5:22), (Int32,Int64,Float32,Float64))
foreach(Base.splat(append!), zip(whole, dims))
CSV.write(Base.GIT_VERSION_INFO.commit_short * ".csv", whole)

This gives us 744e3fd976.csv vs c5df01807e3.csv (nightly from a few days ago).

So we can break this down over a wide variety of cases.

Most of the slow cases here are the "small n":

The variance is pretty high on SkipMissing — Julia's inference really doesn't like iteration's tuple of Union{Missing, ...}. The few cases with 10x+ slowdown are all instabilities of that sort.

mbauman · 2025-05-28T16:11:38Z

OK, I've implemented specialized kernels for SkipMissing{<:AbstractArray} that mirror the old implementation. That's resolved the most egregious perf issues now; none of tested implementations are worse than a 5x regression (and quite a few are 5x+ faster). Still haven't tested Sparse perf, though, there's definitely more work to be done there.

Here are the latest benchmarks: 144c253358.csv. The remaining big regressions are in mid-size dimensional reductions, and in particulardims=(1,2,4).

mbauman · 2025-05-28T19:02:44Z

OK, awesome, I see what's happening here. When we're doing a dimensional reduction over an array with a singleton dimension, we have a choice whether that dimension should be included in the reduction — it doesn't matter whether it was included in the dims or not! And it's advantageous to include it on leading dimensions (which I had already optimized) and advantageous to remove it on trailing dimensions. The biggest gaps to master right were all in cases where we have a singleton trailing dimension that was flagged as being part of the reduction. E.g., sum(rand(3,2,1), dims=(1,3)). That's the same as the simpler and more efficient dims=1! Easy enough fix.

That's why it was the "mid-size" reductions like (1,2,4) that were appearing as big regressions — those are the ones that have that trailing singleton dimension! Now all the tested benchmarks are within 3x, getting better...

30c8afbc5e.csv

There are 260 benchmarks with a 2x or worse regression compared to c5df01807e3.csv. Most (249/260) are dimensional reductions. The whole array ones are either small (n<=64) SizeUnknown iterables or very large (n>million) entirely missing (that is, empty) SkipMissing{<:Array}. The dimensional reductions are mostly IndexCartesian (190/249), and mostly discontiguous column major like (1,3) or (1,2,4) (150/249).

and slightly improve perf for abstract iterables

they ended up losing type stability and it isn't hard to just define our own pairwise impl

they are just different enough to break some assumptions about this function

mbauman · 2025-07-03T19:13:15Z

OK, I must confess I've never really processed how SIMD doesn't really re-associate things. It fundamentally changes the entire ordering. It goes far above and beyond what we promise reduce will do... of course all the SIMD-able operations are themselves (putatively) commutative, so it's not really observable outside of bit-exact floating point operations. I think it'd be very good to completely remove @simd from the implementations here — or at least get rid of it from the generic ones. Instead, we can introduce specific SIMD-like reorderings for commutative operations. This can actually lead to improved performance over @simd in cases where the data are not stored in a data structure that directly supports a SIMD gather. And explicitly encoding the re-ordering here would fully fix #29883 and #52457; mean(X, dims=1) would be exactly equal to mean(X', dims=2). And if we hard-code the re-ordering, it should completely remove architecture-dependencies from the results of these fundamental operations.

It's not uniformly a win, however. Hard-coding such a re-ordering means that we no longer let @simd pick the best one for the current CPU and element type. This PR now just picks one that works fairly well across the board. I've definitely taken a step backwards from the previous benchmarks. The worst offenders are the containers of Any and small Unions, and mostly where n (and the time itself) is very small. In terms of pure counts, there are 5124 benchmarks in my current battery (going over various sizes, array types, iterator types, dimensions, etc.). With 4807a12 (4807a12823.csv), I see:

0.9x runtime (or less): 1830 benchmarks (36%)
1.1x runtime (or less): 3196 benchmarks (62%)
1.5x runtime (or less): 4031 benchmarks (79%)
2x runtime (or less): 4579 benchmarks (89%)

A bunch of the 2x regressions are the 32-bit ones, because we're no longer saturating my M1's simd width. And those that are much worse than that are mostly small n and/or non-concrete. I think I can add some threshold tuning to improve some of that.

adienes · 2025-07-30T14:04:22Z

fixes JuliaStats/Statistics.jl#183

adienes · 2025-08-27T19:30:23Z

test/reducedim.jl

+some_exception(op) = try return (Some(op()), nothing); catch ex; return (nothing, ex); end
+reduced_shape(sz, dims) = ntuple(d -> d in dims ? 1 : sz[d], length(sz))
+
+@testset "Ensure that calling, e.g., sum(empty; dims) has the same behavior as sum(empty)" begin


fyi this testset is repeated from L579

adienes · 2025-08-27T19:32:05Z

base/multidimensional.jl

+        @nexprs 8 N->begin
+            it = iterate(itr, s)
+            if it === nothing
+                N > 7 && (v_7 = op(v_7, f(a_7)))


Suggested change

N > 7 && (v_7 = op(v_7, f(a_7)))

@nexprs 7 M->((N > M) && (v_{8-M} = op(v_{8-M}, f(a_{8-M}))))

adienes · 2025-08-27T19:33:30Z

base/multidimensional.jl

+
+# This special internal method must have at least 8 indices and allows passing
+# optional scalar leading and trailing dimensions
+function _mapreduce_kernel_commutative(f, op, A, init, inds, leading=(), trailing=())


potentially profitable to unroll it out to @nexprs 16 when on a 64bit platform but sizeof(T) <= 4

adienes · 2025-08-27T19:34:23Z

base/deprecated.jl

+
+@deprecate (reducedim_initarray(A::Union{Base.AbstractBroadcasted, AbstractArray}, region, init, ::Type{T}) where {T}) fill!(mapreduce_similar(A,T,reduced_indices(A,region)), init) false
+@deprecate reducedim_initarray(A::Union{Base.AbstractBroadcasted, AbstractArray}, region, init) fill!(mapreduce_similar(A,typeof(init),reduced_indices(A,region)), init) false
+const _dep_message_reducedim_init = ", these internals have been removed. To customize the array returned by dimensional reductions, implement mapreduce_similar instead"


Suggested change

const _dep_message_reducedim_init = ", these internals have been removed. To customize the array returned by dimensional reductions, implement mapreduce_similar instead"

dup. on L567

adienes · 2025-08-27T19:38:36Z

base/reduce.jl

+mapreduce(f::F, op::G, A::AbstractArrayOrBroadcasted; init=_InitialValue(), dims=(:)) where {F,G} = mapreducedim(f, op, A, init, dims)
+mapreduce(f::F, op::G, A::AbstractArrayOrBroadcasted, As::AbstractArrayOrBroadcasted...; init=_InitialValue(), dims=(:)) where {F,G} =
+    reduce(op, map(f, A, As...); init, dims)
+mapreducedim(f::F, op::G, A, init, ::Colon) where {F,G} = mapreduce_pairwise(f, op, A, init)


I wonder (suspect?) if smaller pairwise_blocksize is better when the cost of f exceeds the cost of op? e.g. to start with:

pairwise_blocksize(::typeof(sqrt), op) = 64 pairwise_blocksize(::typeof(sin), op) = 64 pairwise_blocksize(::typeof(cos), op) = 64 pairwise_blocksize(::typeof(exp), op) = 64 pairwise_blocksize(::typeof(log), op) = 64

adienes · 2025-08-27T19:40:43Z

base/reduce.jl

+
+# Reduction kernels that explicitly encode simplistic SIMD-ish reorderings
+const CommutativeOps = Union{typeof(+),typeof(Base.add_sum),typeof(min),typeof(max),typeof(Base._extrema_rf),typeof(|),typeof(&)}
+is_commutative_op(::CommutativeOps) = true


maybe two arg is better with a default?

is_commutative_op(op, ::Type) = is_commutative_op(op)

so we can say like * on Float64 commutes (for the purposes of reduce) but not on Matrix or quaternions or whatever

LilithHafner · 2025-10-02T14:39:41Z

Benchmarks LGTM. It's not ideal to have some regressions come with the improvements but that's not blocking given the correctness improvements here.

mbauman and others added 20 commits May 12, 2025 11:07

counting and summing Bools with a small integer init promote to `[U…

80f031f

…]Int`. (cherry picked from commit c923b1d)

Fix doctests

3c3fdcc

(cherry picked from commit 4bee191)

keep the add_sum method table small

4baecd3

for accumulate inference (cherry picked from commit 4fadd8f)

TEMPORARILY USE SPARSEARRAYS PR BRANCH

0c3d4b2

(cherry picked from commit 0e924cf)

add two-arg axes and size for Broadcasted

0a3f844

fully pairwise mapreduce with consistent use of init

19f0ebb

fixup Broadcast size/axes

0708c2b

more deprecations

e719715

NEWS update for empty dimensional reductions

da3bdb1

avoid calling length in functions that define length

8cef637

Add (offset) axes support for Enumerate

89c3975

they already supported axes through the default fallback that uses size to construct OneTos

make it a bit easier to extend mapreduce_kernel

f88cfeb

fixup PermutedDimsArray changes

bddc126

avoid similar shenanigans; they are not needed

7747f1e

and this was causing ambiguities

cartesian broadcasts are now pairwise!

31de628

TEMPORARILY USE LINEAR ALGEBRA BRANCH

2f5a886

promote to Int

aeb9a89

(cherry picked from commit 32fc16b)

restore reduce_first(+, true) method

2a0b7b7

TEMPORARILY USE SPARSEARRAY AND STATISTICS BRANCHES

6cc2496

mbauman marked this pull request as draft May 14, 2025 23:03

giordano added needs nanosoldier run This PR should have benchmarks run on it fold sum, maximum, reduce, foldl, etc. needs tests Unit tests are required for this change labels May 14, 2025

mbauman added needs news A NEWS entry is required for this change needs docs Documentation for this change is required labels May 14, 2025

oscardssmith reviewed May 15, 2025

View reviewed changes

Moelf mentioned this pull request May 15, 2025

use SIMD.jl for x86 and naive_findmin for :aarch64 JuliaHEP/JetReconstruction.jl#151

Closed

Revert "simplify inner indices computation"

6fb57cc

This reverts commit be75442.

mbauman added 3 commits May 28, 2025 10:06

transducers for SkipMissing

07e11a2

fewer transducers

144c253

fixup! Specialize function args to avoid allocations

ddc4502

be more careful about which dimensions are considered "inner" reductions

30c8afb

Krastanov mentioned this pull request Jun 10, 2025

Proposal: Make raph operators work with different graph types than SimpleGraph and SimpleDiGraph JuliaGraphs/Graphs.jl#441

Open

mbauman added 9 commits June 25, 2025 17:44

only use mapreduce_first without initial values

b66af55

and slightly improve perf for abstract iterables

skip skipmissing transducers entirely

a729ea7

they ended up losing type stability and it isn't hard to just define our own pairwise impl

use distinct skipmissing functions

13ee1b5

they are just different enough to break some assumptions about this function

improve inference for unstable HasLength iters

e6948ec

add tuned quick outs

8ec7b41

simplify bootstrap; avoid vcat early in bootstrap order

427bbb7

skip at-simd in favor of explicit re-commutating

5f8c6a3

Merge remote-tracking branch 'origin/master' into mb/ReduceReuse♻

cdb3cb4

fixup! Specialize function args to avoid allocations

264b8db

stevengj mentioned this pull request Jul 3, 2025

use pairwise order for mapreduce on arbitrary iterators #52397

Draft

mbauman added 2 commits July 3, 2025 13:12

move to 8x bins for commutative ops

3d5cfec

Merge remote-tracking branch 'origin/master' into mb/ReduceReuse♻

4807a12

mbauman force-pushed the mb/ReduceReuse♻ branch from 00c4f51 to 4807a12 Compare July 3, 2025 17:13

mbauman mentioned this pull request Jul 30, 2025

WIP: implement reductions as reductions JuliaStats/Statistics.jl#172

Open

adienes mentioned this pull request Aug 21, 2025

don't use reducedim_initarray JuliaLang/LinearAlgebra.jl#1153

Closed

adienes reviewed Aug 27, 2025

View reviewed changes

mbauman mentioned this pull request Sep 25, 2025

mapreduce uses @simd which can cause different results depending on optimizations #59657

Open

vtjnash mentioned this pull request Oct 1, 2025

Fix findmin/max with dims for non-1 based inputs. #45822

Open

	N > 7 && (v_7 = op(v_7, f(a_7)))
	@nexprs 7 M->((N > M) && (v_{8-M} = op(v_{8-M}, f(a_{8-M}))))

Uh oh!

WIP: The great pairwise reduction refactor #58418

Are you sure you want to change the base?

WIP: The great pairwise reduction refactor #58418

Uh oh!

Conversation

mbauman commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tecosaur commented May 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jakobnissen commented May 15, 2025

Uh oh!

mbauman commented May 27, 2025

Uh oh!

mbauman commented May 28, 2025

Uh oh!

mbauman commented May 28, 2025

Uh oh!

mbauman commented Jul 3, 2025

Uh oh!

adienes commented Jul 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LilithHafner commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

mbauman commented May 14, 2025 •

edited

Loading