Skip to content

quantile is inefficient in common case of CI calculation #84

@bkamins

Description

@bkamins

Example:

julia> x=rand(10^4);

julia> @btime [quantile($x, 0.05), quantile($x, 0.95)]
  160.000 μs (5 allocations: 156.50 KiB)
2-element Vector{Float64}:
 0.04742732947951543
 0.9507927513232511

julia> @btime quantile($x, [0.05, 0.95])
  597.700 μs (4 allocations: 78.39 KiB)
2-element Vector{Float64}:
 0.04742732947951543
 0.9507927513232511

julia> x = rand(10^6);

julia> @btime quantile($x, [0.05, 0.95])
  92.449 ms (4 allocations: 7.63 MiB)
2-element Vector{Float64}:
 0.049906389147197056
 0.9498833775516955

julia> @btime [quantile($x, 0.05), quantile($x, 0.95)]
  18.751 ms (5 allocations: 15.26 MiB)
2-element Vector{Float64}:
 0.049906389147197056
 0.9498833775516955

julia> @btime quantile($x, [0.005, 0.995])
  98.948 ms (4 allocations: 7.63 MiB)
2-element Vector{Float64}:
 0.005030018622636852
 0.9949264399317231

julia> @btime [quantile($x, 0.005), quantile($x, 0.995)]
  17.697 ms (5 allocations: 15.26 MiB)
2-element Vector{Float64}:
 0.005030018622636852
 0.9949264399317231

but

julia> @btime quantile($x, [0.49, 0.51])
  12.591 ms (4 allocations: 7.63 MiB)
2-element Vector{Float64}:
 0.48882655343720904
 0.5090073220557704

julia> @btime [quantile($x, 0.49), quantile($x, 0.51)]
  22.728 ms (5 allocations: 15.26 MiB)
2-element Vector{Float64}:
 0.48882655343720904
 0.5090073220557704

The reason is that doing partial sort twice on tails if we are near the end is faster than doing one partial sort. Unfortunately this is a common case I would assume. Maybe we should introduce some optimization here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions