Skip to content

DiscreteNonParametric and Categorical Construction Issue #1832

@btmit

Description

@btmit

Construction of a Categorical distribution seems to make a copy of the p vector. I see this through profiling, @btime and the fact that I can't see changes in the original vector after I create the Categorical. There are three issues I see:

  1. Categorical docstring includes the following: "Note: The input vector p is directly used as a field of the constructed distribution, without being copied." which seems incorrect.
  2. Performance issues in critical sections of code where this allocation can really add up
  3. Bugs such as the following:
using Distributions
x = rand(3,5)
x = x ./ sum(x, dims=1)  # each column is a valid probability vector
c = Categorical.(eachcol(x))

julia> c = Categorical.(eachcol(x))
ERROR: MethodError: Cannot convert an object of type Vector{Float64} to an object of type SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}

I believe the underlying issue is that the DiscreteNonParametric inner constructor tries to sort and reorder everything, which creates a copy and then the constructor doesn't update the type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions