Skip to content

clustering high-dimensional data? #18

@tdhock

Description

@tdhock

Hi @dm13450 I am trying to get dirichletprocess working for clustering a high-dimensional data set.
For example https://web.stanford.edu/~hastie/ElemStatLearn/datasets/zip.train.gz has 256 features.
Using dirichletprocess::DirichletProcessMvnormal would result in a 256 x 256 covariance matrix per cluster, right?
This results in VERY SLOW inference on my computer.
One way to speed that up would be to use a constrained covariance matrix, say spherical.
Is that something that I should implement myself?
or is there some existing/recommended way to accomplish this?
Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions