Skip to content

Alternative faster t-digest implementation #66

@Jackmrzhou

Description

@Jackmrzhou

Hi, thanks for the t-digest implementation for python!
I used this for my work and I found in the end, computing t-digest and merging t-digest becoming the bottleneck. So I read the original paper and implemented an another version of it(using the algorithm in the paper). Then I found the performance is better (around 50-100 times faster). I think the improvement part is that we can have some buffer and merge hundred of values into t-digest at once.
I wonder if I could have a PR to this repo and add an alternative implementation to it? So I can use that in my day to day work, thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions