Skip to content

Can I use TDigest instead of QDigest? #47

@ahmadpriatama

Description

@ahmadpriatama

I'm calculating quantile like described in Liveramp blog post

but somehow, running it on production server output

Caused by: java.lang.IllegalArgumentException: Can only accept values in the range 0..4611686018427387903, got 9223372036854775807
    at com.clearspring.analytics.stream.quantile.QDigest.offer(QDigest.java:125)
    at com.liveramp.cascading_ext.combiner.lib.QuantileExactAggregator.partialAggregate(QuantileExactAggregator.java:38)
    at com.liveramp.cascading_ext.combiner.lib.QuantileExactAggregator.partialAggregate(QuantileExactAggregator.java:17)
    at com.liveramp.cascading_ext.combiner.CombinerFunctionContext.combineAndEvict(CombinerFunctionContext.java:130)
    at com.liveramp.cascading_ext.combiner.CombinerFunction.operate(CombinerFunction.java:130)
    at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
    ... 11 more

and tdunning said that i should use TDigest instead of QDigest, but cacasding_ext depend on stream_lib version which not including TDigest. Any idea so i can use TDigest?
I updated the dependencies version of stream lib to the latest version which include TDigest, but apparently cascading_ext have no ExactAggregator that support TDigest (QDigest use QuantileExactAggregator). What should I do?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions