Skip to content

Question: Sparse heterogenous data #118

@chrisrossi

Description

@chrisrossi

Hi, Rudiger, et al,

David Phelan has asked me to look at using this library for dClimate, and I was hoping you could help answer a question for me.

We have weather station telemetry data that has the general form:

timestamp var1 var2 var3
2023-05-18 2.1 6.5
2023-05-19 2.1 6.5
2023-05-20
2023-05-21 2.1 4.2
2023-05-22 2.1 4.2
2023-05-23 2.1 4.2 6.5

In plain English, for a specific index, we have several variables, any of which may or may not have a value.

David did point me to this older work where it's clear you were thinking about heterogenous data and missing values, but looking at the Banyan repository, it's not obvious to me how these are handled. Is this use case accounted for with Banyan, or should we be looking elsewhere?

If this is an accounted for use case, what's the best way to go about it? Use a separate tree for each variable, using Option<T> for the value, or use a single tree with a tuple value?

I'm not sure if this is related or not, but I don't have a good handle on what the "Forest" abstraction is or does. Is there any information you could provide about that?

Thank you!
Chris

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions