-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi, Rudiger, et al,
David Phelan has asked me to look at using this library for dClimate, and I was hoping you could help answer a question for me.
We have weather station telemetry data that has the general form:
| timestamp | var1 | var2 | var3 |
|---|---|---|---|
| 2023-05-18 | 2.1 | 6.5 | |
| 2023-05-19 | 2.1 | 6.5 | |
| 2023-05-20 | |||
| 2023-05-21 | 2.1 | 4.2 | |
| 2023-05-22 | 2.1 | 4.2 | |
| 2023-05-23 | 2.1 | 4.2 | 6.5 |
In plain English, for a specific index, we have several variables, any of which may or may not have a value.
David did point me to this older work where it's clear you were thinking about heterogenous data and missing values, but looking at the Banyan repository, it's not obvious to me how these are handled. Is this use case accounted for with Banyan, or should we be looking elsewhere?
If this is an accounted for use case, what's the best way to go about it? Use a separate tree for each variable, using Option<T> for the value, or use a single tree with a tuple value?
I'm not sure if this is related or not, but I don't have a good handle on what the "Forest" abstraction is or does. Is there any information you could provide about that?
Thank you!
Chris