Question: Sparse heterogenous data

Hi, Rudiger, et al,

David Phelan has asked me to look at using this library for dClimate, and I was hoping you could help answer a question for me.

We have weather station telemetry data that has the general form:

| timestamp  | var1 | var2 | var3 |
-------------|------|------|------|
| 2023-05-18 |  2.1 |      |  6.5 |
| 2023-05-19 |  2.1 |      |  6.5 |
| 2023-05-20 |      |      |      |
| 2023-05-21 |  2.1 |  4.2 |      |
| 2023-05-22 |  2.1 |  4.2 |      |
| 2023-05-23 |  2.1 |  4.2 |  6.5 |

In plain English, for a specific index, we have several variables, any of which may or may not have a value.

David did point me to this [older work](https://rklaehn.github.io/2018/06/10/efficient-telemetry-storage-on-ipfs/) where it's clear you were thinking about heterogenous data and missing values, but looking at the Banyan repository, it's not obvious to me how these are handled. Is this use case accounted for with Banyan, or should we be looking elsewhere? 

If this is an accounted for use case, what's the best way to go about it? Use a separate tree for each variable, using Option\<T\> for the value, or use a single tree with a tuple value?

I'm not sure if this is related or not, but I don't have a good handle on what the "Forest" abstraction is or does. Is there any information you could provide about that?

Thank you!
Chris


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Sparse heterogenous data #118

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

timestamp	var1	var2	var3
2023-05-18	2.1		6.5
2023-05-19	2.1		6.5
2023-05-20
2023-05-21	2.1	4.2
2023-05-22	2.1	4.2
2023-05-23	2.1	4.2	6.5

Question: Sparse heterogenous data #118

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions