Questions about generate dataset using make_dataset.py

Hi, nice work !

I'm trying to explore SEVIR dataset and I wonna generate a dataset with make_dataset.py.

I have modified make_dataset.py and I split each event into 2 trainig samples.

I have run make_dataset.py, and I have got files named as nowcast_training_000.h5, nowcast_testing_000.h5, ..., nowcast_training_008.h5, nowcast_training_008.h5, and their corresponding xxx_META.csv files. (I remain the parameter "n_chunks" as the default value 8).

However, I don't understand the relations in these files, and I have the following confusions,

1. Is the data in xxx_000.h5 the same as that  in xxx_001.h5 and others but with different data orders, or the data in xxx_000.h5 is not the same as that in xxx_001.h5 and others.

2. Should I use one of the file pairs for training and testing (such as nowcast_training_000.h5 for training and nowcast_testinging_000.h5 for testing ), or using all of the files, or setting the parameter "append" to "True" to write the 8 chunks into 1 training file and 1 testing file ？

Thanks in advance !



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Questions about generate dataset using make_dataset.py #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions about generate dataset using make_dataset.py #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions