Results on stationary MRMS-only data #29
ValterFallenius
started this conversation in
Show and tell
Replies: 1 comment 3 replies
-
|
Thanks loads for sharing! Another option might be to pre-train MetNet on a different dataset (e.g. the same dataset that the MetNet authors used) and then fine-tune on your dataset. I think @jacobbieker is busy uploading at least some of the relevant data. Another option might be to use a simpler model? We were recently involved in an ML competition where the task was to predict the next 2 hours of satellite imagery. U-Nets did really well. Here's the winning ML model: https://github.com/jmather625/climatehack |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have finalized my model with a simpler setup than I initially planned. With 8 leadtimes and 15 minute spacing the model achieves something but I after rigorous testing the results are still pretty poor. See F1-score plotted below when compared with persistence:
I suspect I have too few training samples. MetNet uses 1.7M data samples before they stopped observing overfitting. I have trained a network on 4,400 samples that does something but it doesn't perform nearly as well as the MetNet. In the beginning of my project I decided to make it easy for myself and work with a stationary model, this lead to less data available.
My available data before preprocessing: 5 years, 365 days, 90 minute data samples ---> 30,000 samples
After sorting out all samples with less than 5 pixels of rain in any lead time only 4,400 samples remain.
MetNet has an input patch of size 1024km x 1024km and a total coverage of 7000km x 2500km, this gives ~15 non-overlapping geographical locations.
MetNets available data: 1.5 years, 365 days, 90 minute samples, 15 geographical non-overlapping locations ---> 131,000 samples
Same sorting technique leaves only ~20,000 samples.
Since this is way less than 1.7M we can assume they do not use non-overlapping geographical locations, instead this is randomly sampled with yields many more data points.
Some examples of successes and fails:




I am contemplating implementing a non-stationary model, however this would require some time that I might not have since my thesis is due in 1 month.
Beta Was this translation helpful? Give feedback.
All reactions