Skip to content

Problems with Pandas time formatting when ARM files do not start at 0000 UTC. #403

@AdamTheisen

Description

@AdamTheisen

Pandas, and dateutil parser incorrectly handles ARM time because it does not parse the units string correctly. Xarray uses the pandas Timestamp function which I think defaults to iso8601 standards which ARM does not follow. I should note that this is only an issue when we are trying to decode the time into an actual datetime in python, which xarray defaults to. The issue boils down into how we indicate the time zone. As an example with the corkazrcfrgeM1.a1 code the units are stored as

time:units = "seconds since 2019-04-02 22:00:02 0:00" ;

however, when read in using xarray (which uses pandas) or parsed using dateutil, the output is

Timestamp('2019-04-02 00:00:00')

Which puts the time back to 00 utc. Since most of our files start at 00 UTC, this has not been an issue, but for the wrong reasons. If it defaults to 00 UTC, we haven’t noticed that it’s not actually reading it in as it’s supposed to. If we were to put a + in front of the timezone in the units string, it would work, but that would require updating all ARM’s files.

a = '2019-04-02 22:00:02+0:00'
pd.Timestamp(a)
Timestamp('2019-04-02 22:00:02+0000', tz='UTC')

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions