Skip to content

DISCUSSION: Datetime Alignment Standards #304

@bradyrx

Description

@bradyrx

CC @kpegion, @aaronspring, @ahuang11, @jmunroe, @Thomas-Moore-Creative, @dougiesquire, @jukent. Looking for insight on time stuff here if/when folks have time to think about this!

With the soon-to-be-merged https://github.com/bradyrx/climpred/pull/294, it's time to figure out some assumptions and standards for dealing with pesky datetime in aligning initialized predictions. I want to work out a system here so we can slowly introduce PRs to handle some of these ideas.

I think we should also add a dedicated docs page on all this datetime stuff. We could draw from some of the thoughts from @jukent in https://ncar.github.io/xdev/posts/time/.

FYI I have some code snippets at https://github.com/bradyrx/esmtools/blob/master/esmtools/temporal.py that could be useful for the below cases.

Here are some thoughts/issues I think are important to discuss (I am going to break them up into separate comments):

  1. Do we assume that "monthly" units in the lead attribute aligns with the start of the next month? Currently we handle monthly units by using .shift() with cftime of nlags 'MS' (month-start).
import cftime
import xarray as xr

inits = [cftime.DatetimeProlepticGregorian(y, m, 1) for y in [1995, 1996] for m in [1, 2, 3]]
inits = xr.CFTimeIndex(inits)  # Includes leap year in 1996

print(inits)
>>> CFTimeIndex([1995-01-01 00:00:00, 1995-02-01 00:00:00, 1995-03-01 00:00:00,
                 1996-01-01 00:00:00, 1996-02-01 00:00:00, 1996-03-01 00:00:00],
                 dtype='object')

inits.shift(1, 'MS')  # Current practice
>>> CFTimeIndex([1995-02-01 00:00:00, 1995-03-01 00:00:00, 1995-04-01 00:00:00,
                 1996-02-01 00:00:00, 1996-03-01 00:00:00, 1996-04-01 00:00:00],
                 dtype='object')

inits.shift(1, 'M')  # Month-end?
>>> CFTimeIndex([1995-01-31 00:00:00, 1995-02-28 00:00:00, 1995-03-31 00:00:00,
                 1996-01-31 00:00:00, 1996-02-29 00:00:00, 1996-03-31 00:00:00],
                 dtype='object')

So currently we use the MS practice which means we don't have to worry about leap years or leap-year calendars. We just align with a verification product on the first of the next month for a monthly forecast. Of course there are issues.

  • What if the verification product represents the e.g. January average with January 31st, as do many climate models? How do we know when to then use .shift(1, 'M') instead of .shift(1, 'MS')? Do we just ask that users convert their verification product to MS?
  • What if they represent with month-middle? Do you have a lookup table that relates ndays in month including leap years so you can shift in units of days?
  • Do we accommodate time bounds? The two above decisions are dependent on time bounds. Is the January 31st index on a verification product the January 15th to February 15th average or the January 1st to January 31st average? (This is a common issue for any time resolution)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions