-
Notifications
You must be signed in to change notification settings - Fork 8
Description
date_time_parse_RFC_3339() is unique because it parses the RFC 3339 format, like:
2019-01-01T00:00:00Z
2019-01-01T00:00:00+0430
2019-01-01T00:00:00+04:30
Notably it allows you to parse and use the +04:30 info. But it's pretty strict about the format itself. You can only customize separator = "T" and offset = "Z".
This is an example where the abstraction over sys-time leaks:
https://stackoverflow.com/questions/79043367/how-to-read-in-a-character-datetime-with-a-timezone-offset-in-r/79044161#79044161
2019-01-01 00:00+04:30
Notably, no seconds! We have to dip down to sys_time_parse() to parse this
library(clock)
x <- c(
"2023-10-29 00:00+02:00",
"2023-10-29 01:00+02:00",
"2023-10-29 02:00+02:00",
"2023-10-29 02:00+01:00",
"2023-10-29 03:00+01:00",
"2023-10-29 04:00+01:00"
)
# Parse into (roughly) UTC, respecting `%Ez`, i.e. the `+HH:MM` bit
x <- sys_time_parse(
x,
format = "%Y-%m-%d %H:%M%Ez",
precision = "minute"
)
x
#> <sys_time<minute>[6]>
#> [1] "2023-10-28T22:00" "2023-10-28T23:00" "2023-10-29T00:00" "2023-10-29T01:00"
#> [5] "2023-10-29T02:00" "2023-10-29T03:00"
# Convert to POSIXct with your expected time zone
as_date_time(x, zone = "Europe/Paris")
#> [1] "2023-10-29 00:00:00 CEST" "2023-10-29 01:00:00 CEST"
#> [3] "2023-10-29 02:00:00 CEST" "2023-10-29 02:00:00 CET"
#> [5] "2023-10-29 03:00:00 CET" "2023-10-29 04:00:00 CET"
# Or UTC if you wanted that
as_date_time(x, zone = "UTC")
#> [1] "2023-10-28 22:00:00 UTC" "2023-10-28 23:00:00 UTC"
#> [3] "2023-10-29 00:00:00 UTC" "2023-10-29 01:00:00 UTC"
#> [5] "2023-10-29 02:00:00 UTC" "2023-10-29 03:00:00 UTC"Possibly we need date_time_parse_UTC() as one more convenience parser to fill this gap. I think this has come up one other time in the past.
We'd end up with:
| UTC offset YES | UTC offset NO | |
|---|---|---|
| Full TZ name YES | date_time_parse_complete | date_time_parse_abbrev* |
| Full TZ name NO | date_time_parse_UTC | date_time_parse |
With:
date_time_parse_abbrev()being theUTC offset NO + Full TZ name YEScombo, which isn't exactly accurate, but just the full tz name likeAmerica/New_Yorkis not enough to disambiguate a time that sits in the "fall back" overlap, so we don't include that case and instead include the abbrev case, because an abbreviation combined with the suppliedzoneis enough to disambiguatedate_time_parse_RFC_3339()being a special case ofdate_time_parse_UTC()that is restricted to just the common RFC format
But then it seems like we actually cover the whole spectrum of possible formats to parse!
This table would probably be pretty useful to include in the help docs of ?date_time_parse