Skip to content

Commit 86f821d

Browse files
committed
update of function documentation for function coerce, concerning date format
1 parent 4af716e commit 86f821d

File tree

1 file changed

+13
-52
lines changed

1 file changed

+13
-52
lines changed

R/utils.R

Lines changed: 13 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -49,60 +49,21 @@ has_star <- function(x) {
4949
#' @param x A character vector
5050
#' @param atomicclass A character string indicating the atomic class
5151
#' @description
52-
#' We assume that the date is stored as a signed integer in excel, being the
53-
#' number of days passed since January 1 1970. When read by read_excel it seems
54-
#' to be converted to the number of seconds passed since January 1 1970. This
55-
#' is the same as the POSIXct class in R. Hence, we can convert this number `x`
56-
#' to a date again by using `as.POSIXct(x)` and
57-
#' `format(as.POSIXct(x, tz=""), format="%Y-%m-%d")` to get a string with format
58-
#' YYYY-MM-DD.
59-
#'
60-
# TODO: check the previous and following statements
52+
#' About date conversion: read_excel() reads dates as the correct POSIXct object
53+
#' when the excel field is formatted as a date. However, if the field is formatted
54+
#' as a number, it will be read as a numeric value. In this case, the conversion to
55+
#' a date object must be performed using the as.Date() function. The origin
56+
#' parameter must be set to "1899-12-30" to account for the fact that Excel
57+
#' calculates the epoch since January 1, 1900 and incorrecly assumes that 1900 was
58+
#' a leap year.
6159
#
6260
# ACCORDING TO Copilot:
63-
# ### **Analysis**:
64-
# 1. **Excel Date Storage**:
65-
# - Excel does **not** store dates as the number of days since January 1, 1970. Instead:
66-
# - Excel stores dates as the number of days since **January 1, 1900** (for Windows systems)
67-
# or **January 1, 1904** (for macOS systems).
68-
# - Excel also incorrectly assumes that 1900 was a leap year, which introduces an offset of
69-
# 1 day for dates before March 1, 1900.
70-
#
71-
# 2. **`read_excel` Behavior**:
72-
# - When using `readxl::read_excel`, Excel dates are typically read as numeric values representing
73-
# the number of days since Excel's epoch (e.g., 1900 or 1904). These values are **not
74-
# automatically converted to POSIXct** by `read_excel`. The user must manually convert them.
75-
#
76-
# 3. **POSIXct Conversion**:
77-
# - The description mentions converting the number to a date using `as.POSIXct(as.integer(x))`.
78-
# However:
79-
# - This assumes that the numeric value `x` is already in seconds since January 1, 1970, which
80-
# is not the case for Excel dates.
81-
# - To convert Excel dates to R's `POSIXct`, you need to account for Excel's epoch (e.g.,
82-
# subtract the appropriate offset for 1900 or 1904).
83-
#
84-
# 4. **Formatting**:
85-
# - The description correctly states that `format(as.POSIXct(x, tz=""), format="%Y-%m-%d")`
86-
# can be used to format a `POSIXct` object as a string in the `YYYY-MM-DD` format.
87-
#
88-
# ### **Corrected Description**:
89-
# Here’s a revised and accurate version of the description:
90-
#
91-
# Excel stores dates as numeric values representing the number of days since
92-
# January 1, 1900 (Windows) or January 1, 1904 (macOS). Note that Excel's 1900
93-
# date system incorrectly assumes 1900 was a leap year, which introduces a
94-
# 1-day offset for dates before March 1, 1900.
95-
#
96-
# When read using `readxl::read_excel`, Excel dates are imported as numeric
97-
# values. To convert these to R's `POSIXct` class, you must account for Excel's
98-
# epoch. For example, subtract 25569 days (the number of days between January 1,
99-
# 1900, and January 1, 1970) and convert to seconds by multiplying by 86400.
100-
#
101-
# Example conversion:
102-
# `as.POSIXct((x - 25569) * 86400, origin = "1970-01-01", tz = "")`
103-
#
104-
# To format the date as a string in `YYYY-MM-DD` format, use:
105-
# `format(as.POSIXct(...), format = "%Y-%m-%d")`.
61+
# **Excel Date Storage**:
62+
# - Excel does **not** store dates as the number of days since January 1, 1970. Instead:
63+
# - Excel stores dates as the number of days since **January 1, 1900** (for Windows systems)
64+
# or **January 1, 1904** (for macOS systems).
65+
# - Excel also incorrectly assumes that 1900 was a leap year, which introduces an offset of
66+
# 1 day for dates before March 1, 1900.
10667
#
10768
#' @return A vector of the specified atomic class
10869
#' @noRd

0 commit comments

Comments
 (0)