-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Closed
Labels
BugError ReportingIncorrect or improved errors from pandasIncorrect or improved errors from pandascov/corr
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
df = DataFrame({
"a": [1, 2, 3],
# "b": [pd.Timestamp("1970-01-01 00:00:00.000000001"), pd.Timestamp("1970-01-01 00:00:00.000000002"), pd.NaT],
"b": [1, 2, np.nan],
"c": [1, 2, 3],
})
df.cov()
Issue Description
If I understand this correctly, both calculations should be equivalent. The np.nan case returns
a b c
a 1.0 0.5 1.0
b 0.5 0.5 0.5
c 1.0 0.5 1.0
while the NaT case returns
a b c
a 1.000000e+00 -4.611686e+18 1.000000e+00
b -4.611686e+18 2.835686e+37 -4.611686e+18
c 1.000000e+00 -4.611686e+18 1.000000e+00
Both are equivalent without any missing values
Expected Behavior
Same result for both
Installed Versions
main
Metadata
Metadata
Assignees
Labels
BugError ReportingIncorrect or improved errors from pandasIncorrect or improved errors from pandascov/corr