-
Notifications
You must be signed in to change notification settings - Fork 13
comparison dataframes in assertions.pyΒ #65
Copy link
Copy link
Open
Labels
questionFurther information is requestedFurther information is requested
Description
DISCLAIMER
I came here without deep knowledge of data-detective. Following question is purely based on the contents of the assertions.py file
(https://github.com/Tinkoff/data-detective/blob/master/data-detective-airflow/data_detective_airflow/test_utilities/assertions.py)
Function assert_frame_equal compares two dataframes. How? By computing two scalar values with to_bytes and comparing them. This method has some limitations:
a = pd.DataFrame([1, 1, 2, 4])
b = pd.DataFrame([3, 2, 4, 3])
c = pd.DataFrame([2, 4])
d = pd.DataFrame([1, [1, 2], 4])
e = pd.DataFrame([{"a": 2}, {"a": 4}])
assert to_bytes(a) == to_bytes(b) == to_bytes(c) == to_bytes(d) == to_bytes(e) # True
Probably you compare dataframes of the same size or the same type. So there's no chance you'll have dataframes C, D, E in test simultaneously. But having dataframes A and B equal makes one wondering.
Have you considered such cases?
mebelousov and ikanashov
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested