Skip to content

Conversation

bermuchin
Copy link

Adds the DatetimeOrdinal, a new transformer to convert datetime variables
into their ordinal representation. This allows using date information as a
continuous numerical feature in machine learning models.

The transformer can automatically find datetime variables or work with a
user-defined list. It also provides an option to set a custom start_date
to calculate ordinals relative to a specific point in time, which can
help in creating more meaningful features.

Closes #818

)

Implement the new DatetimeOrdinal transformer for converting datetime
features to their ordinal representation. This commit includes the
transformer class itself and a full suite of pytest unit tests to
ensure its correctness and robustness.
This commit fixes two issues that were causing the CI checks to fail for the DatetimeOrdinal transformer.

- Corrected the docstring format in the transform() method to resolve the sphinx-build error.
- Removed trailing whitespace from a test file to pass the flake8 style check.
This commit fixes issue that were causing the CI checks to fail for the DatetimeOrdinal transformer.

- Removed trailing whitespace from a test file to pass the flake8 style checks.
This commit fixes issue that were causing the CI checks to fail for the DatetimeOrdinal transformer.

- Refactored long lines of code to resolve E501 errors reported by flake8.
@bermuchin
Copy link
Author

Hi, @solegalli. Can I get a review?

Copy link
Collaborator

@solegalli solegalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bermuchin

This is a masterpiece. I am so happy with this PR, it's almost ready to merge, I have very little to add. Thank you so much.

I made a few comments on the tests, one is a nitpick, the others are a bit more important: to test the error message. Here's an example of how we do it for other classes:

with pytest.raises(ValueError) as record:

Going through the code I notice that we are not consistent with how we test errors, but we should move towards testing that the errors match in for all our tests.

Other than that, the last bit missing is adding a user guide, and adding the new class to the readme and the main doc page: https://github.com/feature-engine/feature_engine/blob/main/docs/index.rst

Thanks a lot! I look forward to merging it!

)


def test_datetime_ordinal_no_variables_specified(df_datetime_ordinal):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nitpick: this test could be put together with test_datetime_ordinal_no_start_date and pass the variable names or None with a pytest fixture, because the test and expected results (test logic) are identical.


def test_datetime_ordinal_missing_values_raise(df_datetime_ordinal_na):
transformer = DatetimeOrdinal(missing_values="raise")
with pytest.raises(ValueError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally like testing that the error message matches the expected message as well, because sometimes it fails somewhere else and then we can't be sure.

def test_datetime_ordinal_missing_values_raise(df_datetime_ordinal_na):
transformer = DatetimeOrdinal(missing_values="raise")
with pytest.raises(ValueError):
transformer.fit(df_datetime_ordinal_na)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should also fail in transform, so we could add that test as well.



def test_datetime_ordinal_invalid_start_date():
with pytest.raises(ValueError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like testing that the error message matches the expected one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add DatetimeOrdinal

2 participants