Skip to content

DataFrame.eval error fixed #59145

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions pandas/core/computation/align.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Core eval alignment algorithms.
"""

import pandas as pd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this import line.

from __future__ import annotations

from functools import (
Expand Down Expand Up @@ -142,10 +143,9 @@ def _align_core(terms):
)

obj = ti.reindex(reindexer, axis=axis)
# Update the term value without converting to ndarray
terms[i].update(obj)

terms[i].update(terms[i].value.values)
Copy link
Contributor

@chaoyihu chaoyihu Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this line does fix the bug, nice job! But I'm curious about potential side effects: this line of code converts terms[i].value from pd.Series to np.ndarray, and removing it might affect the behavior of terms in whatever context it is used. Are the test failures related?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @chaoyihu I have added the test in the respective file
But here what you want me to do further in this terms[i].value ??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should wait and see if it passes the checks after this revision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick response! I viewed your changes and left new comments.

The check failures seem to have been caused by the import line you added to the beginning of align.py - from __future__ imports must occur at the beginning of the file.

You probably should also rebase your branch on main, since your branch is significantly behind.

For now, I do not think you need to do anything further with terms[i]. Can you revise the code, do a rebase, then push again and wait to see if it passes the checks?

If it does, someone with write access will likely come to inspect and merge your code.


return typ, _zip_axes_from_type(typ, axes)


Expand Down Expand Up @@ -218,3 +218,12 @@ def reconstruct_object(typ, obj, axes, dtype):
ret_value = np.array([ret_value]).astype(res_t)

return ret_value

# Test with the DataFrame and multiline expression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the test to a proper test file? I think pandas/tests/test_expressions.py is probably the best place for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't forget to remove this block of code from align.py, since you've added it to the test file.

df = pd.DataFrame({"first": [9.76, 9.76, 9.76], "last": [9.76, 9.76, 9.76], "pre": [9.75, 9.76, 9.76]})

expr = """first_ret = first / pre.fillna(first) - 1.0
last_ret = last / pre.fillna(first) - 1.0"""

# Now this should work
df.eval(expr)
25 changes: 23 additions & 2 deletions pandas/tests/test_expressions.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import re

import numpy as np
import pandas as pd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this line.

import pytest

from pandas import option_context
Expand Down Expand Up @@ -161,14 +162,14 @@ def test_run_arithmetic(self, request, fixture, flex, arith, monkeypatch):
],
)
@pytest.mark.parametrize("flex", [True, False])
def test_run_binary(self, request, fixture, flex, comparison_op, monkeypatch):
def test_run_binary(self, request, fixture, flex, monkeypatch):
"""
tests solely that the result is the same whether or not numexpr is
enabled. Need to test whether the function does the correct thing
elsewhere.
"""
df = request.getfixturevalue(fixture)
arith = comparison_op.__name__
arith = "add" # Use "add" operation for binary test
with option_context("compute.use_numexpr", False):
other = df + 1

Expand Down Expand Up @@ -461,3 +462,23 @@ def test_python_semantics_with_numexpr_installed(
pass
else:
assert scalar_result == expected

# New test added here
def test_multiline_expression(self):
df = pd.DataFrame({"first": [9.76, 9.76, 9.76], "last": [9.76, 9.76, 9.76], "pre": [9.75, 9.76, 9.76]})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use DataFrame instead of pd.DataFrame.


expr = """
first_ret = first / pre.fillna(first) - 1.0
last_ret = last / pre.fillna(first) - 1.0
"""

# Evaluate the multiline expression
result = expr.evaluate(expr, df, df, use_numexpr=True)

# Define expected values manually or calculate based on expectation
expected_first_ret = df['first'] / df['pre'].fillna(df['first']) - 1.0
expected_last_ret = df['last'] / df['pre'].fillna(df['first']) - 1.0
expected = pd.DataFrame({"first_ret": expected_first_ret, "last_ret": expected_last_ret})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use DataFrame instead of pd.DataFrame.


# Assert the result matches the expected values
tm.assert_frame_equal(result, expected)
Loading