-
Notifications
You must be signed in to change notification settings - Fork 35
Description
Since updating dependencies to allow numpy 2.0 (#986 ), we have a test failure in the OpenData store that is being triggered by pandas. See, for example, this failed test run.
The exception raised is
def resolve(self, key: str, is_local: bool):
"""
Resolve a variable name in a possibly local context.
Parameters
----------
key : str
A variable name
is_local : bool
Flag indicating whether the variable is local or not (prefixed with
the '@' symbol)
Returns
-------
value : object
The value of a particular variable
"""
try:
# only look for locals in outer scope
if is_local:
return self.scope[key]
# not a local variable so check in resolvers if we have them
if self.has_resolvers:
return self.resolvers[key]
# if we're here that means that we have no locals and we also have
# no resolvers
assert not is_local and not self.has_resolvers
return self.scope[key]
except KeyError:
try:
# last ditch effort we look in temporaries
# these are created when parsing indexing expressions
# e.g., df[df > 0]
return self.temps[key]
except KeyError as err:
> raise UndefinedVariableError(key, is_local) from err
E pandas.errors.UndefinedVariableError: name 'np' is not defined
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/pandas/core/computation/scope.py:244: UndefinedVariableError
The test triggering the failure is test_update:
def test_update(s3store):
assert len(s3store.index_data) == 2
s3store.update(
pd.DataFrame(
[
{
"task_id": "mp-199999",
"data": "asd",
"group": {"level_two": 4},
s3store.last_updated_field: datetime.utcnow(),
}
]
)
)
I did some debugging on the resolve function in pandas (see source code here) and determined that the number in {"level_two": 4}, is getting turned into a np.int64 and that the key and is_local args to the resolve function are np and False, respectively.
Somewhere, pandas is getting confused and using np as a variable name. I'm not sure how or why this is happening but I have a feeling it is a pandas bug. The following may be relevant:
https://numpy.org/devdocs/numpy_2_0_migration_guide.html#windows-default-integer
Version
latest
Which OS?
- MacOS
- Windows
- Linux