-
Notifications
You must be signed in to change notification settings - Fork 97
Closes #5155: Bug: MultiIndex .lookup() attempts illegal dtype cast for tuple keys #5156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
2303c7b to
11f3fe1
Compare
11f3fe1 to
a05d13e
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5156 +/- ##
========================================
Coverage ? 100.00%
========================================
Files ? 4
Lines ? 63
Branches ? 0
========================================
Hits ? 63
Misses ? 0
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
tests/pandas/index_test.py
Outdated
| assert mask.dtype == ak.bool_ | ||
|
|
||
| # Expect exactly the first row to match | ||
| assert mask.to_ndarray().tolist() == [True, False, False, False] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I've studied this for a bit, I'd also suggest testing a lookup where the inputs are arrays. This is what I did with the above:
>>> midx.lookup(([ak.array([1,2]),ak.array(['blue','red'])]))
array([False True True False])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't mean to do this as a 'requested changes'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
c32ebc4 to
dd535ad
Compare
PR Description: Improve
Index.lookupandMultiIndex.lookupsemanticsThis pull request refines type-handling, error messaging, and row/column matching
behavior for both
Index.containsandMultiIndex.lookup.It also adds a regression test ensuring that mixed-dtype tuple keys do not
trigger incorrect scalar casting.
The changes improve correctness, readability, and alignment with Pandas semantics.
Summary of Changes
1.
Index.lookupFile:
arkouda/pandas/index.pyboolean
pdarrayof lengthlen(self).TypeErrordescription to reflect that the function expects avalue convertible into an Arkouda array.
akint64).Motivation:
Clarifies semantics and avoids stale / unused imports.
2.
MultiIndex.lookupFile:
arkouda/pandas/index.pyMajor improvements to validation, dtype behavior, and membership logic:
Validation
listortuple.nlevelswith a clearValueError.Two explicit code paths
Per-level arkouda arrays (e.g. list of
pdarray/Strings)Delegated directly to
in1d(self.index, key)for vectorized matching.Scalar tuple keys (e.g.,
(1, "red"))This behavior aligns better with Pandas and eliminates subtle dtype bugs.
3. New Test: Mixed-dtype tuple lookup
File:
tests/pandas/index_test.pyAdded test
test_multiindex_lookup_tuple_mixed_dtypes:(1, "red"):"red"into numeric types.[True, False, False, False]for the provided example.Motivation:
Prevents regressions and captures a real-world bug scenario.
Why This Matters
MultiIndex.lookup.Backward Compatibility
Closes #5155: Bug: MultiIndex .lookup() attempts illegal dtype cast for tuple keys