-
Notifications
You must be signed in to change notification settings - Fork 392
fix: Map Literal <-> Python Dict conversion #6084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…instead of lists of tuples Update tests to expect dicts for map outputs and value_counts results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b2f73e3760
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let map = PyDict::new(py); | ||
| for (key, value) in keys.to_literals().into_iter().zip(values.to_literals()) { | ||
| map.set_item(key.into_pyobject(py)?, value.into_pyobject(py)?)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve duplicate map keys instead of overwriting
Converting map literals to a Python dict will silently drop earlier entries when a map contains duplicate keys, because later set_item calls overwrite prior ones. Arrow map semantics allow duplicate keys and preserve all pairs, so to_pydict() now loses data for any map array or literal with repeated keys (e.g., data sourced from Arrow where a key appears twice). This is a behavioral regression versus the previous list-of-tuples representation which preserved duplicates.
Useful? React with 👍 / 👎.
Greptile OverviewGreptile SummaryThis PR fixes the Python representation of Map literals by converting them to Python dicts instead of lists of tuples. The core change in The nested list test case in All changes are consistent and properly maintain the contract between Rust and Python representations of Map types. Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User as User Code
participant Literal as Literal::Map
participant PyDict as Python Dict
participant Series as Series (keys/values)
Note over User,Series: Map Literal to Python Conversion
User->>Literal: .into_pyobject(py)
activate Literal
Note over Literal: Match Self::Map { keys, values }
Literal->>Literal: Assert keys.len() == values.len()
Literal->>PyDict: PyDict::new(py)
activate PyDict
Literal->>Series: keys.to_literals()
Series-->>Literal: Iterator<Literal>
Literal->>Series: values.to_literals()
Series-->>Literal: Iterator<Literal>
loop For each (key, value) pair
Literal->>Literal: key.into_pyobject(py)?
Literal->>Literal: value.into_pyobject(py)?
Literal->>PyDict: map.set_item(key, value)?
end
PyDict-->>Literal: Python Dict Object
deactivate PyDict
Literal-->>User: Dict instead of List[Tuple]
deactivate Literal
Note over User: value_counts() now returns<br/>{"a": 2, "b": 1}<br/>instead of [("a", 2), ("b", 1)]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No files reviewed, no comments
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #6084 +/- ##
==========================================
- Coverage 72.91% 72.90% -0.02%
==========================================
Files 973 973
Lines 126184 126187 +3
==========================================
- Hits 92011 91996 -15
- Misses 34173 34191 +18
🚀 New features to boost your workflow:
|
Changes Made
Return map literals as Python dicts instead of lists of tuples.
Update tests to expect dicts for map outputs and value_counts results.
Related Issues
#6081