Skip to content

Conversation

@w2ais
Copy link

@w2ais w2ais commented Jan 23, 2026

Changes Made

Return map literals as Python dicts instead of lists of tuples.

Update tests to expect dicts for map outputs and value_counts results.

Related Issues

#6081

…instead of

lists of tuples

Update tests to expect dicts for map outputs and value_counts results.
@w2ais w2ais changed the title Bug fix : Map Literal <-> Python Dict conversion fix: Map Literal <-> Python Dict conversion Jan 23, 2026
@github-actions github-actions bot added the fix label Jan 23, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b2f73e3760

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +201 to +203
let map = PyDict::new(py);
for (key, value) in keys.to_literals().into_iter().zip(values.to_literals()) {
map.set_item(key.into_pyobject(py)?, value.into_pyobject(py)?)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve duplicate map keys instead of overwriting

Converting map literals to a Python dict will silently drop earlier entries when a map contains duplicate keys, because later set_item calls overwrite prior ones. Arrow map semantics allow duplicate keys and preserve all pairs, so to_pydict() now loses data for any map array or literal with repeated keys (e.g., data sourced from Arrow where a key appears twice). This is a behavioral regression versus the previous list-of-tuples representation which preserved duplicates.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 23, 2026

Greptile Overview

Greptile Summary

This PR fixes the Python representation of Map literals by converting them to Python dicts instead of lists of tuples. The core change in python.rs replaces the list construction with proper dict creation using PyDict::new() and iterative set_item() calls. All test files have been updated to reflect the new dict-based output format for map types and value_counts() operations.

The nested list test case in test_list_value_counts_nested() now correctly expects a TypeError when calling to_pydict(), as lists are unhashable and cannot be used as dictionary keys in Python. This is the expected behavior when map keys are non-hashable types.

All changes are consistent and properly maintain the contract between Rust and Python representations of Map types.

Confidence Score: 5/5

  • This PR is safe to merge with no risks
  • The implementation correctly converts Map literals to Python dicts, which is the natural and expected representation. The Rust code properly iterates through key-value pairs and uses the appropriate PyDict API. All test updates are consistent with the new behavior, and the nested list test correctly identifies the TypeError for unhashable keys. The change is well-isolated, maintains API consistency, and improves user experience by returning the more Pythonic dict type.
  • No files require special attention

Important Files Changed

Filename Overview
src/daft-core/src/lit/python.rs Changed Map literal conversion to return Python dicts instead of lists of tuples, correctly implementing PyDict iteration and insertion
tests/expressions/test_expressions.py Updated test expectations from list of tuples to dicts for value_counts outputs, and correctly expects TypeError for nested lists (unhashable)
tests/recordbatch/test_from_py.py Updated map array test assertions to expect dicts instead of lists of tuples
tests/series/test_concat.py Updated map access patterns from tuple indexing to dict key access
tests/series/test_if_else.py Updated test expectations for map if-else operations to use dicts instead of lists of tuples

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant Literal as Literal::Map
    participant PyDict as Python Dict
    participant Series as Series (keys/values)
    
    Note over User,Series: Map Literal to Python Conversion
    
    User->>Literal: .into_pyobject(py)
    activate Literal
    
    Note over Literal: Match Self::Map { keys, values }
    Literal->>Literal: Assert keys.len() == values.len()
    
    Literal->>PyDict: PyDict::new(py)
    activate PyDict
    
    Literal->>Series: keys.to_literals()
    Series-->>Literal: Iterator<Literal>
    
    Literal->>Series: values.to_literals()
    Series-->>Literal: Iterator<Literal>
    
    loop For each (key, value) pair
        Literal->>Literal: key.into_pyobject(py)?
        Literal->>Literal: value.into_pyobject(py)?
        Literal->>PyDict: map.set_item(key, value)?
    end
    
    PyDict-->>Literal: Python Dict Object
    deactivate PyDict
    
    Literal-->>User: Dict instead of List[Tuple]
    deactivate Literal
    
    Note over User: value_counts() now returns<br/>{"a": 2, "b": 1}<br/>instead of [("a", 2), ("b", 1)]
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@codecov
Copy link

codecov bot commented Jan 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.90%. Comparing base (538c593) to head (b2f73e3).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6084      +/-   ##
==========================================
- Coverage   72.91%   72.90%   -0.02%     
==========================================
  Files         973      973              
  Lines      126184   126187       +3     
==========================================
- Hits        92011    91996      -15     
- Misses      34173    34191      +18     
Files with missing lines Coverage Δ
src/daft-core/src/lit/python.rs 83.82% <100.00%> (+0.09%) ⬆️

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant