Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Jan 7, 2026

Rationale for this change

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

/// TODO(wesm): should this be a DCHECK? Or checked elsewhere

are all user-facing checks so cannot be DCHECK.

(Generated by ChatGPT)

import pyarrow as pa
import pyarrow.compute as pc

print("=" * 60)
print("PyArrow case_when Validation Examples")
print("=" * 60)

# Helper function to test case_when
def test_case_when(name, cond_array, value1, value2, line_number):
    try:
        result = pc.case_when(cond_array, value1, value2)
        print(f"✅ {name}: Success")
        return True
    except pa.ArrowInvalid as e:
        print(f"❌ {name} (line {line_number}): {e}")
        return False

# Create condition structs
valid_cond = pa.array([
    {'c1': True}, {'c1': False}
], type=pa.struct([pa.field('c1', pa.bool_())]))

invalid_cond = pa.array([
    {'c1': True}, None  # ← OUTER NULL triggers validation!
], type=pa.struct([pa.field('c1', pa.bool_())]))

print("\n1. BINARY TYPE (line 1771 - enable_if_base_binary)")
print("-" * 60)
binary1 = pa.array([b'v1', b'v1'])
binary2 = pa.array([b'v2', b'v2'])
test_case_when("Valid binary", valid_cond, binary1, binary2, 1771)
test_case_when("Invalid binary", invalid_cond, binary1, binary2, 1771)

print("\n2. LIST TYPE (line 1818 - enable_if_var_size_list)")
print("-" * 60)
list1 = pa.array([[1, 2], [3, 4]])
list2 = pa.array([[5, 6], [7, 8]])
test_case_when("Valid list", valid_cond, list1, list2, 1818)
test_case_when("Invalid list", invalid_cond, list1, list2, 1818)

print("\n3. LIST_VIEW TYPE (line 1859 - enable_if_list_view)")
print("-" * 60)
listview1 = pa.array([[1, 2], [3, 4]], type=pa.list_view(pa.int64()))
listview2 = pa.array([[5, 6], [7, 8]], type=pa.list_view(pa.int64()))
test_case_when("Valid list_view", valid_cond, listview1, listview2, 1859)
test_case_when("Invalid list_view", invalid_cond, listview1, listview2, 1859)

print("\n4. MAP TYPE (line 1900)")
print("-" * 60)
map1 = pa.array([[('k1', 100)], [('k2', 200)]], type=pa.map_(pa.string(), pa.int64()))
map2 = pa.array([[('k3', 300)], [('k4', 400)]], type=pa.map_(pa.string(), pa.int64()))
test_case_when("Valid map", valid_cond, map1, map2, 1900)
test_case_when("Invalid map", invalid_cond, map1, map2, 1900)

print("\n5. STRUCT TYPE (line 1917)")
print("-" * 60)
struct1 = pa.array([{'x': 1}, {'x': 2}], type=pa.struct([pa.field('x', pa.int64())]))
struct2 = pa.array([{'x': 3}, {'x': 4}], type=pa.struct([pa.field('x', pa.int64())]))
test_case_when("Valid struct", valid_cond, struct1, struct2, 1917)
test_case_when("Invalid struct", invalid_cond, struct1, struct2, 1917)

print("\n6. FIXED_SIZE_LIST TYPE (line 1934)")
print("-" * 60)
fsl1 = pa.array([[1, 2], [3, 4]], type=pa.list_(pa.int64(), 2))
fsl2 = pa.array([[5, 6], [7, 8]], type=pa.list_(pa.int64(), 2))
test_case_when("Valid fixed_size_list", valid_cond, fsl1, fsl2, 1934)
test_case_when("Invalid fixed_size_list", invalid_cond, fsl1, fsl2, 1934)

print("\n7. DICTIONARY TYPE (line 1974)")
print("-" * 60)
dict1 = pa.array(['a', 'b']).dictionary_encode()
dict2 = pa.array(['c', 'd']).dictionary_encode()
test_case_when("Valid dictionary", valid_cond, dict1, dict2, 1974)
test_case_when("Invalid dictionary", invalid_cond, dict1, dict2, 1974)

print("\n" + "=" * 60)
print("KEY POINT: Outer nulls (None in struct) trigger validation")
print("           Inner nulls ({'c1': None}) are allowed")
print("=" * 60)

What changes are included in this PR?

This PR removes obsolete TODOs. They cannot be DCHECK.

Are these changes tested?

Yes, as described above.

Are there any user-facing changes?

No, dev-only.

@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@HyukjinKwon HyukjinKwon changed the title [Minor][C++][Compute] Remove resolved TODO comments in case_when null validation MINOR: [C++][Compute] Remove resolved TODO comments in case_when null validation Jan 7, 2026
@HyukjinKwon HyukjinKwon force-pushed the minor-remove-todos branch 2 times, most recently from a8f2b4a to 90cfa99 Compare January 7, 2026 04:27
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit 5b1af01 into apache:main Jan 8, 2026
46 checks passed
@kou kou removed the awaiting review Awaiting review label Jan 8, 2026
@github-actions github-actions bot added the awaiting merge Awaiting merge label Jan 8, 2026
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 5b1af01.

There weren't enough matching historic benchmark results to make a call on whether there were regressions.

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants