Skip to content

Commit 0c171a4

Browse files
Remove Column docstring section about Arrow's null dtype (#280)
1 parent 25e5a52 commit 0c171a4

File tree

1 file changed

+0
-14
lines changed

1 file changed

+0
-14
lines changed

protocol/dataframe_protocol.py

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -179,20 +179,6 @@ class Column(ABC):
179179
and an offsets buffer (if variable-size binary; e.g., variable-length
180180
strings).
181181
182-
TBD: Arrow has a separate "null" dtype, and has no separate mask concept.
183-
Instead, it seems to use "children" for both columns with a bit mask,
184-
and for nested dtypes. Unclear whether this is elegant or confusing.
185-
This design requires checking the null representation explicitly.
186-
187-
The Arrow design requires checking:
188-
1. the ARROW_FLAG_NULLABLE (for sentinel values)
189-
2. if a column has two children, combined with one of those children
190-
having a null dtype.
191-
192-
Making the mask concept explicit seems useful. One null dtype would
193-
not be enough to cover both bit and byte masks, so that would mean
194-
even more checking if we did it the Arrow way.
195-
196182
TBD: there's also the "chunk" concept here, which is implicit in Arrow as
197183
multiple buffers per array (= column here). Semantically it may make
198184
sense to have both: chunks were meant for example for lazy evaluation

0 commit comments

Comments
 (0)