@@ -53,3 +53,34 @@ respectively, and '_id' that is an `ObjectId`, your schema can be defined as::
53
53
54
54
Unsupported data types in a schema cause a ``ValueError `` identifying the
55
55
field and its data type.
56
+
57
+ Null Values and Conversion to Pandas DataFrames
58
+ -----------------------------------------------
59
+
60
+ In Arrow, all Arrays are always nullable.
61
+ Pandas has experimental nullable data types as, e.g., "Int64" (note the capital "I").
62
+ You can instruct Arrow to create a pandas DataFrame using nullable dtypes
63
+ with the code below (taken from `here <https://arrow.apache.org/docs/python/pandas.html >`_)
64
+
65
+ .. code-block :: pycon
66
+
67
+ >>> dtype_mapping = {
68
+ ... pa.int8(): pd.Int8Dtype(),
69
+ ... pa.int16(): pd.Int16Dtype(),
70
+ ... pa.int32(): pd.Int32Dtype(),
71
+ ... pa.int64(): pd.Int64Dtype(),
72
+ ... pa.uint8(): pd.UInt8Dtype(),
73
+ ... pa.uint16(): pd.UInt16Dtype(),
74
+ ... pa.uint32(): pd.UInt32Dtype(),
75
+ ... pa.uint64(): pd.UInt64Dtype(),
76
+ ... pa.bool_(): pd.BooleanDtype(),
77
+ ... pa.float32(): pd.Float32Dtype(),
78
+ ... pa.float64(): pd.Float64Dtype(),
79
+ ... pa.string(): pd.StringDtype(),
80
+ ... }
81
+ ... df = arrow_table.to_pandas(
82
+ ... types_mapper=dtype_mapping.get, split_blocks=True, self_destruct=True
83
+ ... )
84
+ ... del arrow_table
85
+
86
+ Defining a conversion for `pa.string() ` in addition converts Arrow strings to NumPy strings, and not objects.
0 commit comments