@@ -110,14 +110,18 @@ Or to iterate:
110
110
Data Frame Type Mapping
111
111
-----------------------
112
112
113
+ Default Data Frame Type Mapping
114
+ +++++++++++++++++++++++++++++++
115
+
113
116
Internally, python-oracledb's :ref: `DataFrame <oracledataframeobj >` support
114
117
makes use of `Apache nanoarrow <https://arrow.apache.org/nanoarrow/ >`__
115
118
libraries to build data frames.
116
119
117
- The following data type mapping occurs from Oracle Database types to the Arrow
118
- types used in python-oracledb DataFrame objects. Querying any other data types
119
- from Oracle Database will result in an exception. :ref: `Output type handlers
120
- <outputtypehandlers>` cannot be used to map data types.
120
+ When querying, the following default data type mapping occurs from Oracle
121
+ Database types to the Arrow types used in python-oracledb DataFrame
122
+ objects. Querying any other data types from Oracle Database will result in an
123
+ exception. :ref: `Output type handlers <outputtypehandlers >` cannot be used to
124
+ map data types.
121
125
122
126
.. list-table-with-summary :: Mapping from Oracle Database to Arrow data types
123
127
:header-rows: 1
@@ -258,6 +262,99 @@ When converting Oracle Database DATEs and TIMESTAMPs:
258
262
* - 7 - 9
259
263
- nanoseconds
260
264
265
+ Explicit Data Frame Type Mapping
266
+ ++++++++++++++++++++++++++++++++
267
+
268
+ You can explicitly set the data types and names that a :ref: `DataFrame
269
+ <oracledataframeobj>` will use for query results. This provides fine-grained
270
+ control over the physical data representation of the resulting Arrow arrays. It
271
+ allows you to specify a representation that is more efficient for its specific
272
+ use case. This can reduce memory consumption and improve processing speed.
273
+
274
+ The parameter ``requested_schema `` parameter to
275
+ :meth: `Connection.fetch_df_all() `, :meth: `Connection.fetch_df_batches() `,
276
+ :meth: `AsyncConnection.fetch_df_all() `, or
277
+ :meth: `AsyncConnection.fetch_df_batches() ` should be an object implementing the
278
+ `Arrow PyCapsule schema interface
279
+ <https://arrow.apache.org/docs/python/generated/pyarrow.Schema.html> `__.
280
+
281
+ For example, the ``pyarrow.schema() `` factory function can be used to create a
282
+ new schema. This takes a list of field definitions as input. Each field can be
283
+ a tuple of ``(name, DataType) ``:
284
+
285
+ .. code-block :: python
286
+
287
+ import pyarrow
288
+
289
+ # Default fetch
290
+
291
+ odf = connection.fetch_df_all(
292
+ " select 123 c1, 'Scott' c2 from dual"
293
+ )
294
+ tab = pyarrow.table(odf)
295
+ print (" Default Output:" , tab)
296
+
297
+ # Fetching with an explicit schema
298
+
299
+ schema = pyarrow.schema([
300
+ (" col_1" , pyarrow.int16()),
301
+ (" C2" , pyarrow.string())
302
+ ])
303
+
304
+ odf = connection.fetch_df_all(
305
+ " select 456 c1, 'King' c2 from dual" ,
306
+ requested_schema = schema
307
+ )
308
+ tab = pyarrow.table(odf)
309
+ print (" \n New Output:" , tab)
310
+
311
+ The schema should have an entry for each queried column.
312
+
313
+ Running the example shows that the number column with the explicit schema was
314
+ fetched into the requested type INT16. Its name has also changed::
315
+
316
+ Default Output: pyarrow.Table
317
+ C1: double
318
+ C2: string
319
+ ----
320
+ C1: [[123]]
321
+ C2: [["Scott"]]
322
+
323
+ New Output: pyarrow.Table
324
+ col_1: int16
325
+ C2: string
326
+ ----
327
+ col_1: [[456]]
328
+ C2: [["King"]]
329
+
330
+ **Supported Explicit Type Mapping **
331
+
332
+ The following table shows the explicit type mappings that are supported. An
333
+ error will occur if the database type or the data cannot be represented in the
334
+ requested schema type.
335
+
336
+ .. list-table-with-summary ::
337
+ :header-rows: 1
338
+ :class: wy-table-responsive
339
+ :widths: 1 1
340
+ :align: left
341
+ :summary: The first column is the Oracle Database data type. The second column shows supported Arrow data types.
342
+
343
+ * - Oracle Database Type
344
+ - Arrow Data Types
345
+ * - DB_TYPE_NUMBER
346
+ - INT8, INT16, INT32, INT64, UINT8, UINT16, UINT32, UINT64, DECIMAL128(p, s), DOUBLE, FLOAT
347
+ * - DB_TYPE_RAW, DB_TYPE_LONG_RAW
348
+ - BINARY, FIXED SIZE BINARY, LARGE BINARY
349
+ * - DB_TYPE_BOOLEAN
350
+ - BOOLEAN
351
+ * - DB_TYPE_DATE, DB_TYPE_TIMESTAMP, DB_TYPE_TIMESTAMP_LTZ, DB_TYPE_TIMESTAMP_TZ
352
+ - DATE32, DATE64, TIMESTAMP
353
+ * - DB_TYPE_BINARY_DOUBLE, DB_TYPE_BINARY_FLOAT
354
+ - DOUBLE, FLOAT
355
+ * - DB_TYPE_VARCHAR, DB_TYPE_CHAR, DB_TYPE_LONG, DB_TYPE_NVARCHAR, DB_TYPE_NCHAR, DB_TYPE_LONG_NVARCHAR
356
+ - STRING, LARGE_STRING
357
+
261
358
.. _convertingodf :
262
359
263
360
Converting python-oracledb's DataFrame to Other Data Frames
0 commit comments