Skip to content

Commit 5cbd0ef

Browse files
committed
[Docs] Enhance transforming-data documentation with Polars operations section
Signed-off-by: peterxcli <peterxcli@gmail.com>
1 parent a5d3e3c commit 5cbd0ef

File tree

1 file changed

+15
-2
lines changed

1 file changed

+15
-2
lines changed

doc/source/data/transforming-data.rst

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ In this case, your function would look like:
212212
# yield the same batch multiple times
213213
for _ in range(10):
214214
yield batch
215-
215+
216216
Choosing the right batch format
217217
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
218218

@@ -249,6 +249,19 @@ program might run into out-of-memory (OOM) errors.
249249

250250
If you encounter an OOM errors, try decreasing your ``batch_size``.
251251

252+
Enabling Polars operations
253+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
254+
255+
You can enable Polars globally to optimize certain Ray Data operations. Ray Data uses Polars internally for better performance when processing data.
256+
257+
To enable Polars operations, configure the :class:`~ray.data.DataContext`:
258+
259+
.. testcode::
260+
ctx = ray.data.DataContext.get_current()
261+
ctx.use_polars_sort = True
262+
263+
When you enable these flags, Ray Data automatically uses Polars for use Polars for tabular dataset sorting operations. which can significantly improve performance for certain workloads. This doesn't affect your UDF code—you can still use any batch format in :meth:`~ray.data.Dataset.map_batches`.
264+
252265

253266
.. _stateful_transforms:
254267

@@ -365,7 +378,7 @@ You can read more about resources in Ray here: :ref:`resource-requirements`.
365378
:hide:
366379

367380
import ray
368-
381+
369382
ds = ray.data.range(1)
370383

371384
.. testcode::

0 commit comments

Comments
 (0)