Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions doc/source/user_guide/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,17 @@ this area.

.. note::

The Python and NumPy indexing operators ``[]`` and attribute operator ``.``
provide quick and easy access to pandas data structures across a wide range
of use cases. This makes interactive work intuitive, as there's little new
to learn if you already know how to deal with Python dictionaries and NumPy
The Python and NumPy indexing operators ``[]`` and the attribute operator ``.``
provide quick and easy access to pandas data structures across a wide range of
use cases. This makes interactive work intuitive, as there's little new to
learn if you already know how to deal with Python dictionaries and NumPy
arrays. However, since the type of the data to be accessed isn't known in
advance, directly using standard operators has some optimization limits. For
production code, we recommended that you take advantage of the optimized
pandas data access methods exposed in this chapter.
advance, directly using these standard operators has some optimization limits.

For performance-critical or production code, we recommend using the optimized
pandas data access methods (such as ``.loc`` and ``.iloc``) described in this
chapter.

See the :ref:`MultiIndex / Advanced Indexing <advanced>` for ``MultiIndex`` and more advanced indexing documentation.

See the :ref:`cookbook<cookbook.selection>` for some advanced strategies.
Expand Down
35 changes: 35 additions & 0 deletions doc/source/user_guide/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,41 @@ See :ref:`text.four_string_variants` section below for details.

.. _text.string_methods:

String storage: pyarrow vs python
---------------------------------

Pandas supports different storage backends for string data.
Depending on the configuration and installed dependencies,
string data may be stored using either a Python object-based
implementation or a pyarrow-backed implementation.

In general, the pyarrow-backed string storage is recommended
for most users, as it provides better performance and a more
compact memory representation.

**pyarrow-backed string storage**

- Pros:
- More compact memory footprint
- Faster vectorized string operations
- Cons:
- Strings are immutable; modifying values results in new arrays
- Some edge-case behavior differences compared to Python strings

**Python object string storage**

- Pros:
- Uses Python string objects (mutable at the array level)
- Behavior consistent with standard Python string semantics
- Cons:
- Higher memory usage
- Slower performance due to lack of vectorization

While pandas aims to provide identical results regardless of
the underlying string storage, some behavior differences may
exist in edge cases (for example, certain Unicode operations).
These differences are documented where relevant.

String methods
==============

Expand Down