diff --git a/docs/source/api/dataframe.rst b/docs/source/api/dataframe.rst
deleted file mode 100644
index a9e9e47c8..000000000
--- a/docs/source/api/dataframe.rst
+++ /dev/null
@@ -1,387 +0,0 @@
-.. Licensed to the Apache Software Foundation (ASF) under one
-.. or more contributor license agreements. See the NOTICE file
-.. distributed with this work for additional information
-.. regarding copyright ownership. The ASF licenses this file
-.. to you under the Apache License, Version 2.0 (the
-.. "License"); you may not use this file except in compliance
-.. with the License. You may obtain a copy of the License at
-
-.. http://www.apache.org/licenses/LICENSE-2.0
-
-.. Unless required by applicable law or agreed to in writing,
-.. software distributed under the License is distributed on an
-.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-.. KIND, either express or implied. See the License for the
-.. specific language governing permissions and limitations
-.. under the License.
-
-=================
-DataFrame API
-=================
-
-Overview
---------
-
-The ``DataFrame`` class is the core abstraction in DataFusion that represents tabular data and operations
-on that data. DataFrames provide a flexible API for transforming data through various operations such as
-filtering, projection, aggregation, joining, and more.
-
-A DataFrame represents a logical plan that is lazily evaluated. The actual execution occurs only when
-terminal operations like ``collect()``, ``show()``, or ``to_pandas()`` are called.
-
-Creating DataFrames
--------------------
-
-DataFrames can be created in several ways:
-
-* From SQL queries via a ``SessionContext``:
-
- .. code-block:: python
-
- from datafusion import SessionContext
-
- ctx = SessionContext()
- df = ctx.sql("SELECT * FROM your_table")
-
-* From registered tables:
-
- .. code-block:: python
-
- df = ctx.table("your_table")
-
-* From various data sources:
-
- .. code-block:: python
-
- # From CSV files (see :ref:`io_csv` for detailed options)
- df = ctx.read_csv("path/to/data.csv")
-
- # From Parquet files (see :ref:`io_parquet` for detailed options)
- df = ctx.read_parquet("path/to/data.parquet")
-
- # From JSON files (see :ref:`io_json` for detailed options)
- df = ctx.read_json("path/to/data.json")
-
- # From Avro files (see :ref:`io_avro` for detailed options)
- df = ctx.read_avro("path/to/data.avro")
-
- # From Pandas DataFrame
- import pandas as pd
- pandas_df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
- df = ctx.from_pandas(pandas_df)
-
- # From Arrow data
- import pyarrow as pa
- batch = pa.RecordBatch.from_arrays(
- [pa.array([1, 2, 3]), pa.array([4, 5, 6])],
- names=["a", "b"]
- )
- df = ctx.from_arrow(batch)
-
- For detailed information about reading from different data sources, see the :doc:`I/O Guide <../user-guide/io/index>`.
- For custom data sources, see :ref:`io_custom_table_provider`.
-
-Common DataFrame Operations
----------------------------
-
-DataFusion's DataFrame API offers a wide range of operations:
-
-.. code-block:: python
-
- from datafusion import column, literal
-
- # Select specific columns
- df = df.select("col1", "col2")
-
- # Select with expressions
- df = df.select(column("a") + column("b"), column("a") - column("b"))
-
- # Filter rows
- df = df.filter(column("age") > literal(25))
-
- # Add computed columns
- df = df.with_column("full_name", column("first_name") + literal(" ") + column("last_name"))
-
- # Multiple column additions
- df = df.with_columns(
- (column("a") + column("b")).alias("sum"),
- (column("a") * column("b")).alias("product")
- )
-
- # Sort data
- df = df.sort(column("age").sort(ascending=False))
-
- # Join DataFrames
- df = df1.join(df2, on="user_id", how="inner")
-
- # Aggregate data
- from datafusion import functions as f
- df = df.aggregate(
- [], # Group by columns (empty for global aggregation)
- [f.sum(column("amount")).alias("total_amount")]
- )
-
- # Limit rows
- df = df.limit(100)
-
- # Drop columns
- df = df.drop("temporary_column")
-
-Terminal Operations
--------------------
-
-To materialize the results of your DataFrame operations:
-
-.. code-block:: python
-
- # Collect all data as PyArrow RecordBatches
- result_batches = df.collect()
-
- # Convert to various formats
- pandas_df = df.to_pandas() # Pandas DataFrame
- polars_df = df.to_polars() # Polars DataFrame
- arrow_table = df.to_arrow_table() # PyArrow Table
- py_dict = df.to_pydict() # Python dictionary
- py_list = df.to_pylist() # Python list of dictionaries
-
- # Display results
- df.show() # Print tabular format to console
-
- # Count rows
- count = df.count()
-
-HTML Rendering in Jupyter
--------------------------
-
-When working in Jupyter notebooks or other environments that support rich HTML display,
-DataFusion DataFrames automatically render as nicely formatted HTML tables. This functionality
-is provided by the ``_repr_html_`` method, which is automatically called by Jupyter.
-
-Basic HTML Rendering
-~~~~~~~~~~~~~~~~~~~~
-
-In a Jupyter environment, simply displaying a DataFrame object will trigger HTML rendering:
-
-.. code-block:: python
-
- # Will display as HTML table in Jupyter
- df
-
- # Explicit display also uses HTML rendering
- display(df)
-
-HTML Rendering Customization
-----------------------------
-
-DataFusion provides extensive customization options for HTML table rendering through the
-``datafusion.html_formatter`` module.
-
-Configuring the HTML Formatter
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-You can customize how DataFrames are rendered by configuring the formatter:
-
-.. code-block:: python
-
- from datafusion.html_formatter import configure_formatter
-
- configure_formatter(
- max_cell_length=30, # Maximum length of cell content before truncation
- max_width=800, # Maximum width of table in pixels
- max_height=400, # Maximum height of table in pixels
- max_memory_bytes=2 * 1024 * 1024,# Maximum memory used for rendering (2MB)
- min_rows_display=10, # Minimum rows to display
- repr_rows=20, # Number of rows to display in representation
- enable_cell_expansion=True, # Allow cells to be expandable on click
- custom_css=None, # Custom CSS to apply
- show_truncation_message=True, # Show message when data is truncated
- style_provider=None, # Custom style provider class
- use_shared_styles=True # Share styles across tables to reduce duplication
- )
-
-Custom Style Providers
-~~~~~~~~~~~~~~~~~~~~~~
-
-For advanced styling needs, you can create a custom style provider class:
-
-.. code-block:: python
-
- from datafusion.html_formatter import configure_formatter
-
- class CustomStyleProvider:
- def get_cell_style(self) -> str:
- return "background-color: #f5f5f5; color: #333; padding: 8px; border: 1px solid #ddd;"
-
- def get_header_style(self) -> str:
- return "background-color: #4285f4; color: white; font-weight: bold; padding: 10px;"
-
- # Apply custom styling
- configure_formatter(style_provider=CustomStyleProvider())
-
-Custom Type Formatters
-~~~~~~~~~~~~~~~~~~~~~~
-
-You can register custom formatters for specific data types:
-
-.. code-block:: python
-
- from datafusion.html_formatter import get_formatter
-
- formatter = get_formatter()
-
- # Format integers with color based on value
- def format_int(value):
- return f' 100 else "blue"}">{value}'
-
- formatter.register_formatter(int, format_int)
-
- # Format date values
- def format_date(value):
- return f'{value.isoformat()}'
-
- formatter.register_formatter(datetime.date, format_date)
-
-Custom Cell Builders
-~~~~~~~~~~~~~~~~~~~~
-
-For complete control over cell rendering:
-
-.. code-block:: python
-
- formatter = get_formatter()
-
- def custom_cell_builder(value, row, col, table_id):
- try:
- num_value = float(value)
- if num_value > 0: # Positive values get green
- return f'
{value} | '
- if num_value < 0: # Negative values get red
- return f'{value} | '
- except (ValueError, TypeError):
- pass
-
- # Default styling for non-numeric or zero values
- return f'{value} | '
-
- formatter.set_custom_cell_builder(custom_cell_builder)
-
-Custom Header Builders
-~~~~~~~~~~~~~~~~~~~~~~
-
-Similarly, you can customize the rendering of table headers:
-
-.. code-block:: python
-
- def custom_header_builder(field):
- tooltip = f"Type: {field.type}"
- return f'{field.name} | '
-
- formatter.set_custom_header_builder(custom_header_builder)
-
-Managing Formatter State
------------------------~
-
-The HTML formatter maintains global state that can be managed:
-
-.. code-block:: python
-
- from datafusion.html_formatter import reset_formatter, reset_styles_loaded_state, get_formatter
-
- # Reset the formatter to default settings
- reset_formatter()
-
- # Reset only the styles loaded state (useful when styles were loaded but need reloading)
- reset_styles_loaded_state()
-
- # Get the current formatter instance to make changes
- formatter = get_formatter()
-
-Advanced Example: Dashboard-Style Formatting
-------------------------------------------~~
-
-This example shows how to create a dashboard-like styling for your DataFrames:
-
-.. code-block:: python
-
- from datafusion.html_formatter import configure_formatter, get_formatter
-
- # Define custom CSS
- custom_css = """
- .datafusion-table {
- font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
- border-collapse: collapse;
- width: 100%;
- box-shadow: 0 2px 3px rgba(0,0,0,0.1);
- }
- .datafusion-table th {
- position: sticky;
- top: 0;
- z-index: 10;
- }
- .datafusion-table tr:hover td {
- background-color: #f1f7fa !important;
- }
- .datafusion-table .numeric-positive {
- color: #0a7c00;
- }
- .datafusion-table .numeric-negative {
- color: #d13438;
- }
- """
-
- class DashboardStyleProvider:
- def get_cell_style(self) -> str:
- return "padding: 8px 12px; border-bottom: 1px solid #e0e0e0;"
-
- def get_header_style(self) -> str:
- return ("background-color: #0078d4; color: white; font-weight: 600; "
- "padding: 12px; text-align: left; border-bottom: 2px solid #005a9e;")
-
- # Apply configuration
- configure_formatter(
- max_height=500,
- enable_cell_expansion=True,
- custom_css=custom_css,
- style_provider=DashboardStyleProvider(),
- max_cell_length=50
- )
-
- # Add custom formatters for numbers
- formatter = get_formatter()
-
- def format_number(value):
- try:
- num = float(value)
- cls = "numeric-positive" if num > 0 else "numeric-negative" if num < 0 else ""
- return f'{value:,}' if cls else f'{value:,}'
- except (ValueError, TypeError):
- return str(value)
-
- formatter.register_formatter(int, format_number)
- formatter.register_formatter(float, format_number)
-
-Best Practices
---------------
-
-1. **Memory Management**: For large datasets, use ``max_memory_bytes`` to limit memory usage.
-
-2. **Responsive Design**: Set reasonable ``max_width`` and ``max_height`` values to ensure tables display well on different screens.
-
-3. **Style Optimization**: Use ``use_shared_styles=True`` to avoid duplicate style definitions when displaying multiple tables.
-
-4. **Reset When Needed**: Call ``reset_formatter()`` when you want to start fresh with default settings.
-
-5. **Cell Expansion**: Use ``enable_cell_expansion=True`` when cells might contain longer content that users may want to see in full.
-
-Additional Resources
---------------------
-
-* :doc:`../user-guide/dataframe` - Complete guide to using DataFrames
-* :doc:`../user-guide/io/index` - I/O Guide for reading data from various sources
-* :doc:`../user-guide/data-sources` - Comprehensive data sources guide
-* :ref:`io_csv` - CSV file reading
-* :ref:`io_parquet` - Parquet file reading
-* :ref:`io_json` - JSON file reading
-* :ref:`io_avro` - Avro file reading
-* :ref:`io_custom_table_provider` - Custom table providers
-* `API Reference `_ - Full API reference
diff --git a/docs/source/api/index.rst b/docs/source/api/index.rst
deleted file mode 100644
index 7f58227ca..000000000
--- a/docs/source/api/index.rst
+++ /dev/null
@@ -1,27 +0,0 @@
-.. Licensed to the Apache Software Foundation (ASF) under one
-.. or more contributor license agreements. See the NOTICE file
-.. distributed with this work for additional information
-.. regarding copyright ownership. The ASF licenses this file
-.. to you under the Apache License, Version 2.0 (the
-.. "License"); you may not use this file except in compliance
-.. with the License. You may obtain a copy of the License at
-
-.. http://www.apache.org/licenses/LICENSE-2.0
-
-.. Unless required by applicable law or agreed to in writing,
-.. software distributed under the License is distributed on an
-.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-.. KIND, either express or implied. See the License for the
-.. specific language governing permissions and limitations
-.. under the License.
-
-=============
-API Reference
-=============
-
-This section provides detailed API documentation for the DataFusion Python library.
-
-.. toctree::
- :maxdepth: 2
-
- dataframe
diff --git a/docs/source/index.rst b/docs/source/index.rst
index ff1e47280..adec60f48 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -72,7 +72,7 @@ Example
user-guide/introduction
user-guide/basics
user-guide/data-sources
- user-guide/dataframe
+ user-guide/dataframe/index
user-guide/common-operations/index
user-guide/io/index
user-guide/configuration
@@ -93,5 +93,3 @@ Example
:hidden:
:maxdepth: 1
:caption: API
-
- api/index
diff --git a/docs/source/user-guide/basics.rst b/docs/source/user-guide/basics.rst
index 2975d9a6b..7c6820461 100644
--- a/docs/source/user-guide/basics.rst
+++ b/docs/source/user-guide/basics.rst
@@ -73,7 +73,7 @@ DataFrames are typically created by calling a method on :py:class:`~datafusion.c
calling the transformation methods, such as :py:func:`~datafusion.dataframe.DataFrame.filter`, :py:func:`~datafusion.dataframe.DataFrame.select`, :py:func:`~datafusion.dataframe.DataFrame.aggregate`,
and :py:func:`~datafusion.dataframe.DataFrame.limit` to build up a query definition.
-For more details on working with DataFrames, including visualization options and conversion to other formats, see :doc:`dataframe`.
+For more details on working with DataFrames, including visualization options and conversion to other formats, see :doc:`dataframe/index`.
Expressions
-----------
diff --git a/docs/source/user-guide/dataframe/index.rst b/docs/source/user-guide/dataframe/index.rst
new file mode 100644
index 000000000..f69485af7
--- /dev/null
+++ b/docs/source/user-guide/dataframe/index.rst
@@ -0,0 +1,209 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements. See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership. The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied. See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+DataFrames
+==========
+
+Overview
+--------
+
+The ``DataFrame`` class is the core abstraction in DataFusion that represents tabular data and operations
+on that data. DataFrames provide a flexible API for transforming data through various operations such as
+filtering, projection, aggregation, joining, and more.
+
+A DataFrame represents a logical plan that is lazily evaluated. The actual execution occurs only when
+terminal operations like ``collect()``, ``show()``, or ``to_pandas()`` are called.
+
+Creating DataFrames
+-------------------
+
+DataFrames can be created in several ways:
+
+* From SQL queries via a ``SessionContext``:
+
+ .. code-block:: python
+
+ from datafusion import SessionContext
+
+ ctx = SessionContext()
+ df = ctx.sql("SELECT * FROM your_table")
+
+* From registered tables:
+
+ .. code-block:: python
+
+ df = ctx.table("your_table")
+
+* From various data sources:
+
+ .. code-block:: python
+
+ # From CSV files (see :ref:`io_csv` for detailed options)
+ df = ctx.read_csv("path/to/data.csv")
+
+ # From Parquet files (see :ref:`io_parquet` for detailed options)
+ df = ctx.read_parquet("path/to/data.parquet")
+
+ # From JSON files (see :ref:`io_json` for detailed options)
+ df = ctx.read_json("path/to/data.json")
+
+ # From Avro files (see :ref:`io_avro` for detailed options)
+ df = ctx.read_avro("path/to/data.avro")
+
+ # From Pandas DataFrame
+ import pandas as pd
+ pandas_df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
+ df = ctx.from_pandas(pandas_df)
+
+ # From Arrow data
+ import pyarrow as pa
+ batch = pa.RecordBatch.from_arrays(
+ [pa.array([1, 2, 3]), pa.array([4, 5, 6])],
+ names=["a", "b"]
+ )
+ df = ctx.from_arrow(batch)
+
+For detailed information about reading from different data sources, see the :doc:`I/O Guide <../io/index>`.
+For custom data sources, see :ref:`io_custom_table_provider`.
+
+Common DataFrame Operations
+---------------------------
+
+DataFusion's DataFrame API offers a wide range of operations:
+
+.. code-block:: python
+
+ from datafusion import column, literal
+
+ # Select specific columns
+ df = df.select("col1", "col2")
+
+ # Select with expressions
+ df = df.select(column("a") + column("b"), column("a") - column("b"))
+
+ # Filter rows
+ df = df.filter(column("age") > literal(25))
+
+ # Add computed columns
+ df = df.with_column("full_name", column("first_name") + literal(" ") + column("last_name"))
+
+ # Multiple column additions
+ df = df.with_columns(
+ (column("a") + column("b")).alias("sum"),
+ (column("a") * column("b")).alias("product")
+ )
+
+ # Sort data
+ df = df.sort(column("age").sort(ascending=False))
+
+ # Join DataFrames
+ df = df1.join(df2, on="user_id", how="inner")
+
+ # Aggregate data
+ from datafusion import functions as f
+ df = df.aggregate(
+ [], # Group by columns (empty for global aggregation)
+ [f.sum(column("amount")).alias("total_amount")]
+ )
+
+ # Limit rows
+ df = df.limit(100)
+
+ # Drop columns
+ df = df.drop("temporary_column")
+
+Terminal Operations
+-------------------
+
+To materialize the results of your DataFrame operations:
+
+.. code-block:: python
+
+ # Collect all data as PyArrow RecordBatches
+ result_batches = df.collect()
+
+ # Convert to various formats
+ pandas_df = df.to_pandas() # Pandas DataFrame
+ polars_df = df.to_polars() # Polars DataFrame
+ arrow_table = df.to_arrow_table() # PyArrow Table
+ py_dict = df.to_pydict() # Python dictionary
+ py_list = df.to_pylist() # Python list of dictionaries
+
+ # Display results
+ df.show() # Print tabular format to console
+
+ # Count rows
+ count = df.count()
+
+HTML Rendering
+--------------
+
+When working in Jupyter notebooks or other environments that support HTML rendering, DataFrames will
+automatically display as formatted HTML tables. For detailed information about customizing HTML
+rendering, formatting options, and advanced styling, see :doc:`rendering`.
+
+Core Classes
+------------
+
+**DataFrame**
+ The main DataFrame class for building and executing queries.
+
+ See: :py:class:`datafusion.DataFrame`
+
+**SessionContext**
+ The primary entry point for creating DataFrames from various data sources.
+
+ Key methods for DataFrame creation:
+
+ * :py:meth:`~datafusion.SessionContext.read_csv` - Read CSV files
+ * :py:meth:`~datafusion.SessionContext.read_parquet` - Read Parquet files
+ * :py:meth:`~datafusion.SessionContext.read_json` - Read JSON files
+ * :py:meth:`~datafusion.SessionContext.read_avro` - Read Avro files
+ * :py:meth:`~datafusion.SessionContext.table` - Access registered tables
+ * :py:meth:`~datafusion.SessionContext.sql` - Execute SQL queries
+ * :py:meth:`~datafusion.SessionContext.from_pandas` - Create from Pandas DataFrame
+ * :py:meth:`~datafusion.SessionContext.from_arrow` - Create from Arrow data
+
+ See: :py:class:`datafusion.SessionContext`
+
+Expression Classes
+------------------
+
+**Expr**
+ Represents expressions that can be used in DataFrame operations.
+
+ See: :py:class:`datafusion.Expr`
+
+**Functions for creating expressions:**
+
+* :py:func:`datafusion.column` - Reference a column by name
+* :py:func:`datafusion.literal` - Create a literal value expression
+
+Built-in Functions
+------------------
+
+DataFusion provides many built-in functions for data manipulation:
+
+* :py:mod:`datafusion.functions` - Mathematical, string, date/time, and aggregation functions
+
+For a complete list of available functions, see the :py:mod:`datafusion.functions` module documentation.
+
+
+.. toctree::
+ :maxdepth: 1
+
+ rendering
diff --git a/docs/source/user-guide/dataframe.rst b/docs/source/user-guide/dataframe/rendering.rst
similarity index 72%
rename from docs/source/user-guide/dataframe.rst
rename to docs/source/user-guide/dataframe/rendering.rst
index 23c65b5f6..4c37c7471 100644
--- a/docs/source/user-guide/dataframe.rst
+++ b/docs/source/user-guide/dataframe/rendering.rst
@@ -15,59 +15,37 @@
.. specific language governing permissions and limitations
.. under the License.
-DataFrames
-==========
+HTML Rendering in Jupyter
+=========================
-Overview
---------
+When working in Jupyter notebooks or other environments that support rich HTML display,
+DataFusion DataFrames automatically render as nicely formatted HTML tables. This functionality
+is provided by the ``_repr_html_`` method, which is automatically called by Jupyter to provide
+a richer visualization than plain text output.
-DataFusion's DataFrame API provides a powerful interface for building and executing queries against data sources.
-It offers a familiar API similar to pandas and other DataFrame libraries, but with the performance benefits of Rust
-and Arrow.
+Basic HTML Rendering
+--------------------
-A DataFrame represents a logical plan that can be composed through operations like filtering, projection, and aggregation.
-The actual execution happens when terminal operations like ``collect()`` or ``show()`` are called.
-
-Basic Usage
------------
+In a Jupyter environment, simply displaying a DataFrame object will trigger HTML rendering:
.. code-block:: python
- import datafusion
- from datafusion import col, lit
+ # Will display as HTML table in Jupyter
+ df
- # Create a context and register a data source
- ctx = datafusion.SessionContext()
- ctx.register_csv("my_table", "path/to/data.csv")
-
- # Create and manipulate a DataFrame
- df = ctx.sql("SELECT * FROM my_table")
-
- # Or use the DataFrame API directly
- df = (ctx.table("my_table")
- .filter(col("age") > lit(25))
- .select([col("name"), col("age")]))
-
- # Execute and collect results
- result = df.collect()
-
- # Display the first few rows
- df.show()
+ # Explicit display also uses HTML rendering
+ display(df)
-HTML Rendering
---------------
-
-When working in Jupyter notebooks or other environments that support HTML rendering, DataFrames will
-automatically display as formatted HTML tables, making it easier to visualize your data.
+Customizing HTML Rendering
+---------------------------
-The ``_repr_html_`` method is called automatically by Jupyter to render a DataFrame. This method
-controls how DataFrames appear in notebook environments, providing a richer visualization than
-plain text output.
+DataFusion provides extensive customization options for HTML table rendering through the
+``datafusion.html_formatter`` module.
-Customizing HTML Rendering
---------------------------
+Configuring the HTML Formatter
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-You can customize how DataFrames are rendered in HTML by configuring the formatter:
+You can customize how DataFrames are rendered by configuring the formatter:
.. code-block:: python
@@ -91,7 +69,7 @@ You can customize how DataFrames are rendered in HTML by configuring the formatt
The formatter settings affect all DataFrames displayed after configuration.
Custom Style Providers
-----------------------
+-----------------------
For advanced styling needs, you can create a custom style provider:
@@ -118,7 +96,8 @@ For advanced styling needs, you can create a custom style provider:
configure_formatter(style_provider=MyStyleProvider())
Performance Optimization with Shared Styles
--------------------------------------------
+--------------------------------------------
+
The ``use_shared_styles`` parameter (enabled by default) optimizes performance when displaying
multiple DataFrames in notebook environments:
@@ -138,7 +117,7 @@ When ``use_shared_styles=True``:
- Applies consistent styling across all DataFrames
Creating a Custom Formatter
----------------------------
+----------------------------
For complete control over rendering, you can implement a custom formatter:
@@ -184,7 +163,7 @@ Get the current formatter settings:
print(formatter.theme)
Contextual Formatting
----------------------
+----------------------
You can also use a context manager to temporarily change formatting settings:
@@ -207,12 +186,38 @@ Memory and Display Controls
You can control how much data is displayed and how much memory is used for rendering:
- .. code-block:: python
-
+.. code-block:: python
+
configure_formatter(
max_memory_bytes=4 * 1024 * 1024, # 4MB maximum memory for display
min_rows_display=50, # Always show at least 50 rows
repr_rows=20 # Show 20 rows in __repr__ output
)
-These parameters help balance comprehensive data display against performance considerations.
\ No newline at end of file
+These parameters help balance comprehensive data display against performance considerations.
+
+Best Practices
+--------------
+
+1. **Global Configuration**: Use ``configure_formatter()`` at the beginning of your notebook to set up consistent formatting for all DataFrames.
+
+2. **Memory Management**: Set appropriate ``max_memory_bytes`` limits to prevent performance issues with large datasets.
+
+3. **Shared Styles**: Keep ``use_shared_styles=True`` (default) for better performance in notebooks with multiple DataFrames.
+
+4. **Reset When Needed**: Call ``reset_formatter()`` when you want to start fresh with default settings.
+
+5. **Cell Expansion**: Use ``enable_cell_expansion=True`` when cells might contain longer content that users may want to see in full.
+
+Additional Resources
+--------------------
+
+* :doc:`../dataframe/index` - Complete guide to using DataFrames
+* :doc:`../io/index` - I/O Guide for reading data from various sources
+* :doc:`../data-sources` - Comprehensive data sources guide
+* :ref:`io_csv` - CSV file reading
+* :ref:`io_parquet` - Parquet file reading
+* :ref:`io_json` - JSON file reading
+* :ref:`io_avro` - Avro file reading
+* :ref:`io_custom_table_provider` - Custom table providers
+* `API Reference `_ - Full API reference