@@ -150,238 +150,3 @@ To materialize the results of your DataFrame operations:
150150 # Count rows
151151 count = df.count()
152152
153- HTML Rendering in Jupyter
154- -------------------------
155-
156- When working in Jupyter notebooks or other environments that support rich HTML display,
157- DataFusion DataFrames automatically render as nicely formatted HTML tables. This functionality
158- is provided by the ``_repr_html_ `` method, which is automatically called by Jupyter.
159-
160- Basic HTML Rendering
161- ~~~~~~~~~~~~~~~~~~~~
162-
163- In a Jupyter environment, simply displaying a DataFrame object will trigger HTML rendering:
164-
165- .. code-block :: python
166-
167- # Will display as HTML table in Jupyter
168- df
169-
170- # Explicit display also uses HTML rendering
171- display(df)
172-
173- HTML Rendering Customization
174- ----------------------------
175-
176- DataFusion provides extensive customization options for HTML table rendering through the
177- ``datafusion.html_formatter `` module.
178-
179- Configuring the HTML Formatter
180- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
181-
182- You can customize how DataFrames are rendered by configuring the formatter:
183-
184- .. code-block :: python
185-
186- from datafusion.html_formatter import configure_formatter
187-
188- configure_formatter(
189- max_cell_length = 30 , # Maximum length of cell content before truncation
190- max_width = 800 , # Maximum width of table in pixels
191- max_height = 400 , # Maximum height of table in pixels
192- max_memory_bytes = 2 * 1024 * 1024 ,# Maximum memory used for rendering (2MB)
193- min_rows_display = 10 , # Minimum rows to display
194- repr_rows = 20 , # Number of rows to display in representation
195- enable_cell_expansion = True , # Allow cells to be expandable on click
196- custom_css = None , # Custom CSS to apply
197- show_truncation_message = True , # Show message when data is truncated
198- style_provider = None , # Custom style provider class
199- use_shared_styles = True # Share styles across tables to reduce duplication
200- )
201-
202- Custom Style Providers
203- ~~~~~~~~~~~~~~~~~~~~~~
204-
205- For advanced styling needs, you can create a custom style provider class:
206-
207- .. code-block :: python
208-
209- from datafusion.html_formatter import configure_formatter
210-
211- class CustomStyleProvider :
212- def get_cell_style (self ) -> str :
213- return " background-color: #f5f5f5; color: #333; padding: 8px; border: 1px solid #ddd;"
214-
215- def get_header_style (self ) -> str :
216- return " background-color: #4285f4; color: white; font-weight: bold; padding: 10px;"
217-
218- # Apply custom styling
219- configure_formatter(style_provider = CustomStyleProvider())
220-
221- Custom Type Formatters
222- ~~~~~~~~~~~~~~~~~~~~~~
223-
224- You can register custom formatters for specific data types:
225-
226- .. code-block :: python
227-
228- from datafusion.html_formatter import get_formatter
229-
230- formatter = get_formatter()
231-
232- # Format integers with color based on value
233- def format_int (value ):
234- return f ' <span style="color: { " red" if value > 100 else " blue" } "> { value} </span> '
235-
236- formatter.register_formatter(int , format_int)
237-
238- # Format date values
239- def format_date (value ):
240- return f ' <span class="date-value"> { value.isoformat()} </span> '
241-
242- formatter.register_formatter(datetime.date, format_date)
243-
244- Custom Cell Builders
245- ~~~~~~~~~~~~~~~~~~~~
246-
247- For complete control over cell rendering:
248-
249- .. code-block :: python
250-
251- formatter = get_formatter()
252-
253- def custom_cell_builder (value , row , col , table_id ):
254- try :
255- num_value = float (value)
256- if num_value > 0 : # Positive values get green
257- return f ' <td style="background-color: #d9f0d3"> { value} </td> '
258- if num_value < 0 : # Negative values get red
259- return f ' <td style="background-color: #f0d3d3"> { value} </td> '
260- except (ValueError , TypeError ):
261- pass
262-
263- # Default styling for non-numeric or zero values
264- return f ' <td style="border: 1px solid #ddd"> { value} </td> '
265-
266- formatter.set_custom_cell_builder(custom_cell_builder)
267-
268- Custom Header Builders
269- ~~~~~~~~~~~~~~~~~~~~~~
270-
271- Similarly, you can customize the rendering of table headers:
272-
273- .. code-block :: python
274-
275- def custom_header_builder (field ):
276- tooltip = f " Type: { field.type} "
277- return f ' <th style="background-color: #333; color: white" title=" { tooltip} "> { field.name} </th> '
278-
279- formatter.set_custom_header_builder(custom_header_builder)
280-
281- Managing Formatter State
282- -----------------------~
283-
284- The HTML formatter maintains global state that can be managed:
285-
286- .. code-block :: python
287-
288- from datafusion.html_formatter import reset_formatter, reset_styles_loaded_state, get_formatter
289-
290- # Reset the formatter to default settings
291- reset_formatter()
292-
293- # Reset only the styles loaded state (useful when styles were loaded but need reloading)
294- reset_styles_loaded_state()
295-
296- # Get the current formatter instance to make changes
297- formatter = get_formatter()
298-
299- Advanced Example: Dashboard-Style Formatting
300- ------------------------------------------~~
301-
302- This example shows how to create a dashboard-like styling for your DataFrames:
303-
304- .. code-block :: python
305-
306- from datafusion.html_formatter import configure_formatter, get_formatter
307-
308- # Define custom CSS
309- custom_css = """
310- .datafusion-table {
311- font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
312- border-collapse: collapse;
313- width: 100%;
314- box-shadow: 0 2px 3px rgba(0,0,0,0.1);
315- }
316- .datafusion-table th {
317- position: sticky;
318- top: 0;
319- z-index: 10;
320- }
321- .datafusion-table tr:hover td {
322- background-color: #f1f7fa !important;
323- }
324- .datafusion-table .numeric-positive {
325- color: #0a7c00;
326- }
327- .datafusion-table .numeric-negative {
328- color: #d13438;
329- }
330- """
331-
332- class DashboardStyleProvider :
333- def get_cell_style (self ) -> str :
334- return " padding: 8px 12px; border-bottom: 1px solid #e0e0e0;"
335-
336- def get_header_style (self ) -> str :
337- return (" background-color: #0078d4; color: white; font-weight: 600; "
338- " padding: 12px; text-align: left; border-bottom: 2px solid #005a9e;" )
339-
340- # Apply configuration
341- configure_formatter(
342- max_height = 500 ,
343- enable_cell_expansion = True ,
344- custom_css = custom_css,
345- style_provider = DashboardStyleProvider(),
346- max_cell_length = 50
347- )
348-
349- # Add custom formatters for numbers
350- formatter = get_formatter()
351-
352- def format_number (value ):
353- try :
354- num = float (value)
355- cls = " numeric-positive" if num > 0 else " numeric-negative" if num < 0 else " "
356- return f ' <span class=" { cls } "> { value:, } </span> ' if cls else f ' { value:, } '
357- except (ValueError , TypeError ):
358- return str (value)
359-
360- formatter.register_formatter(int , format_number)
361- formatter.register_formatter(float , format_number)
362-
363- Best Practices
364- --------------
365-
366- 1. **Memory Management **: For large datasets, use ``max_memory_bytes `` to limit memory usage.
367-
368- 2. **Responsive Design **: Set reasonable ``max_width `` and ``max_height `` values to ensure tables display well on different screens.
369-
370- 3. **Style Optimization **: Use ``use_shared_styles=True `` to avoid duplicate style definitions when displaying multiple tables.
371-
372- 4. **Reset When Needed **: Call ``reset_formatter() `` when you want to start fresh with default settings.
373-
374- 5. **Cell Expansion **: Use ``enable_cell_expansion=True `` when cells might contain longer content that users may want to see in full.
375-
376- Additional Resources
377- --------------------
378-
379- * :doc: `../user-guide/dataframe ` - Complete guide to using DataFrames
380- * :doc: `../user-guide/io/index ` - I/O Guide for reading data from various sources
381- * :doc: `../user-guide/data-sources ` - Comprehensive data sources guide
382- * :ref: `io_csv ` - CSV file reading
383- * :ref: `io_parquet ` - Parquet file reading
384- * :ref: `io_json ` - JSON file reading
385- * :ref: `io_avro ` - Avro file reading
386- * :ref: `io_custom_table_provider ` - Custom table providers
387- * `API Reference <https://arrow.apache.org/datafusion-python/api/index.html >`_ - Full API reference
0 commit comments