feat(pkg-py): Replace pandas with narwhals#175
Conversation
09816ed to
1081e4d
Compare
| """A DataSource implementation that wraps a pandas DataFrame using DuckDB.""" | ||
| """A DataSource implementation that wraps a DataFrame using DuckDB.""" | ||
|
|
||
| _df: nw.DataFrame | nw.LazyFrame |
There was a problem hiding this comment.
I'm pretty sure it's going to make sense for us to have a separate LazyFrameSource, which I'll do in a follow up PR (before the next release). The benefit being that we can be more lazy about computation, and possibly have .df() also return a LazyFrame in that scenario
|
|
||
| # Ensure we're working with a DataFrame, not a LazyFrame | ||
| ndf = ( | ||
| self._df.head(10).collect() |
There was a problem hiding this comment.
Note that downstream calculation of ranges and unique values wasn't working properly because they were based on the first 10 rows -- I'll address this when doing the new LazyFrameSource implementation
65e778b to
167317d
Compare
Remove pandas as a required dependency in favor of narwhals, which provides a unified DataFrame interface supporting both pandas and polars backends. Changes: - Add _df_compat.py module with read_csv, read_sql, and duckdb_result_to_nw helpers - Update DataSource classes to return narwhals DataFrames - Update df_to_html to generate HTML without pandas dependency - Make pandas and polars optional dependencies - Add comprehensive tests for DataFrameSource and df_compat module Users can now install with either `pip install querychat[pandas]` or `pip install querychat[polars]`. Use `.to_native()` on returned DataFrames to get the underlying pandas or polars DataFrame. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
167317d to
0ffa6e7
Compare
98ef15d to
288cd55
Compare
Resolved conflicts: - _datasource.py: Combined narwhals abstraction layer with security configs, check_query validation, and test_query method - build.qmd: Kept include directive approach from main 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
pkg-py/src/querychat/_utils.py
Outdated
| columns = df_short.columns | ||
| rows = df_short.rows() | ||
|
|
||
| # Build HTML table | ||
| html_parts = ['<table border="1" class="dataframe table table-striped">'] | ||
|
|
||
| # Header | ||
| html_parts.append(" <thead>") | ||
| html_parts.append(' <tr style="text-align: right;">') | ||
| html_parts.extend(f" <th>{_escape_html(col)}</th>" for col in columns) | ||
| html_parts.append(" </tr>") | ||
| html_parts.append(" </thead>") | ||
|
|
||
| # Body | ||
| html_parts.append(" <tbody>") | ||
| for row in rows: | ||
| html_parts.append(" <tr>") | ||
| html_parts.extend(f" <td>{_escape_html(str(val))}</td>" for val in row) | ||
| html_parts.append(" </tr>") | ||
| html_parts.append(" </tbody>") | ||
|
|
||
| html_parts.append("</table>") | ||
| table_html = "\n".join(html_parts) |
There was a problem hiding this comment.
I feel a bit torn on whether it's worth a great_tables dependency here? It is pretty low dependency, so maybe it's worth it?
There was a problem hiding this comment.
I think it makes sense to use great_tables if it drops in nicely
| """A DataSource implementation that wraps a pandas DataFrame using DuckDB.""" | ||
| """A DataSource implementation that wraps a DataFrame using DuckDB.""" | ||
|
|
||
| _df: nw.DataFrame | nw.LazyFrame |
pkg-py/src/querychat/_utils.py
Outdated
| columns = df_short.columns | ||
| rows = df_short.rows() | ||
|
|
||
| # Build HTML table | ||
| html_parts = ['<table border="1" class="dataframe table table-striped">'] | ||
|
|
||
| # Header | ||
| html_parts.append(" <thead>") | ||
| html_parts.append(' <tr style="text-align: right;">') | ||
| html_parts.extend(f" <th>{_escape_html(col)}</th>" for col in columns) | ||
| html_parts.append(" </tr>") | ||
| html_parts.append(" </thead>") | ||
|
|
||
| # Body | ||
| html_parts.append(" <tbody>") | ||
| for row in rows: | ||
| html_parts.append(" <tr>") | ||
| html_parts.extend(f" <td>{_escape_html(str(val))}</td>" for val in row) | ||
| html_parts.append(" </tr>") | ||
| html_parts.append(" </tbody>") | ||
|
|
||
| html_parts.append("</table>") | ||
| table_html = "\n".join(html_parts) |
There was a problem hiding this comment.
I think it makes sense to use great_tables if it drops in nicely
Replace manual HTML table construction in df_to_html() with great_tables GT class for richer, styled table output in chat messages. - Add great-tables>=0.16.0 as a dependency - Simplify df_to_html() to use GT().as_raw_html() - Remove manual _escape_html helper (great_tables handles escaping) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Passing self.client() (calling the method) resulted in tools being registered twice - once in client() and again in mod_server. Instead, pass self.client (the method reference) so mod_server can call it with the update_dashboard and reset_dashboard callbacks. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This PR removes
pandasas a required dependency and replaces it withnarwhals, a lightweight DataFrame abstraction layer that supports both pandas and polars backends. Users can now choose their preferred DataFrame library.Motivation
Changes
Dependencies
pandasfrom required dependenciespandasandpolarsas optional dependenciespip install querychat[pandas]orpip install querychat[polars]API Changes (Breaking)
execute_query(),get_data(), anddf()now return narwhals DataFrames instead of pandas DataFrames.to_native()on returned DataFrames to get the underlying pandas/polars DataFrameInternal Changes
_df_compat.pymodule handles backend selection (prefers polars when available)df_to_html()generates HTML directly without pandas dependencyDataFrameSourceaccepts pandas, polars, or narwhals DataFramesTests
test_df_compat.pyfor the compatibility layertest_dataframe_source.pywith comprehensive DataFrameSource testsMigration Guide