Skip to content

Commit c7a6080

Browse files
docs: Add docs on how to use Narwhals to generate SQL (#2570)
--------- Co-authored-by: Francesco Bruzzesi <[email protected]>
1 parent 36a3fae commit c7a6080

File tree

5 files changed

+85
-8
lines changed

5 files changed

+85
-8
lines changed

.github/workflows/mkdocs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,6 @@ jobs:
2929
restore-keys: |
3030
mkdocs-material-
3131
- name: Install dependencies
32-
run: uv pip install -e ".[dask,duckdb,sqlframe]" --group docs --system
32+
run: uv pip install -e . --group docs --system
3333
- name: Deploy docs
3434
run: mkdocs gh-deploy --force

docs/concepts/order_dependence.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ no issue.
2525
import narwhals as nw
2626
import pandas as pd
2727

28-
df_pd = pd.DataFrame({"a": [1, 3, 4], "i": [0, 1, 2]})
29-
df = nw.from_native(df_pd)
28+
data = {"a": [1, 3, 4], "i": [0, 1, 2]}
29+
df = nw.from_native(pd.DataFrame(data))
3030
print(df.with_columns(a_cum_sum=nw.col("a").cum_sum()))
3131
```
3232

@@ -39,13 +39,11 @@ you specify `order_by`. For example:
3939
or a `LazyFrame`.
4040

4141
```python exec="1" result="python" session="order_dependence" source="above"
42-
from sqlframe.duckdb import DuckDBSession
42+
import polars as pl
4343

44-
session = DuckDBSession()
45-
sqlframe_df = session.createDataFrame(df_pd)
46-
lf = nw.from_native(sqlframe_df)
44+
lf = nw.from_native(pl.LazyFrame(data))
4745
result = lf.with_columns(a_cum_sum=nw.col("a").cum_sum().over(order_by="i"))
48-
print(result.collect("pandas"))
46+
print(result.collect())
4947
```
5048

5149
When writing an order-dependent function, if you want it to be executable by `LazyFrame`

docs/generating_sql.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Generating SQL
2+
3+
Suppose you want to write Polars syntax and translate it to SQL.
4+
For example, what's the SQL equivalent to:
5+
6+
```python exec="1" source="above" session="generating-sql"
7+
import narwhals as nw
8+
from narwhals.typing import IntoFrameT
9+
10+
11+
def avg_monthly_price(df_native: IntoFrameT) -> IntoFrameT:
12+
return (
13+
nw.from_native(df_native)
14+
.group_by(nw.col("date").dt.truncate("1mo"))
15+
.agg(nw.col("price").mean())
16+
.sort("date")
17+
.to_native()
18+
)
19+
```
20+
21+
?
22+
23+
There are several ways to find out.
24+
25+
## Via SQLFrame (most lightweight solution)
26+
27+
The most lightweight solution which does not require any heavy dependencies, nor
28+
any actual table or dataframe, is with SQLFrame.
29+
30+
```python exec="1" source="above" session="generating-sql" result="sql"
31+
from sqlframe.standalone import StandaloneSession
32+
33+
session = StandaloneSession.builder.getOrCreate()
34+
session.catalog.add_table("prices", column_mapping={"date": "date", "price": "float"})
35+
df = nw.from_native(session.read.table("prices"))
36+
37+
print(avg_monthly_price(df).sql(dialect="duckdb"))
38+
```
39+
40+
Or, to print the SQL code in a different dialect (say, databricks):
41+
42+
```python exec="1" source="above" session="generating-sql" result="sql"
43+
print(avg_monthly_price(df).sql(dialect="databricks"))
44+
```
45+
46+
## Via DuckDB
47+
48+
You can also generate SQL directly from DuckDB.
49+
50+
```python exec="1" source="above" session="generating-sql" result="sql"
51+
import duckdb
52+
53+
conn = duckdb.connect()
54+
conn.sql("""CREATE TABLE prices (date DATE, price DOUBLE);""")
55+
56+
df = nw.from_native(conn.table("prices"))
57+
print(avg_monthly_price(df).sql_query())
58+
```
59+
60+
To make it look a bit prettier, we can pass it to [SQLGlot](https://github.com/tobymao/sqlglot):
61+
62+
```python exec="1" source="above" session="generating-sql" result="sql"
63+
import sqlglot
64+
65+
print(sqlglot.transpile(avg_monthly_price(df).sql_query(), pretty=True)[0])
66+
```
67+
68+
## Via Ibis
69+
70+
We can also use Ibis to generate SQL:
71+
72+
```python exec="1" source="above" session="generating-sql" result="sql"
73+
import ibis
74+
75+
t = ibis.table({"date": "date", "price": "double"}, name="prices")
76+
print(ibis.to_sql(avg_monthly_price(t)))
77+
```

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ nav:
1111
- basics/series.md
1212
- basics/complete_example.md
1313
- basics/dataframe_conversion.md
14+
- Narwhals and SQL: generating_sql.md
1415
- Concepts:
1516
- concepts/order_dependence.md
1617
- concepts/pandas_index.md

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ docs = [
7878
"black", # required by mkdocstrings_handlers
7979
"jinja2",
8080
"duckdb",
81+
"narwhals[ibis]",
8182
"markdown-exec[ansi]",
8283
"mkdocs",
8384
"mkdocs-autorefs",

0 commit comments

Comments
 (0)