docs: Rewrite overhead section (#2566)

MarcoGorelli · web-flow · commit 8d3de066d83a · 2025-05-17T22:23:07.000+01:00
* docs: Rewrite overhead section

* add some examples of functions where we calculate columns/schema
diff --git a/docs/overhead.md b/docs/overhead.md
@@ -2,19 +2,51 @@
 
 Narwhals converts Polars syntax to non-Polars dataframes.
 
-So, what's the overhead of running pandas vs pandas via Narwhals?
+So, what's the overhead of running "pandas" vs "pandas via Narwhals"?
 
-Based on experiments we've done, the answer is: it's negligible. Here
-are timings from the TPC-H queries, comparing running pandas directly
-vs running pandas via Narwhals:
+Based on experiments we've done, the answer is: it's negligible.
+Sometimes it's even negative, because of how careful we are in Narwhals
+to avoid unnecessary copies and index resets. Here are timings from the
+TPC-H queries, comparing running pandas directly vs running pandas via Narwhals:
 
-![Comparison of pandas vs "pandas via Narwhals" timings on TPC-H queries showing neglibile overhead](https://github.com/narwhals-dev/narwhals/assets/33491632/71029c26-4121-43bb-90fb-5ac1c16ab8a2)
+![Comparison of pandas vs "pandas via Narwhals" timings on TPC-H queries showing neglibile overhead](https://github.com/user-attachments/assets/bbd6fcaf-5c25-46a6-8c03-9ce42efca787)
 
-[Here](https://www.kaggle.com/code/marcogorelli/narwhals-tpc-h-results-s-2)'s the code to
-reproduce the plot above, check the input
-sources for notebooks which run each individual query, along with
-the data sources.
+[Complete code to reproduce](https://www.kaggle.com/code/marcogorelli/narwhals-vs-pandas-overhead-tpc-h-s2).
 
-On some runs, the Narwhals code makes things marginally faster, on others
-marginally slower. The overall picture is clear: with Narwhals, you
-can support both Polars and pandas APIs with little to no impact on either.
+## Plotly's story
+
+One big difference between Plotly v5 and Plotly v6 is the handling of non-pandas inputs:
+
+- In v5, Plotly would convert non-pandas inputs to pandas.
+- In v6, Plotly operates on non-pandas inputs natively (via Narwhals).
+
+We expected that this would bring a noticeable performance benefit for non-pandas inputs,
+but that there may be some slight overhead for pandas.
+
+Instead, we observed that things got noticeably faster for both non-pandas inputs and for
+pandas ones!
+
+- Polars plots got 3x, and sometimes even more than 10x, faster.
+- pandas plots were typically no slower, but sometimes ~20% faster.
+
+Full details on [Plotly's write-up](https://plotly.com/blog/chart-smarter-not-harder-universal-dataframe-support/).
+
+## Overhead for DuckDB, PySpark, and other lazy backends
+
+For lazy backends, Narwhals respects the backends' laziness and always keeps
+everything lazy. Narwhals never evaluates a full query unless you ask it to
+(with `.collect()`).
+
+In order to mimic Polars' behaviour, there are some places
+where Narwhals does need to inspect dataframes' schemas, such as:
+
+- joins
+- selectors
+- `nth`
+- `concat` with `how='vertical'`
+- `unique`
+
+This is typically cheap (as it does not require reading a full dataset into memory and
+can often just be done from metadata alone) but it's not free, especially if your
+data lives on the cloud. To minimise the overhead, when Narwhals needs to evaluate
+schemas or column names, it makes sure to cache them.