Skip to content

Commit cbc1d48

Browse files
authored
Merge pull request #126 from marimo-team/bigfix
marimo check cleanup
2 parents 8288699 + 3b7283c commit cbc1d48

File tree

70 files changed

+6611
-7774
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+6611
-7774
lines changed

_server/README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Readme
3+
marimo-version: 0.18.4
4+
---
5+
16
# marimo learn server
27

38
This folder contains server code for hosting marimo apps.
@@ -21,4 +26,4 @@ docker build -t marimo-learn .
2126

2227
```bash
2328
docker run -p 7860:7860 marimo-learn
24-
```
29+
```

daft/01_what_makes_daft_special.py

Lines changed: 30 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -8,37 +8,33 @@
88

99
import marimo
1010

11-
__generated_with = "0.13.6"
11+
__generated_with = "0.18.4"
1212
app = marimo.App(width="medium")
1313

1414

1515
@app.cell(hide_code=True)
1616
def _(mo):
17-
mo.md(
18-
r"""
17+
mo.md(r"""
1918
# What Makes Daft Special?
2019
2120
> _By [Péter Ferenc Gyarmati](http://github.com/peter-gy)_.
2221
2322
Welcome to the course on [Daft](https://www.getdaft.io/), the distributed dataframe library! In this first chapter, we'll explore what Daft is and what makes it a noteworthy tool in the landscape of data processing. We'll look at its core design choices and how they aim to help you work with data more effectively, whether you're a data engineer, data scientist, or analyst.
24-
"""
25-
)
23+
""")
2624
return
2725

2826

2927
@app.cell(hide_code=True)
3028
def _(mo):
31-
mo.md(
32-
r"""
29+
mo.md(r"""
3330
## 🎯 Introducing Daft: A Unified Data Engine
3431
3532
Daft is a distributed query engine designed to handle a wide array of data tasks, from data engineering and analytics to powering ML/AI workflows. It provides both a Python DataFrame API, familiar to users of libraries like Pandas, and a SQL interface, allowing you to choose the interaction style that best suits your needs or the task at hand.
3633
3734
The main goal of Daft is to provide a robust and versatile platform for processing data, whether it's gigabytes on your laptop or petabytes on a cluster.
3835
3936
Let's go ahead and `pip install daft` to see it in action!
40-
"""
41-
)
37+
""")
4238
return
4339

4440

@@ -86,8 +82,7 @@ def _(mo):
8682

8783
@app.cell(hide_code=True)
8884
def _(mo):
89-
mo.md(
90-
r"""
85+
mo.md(r"""
9186
## 🦀 Built with Rust: Performance and Simplicity
9287
9388
One of Daft's key characteristics is that its core engine is written in Rust. This choice has several implications for users:
@@ -97,8 +92,7 @@ def _(mo):
9792
* **Simplified Developer Experience**: Rust-based systems typically require less configuration tuning compared to JVM-based systems. You don't need to worry about JVM heap sizes, garbage collection parameters, or managing Java dependencies.
9893
9994
Daft also leverages [Apache Arrow](https://arrow.apache.org/) for its in-memory data format. This allows for efficient data exchange between Daft's Rust core and Python, often with zero-copy data sharing, further enhancing performance.
100-
"""
101-
)
95+
""")
10296
return
10397

10498

@@ -118,7 +112,9 @@ def _(mo):
118112

119113
@app.cell(hide_code=True)
120114
def _(mo):
121-
mo.md(r"""A cornerstone of Daft's design is **lazy execution**. Imagine defining a DataFrame with a trillion rows on your laptop – usually not a great prospect for your device's memory!""")
115+
mo.md(r"""
116+
A cornerstone of Daft's design is **lazy execution**. Imagine defining a DataFrame with a trillion rows on your laptop – usually not a great prospect for your device's memory!
117+
""")
122118
return
123119

124120

@@ -135,7 +131,9 @@ def _(daft):
135131

136132
@app.cell(hide_code=True)
137133
def _(mo):
138-
mo.md(r"""With Daft, this is perfectly fine. Operations like `with_column` or `filter` don't compute results immediately. Instead, Daft builds a *logical plan* – a blueprint of the transformations you've defined. You can inspect this plan:""")
134+
mo.md(r"""
135+
With Daft, this is perfectly fine. Operations like `with_column` or `filter` don't compute results immediately. Instead, Daft builds a *logical plan* – a blueprint of the transformations you've defined. You can inspect this plan:
136+
""")
139137
return
140138

141139

@@ -147,14 +145,15 @@ def _(mo, trillion_rows_df):
147145

148146
@app.cell(hide_code=True)
149147
def _(mo):
150-
mo.md(r"""This plan is only executed (and data materialized) when you explicitly request it (e.g., with `.show()`, `.collect()`, or by writing to a file). Before execution, Daft's optimizer works to make your query run as efficiently as possible. This approach allows you to define complex operations on massive datasets without immediate computational cost or memory overflow.""")
148+
mo.md(r"""
149+
This plan is only executed (and data materialized) when you explicitly request it (e.g., with `.show()`, `.collect()`, or by writing to a file). Before execution, Daft's optimizer works to make your query run as efficiently as possible. This approach allows you to define complex operations on massive datasets without immediate computational cost or memory overflow.
150+
""")
151151
return
152152

153153

154154
@app.cell(hide_code=True)
155155
def _(mo):
156-
mo.md(
157-
r"""
156+
mo.md(r"""
158157
## 🌐 Scale Your Work: From Laptop to Cluster
159158
160159
Daft is designed with scalability in mind. As the trillion-row dataframe example above illustrates, you can write your data processing logic using Daft's Python API, and this same code can run:
@@ -163,24 +162,21 @@ def _(mo):
163162
* **On a Cluster**: By integrating with [Ray](https://www.ray.io/), a framework for distributed computing. This allows Daft to scale out to process very large datasets across many machines.
164163
165164
This "write once, scale anywhere" approach means you don't need to significantly refactor your code when moving from local development to large-scale distributed execution. We'll delve into distributed computing with Ray in a later chapter.
166-
"""
167-
)
165+
""")
168166
return
169167

170168

171169
@app.cell(hide_code=True)
172170
def _(mo):
173-
mo.md(
174-
r"""
171+
mo.md(r"""
175172
## 🖼️ Handling More Than Just Tables: Multimodal Data Support
176173
177174
Modern datasets often contain more than just numbers and text. They might include images, audio clips, URLs pointing to external files, tensor data from machine learning models, or complex nested structures like JSON.
178175
179176
Daft is built to accommodate these **multimodal data types** as integral parts of a DataFrame. This means you can have columns containing image data, embeddings, or other complex Python objects, and Daft provides mechanisms to process them. This is particularly useful for ML/AI pipelines and advanced analytics where diverse data sources are common.
180177
181178
As an example of how Daft simplifies working with such complex data, let's see how we can process image URLs. With just a few lines of Daft code, we can pull open data from the [National Gallery of Art](https://github.com/NationalGalleryOfArt/opendata), then directly fetch, decode, and even resize the images within our DataFrame:
182-
"""
183-
)
179+
""")
184180
return
185181

186182

@@ -217,20 +213,23 @@ def _(daft):
217213

218214
@app.cell(hide_code=True)
219215
def _(mo):
220-
mo.md(r"""> Example inspired by the great post [Exploring Art with TypeScript, Jupyter, Polars, and Observable Plot](https://deno.com/blog/exploring-art-with-typescript-and-jupyter) published on Deno's blog.""")
216+
mo.md(r"""
217+
> Example inspired by the great post [Exploring Art with TypeScript, Jupyter, Polars, and Observable Plot](https://deno.com/blog/exploring-art-with-typescript-and-jupyter) published on Deno's blog.
218+
""")
221219
return
222220

223221

224222
@app.cell(hide_code=True)
225223
def _(mo):
226-
mo.md(r"""In later chapters, we'll explore in more detail how to work with these image objects and other complex types, including applying User-Defined Functions (UDFs) for custom processing. Until then, you can [take a look at a more complex example](https://blog.getdaft.io/p/we-cloned-over-15000-repos-to-find), in which Daft is used to clone over 15,000 GitHub repos to find the best developers.""")
224+
mo.md(r"""
225+
In later chapters, we'll explore in more detail how to work with these image objects and other complex types, including applying User-Defined Functions (UDFs) for custom processing. Until then, you can [take a look at a more complex example](https://blog.getdaft.io/p/we-cloned-over-15000-repos-to-find), in which Daft is used to clone over 15,000 GitHub repos to find the best developers.
226+
""")
227227
return
228228

229229

230230
@app.cell(hide_code=True)
231231
def _(mo):
232-
mo.md(
233-
r"""
232+
mo.md(r"""
234233
## 🧑‍💻 Designed for Developers: Python and SQL Interfaces
235234
236235
Daft aims to be developer-friendly by offering flexible ways to interact with your data:
@@ -239,8 +238,7 @@ def _(mo):
239238
* **SQL Interface**: For those who prefer SQL or have existing SQL-based logic, Daft allows you to write queries using SQL syntax. Daft can execute SQL queries directly or even translate SQL expressions into its native expression system.
240239
241240
This dual-interface approach allows developers to choose the most appropriate tool for their specific task or leverage existing skills.
242-
"""
243-
)
241+
""")
244242
return
245243

246244

@@ -285,8 +283,7 @@ def _(daft):
285283

286284
@app.cell(hide_code=True)
287285
def _(mo):
288-
mo.md(
289-
r"""
286+
mo.md(r"""
290287
## 🟣 Daft's Value Proposition
291288
292289
So, what makes Daft special? It's the combination of these design choices:
@@ -299,16 +296,14 @@ def _(mo):
299296
These elements combine to make Daft a versatile tool for tackling modern data challenges.
300297
301298
And this is just scratching the surface. Daft is a growing data engine with an ambitious vision: to unify data engineering, analytics, and ML/AI workflows 🚀.
302-
"""
303-
)
299+
""")
304300
return
305301

306302

307303
@app.cell
308304
def _():
309305
import daft
310306
import marimo as mo
311-
312307
return daft, mo
313308

314309

daft/README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,8 @@
1+
---
2+
title: Readme
3+
marimo-version: 0.18.4
4+
---
5+
16
# Learn Daft
27

38
_🚧 This collection is a work in progress. Please help us add notebooks!_
@@ -23,4 +28,4 @@ You can also open notebooks in our online playground by appending marimo.app/ to
2328

2429
**Thanks to all our notebook authors!**
2530

26-
* [Péter Gyarmati](https://github.com/peter-gy)
31+
* [Péter Gyarmati](https://github.com/peter-gy)

0 commit comments

Comments
 (0)