Skip to content

Commit 96861b7

Browse files
committed
duckplyr 1.0.0
1 parent aee5583 commit 96861b7

File tree

1 file changed

+127
-0
lines changed

1 file changed

+127
-0
lines changed
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
output: hugodown::hugo_document
3+
4+
slug: duckplyr-1-0-0
5+
title: duckplyr fully joins the tidyverse!
6+
date: 2025-02-11
7+
author: Kirill Müller and Maëlle Salmon
8+
description: >
9+
duckplyr 1.0.0 is on CRAN and part of the tidyverse! duckplyr is a drop-in
10+
replacement for dplyr, powered by DuckDB for speed.
11+
12+
photo:
13+
url: https://www.pexels.com/photo/a-mallard-duck-on-water-6918877/
14+
author: Kiril Gruev
15+
16+
# one of: "deep-dive", "learn", "package", "programming", "roundup", or "other"
17+
categories: [package]
18+
tags:
19+
- duckplyr
20+
- dplyr
21+
- tidyverse
22+
---
23+
24+
<!--
25+
TODO:
26+
* [x] Look over / edit the post's title in the yaml
27+
* [x] Edit (or delete) the description; note this appears in the Twitter card
28+
* [x] Pick category and tags (see existing with `hugodown::tidy_show_meta()`)
29+
* [x] Find photo & update yaml metadata
30+
* [ ] Create `thumbnail-sq.jpg`; height and width should be equal
31+
* [ ] Create `thumbnail-wd.jpg`; width should be >5x height
32+
* [ ] `hugodown::use_tidy_thumbnails()`
33+
* [x] Add intro sentence, e.g. the standard tagline for the package
34+
* [x] `usethis::use_tidy_thanks()`
35+
-->
36+
37+
We're very chuffed to announce the release of [duckplyr](https://duckplyr.tidyverse.org) 1.0.0.
38+
duckplyr is a drop-in replacement for dplyr, powered by DuckDB for speed.
39+
It joins the rank of dplyr backends together with [dtplyr](https://dtplyr.tidyverse.org) and [dbplyr](https://dbplyr.tidyverse.org).
40+
41+
You can install it from CRAN with:
42+
43+
```{r, eval = FALSE}
44+
install.packages("duckplyr")
45+
```
46+
47+
In this article, we'll introduce you to the basic usage of duckplyr, show how it can help you handle large data, and explain how you can help improve the package.
48+
49+
50+
## A drop-in replacement for dplyr
51+
52+
The duckplyr package is a drop-in replacement for dplyr that uses DuckDB for speed.
53+
54+
First, data is inputted using either conversion (from data in memory) or ingestion (from data in files) functions.
55+
Alternatively, calling `library(duckplyr)` overwrites dplyr methods, enabling duckplyr for the entire session no matter how data.frames are created.
56+
57+
```{r load}
58+
library(conflicted)
59+
library(duckplyr)
60+
conflict_prefer("filter", "dplyr", quiet = TRUE)
61+
```
62+
63+
Then, the data manipulation pipeline uses the exact same syntax as a dplyr pipeline.
64+
The duckplyr package performs the computation using DuckDB, or, if a specific operation is not supported, fallbacks to dplyr.
65+
66+
67+
```{r}
68+
library("babynames")
69+
out <- babynames |>
70+
filter(n > 1000) |>
71+
summarize(
72+
.by = c(sex, year),
73+
babies_n = sum(n)
74+
) |>
75+
filter(sex == "F")
76+
```
77+
78+
The result can finally be materialized to memory, or computed temporarily, or computed to a file.
79+
80+
```{r}
81+
# to memory
82+
out
83+
84+
# to a file
85+
csv_file <- withr::local_tempfile()
86+
file.size(csv_file)
87+
compute_csv(out, csv_file)
88+
file.size(csv_file)
89+
```
90+
91+
When duckplyr itself does not support specific functionality, it falls back to dplyr.
92+
For instance, row names are not supported yet:
93+
94+
```{r}
95+
mtcars |>
96+
summarize(
97+
.by = cyl,
98+
disp = mean(disp, na.rm = TRUE),
99+
sd = sd(disp, na.rm = TRUE)
100+
)
101+
```
102+
103+
Current limitations are documented in a vignette.
104+
You can change the verbosity of fallbacks, refer to [`duckplyr::fallback_sitrep()`](https://duckplyr.tidyverse.org/reference/fallback.html).
105+
106+
107+
108+
## A handy tool for large data
109+
110+
## Help us improve duckplyr!
111+
112+
Our goals for future development of duckplyr include:
113+
114+
- Enabling users to provide custom translations of dplyr functionality;
115+
- Making it easier to contribute code to duckplyr.
116+
117+
You can already help though, in three main ways:
118+
119+
- Please report any issue especially regarding unknown incompabilities. See [`vignette("limits")`](https://duckplyr.tidyverse.org/articles/limits.html).
120+
- Contribute to the codebase after reading duckplyr's contributing guide.
121+
- Turn on telemetry to help us hear about the most frequent fallbacks so we can prioritize working on the corresponding missing dplyr translation. See [`vignette("telemetry")](https://duckplyr.tidyverse.org/articles/telemetry.html) and the [`duckplyr::fallback_sitrep()`](https://duckplyr.tidyverse.org/reference/fallback.html) function.
122+
123+
## Acknowledgements
124+
125+
A big thanks to all 54 folks who filed issues, created PRs and generally helped to improve duckplyr!
126+
127+
[&#x0040;adamschwing](https://github.com/adamschwing), [&#x0040;andreranza](https://github.com/andreranza), [&#x0040;apalacio9502](https://github.com/apalacio9502), [&#x0040;apsteinmetz](https://github.com/apsteinmetz), [&#x0040;barracuda156](https://github.com/barracuda156), [&#x0040;beniaminogreen](https://github.com/beniaminogreen), [&#x0040;bob-rietveld](https://github.com/bob-rietveld), [&#x0040;brichards920](https://github.com/brichards920), [&#x0040;cboettig](https://github.com/cboettig), [&#x0040;davidjayjackson](https://github.com/davidjayjackson), [&#x0040;DavisVaughan](https://github.com/DavisVaughan), [&#x0040;Ed2uiz](https://github.com/Ed2uiz), [&#x0040;eitsupi](https://github.com/eitsupi), [&#x0040;era127](https://github.com/era127), [&#x0040;etiennebacher](https://github.com/etiennebacher), [&#x0040;eutwt](https://github.com/eutwt), [&#x0040;fmichonneau](https://github.com/fmichonneau), [&#x0040;github-actions[bot]](https://github.com/github-actions[bot]), [&#x0040;hadley](https://github.com/hadley), [&#x0040;hannes](https://github.com/hannes), [&#x0040;hawkfish](https://github.com/hawkfish), [&#x0040;IndrajeetPatil](https://github.com/IndrajeetPatil), [&#x0040;JanSulavik](https://github.com/JanSulavik), [&#x0040;JavOrraca](https://github.com/JavOrraca), [&#x0040;jeroen](https://github.com/jeroen), [&#x0040;jhk0530](https://github.com/jhk0530), [&#x0040;joakimlinde](https://github.com/joakimlinde), [&#x0040;JosiahParry](https://github.com/JosiahParry), [&#x0040;krlmlr](https://github.com/krlmlr), [&#x0040;larry77](https://github.com/larry77), [&#x0040;lnkuiper](https://github.com/lnkuiper), [&#x0040;lorenzwalthert](https://github.com/lorenzwalthert), [&#x0040;luisDVA](https://github.com/luisDVA), [&#x0040;maelle](https://github.com/maelle), [&#x0040;math-mcshane](https://github.com/math-mcshane), [&#x0040;meersel](https://github.com/meersel), [&#x0040;multimeric](https://github.com/multimeric), [&#x0040;mytarmail](https://github.com/mytarmail), [&#x0040;nicki-dese](https://github.com/nicki-dese), [&#x0040;PMassicotte](https://github.com/PMassicotte), [&#x0040;prasundutta87](https://github.com/prasundutta87), [&#x0040;rafapereirabr](https://github.com/rafapereirabr), [&#x0040;Robinlovelace](https://github.com/Robinlovelace), [&#x0040;romainfrancois](https://github.com/romainfrancois), [&#x0040;sparrow925](https://github.com/sparrow925), [&#x0040;stefanlinner](https://github.com/stefanlinner), [&#x0040;thomasp85](https://github.com/thomasp85), [&#x0040;TimTaylor](https://github.com/TimTaylor), [&#x0040;Tmonster](https://github.com/Tmonster), [&#x0040;toppyy](https://github.com/toppyy), [&#x0040;wibeasley](https://github.com/wibeasley), [&#x0040;yjunechoe](https://github.com/yjunechoe), [&#x0040;ywhcuhk](https://github.com/ywhcuhk), and [&#x0040;zhjx19](https://github.com/zhjx19).

0 commit comments

Comments
 (0)