Duckdb equivalent to dplyr's separate() or separate_wider_delim()? #1480

adamschwing · 2024-11-06T19:50:40Z

adamschwing
Nov 6, 2024

Hello!

I would like to take a comma separated string and put each element in its own row. This is easy to do in dplyr using the separate() or separate_wider_delim() functions. However, my dataset is very large because each string has thousands of elements and the dataset contains thousands of these strings across many columns and rows. So doing this separation is impractical using purely dplyr.

Is there an equivalent function in duckdb-r or duckplyr for this?

nbc · 2024-11-21T17:01:49Z

nbc
Nov 21, 2024

Something like that ?

library(duckdb)
#> Loading required package: DBI

con <- dbConnect(duckdb())

cat(readr::read_file('/tmp/split.csv'))
#> str1;str2
#> string;a1,a2,a3
#> string;a4,a5,a6

dbGetQuery(con, "SELECT str1, str_split(str2, ',').UNNEST() FROM read_csv('/tmp/split.csv', delim=';')")
#>     str1 unnest(str_split(str2, ','))
#> 1 string                           a1
#> 2 string                           a2
#> 3 string                           a3
#> 4 string                           a4
#> 5 string                           a5
#> 6 string                           a6

^{Created on 2024-11-21 with reprex v2.1.0}

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Duckdb equivalent to dplyr's separate() or separate_wider_delim()? #1480

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Duckdb equivalent to dplyr's separate() or separate_wider_delim()? #1480

Uh oh!

adamschwing Nov 6, 2024

Replies: 1 comment

Uh oh!

nbc Nov 21, 2024

adamschwing
Nov 6, 2024

nbc
Nov 21, 2024