A Rust-based transpiler that converts R dplyr syntax to SQL queries.
libdplyr enables R users to write database queries using familiar dplyr syntax and converts them to efficient SQL for execution. It supports multiple SQL dialects (PostgreSQL, MySQL, SQLite, DuckDB) for use across various database environments.
- dplyr Syntax Support: Full support for
select(),filter(),mutate(),arrange(),group_by(),summarise() - Pipeline Operations: Chain operations using the
%>%pipe operator - Multiple Dialects: PostgreSQL, MySQL, SQLite, DuckDB
- Performance: High-performance Rust implementation
- Dual Mode: Use as a Rust library or standalone CLI tool
Linux/macOS:
curl -sSL https://raw.githubusercontent.com/mrchypark/libdplyr/main/install.sh | bashWindows (PowerShell):
Irm https://raw.githubusercontent.com/mrchypark/libdplyr/main/install.ps1 | iexTry it out:
echo "select(name, age) %>% filter(age > 18)" | libdplyr --prettyNote: For detailed installation options, troubleshooting, and platform support, see the Installation Guide.
The most efficient way to use libdplyr is through stdin/stdout pipelines:
# Basic usage
echo "select(name, age) %>% filter(age > 18)" | libdplyr
# Specify dialect (postgres, mysql, sqlite, duckdb)
echo "select(name)" | libdplyr --dialect mysql
# Output formatting
echo "select(name)" | libdplyr --pretty
echo "select(name)" | libdplyr --json
echo "select(name)" | libdplyr --compactuse libdplyr::{Transpiler, PostgreSqlDialect};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let transpiler = Transpiler::new(Box::new(PostgreSqlDialect::new()));
let dplyr_code = "select(name, age) %>% filter(age > 18)";
let sql = transpiler.transpile(dplyr_code)?;
println!("{}", sql);
Ok(())
}libdplyr supports a wide range of dplyr verbs and R functions.
| Function | Description | Example |
|---|---|---|
select() |
Select/rename columns | select(id, name) |
filter() |
Filter rows | filter(age > 18) |
mutate() |
Create/modify columns | mutate(total = price * qty) |
rename() |
Rename columns | rename(new = old) |
arrange() |
Sort rows | arrange(desc(date)) |
group_by() |
Group rows | group_by(dept) |
summarise() |
Aggregate data | summarise(avg = mean(val)) |
*_join() |
Joins (inner, left, etc.) | left_join(other, by="id") |
| Set Ops | union, intersect, setdiff | union(other) |
- Aggregation:
mean,sum,min,max,n,count,median,mode - Window:
row_number,rank,lead,lag,ntile - Math:
abs,sqrt,round,floor,log,exp - String:
tolower,toupper,substr,trimws - Logic:
ifelse,is.na,coalesce
let transpiler = Transpiler::new(Box::new(PostgreSqlDialect::new()));
let sql = transpiler.transpile("select(name) %>% filter(age > 18)")?;
// SELECT "name" FROM "data" WHERE "age" > 18let code = r#"
select(dept, salary) %>%
filter(salary > 50000) %>%
group_by(dept) %>%
summarise(avg_sal = mean(salary)) %>%
arrange(desc(avg_sal))
"#;libdplyr provides detailed error codes:
Exit 0: SuccessExit 4: Validation Error (syntax issues)Exit 5: Transpilation Error (generation failed)
Debug Mode:
libdplyr --verbose --debugFor more troubleshooting details, see INSTALL.md.
libdplyr is optimized for speed using Rust's zero-cost abstractions. Run benchmarks:
cargo bench# Setup
git clone https://github.com/mrchypark/libdplyr.git
cd libdplyr
cargo build
# Test
cargo test- Documentation: docs.rs/libdplyr
- Installation: INSTALL.md
- Issues: GitHub Issues
MIT License. See LICENSE.