|
2 | 2 |
|
3 | 3 |  |
4 | 4 |
|
5 | | -Serving any [datafusion](https://datafusion.apache.org) `SessionContext` with full PostgreSQL compatibility, including authentication, role-based access control, and SSL/TLS encryption. Available as a library and a CLI tool. |
6 | | - |
7 | | -This project adds a comprehensive [PostgreSQL compatible access layer](https://github.com/sunng87/pgwire) to the [Apache DataFusion](https://github.com/apache/arrow-datafusion) query engine, making it a drop-in replacement for PostgreSQL in analytics workloads. |
| 5 | +A PostgreSQL-compatible server for [Apache DataFusion](https://datafusion.apache.org), supporting authentication, role-based access control, and SSL/TLS encryption. Available as both a library and CLI tool. |
8 | 6 |
|
| 7 | +Built on [pgwire](https://github.com/sunng87/pgwire) to provide PostgreSQL wire protocol compatibility for analytical workloads. |
9 | 8 | It was originally an example of the [pgwire](https://github.com/sunng87/pgwire) |
10 | 9 | project. |
11 | 10 |
|
12 | 11 | ## ✨ Key Features |
13 | 12 |
|
14 | 13 | - 🔌 **Full PostgreSQL Wire Protocol** - Compatible with all PostgreSQL clients and drivers |
15 | | -- 🛡️ **Enterprise Security** - Authentication, RBAC, and SSL/TLS encryption |
| 14 | +- 🛡️ **Security Features** - Authentication, RBAC, and SSL/TLS encryption |
16 | 15 | - 🏗️ **Complete System Catalogs** - Real `pg_catalog` tables with accurate metadata |
17 | 16 | - 📊 **Advanced Data Types** - Comprehensive Arrow ↔ PostgreSQL type mapping |
18 | | -- 🔄 **Transaction Support** - Full ACID transaction lifecycle (BEGIN/COMMIT/ROLLBACK) |
| 17 | +- 🔄 **Transaction Support** - ACID transaction lifecycle (BEGIN/COMMIT/ROLLBACK) |
19 | 18 | - ⚡ **High Performance** - Apache DataFusion's columnar query execution |
20 | 19 |
|
21 | | -## 🎯 Roadmap & Status |
22 | | - |
23 | | -- [x] **Core Features** |
24 | | - - [x] datafusion-postgres as a CLI tool |
25 | | - - [x] datafusion-postgres as a library |
26 | | - - [x] datafusion information schema |
27 | | - - [x] Complete `pg_catalog` system tables (pg_type, pg_attribute, pg_proc, pg_class, etc.) |
28 | | - - [x] Comprehensive Arrow ↔ PostgreSQL data type mapping |
29 | | - - [x] Essential PostgreSQL functions (version(), current_schema(), has_table_privilege(), etc.) |
30 | | - |
31 | | -- [x] **Security & Authentication** 🆕 |
32 | | - - [x] User authentication and management |
33 | | - - [x] Role-based access control (RBAC) |
34 | | - - [x] Granular permissions (SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, etc.) |
35 | | - - [x] Role inheritance and grant management |
36 | | - - [x] SSL/TLS connection encryption |
37 | | - - [x] Query-level permission checking |
38 | | - |
39 | | -- [x] **Transaction Support** 🆕 |
40 | | - - [x] Full ACID transaction lifecycle |
41 | | - - [x] BEGIN/COMMIT/ROLLBACK with all variants |
42 | | - - [x] Failed transaction handling and recovery |
43 | | - - [x] Transaction state management |
44 | | - |
45 | | -- [ ] **Future Enhancements** |
46 | | - - [ ] Connection pooling and performance optimizations |
47 | | - - [ ] Advanced authentication methods (LDAP, certificates) |
48 | | - - [ ] More PostgreSQL functions and operators |
49 | | - - [ ] COPY protocol for bulk data loading |
50 | | - |
51 | | -## 🔐 Authentication & Security |
| 20 | +## 🎯 Features |
52 | 21 |
|
53 | | -datafusion-postgres supports enterprise-grade authentication through pgwire's standard mechanisms: |
| 22 | +### Core Functionality |
| 23 | +- ✅ Library and CLI tool |
| 24 | +- ✅ PostgreSQL wire protocol compatibility |
| 25 | +- ✅ Complete `pg_catalog` system tables |
| 26 | +- ✅ Arrow ↔ PostgreSQL data type mapping |
| 27 | +- ✅ PostgreSQL functions (version, current_schema, has_table_privilege, etc.) |
54 | 28 |
|
55 | | -### Production Authentication Setup |
| 29 | +### Security & Authentication |
| 30 | +- ✅ User authentication and RBAC |
| 31 | +- ✅ Granular permissions (SELECT, INSERT, UPDATE, DELETE, CREATE, DROP) |
| 32 | +- ✅ Role inheritance and grant management |
| 33 | +- ✅ SSL/TLS encryption |
| 34 | +- ✅ Query-level permission checking |
56 | 35 |
|
57 | | -Proper pgwire authentication: |
| 36 | +### Transaction Support |
| 37 | +- ✅ ACID transaction lifecycle |
| 38 | +- ✅ BEGIN/COMMIT/ROLLBACK with all variants |
| 39 | +- ✅ Failed transaction handling and recovery |
58 | 40 |
|
59 | | -```rust |
60 | | -use pgwire::api::auth::cleartext::CleartextStartupHandler; |
61 | | -use datafusion_postgres::auth::{DfAuthSource, AuthManager}; |
62 | | - |
63 | | -// Setup authentication |
64 | | -let auth_manager = Arc::new(AuthManager::new()); |
65 | | -let auth_source = Arc::new(DfAuthSource::new(auth_manager)); |
66 | | - |
67 | | -// Choose authentication method: |
68 | | -// 1. Cleartext (simple) |
69 | | -let authenticator = CleartextStartupHandler::new( |
70 | | - auth_source, |
71 | | - Arc::new(DefaultServerParameterProvider::default()) |
72 | | -); |
| 41 | +### Future Enhancements |
| 42 | +- ⏳ Connection pooling optimizations |
| 43 | +- ⏳ Advanced authentication (LDAP, certificates) |
| 44 | +- ⏳ COPY protocol for bulk data loading |
73 | 45 |
|
74 | | -// 2. MD5 (recommended) |
75 | | -// let authenticator = MD5StartupHandler::new(auth_source, params); |
| 46 | +## 🔐 Authentication |
76 | 47 |
|
77 | | -// 3. SCRAM (enterprise - requires "server-api-scram" feature) |
78 | | -// let authenticator = SASLScramAuthStartupHandler::new(auth_source, params); |
79 | | -``` |
| 48 | +Supports standard pgwire authentication methods: |
80 | 49 |
|
81 | | -### User Management |
| 50 | +- **Cleartext**: `CleartextStartupHandler` for simple password authentication |
| 51 | +- **MD5**: `MD5StartupHandler` for MD5-hashed passwords |
| 52 | +- **SCRAM**: `SASLScramAuthStartupHandler` for secure authentication |
82 | 53 |
|
83 | | -```rust |
84 | | -// Add users to the RBAC system |
85 | | -auth_manager.add_user("admin", "secure_password", vec!["dbadmin".to_string()]).await; |
86 | | -auth_manager.add_user("analyst", "password123", vec!["readonly".to_string()]).await; |
87 | | -``` |
| 54 | +See `auth.rs` for complete implementation examples using `DfAuthSource`. |
88 | 55 |
|
89 | 56 | ## 🚀 Quick Start |
90 | 57 |
|
@@ -129,12 +96,11 @@ serve(session_context, &server_options).await |
129 | 96 |
|
130 | 97 | ### The CLI `datafusion-postgres-cli` |
131 | 98 |
|
132 | | -As a command-line application, this tool serves any JSON/CSV/Arrow/Parquet/Avro |
133 | | -files as tables, and exposes them via PostgreSQL compatible protocol with full security features. |
| 99 | +Command-line tool to serve JSON/CSV/Arrow/Parquet/Avro files as PostgreSQL-compatible tables. |
134 | 100 |
|
135 | 101 | ``` |
136 | 102 | datafusion-postgres-cli 0.6.1 |
137 | | -A secure postgres interface for datafusion. Serve any CSV/JSON/Arrow/Parquet files as tables. |
| 103 | +A PostgreSQL interface for DataFusion. Serve CSV/JSON/Arrow/Parquet files as tables. |
138 | 104 |
|
139 | 105 | USAGE: |
140 | 106 | datafusion-postgres-cli [OPTIONS] |
@@ -187,12 +153,7 @@ Listening on 127.0.0.1:5432 (unencrypted) |
187 | 153 |
|
188 | 154 | ### Connect with psql |
189 | 155 |
|
190 | | -> **🔐 PRODUCTION AUTHENTICATION**: For production deployments, implement proper authentication by using `DfAuthSource` with pgwire's standard authentication handlers: |
191 | | -> - **Cleartext**: `CleartextStartupHandler` for simple password auth |
192 | | -> - **MD5**: `MD5StartupHandler` for MD5-hashed passwords |
193 | | -> - **SCRAM**: `SASLScramAuthStartupHandler` for enterprise-grade security |
194 | | -> |
195 | | -> See `auth.rs` for complete implementation examples. The default setup is for development only. |
| 156 | +> **🔐 Authentication**: The default setup allows connections without authentication for development. For secure deployments, use `DfAuthSource` with standard pgwire authentication handlers (cleartext, MD5, or SCRAM). See `auth.rs` for implementation examples. |
196 | 157 |
|
197 | 158 | ```bash |
198 | 159 | psql -h 127.0.0.1 -p 5432 -U postgres |
|
0 commit comments