Skip to content

Commit 12be47e

Browse files
committed
feat: s3 configurations and examining table instructions added to delta-rs docs added
1 parent 962cf28 commit 12be47e

File tree

1 file changed

+58
-30
lines changed
  • docs/data_engineering/data_lakehouse/delta_lake

1 file changed

+58
-30
lines changed

docs/data_engineering/data_lakehouse/delta_lake/delta_rs.md

Lines changed: 58 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,27 @@
11
# delta-rs
22

3+
## S3 configurations
4+
5+
Everytime you want to use S3 as a data store, you need to add the following:
6+
7+
```rust
8+
// Register S3 handlers
9+
deltalake::aws::register_handlers(None);
10+
11+
// Set S3 configuration options
12+
let mut storage_options = HashMap::new();
13+
storage_options.insert("AWS_ENDPOINT_URL".to_string(), "http://localhost:5561".to_string());
14+
storage_options.insert("AWS_REGION".to_string(), "us-east-1".to_string());
15+
storage_options.insert("AWS_ACCESS_KEY_ID".to_string(), "admin".to_string());
16+
storage_options.insert("AWS_SECRET_ACCESS_KEY".to_string(), "password".to_string());
17+
storage_options.insert("AWS_ALLOW_HTTP".to_string(), "true".to_string());
18+
storage_options.insert("AWS_S3_ALLOW_UNSAFE_RENAME".to_string(), "true".to_string());
19+
```
20+
21+
The S3 configuration options could be set as environment variables too.
22+
23+
S3 requires a locking provider by default ([more information](https://delta-io.github.io/delta-rs/usage/writing/writing-to-s3-with-locking-provider/)). If you don't want to use a locking provider, you can disable it by setting the `AWS_S3_ALLOW_UNSAFE_RENAME` variable to `true`.
24+
325
## Create table
426

527
```rust
@@ -11,14 +33,33 @@ async fn main() {
1133

1234
let table = delta_ops
1335
.create()
14-
.with_table_name("table_01")
36+
.with_table_name("employee")
1537
.with_column("id", DataType::INTEGER, false, Default::default())
1638
.with_column("name", DataType::STRING, false, Default::default())
1739
.await
1840
.expect("Table creation failed");
1941
}
2042
```
2143

44+
If you want to save the table in a s3 storage, you need to configure the `DeltaOps` object differently:
45+
46+
```rust
47+
use deltalake::{DeltaOps, DeltaTableBuilder};
48+
49+
#[tokio::main()]
50+
async fn main() {
51+
// ...
52+
53+
let delta_table_builder = DeltaTableBuilder::from_uri("s3://data-lakehouse/employee")
54+
.with_storage_options(storage_options)
55+
.build()
56+
.expect("Connection to s3 failed");
57+
let delta_ops = DeltaOps::from(delta_table_builder);
58+
59+
// ...
60+
}
61+
```
62+
2263
## Insert data
2364

2465
```rust
@@ -78,7 +119,9 @@ Open table:
78119
async fn main() {
79120
// ...
80121

81-
let table = deltalake::open_table("s3://data-lakehouse/employee").await.expect("Load failed");
122+
let table = deltalake::open_table_with_storage_options("s3://data-lakehouse/employee", storage_options)
123+
.await
124+
.expect("Load failed");
82125
}
83126
```
84127

@@ -99,34 +142,6 @@ async fn main() {
99142
}
100143
```
101144

102-
### S3 storage
103-
104-
```rust
105-
use std::collections::HashMap;
106-
107-
#[tokio::main()]
108-
async fn main() {
109-
// Register AWS S3 handlers for Delta Lake operations
110-
deltalake::aws::register_handlers(None);
111-
112-
let mut storage_options = HashMap::new();
113-
storage_options.insert("AWS_ENDPOINT_URL".to_string(), "http://localhost:5561".to_string());
114-
storage_options.insert("AWS_REGION".to_string(), "us-east-1".to_string());
115-
storage_options.insert("AWS_ACCESS_KEY_ID".to_string(), "admin".to_string());
116-
storage_options.insert("AWS_SECRET_ACCESS_KEY".to_string(), "password".to_string());
117-
storage_options.insert("AWS_ALLOW_HTTP".to_string(), "true".to_string());
118-
storage_options.insert("AWS_S3_ALLOW_UNSAFE_RENAME".to_string(), "true".to_string());
119-
120-
let table = deltalake::open_table_with_storage_options("s3://data-lakehouse/employee", storage_options)
121-
.await
122-
.expect("Load failed");
123-
}
124-
```
125-
126-
You can set the storage option parameters as environment variables too.
127-
128-
S3 requires a locking provider by default ([more information](https://delta-io.github.io/delta-rs/usage/writing/writing-to-s3-with-locking-provider/)). If you don't want to use a locking provider, you can disable it by setting the `AWS_S3_ALLOW_UNSAFE_RENAME` variable to `true`.
129-
130145
## Time travel
131146

132147
To load the previous state of a table, you can use the `open_table_with_version` function:
@@ -141,3 +156,16 @@ If the table is already loaded and you want to change the version number, just u
141156
```rust
142157
table.load_version(2).await.expect("Load failed");
143158
```
159+
160+
## Examining Table
161+
162+
You can find more information about this inside the [official documentation](https://delta-io.github.io/delta-rs/usage/examining-table/).
163+
164+
```rust
165+
// Metadata
166+
println!("{:?}", table.metadata().unwrap());
167+
// Schema
168+
println!("{:?}", table.schema().unwrap());
169+
// History
170+
println!("{:?}", table.history(None).await.unwrap());
171+
```

0 commit comments

Comments
 (0)