Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 0 additions & 29 deletions docs/guide/formats/index.md

This file was deleted.

139 changes: 0 additions & 139 deletions docs/guide/integrations/jdbc.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,18 +1,13 @@
---
title: Delta Lake
title: Examples
rank: 1
---

# Delta Lake
# Examples

You can use the `delta` format in Sail to work with [Delta Lake](https://delta.io/).
You can use the Spark DataFrame API or Spark SQL to read and write Delta tables.
<!--@include: ../../_common/spark-session.md-->

## Examples

<!--@include: ../_common/spark-session.md-->

### Basic Usage
## Basic Usage

::: code-group

Expand Down Expand Up @@ -44,7 +39,7 @@ SELECT * FROM users;

:::

### Data Partitioning
## Data Partitioning

You can work with partitioned Delta tables using the Spark DataFrame API.
Partitioned Delta tables organize data into directories based on the values of one or more columns.
Expand Down Expand Up @@ -78,7 +73,7 @@ SELECT * FROM metrics WHERE year > 2024;

:::

### Schema Evolution
## Schema Evolution

Delta Lake handles schema evolution gracefully.
By default, if you try to write data with a different schema than the one of the existing Delta table, an error will occur.
Expand All @@ -96,7 +91,7 @@ But this works only if you set the write mode to `overwrite`.
df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").save(path)
```

### Time Travel
## Time Travel

You can use the time travel feature to query historical versions of a Delta table.

Expand All @@ -107,7 +102,7 @@ df = spark.read.format("delta").option("timestampAsOf", "2025-01-02T03:04:05.678

Time travel is not available for Spark SQL in Sail yet, but we plan to support it soon.

### Column Mapping
## Column Mapping

You can write Delta tables with column mapping enabled. The supported column mapping modes are `name` and `id`. You must write to a new Delta table to enable column mapping.

Expand All @@ -118,7 +113,7 @@ df.write.format("delta").option("columnMappingMode", "id").save(path)

Existing Delta tables with column mapping can be read as usual.

### More Features
## More Features

We will continue adding more examples for advanced Delta Lake features as they become available in Sail.
In the meantime, feel free to reach out to us on [Slack](https://lakesail.com/slack) or [GitHub Discussions](https://github.com/lakehq/sail/discussions) if you have questions!
56 changes: 56 additions & 0 deletions docs/guide/sources/delta/features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
title: Supported Features
rank: 2
---

# Supported Features

## Core Table Operations

| Feature | Supported |
| ------------------------------------------- | ------------------ |
| Read | :white_check_mark: |
| Write (append) | :white_check_mark: |
| Write (overwrite) | :white_check_mark: |
| Data skipping (partition pruning) | :white_check_mark: |
| Data skipping (pruning via file statistics) | :white_check_mark: |
| Schema validation | :white_check_mark: |
| Schema evolution | :white_check_mark: |
| Time travel (by version) | :white_check_mark: |
| Time travel (by timestamp) | :white_check_mark: |

Both non-partitioned and partitioned tables are supported for reading and writing.

## DML Operations

| Feature | Supported |
| ------------------------ | ------------------ |
| `DELETE` (copy-on-write) | :white_check_mark: |
| `MERGE` (copy-on-write) | :white_check_mark: |
| `DELETE` (merge-on-read) | :construction: |
| `MERGE` (merge-on-read) | :construction: |
| `UPDATE` | :construction: |

The "merge-on-read" mode refers to updating the table with deletion vectors. This reduces the amount of data that needs to be rewritten during DML operations, but incurs additional read overhead when querying the table.

## Table Maintenance Operations

| Feature | Supported |
| ---------- | -------------- |
| `VACUUM` | :construction: |
| `OPTIMIZE` | :construction: |
| `RESTORE` | :construction: |

## Protocol Internals

| Feature | Supported |
| -------------------------------- | ------------------ |
| Checkpointing | :white_check_mark: |
| Log clean-up | :white_check_mark: |
| Column mapping | :white_check_mark: |
| Deletion vectors | :construction: |
| Constraints | :construction: |
| Identity columns | :construction: |
| Generated columns | :construction: |
| Transaction (conflict detection) | :construction: |
| Change data feed | :construction: |
5 changes: 5 additions & 0 deletions docs/guide/sources/delta/index.data.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import { createContentLoader } from "vitepress";

export default createContentLoader([
"/guide/sources/delta/!(index|_*/**|**/_*/**).md",
]);
18 changes: 18 additions & 0 deletions docs/guide/sources/delta/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: Delta Lake
rank: 1
---

# Delta Lake

You can use the `delta` format in Sail to work with [Delta Lake](https://delta.io/).
You can use the Spark DataFrame API or Spark SQL to read and write Delta tables.

## Topics

<PageList :data="data" :prefix="['guide', 'sources', 'delta']" />

<script setup>
import PageList from "@theme/components/PageList.vue";
import { data } from "./index.data.ts";
</script>
Original file line number Diff line number Diff line change
@@ -1,18 +1,13 @@
---
title: Iceberg
rank: 2
title: Examples
rank: 1
---

# Iceberg
# Examples

You can use the `iceberg` format in Sail to work with [Apache Iceberg](https://iceberg.apache.org/).
You can use the Spark DataFrame API or Spark SQL to read and write Iceberg tables.
<!--@include: ../../_common/spark-session.md-->

## Examples

<!--@include: ../_common/spark-session.md-->

### Basic Usage
## Basic Usage

::: code-group

Expand Down Expand Up @@ -44,7 +39,7 @@ SELECT * FROM users;

:::

### Data Partitioning
## Data Partitioning

You can work with partitioned Iceberg tables using the Spark DataFrame API.
Partitioned Iceberg tables organize data into directories based on the values of one or more columns.
Expand Down Expand Up @@ -78,7 +73,7 @@ SELECT * FROM metrics WHERE year > 2024;

:::

### Time Travel
## Time Travel

You can use the time travel feature to query tags, branches, or historical versions of an Iceberg table.

Expand All @@ -90,7 +85,7 @@ df = spark.read.format("iceberg").option("branch", "main").load(path)

Time travel is not available for Spark SQL in Sail yet, but we plan to support it soon.

### More Features
## More Features

We will continue adding more examples for advanced Iceberg features as they become available in Sail.
In the meantime, feel free to reach out to us on [Slack](https://lakesail.com/slack) or [GitHub Discussions](https://github.com/lakehq/sail/discussions) if you have questions!
Loading