Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,8 @@
"group": "Database tables (non-CDC)",
"pages": [
"integrations/sources/postgresql-table",
"integrations/sources/mysql-table"
"integrations/sources/mysql-table",
"ingestion/sources/snowflake"
]
},
{
Expand Down
6 changes: 6 additions & 0 deletions iceberg/integ-snowflake.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ description: "Sink data from RisingWave to an Apache Iceberg table and query it

This guide shows how to sink data from RisingWave into an Apache Iceberg table and make it available for querying in Snowflake. This integration allows you to use RisingWave for real-time stream processing and Snowflake for large-scale analytics and data warehousing.

<Note>
For direct Snowflake integration, see:
- [Ingest data from Snowflake](/ingestion/sources/snowflake) - Load data from Snowflake tables into RisingWave
- [Sink data to Snowflake](/integrations/destinations/snowflake) - Write data from RisingWave to Snowflake
</Note>

**How it works**

RisingWave → Iceberg table on S3 → AWS Glue or REST Catalog → Snowflake
Expand Down
2 changes: 2 additions & 0 deletions ingestion/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Below is a complete list of source connectors in RisingWave. Click a connector n
| [Webhook](/integrations/sources/webhook) | Built-in |
| [Events API (HTTP)](/integrations/sources/events-api) | External service |
| [Apache Iceberg](/ingestion/sources/iceberg) | |
| [Snowflake](/ingestion/sources/snowflake) | Latest |
| [Load generator (datagen)](/ingestion/sources/datagen) | Built-in |

For information on supported data formats and encodings, and whether you need to use `CREATE SOURCE` or `CREATE TABLE` with each format, see [Data formats and encoding options](/ingestion/formats-and-encoding-options).
Expand Down Expand Up @@ -285,6 +286,7 @@ CREATE TABLE test_users (
| **Google Cloud Storage** | ❌ | ✅ | ⚠️ | Batch only; periodic via external tools |
| **Azure Blob** | ❌ | ✅ | ⚠️ | Batch only; periodic via external tools |
| **Apache Iceberg** | ❌ | ✅ | ⚠️ | Batch only; periodic via external tools |
| **Snowflake** | ❌ | ❌ | ✅ | Periodic refresh with `refresh_interval_sec` |
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The support matrix shows "❌" for One-Time Batch ingestion for Snowflake, but the connector supports manual refresh via REFRESH TABLE command (similar to Iceberg). According to the documentation, if refresh_interval_sec is omitted, the table will only refresh when manually triggered. This is effectively one-time batch ingestion. Consider marking this as "✅" to be consistent with Apache Iceberg (line 288) which also supports FULL_RELOAD mode with manual refresh.

Suggested change
| **Snowflake** || || Periodic refresh with `refresh_interval_sec` |
| **Snowflake** || || Manual `REFRESH TABLE` when `refresh_interval_sec` is omitted; periodic when set |

Copilot uses AI. Check for mistakes.
| **Datagen** | ✅ | ❌ | ❌ | Test data generation only |
| **Direct INSERT** | ❌ | ✅ | ⚠️ | Manual insertion; periodic via external tools |
| **Webhook** | ✅ | ✅ | ⚠️ | Push-based HTTP ingestion; best for SaaS webhooks + request validation/signatures |
Expand Down
209 changes: 209 additions & 0 deletions ingestion/sources/snowflake.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
---
title: "Ingest data from Snowflake"
sidebarTitle: Snowflake
description: "Load data from Snowflake tables into RisingWave using the ADBC connector."
---

This guide describes how to ingest batch data from Snowflake tables into RisingWave using the ADBC (Arrow Database Connectivity) connector. This enables you to create refreshable tables that periodically pull data from Snowflake.

Snowflake is a cloud-based data warehousing platform that allows for scalable and efficient data storage and analysis. For more information about Snowflake, see [Snowflake official website](https://www.snowflake.com/en/).

## Prerequisites

* A Snowflake account with access to the database and tables you want to ingest.
* The Snowflake account identifier (e.g., `myaccount.us-east-1`).
* Valid authentication credentials (username/password, OAuth token, JWT private key, etc.).
* Network access from RisingWave to your Snowflake instance.

Comment on lines +11 to +17
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to other new features in the documentation (e.g., MySQL table source has "Added in v2.2.0"), this Snowflake source connector should include a version note indicating when it was added to RisingWave. This helps users understand feature availability across different RisingWave versions. The note should be added after the introductory description, following the pattern seen in other connector documentation.

Copilot uses AI. Check for mistakes.
## Connecting to Snowflake

RisingWave supports loading data from Snowflake tables using the `adbc_snowflake` connector. This creates a refreshable table that periodically fetches the latest data from Snowflake.

### Syntax

```sql
CREATE TABLE table_name (
primary key (order_id) -- Replace with your actual primary key column(s)
) WITH (
connector = 'adbc_snowflake',
refresh_mode = 'FULL_RELOAD',
refresh_interval_sec = 'interval_in_seconds',
adbc_snowflake.account = 'snowflake_account',
adbc_snowflake.username = 'username',
adbc_snowflake.password = 'password',
adbc_snowflake.database = 'database_name',
adbc_snowflake.schema = 'schema_name',
adbc_snowflake.warehouse = 'warehouse_name',
adbc_snowflake.table = 'table_name'
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the syntax example, the parameter adbc_snowflake.table is set to 'table_name', which creates ambiguity since table_name is also used as the CREATE TABLE name on line 25. Consider using a different placeholder like 'source_table_name' or 'snowflake_table_name' to make it clear that these refer to different tables (the Snowflake source table vs. the RisingWave table being created).

Suggested change
adbc_snowflake.table = 'table_name'
adbc_snowflake.table = 'source_table_name'

Copilot uses AI. Check for mistakes.
);
```

<Note>
**Automatic Schema Inference**

Column definitions are automatically inferred from the Snowflake table and should not be manually specified in the `CREATE TABLE` statement. However, you must specify the primary key if your table requires one.
</Note>

## Parameters

All parameters are required unless specified otherwise.

| Parameter | Description |
| :--------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| connector | Must be `adbc_snowflake`. |
| refresh_mode | Must be `FULL_RELOAD`. The entire table is re-read on each refresh. |
| refresh_interval_sec | The refresh interval in seconds. Determines how frequently data is fetched from Snowflake. |
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter description for refresh_interval_sec states it's required ("All parameters are required unless specified otherwise"), but this conflicts with the behavior described in the Limitations section and similar connectors like Iceberg, where omitting this parameter means manual refresh only. The description should clarify that this parameter is optional, and if omitted, the table will only refresh when manually triggered via REFRESH TABLE.

Suggested change
| refresh_interval_sec | The refresh interval in seconds. Determines how frequently data is fetched from Snowflake. |
| refresh_interval_sec | **Optional.** The refresh interval in seconds. Determines how frequently data is fetched from Snowflake. If omitted, the table is not refreshed automatically and only updates when you run `REFRESH TABLE`. |

Copilot uses AI. Check for mistakes.
| adbc_snowflake.account | The Snowflake account identifier (e.g., `myaccount.us-east-1` or `myaccount`). |
| adbc_snowflake.username | The Snowflake username for authentication. |
| adbc_snowflake.password | **Optional**. The password for username/password authentication. Required if using the default `auth_snowflake` authentication type. |
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter description states that adbc_snowflake.password is "Required if using the default auth_snowflake authentication type", but the default authentication type is described as auth_snowflake on line 63. There's an inconsistency in naming - the description uses "auth_snowflake" but should verify this matches the actual implementation. Based on the parameter name pattern and auth type examples, this should likely be just the default (username/password) rather than specifically named "auth_snowflake".

Suggested change
| adbc_snowflake.password | **Optional**. The password for username/password authentication. Required if using the default `auth_snowflake` authentication type. |
| adbc_snowflake.password | **Optional**. The password for username/password authentication. Required when using username/password authentication (the default auth type). |

Copilot uses AI. Check for mistakes.
| adbc_snowflake.database | The name of the Snowflake database. |
| adbc_snowflake.schema | The Snowflake schema containing the table. |
| adbc_snowflake.warehouse | The Snowflake warehouse to use for queries. |
| adbc_snowflake.table | The name of the Snowflake table to ingest. |
| adbc_snowflake.auth_type | **Optional**. The authentication method. Default is `auth_snowflake` (username/password). Other options: `auth_oauth`, `auth_jwt`, `auth_ext_browser`, `auth_okta`, `auth_mfa`, `auth_pat`, `auth_wif`. |
| adbc_snowflake.jwt_private_key_path | Required when `adbc_snowflake.auth_type` is `auth_jwt`. Local file path on the RisingWave server to the JWT private key file (e.g., `/path/to/key.pem`). |

## Authentication methods

The Snowflake connector supports multiple authentication methods:

### Username and password (default)

```sql
CREATE TABLE my_snowflake_table (
primary key ("order_id")
) WITH (
connector = 'adbc_snowflake',
refresh_mode = 'FULL_RELOAD',
refresh_interval_sec = '3600',
adbc_snowflake.account = 'myaccount.us-east-1',
adbc_snowflake.username = 'myuser',
adbc_snowflake.password = 'mypassword',
adbc_snowflake.database = 'SALES_DB',
adbc_snowflake.schema = 'PUBLIC',
adbc_snowflake.warehouse = 'COMPUTE_WH',
adbc_snowflake.table = 'ORDERS'
);
```

### JWT authentication

```sql
CREATE TABLE my_snowflake_table (
primary key ("order_id")
) WITH (
connector = 'adbc_snowflake',
refresh_mode = 'FULL_RELOAD',
refresh_interval_sec = '7200',
adbc_snowflake.account = 'myaccount',
adbc_snowflake.username = 'myuser',
adbc_snowflake.database = 'SALES_DB',
adbc_snowflake.schema = 'PUBLIC',
adbc_snowflake.warehouse = 'COMPUTE_WH',
adbc_snowflake.table = 'ORDERS',
adbc_snowflake.auth_type = 'auth_jwt',
adbc_snowflake.jwt_private_key_path = '/path/to/key.pem'
);
```

### OAuth authentication

```sql
CREATE TABLE my_snowflake_table (
primary key ("order_id")
) WITH (
connector = 'adbc_snowflake',
refresh_mode = 'FULL_RELOAD',
refresh_interval_sec = '3600',
adbc_snowflake.account = 'myaccount.us-east-1',
adbc_snowflake.username = 'myuser',
adbc_snowflake.database = 'SALES_DB',
adbc_snowflake.schema = 'PUBLIC',
adbc_snowflake.warehouse = 'COMPUTE_WH',
adbc_snowflake.table = 'ORDERS',
adbc_snowflake.auth_type = 'auth_oauth'
);
```

Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OAuth authentication example sets adbc_snowflake.auth_type = 'auth_oauth' but doesn't include any OAuth-specific parameters such as an OAuth token or token endpoint. OAuth authentication typically requires additional parameters beyond just setting the auth type. The example should either include the required OAuth parameters or add a note explaining what additional configuration is needed for OAuth to work properly.

Suggested change
Note: This example only shows how to set the authentication type to OAuth. You must also configure the required OAuth credentials and related settings for the ADBC Snowflake connector (for example, via its configuration or environment) for authentication to work correctly.

Copilot uses AI. Check for mistakes.
## Data type mapping

The following table shows the corresponding data types between Snowflake and RisingWave. For details on native RisingWave data types, see [Overview of data types](/sql/data-types/overview).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot you need to handle the doc ref well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what needs to be changed with the doc ref on line 130? The current reference [Overview of data types](/sql/data-types/overview) follows the same pattern used in other source/sink documentation (e.g., Snowflake sink line 57, SQL Server CDC line 220). Should I use a different link format or reference a different page?


| Snowflake type | RisingWave type | Notes |
| :--------------- | :------------------------------- | :------------------------------------------------- |
| STRING | VARCHAR | |
| NUMBER | DECIMAL or BIGINT | Depends on scale and precision |
| FLOAT | DOUBLE PRECISION | |
| DECIMAL | DECIMAL | |
| CHAR | VARCHAR | |
| TEXT | VARCHAR | |
| DATE | DATE | |
| TIME | Not supported | Will report an error |
| TIMESTAMP_NTZ | TIMESTAMP WITHOUT TIME ZONE | |
| TIMESTAMP_LTZ | TIMESTAMP WITH TIME ZONE | |
| TIMESTAMP_TZ | TIMESTAMP WITH TIME ZONE | |
| BOOLEAN | BOOLEAN | |
| BINARY | BYTEA | |
| VARIANT | VARCHAR | JSON data stored as string |
| OBJECT | VARCHAR | JSON objects stored as string |
| ARRAY | VARCHAR | Arrays stored as string |

## Complete example

This example demonstrates how to create a refreshable table that loads data from a Snowflake table every hour.

### Step 1: Create the refreshable table

```sql
CREATE TABLE snowflake_orders (
primary key ("order_id")
) WITH (
connector = 'adbc_snowflake',
refresh_mode = 'FULL_RELOAD',
refresh_interval_sec = '3600', -- Refresh every hour

-- Snowflake connection parameters
adbc_snowflake.account = 'myaccount.us-east-1',
adbc_snowflake.username = 'analytics_user',
adbc_snowflake.password = 'secure_password',
adbc_snowflake.database = 'PRODUCTION',
adbc_snowflake.schema = 'SALES',
adbc_snowflake.warehouse = 'ANALYTICS_WH',
adbc_snowflake.table = 'ORDERS'
);
```

### Step 2: Query the data

```sql
SELECT * FROM snowflake_orders LIMIT 10;
```

### Step 3: Create materialized views

You can create materialized views based on the Snowflake data:

```sql
-- The columns order_date and total_amount are automatically inferred from the Snowflake table
CREATE MATERIALIZED VIEW daily_sales AS
SELECT
DATE_TRUNC('day', order_date) AS sale_date,
COUNT(*) AS order_count,
SUM(total_amount) AS total_revenue
FROM snowflake_orders
GROUP BY DATE_TRUNC('day', order_date);
```

## Limitations and requirements

* **Refresh mode**: Only `FULL_RELOAD` mode is supported. The entire table is re-read on each refresh interval.
* **Schema inference**: Column definitions are automatically inferred from the Snowflake table. Do not manually specify columns in the `CREATE TABLE` statement.
* **Feature flag**: The Snowflake connector requires the `source-adbc_snowflake` feature to be enabled at compile time. This is enabled by default in official RisingWave builds.
* **Consistent snapshots**: The connector uses Snowflake's time travel feature to ensure all data is read from the same point in time during a refresh.
* **Performance**: For large tables, consider the refresh interval carefully to balance data freshness with query costs in Snowflake.

## What's next?

* [Sink data to Snowflake](/integrations/destinations/snowflake) - Learn how to write data from RisingWave back to Snowflake
* [Work with Snowflake and Iceberg](/iceberg/integ-snowflake) - Integrate Snowflake with Apache Iceberg catalogs
* [Data formats and encoding options](/ingestion/formats-and-encoding-options) - Understand supported data formats
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "What's next?" section links to "Data formats and encoding options", but the Snowflake ADBC connector doesn't use FORMAT/ENCODE options - it relies on automatic schema inference from the Snowflake table. This link may confuse users as it's not relevant to the Snowflake source connector. Consider replacing this with a more relevant link, such as documentation about refreshable tables (REFRESH TABLE command) or other batch ingestion patterns.

Suggested change
* [Data formats and encoding options](/ingestion/formats-and-encoding-options) - Understand supported data formats

Copilot uses AI. Check for mistakes.
2 changes: 2 additions & 0 deletions integrations/destinations/snowflake.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ description: This guide describes how to sink data from RisingWave to Snowflake

Snowflake is a cloud-based data warehousing platform that allows for scalable and efficient data storage and analysis. For more information about Snowflake, see [Snowflake official website](https://www.snowflake.com/en/).

This page describes how to **sink data to** Snowflake. To **ingest data from** Snowflake, see [Ingest data from Snowflake](/ingestion/sources/snowflake).

Sinking from RisingWave to Snowflake utilizes [Snowpipe](https://docs.snowflake.com/en/user-guide/data-load-snowpipe-rest-apis) for data loading. Initially, data is staged in a user-managed S3 bucket in JSON format, and then loaded into the Snowflake table via Snowpipe. For more information, see [Overview of the Snowpipe REST endpoints to load data](https://docs.snowflake.com/user-guide/data-load-snowpipe-rest-overview).

<Tip>
Expand Down
Loading