Skip to content

Commit fcf6996

Browse files
authored
docs: Attach tbl (#1656)
* Update 92-attach-table.md * Update 92-attach-table.md * Update 92-attach-table.md * Create link-tables.md * updates
1 parent 4fb4313 commit fcf6996

File tree

2 files changed

+190
-79
lines changed

2 files changed

+190
-79
lines changed

docs/en/sql-reference/10-sql-commands/00-ddl/01-table/92-attach-table.md

Lines changed: 23 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -5,24 +5,29 @@ sidebar_position: 6
55

66
import FunctionDescription from '@site/src/components/FunctionDescription';
77

8-
<FunctionDescription description="Introduced or updated: v1.2.549"/>
8+
<FunctionDescription description="Introduced or updated: v1.2.698"/>
99

1010
import EEFeature from '@site/src/components/EEFeature';
1111

1212
<EEFeature featureName='ATTACH TABLE'/>
1313

1414
Attaches an existing table to another one. The command moves the data and schema of a table from one database to another, but without actually copying the data. Instead, it creates a link that points to the original table data for accessing the data.
1515

16-
Attach Table enables you to seamlessly connect a table in the cloud service platform to an existing table deployed in a private deployment environment without the need to physically move the data. This is particularly useful when you want to migrate data from a private deployment of Databend to [Databend Cloud](https://www.databend.com) while minimizing the data transfer overhead.
16+
- Attach Table enables you to seamlessly connect a table in the cloud service platform to an existing table deployed in a private deployment environment without the need to physically move the data. This is particularly useful when you want to migrate data from a private deployment of Databend to [Databend Cloud](https://www.databend.com) while minimizing the data transfer overhead.
1717

18-
The attached table operates in READ_ONLY mode. In this mode, changes in the source table are instantly reflected in the attached table. However, the attached table is exclusively for querying purposes and does not support updates. This means INSERT, UPDATE, and DELETE operations are not allowed on the attached table; only SELECT queries can be executed.
18+
- The attached table operates in READ_ONLY mode. In this mode, changes in the source table are instantly reflected in the attached table. However, the attached table is exclusively for querying purposes and does not support updates. This means INSERT, UPDATE, and DELETE operations are not allowed on the attached table; only SELECT queries can be executed.
1919

2020
## Syntax
2121

2222
```sql
23-
ATTACH TABLE <target_table_name> '<source_table_data_URI>'
23+
ATTACH TABLE <target_table_name> [ ( <column_list> ) ] '<source_table_data_URI>'
2424
CONNECTION = ( <connection_parameters> )
2525
```
26+
- `<column_list>`: An optional, comma-separated list of columns to include from the source table, allowing users to specify only the necessary columns instead of including all of them. If not specified, all columns from the source table will be included.
27+
28+
- Renaming an included column in the source table updates its name in the attached table, and it must be accessed using the new name.
29+
- Dropping an included column in the source table makes it inaccessible in the attached table.
30+
- Changes to non-included columns, such as renaming or dropping them in the source table, do not affect the attached table.
2631

2732
- `<source_table_data_URI>` represents the path to the source table's data. For S3-like object storage, the format is `s3://<bucket-name>/<database_ID>/<table_ID>`, for example, _s3://databend-toronto/1/23351/_, which represents the exact path to the table folder within the bucket.
2833

@@ -50,89 +55,28 @@ CONNECTION = ( <connection_parameters> )
5055

5156
- `CONNECTION` specifies the connection parameters required for establishing a link to the object storage where the source table's data is stored. The connection parameters vary for different storage services based on their specific requirements and authentication mechanisms. For more information, see [Connection Parameters](../../../00-sql-reference/51-connect-parameters.md).
5257

53-
## Examples
58+
## Tutorials
5459

55-
This example illustrates how to link a new table in Databend Cloud with an existing table in Databend, which stores data within an Amazon S3 bucket named "databend-toronto".
60+
- [Linking Tables with ATTACH TABLE](/tutorials/databend-cloud/link-tables)
5661

57-
#### Step 1. Creating Table in Databend
62+
## Examples
5863

59-
Create a table named "population" and insert some sample data:
64+
This example creates an attached table, which includes all columns from a source table stored in AWS S3:
6065

61-
```sql title='Databend:'
62-
CREATE TABLE population (
63-
city VARCHAR(50),
64-
population INT
66+
```sql
67+
ATTACH TABLE population_all_columns 's3://databend-doc/1/16/' CONNECTION = (
68+
REGION='us-east-2',
69+
AWS_KEY_ID = '<your_aws_key_id>',
70+
AWS_SECRET_KEY = '<your_aws_secret_key>'
6571
);
66-
67-
INSERT INTO population (city, population) VALUES
68-
('Toronto', 2731571),
69-
('Montreal', 1704694),
70-
('Vancouver', 631486);
7172
```
7273

73-
#### Step 2. Obtaining Database ID and Table ID
74-
75-
Use the [FUSE_SNAPSHOT](../../../20-sql-functions/16-system-functions/fuse_snapshot.md) function to obtain the database ID and table ID. The result below indicates that the database ID is **1**, and the table ID is **556**:
76-
77-
```sql title='Databend:'
78-
SELECT * FROM FUSE_SNAPSHOT('default', 'population');
79-
80-
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
81-
│ snapshot_id │ snapshot_location │ format_version │ previous_snapshot_id │ segment_count │ block_count │ row_count │ bytes_uncompressed │ bytes_compressed │ index_size │ timestamp
82-
├──────────────────────────────────┼───────────────────────────────────────────────────┼────────────────┼──────────────────────┼───────────────┼─────────────┼───────────┼────────────────────┼──────────────────┼────────────┼────────────────────────────┤
83-
│ f252dd43d1aa44898a04827808342daf │ 1/556/_ss/f252dd43d1aa44898a04827808342daf_v4.mpk4NULL113704485312023-11-01 02:35:47.325319
84-
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
85-
```
86-
87-
When you access the bucket page on Amazon S3, you'll observe that the data is organized within the path `databend-toronto` > `1` > `556`, like this:
88-
89-
![Alt text](/img/sql/attach-table-2.png)
90-
91-
#### Step 3. Linking Table in Databend Cloud
74+
This example creates an attached table, which includes only selected columns (`city` and `population`) from a source table stored in AWS S3:
9275

93-
Sign in to Databend Cloud and run the following command in a worksheet to link a table named "population_readonly":
94-
95-
```sql title='Databend Cloud:'
96-
ATTACH TABLE population_readonly 's3://databend-toronto/1/556/' CONNECTION = (
76+
```sql
77+
ATTACH TABLE population_only (city, population) 's3://databend-doc/1/16/' CONNECTION = (
78+
REGION='us-east-2',
9779
AWS_KEY_ID = '<your_aws_key_id>',
9880
AWS_SECRET_KEY = '<your_aws_secret_key>'
9981
);
100-
```
101-
102-
To verify the success of the link, run the following query in Databend Cloud:
103-
104-
```sql title='Databend Cloud:'
105-
SELECT * FROM population_readonly;
106-
107-
-- Expected result:
108-
┌────────────────────────────────────┐
109-
│ city │ population │
110-
├──────────────────┼─────────────────┤
111-
│ Toronto │ 2731571
112-
│ Montreal │ 1704694
113-
│ Vancouver │ 631486
114-
└────────────────────────────────────┘
115-
```
116-
117-
You're all set! If you update the source table in Databend, you can observe the same changes reflected in the target table on Databend Cloud. For example, if you change the population of Toronto to 2,371,571 in the source table:
118-
119-
```sql title='Databend:'
120-
UPDATE population
121-
SET population = 2371571
122-
WHERE city = 'Toronto';
123-
```
124-
125-
You can see that the updates are synced to the attached table in Databend Cloud:
126-
127-
```sql title='Databend Cloud:'
128-
SELECT * FROM population_readonly;
129-
130-
-- Expected result:
131-
┌────────────────────────────────────┐
132-
│ city │ population │
133-
├──────────────────┼─────────────────┤
134-
│ Toronto │ 2371571
135-
│ Montreal │ 1704694
136-
│ Vancouver │ 631486
137-
└────────────────────────────────────┘
138-
```
82+
```
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: Linking Tables with ATTACH TABLE
3+
---
4+
5+
In this tutorial, we'll walk you through how to link a table in Databend Cloud with an existing Databend table stored in an S3 bucket using the [ATTACH TABLE](/sql/sql-commands/ddl/table/attach-table) command.
6+
7+
## Before You Start
8+
9+
Before you start, ensure you have the following prerequisites in place:
10+
11+
- [Docker](https://www.docker.com/) is installed on your local machine, as it will be used to launch a self-hosted Databend.
12+
- An AWS S3 bucket used as storage for your self-hosted Databend. [Learn how to create an S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html).
13+
- AWS Access Key ID and Secret Access Key with sufficient permissions for accessing your S3 bucket. [Manage your AWS credentials](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys).
14+
- BendSQL is installed on your local machine. See [Installing BendSQL](/guides/sql-clients/bendsql/#installing-bendsql) for instructions on how to install BendSQL using various package managers.
15+
16+
## Step 1: Launch Databend in Docker
17+
18+
1. Start a Databend container on your local machine. The command below launches a Databend container with S3 as the storage backend, using the `databend-doc` bucket, along with the specified S3 endpoint and authentication credentials.
19+
20+
```bash
21+
docker run \
22+
-p 8000:8000 \
23+
-e QUERY_STORAGE_TYPE=s3 \
24+
-e AWS_S3_ENDPOINT="https://s3.us-east-2.amazonaws.com" \
25+
-e AWS_S3_BUCKET=databend-doc\
26+
-e AWS_ACCESS_KEY_ID=<your-aws-access-key-id> \
27+
-e AWS_SECRET_ACCESS_KEY=<your-aws-secrect-access-key> \
28+
datafuselabs/databend:v1.2.699-nightly
29+
```
30+
31+
2. Create a table named `population` to store city, province, and population data, and insert sample records as follows:
32+
33+
```sql
34+
CREATE TABLE population (
35+
city VARCHAR(50),
36+
province VARCHAR(50),
37+
population INT
38+
);
39+
40+
INSERT INTO population (city, province, population) VALUES
41+
('Toronto', 'Ontario', 2731571),
42+
('Montreal', 'Quebec', 1704694),
43+
('Vancouver', 'British Columbia', 631486);
44+
```
45+
46+
3. Run the following statement to retrieve the table's location in S3. As indicated in the result below, the S3 URI for the table is `s3://databend-doc/1/16/` for this tutorial.
47+
48+
```sql
49+
SELECT snapshot_location FROM FUSE_SNAPSHOT('default', 'population');
50+
51+
┌──────────────────────────────────────────────────┐
52+
│ snapshot_location │
53+
├──────────────────────────────────────────────────┤
54+
1/16/_ss/513c5100aa0243fe863b4cc2df0e3046_v4.mpk
55+
└──────────────────────────────────────────────────┘
56+
```
57+
58+
## Step 2: Set Up Attached Tables in Databend Cloud
59+
60+
1. Connect to Databend Cloud using BendSQL. If you're unfamiliar with BendSQL, refer to this tutorial: [Connecting to Databend Cloud using BendSQL](../connect/connect-to-databendcloud-bendsql.md).
61+
62+
2. Execute the following statements to create two attached tables:
63+
- The first table, `population_all_columns`, includes all columns from the source data.
64+
- The second table, `population_only`, includes only the selected columns (`city` & `population`).
65+
66+
```sql
67+
-- Create an attached table with all columns from the source
68+
ATTACH TABLE population_all_columns 's3://databend-doc/1/16/' CONNECTION = (
69+
REGION='us-east-2',
70+
AWS_KEY_ID = '<your_aws_key_id>',
71+
AWS_SECRET_KEY = '<your_aws_secret_key>'
72+
);
73+
74+
-- Create an attached table with selected columns (city & population) from the source
75+
ATTACH TABLE population_only (city, population) 's3://databend-doc/1/16/' CONNECTION = (
76+
REGION='us-east-2',
77+
AWS_KEY_ID = '<your_aws_key_id>',
78+
AWS_SECRET_KEY = '<your_aws_secret_key>'
79+
);
80+
```
81+
82+
## Step 3: Verify Attached Tables
83+
84+
1. Query the two attached tables to verify their contents:
85+
86+
```sql
87+
SELECT * FROM population_all_columns;
88+
89+
┌───────────────────────────────────────────────────────┐
90+
│ city │ province │ population │
91+
├──────────────────┼──────────────────┼─────────────────┤
92+
│ Toronto │ Ontario │ 2731571
93+
│ Montreal │ Quebec │ 1704694
94+
│ Vancouver │ British Columbia │ 631486
95+
└───────────────────────────────────────────────────────┘
96+
97+
SELECT * FROM population_only;
98+
99+
┌────────────────────────────────────┐
100+
│ city │ population │
101+
├──────────────────┼─────────────────┤
102+
│ Toronto │ 2731571
103+
│ Montreal │ 1704694
104+
│ Vancouver │ 631486
105+
└────────────────────────────────────┘
106+
```
107+
108+
2. If you update the source table in Databend, you can observe the same changes reflected in the attached table on Databend Cloud. For example, if you change the population of Toronto to 2,371,571 in the source table:
109+
110+
```sql
111+
UPDATE population
112+
SET population = 2371571
113+
WHERE city = 'Toronto';
114+
```
115+
116+
After executing the update, you can query both attached tables to verify that the changes are reflected:
117+
118+
```sql
119+
-- Check the updated population in the attached table with all columns
120+
SELECT population FROM population_all_columns WHERE city = 'Toronto';
121+
122+
-- Check the updated population in the attached table with only the population column
123+
SELECT population FROM population_only WHERE city = 'Toronto';
124+
```
125+
126+
Expected output for both queries above:
127+
128+
```sql
129+
┌─────────────────┐
130+
│ population │
131+
├─────────────────┤
132+
2371571
133+
└─────────────────┘
134+
```
135+
136+
3. If you drop the `province` column from the source table, it will no longer be available in the attached table for queries.
137+
138+
```sql
139+
ALTER TABLE population DROP province;
140+
```
141+
142+
After dropping the column, any queries referencing it will result in an error. However, the remaining columns can still be queried successfully.
143+
144+
For example, attempting to query the dropped `province` column will fail:
145+
146+
```sql
147+
SELECT province FROM population_all_columns;
148+
error: APIError: QueryFailed: [1065]error:
149+
--> SQL:1:8
150+
|
151+
1 | SELECT province FROM population_all_columns
152+
| ^^^^^^^^ column province doesn't exist
153+
```
154+
155+
However, you can still retrieve the `city` and `population` columns:
156+
157+
```sql
158+
SELECT city, population FROM population_all_columns;
159+
160+
┌────────────────────────────────────┐
161+
│ city │ population │
162+
├──────────────────┼─────────────────┤
163+
│ Toronto │ 2371571 │
164+
│ Montreal │ 1704694 │
165+
│ Vancouver │ 631486 │
166+
└────────────────────────────────────┘
167+
```

0 commit comments

Comments
 (0)