@@ -91,9 +91,113 @@ aws s3 ls s3://databend-doc/logs/
9191If the log file has been successfully synced to S3, you should see output similar to this:
9292
9393``` bash
94+ 2024-12-10 15:22:13 0
95+ 2024-12-10 17:52:42 112 1733871161-7b89e50a-6eb4-4531-8479-dd46981e4674.log.gz
96+ ```
97+
98+ You can now download the synced log file from your bucket:
99+
100+ ``` bash
101+ aws s3 cp s3://databend-doc/logs/1733871161-7b89e50a-6eb4-4531-8479-dd46981e4674.log.gz ~ /Documents/
102+ ```
103+
104+ Compared to the original log, the synced log is in NDJSON format, with each record wrapped in an outer ` log ` field:
105+
106+ ``` json
107+ {"log" :{"event" :" login" ,"timestamp" :" 2024-12-08T10:00:00Z" ,"user_id" :1 }}
108+ {"log" :{"event" :" purchase" ,"timestamp" :" 2024-12-08T10:05:00Z" ,"user_id" :2 }}
109+ ```
110+
111+ ## Step 4: Create Task in Databend Cloud
112+
113+ 1 . Open a worksheet, and create an external stage that links to the ` logs ` folder in your bucket:
114+
115+ ``` sql
116+ CREATE STAGE mylog ' s3://databend-doc/logs/' CONNECTION= (
117+ ACCESS_KEY_ID = ' <your-access-key-id>' ,
118+ SECRET_ACCESS_KEY = ' <your-secret-access-key>'
119+ );
120+ ```
121+
122+ Once the stage is successfully created, you can list the files in it:
123+
124+ ``` sql
125+ LIST @mylog;
126+
127+ ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
128+ │ name │ size │ md5 │ last_modified │ creator │
129+ ├────────────────────────────────────────────────────────┼────────┼────────────────────────────────────┼───────────────────────────────┼──────────────────┤
130+ │ 1733871161 - 7b89e50a- 6eb4- 4531 - 8479 - dd46981e4674 .log .gz │ 112 │ " 231ddcc590222bfaabd296b151154844" │ 2024 - 12 - 10 22 :52 :42 .000 + 0000 │ NULL │
131+ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
132+ ```
133+
134+ 2 . Create a table with columns mapped to the fields in the log:
135+
136+ ``` sql
137+ CREATE TABLE logs (
138+ event String,
139+ timestamp Timestamp ,
140+ user_id Int32
141+ );
142+ ```
94143
144+ 3 . Create a scheduled task to load logs from the external stage into the ` logs ` table:
145+
146+ ``` sql
147+ CREATE TASK IF NOT EXISTS myvectortask
148+ WAREHOUSE = ' eric'
149+ SCHEDULE = 1 MINUTE
150+ SUSPEND_TASK_AFTER_NUM_FAILURES = 3
151+ AS
152+ COPY INTO logs
153+ FROM (
154+ SELECT $1 :log:event, $1 :log:timestamp , $1 :log:user_id
155+ FROM @mylog/
156+ )
157+ FILE_FORMAT = (TYPE = NDJSON, COMPRESSION = AUTO)
158+ MAX_FILES = 10000
159+ PURGE = TRUE;
95160```
96161
162+ 4 . Start the task:
163+
164+ ``` sql
165+ ALTER TASK myvectortask RESUME;
166+ ```
167+
168+ Wait for a moment, then check if the logs have been loaded into the table:
169+
170+ ``` sql
171+ SELECT * FROM logs;
172+
173+ ┌──────────────────────────────────────────────────────────┐
174+ │ event │ timestamp │ user_id │
175+ ├──────────────────┼─────────────────────┼─────────────────┤
176+ │ login │ 2024 - 12 - 08 10 :00 :00 │ 1 │
177+ │ purchase │ 2024 - 12 - 08 10 :05 :00 │ 2 │
178+ └──────────────────────────────────────────────────────────┘
179+ ```
180+
181+ If you run ` LIST @mylog; ` now, you will see no files listed. This is because the task is configured with ` PURGE = TRUE ` , which deletes the synced files from S3 after the logs are loaded.
182+
183+ Now, let's simulate generating two more logs in the local log file ` app.log ` :
184+
185+ ``` bash
186+ echo ' {"user_id": 3, "event": "logout", "timestamp": "2024-12-08T10:10:00Z"}' >> /Users/eric/Documents/logs/app.log
187+ echo ' {"user_id": 4, "event": "login", "timestamp": "2024-12-08T10:15:00Z"}' >> /Users/eric/Documents/logs/app.log
188+ ```
97189
190+ Wait for a moment for the log to sync to S3 (a new file should appear in the logs folder). The scheduled task will then load the new logs into the table. If you query the table again, you will find these logs:
98191
192+ ``` sql
193+ SELECT * FROM logs;
99194
195+ ┌──────────────────────────────────────────────────────────┐
196+ │ event │ timestamp │ user_id │
197+ ├──────────────────┼─────────────────────┼─────────────────┤
198+ │ logout │ 2024 - 12 - 08 10 :10 :00 │ 3 │
199+ │ login │ 2024 - 12 - 08 10 :15 :00 │ 4 │
200+ │ login │ 2024 - 12 - 08 10 :00 :00 │ 1 │
201+ │ purchase │ 2024 - 12 - 08 10 :05 :00 │ 2 │
202+ └──────────────────────────────────────────────────────────┘
203+ ```
0 commit comments