Skip to content

Commit 2c4aa3c

Browse files
beryllwwangjunbo
andauthored
[FLINK-36052][docs][cdc-pipeline-connector/es] Add flink cdc pipeline elasticsearch connector docs (#3686)
Co-authored-by: wangjunbo <[email protected]>
1 parent 4d0174c commit 2c4aa3c

File tree

3 files changed

+551
-1
lines changed
  • docs

3 files changed

+551
-1
lines changed
Lines changed: 275 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,275 @@
1+
---
2+
title: "Elasticsearch"
3+
weight: 7
4+
type: docs
5+
aliases:
6+
- /connectors/pipeline-connectors/elasticsearch
7+
---
8+
<!--
9+
Licensed to the Apache Software Foundation (ASF) under one
10+
or more contributor license agreements. See the NOTICE file
11+
distributed with this work for additional information
12+
regarding copyright ownership. The ASF licenses this file
13+
to you under the Apache License, Version 2.0 (the
14+
"License"); you may not use this file except in compliance
15+
with the License. You may obtain a copy of the License at
16+
17+
http://www.apache.org/licenses/LICENSE-2.0
18+
19+
Unless required by applicable law or agreed to in writing,
20+
software distributed under the License is distributed on an
21+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
22+
KIND, either express or implied. See the License for the
23+
specific language governing permissions and limitations
24+
under the License.
25+
-->
26+
27+
# Elasticsearch Pipeline Connector
28+
29+
Elasticsearch Pipeline 连接器可以用作 Pipeline 的 Data Sink, 将数据写入 Elasticsearch。 本文档介绍如何设置 Elasticsearch Pipeline 连接器。
30+
31+
32+
How to create Pipeline
33+
----------------
34+
35+
从 MySQL 读取数据同步到 Elasticsearch 的 Pipeline 可以定义如下:
36+
37+
```yaml
38+
source:
39+
type: mysql
40+
name: MySQL Source
41+
hostname: 127.0.0.1
42+
port: 3306
43+
username: admin
44+
password: pass
45+
tables: adb.\.*, bdb.user_table_[0-9]+, [app|web].order_\.*
46+
server-id: 5401-5404
47+
48+
sink:
49+
type: elasticsearch
50+
name: Elasticsearch Sink
51+
hosts: http://127.0.0.1:9092,http://127.0.0.1:9093
52+
53+
route:
54+
- source-table: adb.\.*
55+
sink-table: default_index
56+
description: sync adb.\.* table to default_index
57+
58+
pipeline:
59+
name: MySQL to Elasticsearch Pipeline
60+
parallelism: 2
61+
```
62+
63+
Pipeline Connector Options
64+
----------------
65+
<div class="highlight">
66+
<table class="colwidths-auto docutils">
67+
<thead>
68+
<tr>
69+
<th class="text-left" style="width: 25%">Option</th>
70+
<th class="text-left" style="width: 8%">Required</th>
71+
<th class="text-left" style="width: 7%">Default</th>
72+
<th class="text-left" style="width: 10%">Type</th>
73+
<th class="text-left" style="width: 50%">Description</th>
74+
</tr>
75+
</thead>
76+
<tbody>
77+
<tr>
78+
<td>type</td>
79+
<td>required</td>
80+
<td style="word-wrap: break-word;">(none)</td>
81+
<td>String</td>
82+
<td>指定要使用的连接器, 这里需要设置成 <code>'elasticsearch'</code>.</td>
83+
</tr>
84+
<tr>
85+
<td>name</td>
86+
<td>optional</td>
87+
<td style="word-wrap: break-word;">(none)</td>
88+
<td>String</td>
89+
<td>Sink 的名称。</td>
90+
</tr>
91+
<tr>
92+
<td>hosts</td>
93+
<td>required</td>
94+
<td style="word-wrap: break-word;">(none)</td>
95+
<td>String</td>
96+
<td>要连接到的一台或多台 Elasticsearch 主机,例如: 'http://host_name:9092,http://host_name:9093'.</td>
97+
</tr>
98+
<tr>
99+
<td>version</td>
100+
<td>optional</td>
101+
<td style="word-wrap: break-word;">7</td>
102+
<td>Integer</td>
103+
<td>指定要使用的连接器,有效值为:
104+
<ul>
105+
<li>6: 连接到 Elasticsearch 6.x 的集群。</li>
106+
<li>7: 连接到 Elasticsearch 7.x 的集群。</li>
107+
<li>8: 连接到 Elasticsearch 8.x 的集群。</li>
108+
</ul>
109+
</td>
110+
</tr>
111+
<tr>
112+
<td>username</td>
113+
<td>optional</td>
114+
<td style="word-wrap: break-word;">(none)</td>
115+
<td>String</td>
116+
<td>用于连接 Elasticsearch 实例认证的用户名。</td>
117+
</tr>
118+
<tr>
119+
<td>password</td>
120+
<td>optional</td>
121+
<td style="word-wrap: break-word;">(none)</td>
122+
<td>String</td>
123+
<td>用于连接 Elasticsearch 实例认证的密码。</td>
124+
</tr>
125+
<tr>
126+
<td>batch.size.max</td>
127+
<td>optional</td>
128+
<td style="word-wrap: break-word;">500</td>
129+
<td>Integer</td>
130+
<td>每个批量请求的最大缓冲操作数。 可以设置为'0'来禁用它。</td>
131+
</tr>
132+
<tr>
133+
<td>inflight.requests.max</td>
134+
<td>optional</td>
135+
<td style="word-wrap: break-word;">5</td>
136+
<td>Integer</td>
137+
<td>连接器将尝试执行的最大并发请求数。</td>
138+
</tr>
139+
<tr>
140+
<td>buffered.requests.max</td>
141+
<td>optional</td>
142+
<td style="word-wrap: break-word;">1000</td>
143+
<td>Integer</td>
144+
<td>每个批量请求的内存缓冲区中保留的最大请求数。</td>
145+
</tr>
146+
<tr>
147+
<td>batch.size.max.bytes</td>
148+
<td>optional</td>
149+
<td style="word-wrap: break-word;">5242880</td>
150+
<td>Long</td>
151+
<td>每个批量请求的缓冲操作在内存中的最大值(以byte为单位)。</td>
152+
</tr>
153+
<tr>
154+
<td>buffer.time.max.ms</td>
155+
<td>optional</td>
156+
<td style="word-wrap: break-word;">5000</td>
157+
<td>Long</td>
158+
<td>每个批量请求的缓冲 flush 操作的间隔(以ms为单位)。</td>
159+
</tr>
160+
<tr>
161+
<td>record.size.max.bytes</td>
162+
<td>optional</td>
163+
<td style="word-wrap: break-word;">10485760</td>
164+
<td>Long</td>
165+
<td>单个记录的最大大小(以byte为单位)。</td>
166+
</tr>
167+
</tbody>
168+
</table>
169+
</div>
170+
171+
Usage Notes
172+
--------
173+
174+
* 写入 Elasticsearch 的 index 默认为与上游表同名字符串,可以通过 pipeline 的 route 功能进行修改。
175+
176+
* 如果写入 Elasticsearch 的 index 不存在,不会被默认创建。
177+
178+
Data Type Mapping
179+
----------------
180+
Elasticsearch 将文档存储在 JSON 字符串中,数据类型之间的映射关系如下表所示:
181+
<div class="wy-table-responsive">
182+
<table class="colwidths-auto docutils">
183+
<thead>
184+
<tr>
185+
<th class="text-left">CDC type</th>
186+
<th class="text-left">JSON type</th>
187+
<th class="text-left" style="width:60%;">NOTE</th>
188+
</tr>
189+
</thead>
190+
<tbody>
191+
<tr>
192+
<td>TINYINT</td>
193+
<td>NUMBER</td>
194+
<td></td>
195+
</tr>
196+
<tr>
197+
<td>SMALLINT</td>
198+
<td>NUMBER</td>
199+
<td></td>
200+
</tr>
201+
<tr>
202+
<td>INT</td>
203+
<td>NUMBER</td>
204+
<td></td>
205+
</tr>
206+
<tr>
207+
<td>BIGINT</td>
208+
<td>NUMBER</td>
209+
<td></td>
210+
</tr>
211+
<tr>
212+
<td>FLOAT</td>
213+
<td>NUMBER</td>
214+
<td></td>
215+
</tr>
216+
<tr>
217+
<td>DOUBLE</td>
218+
<td>NUMBER</td>
219+
<td></td>
220+
</tr>
221+
<tr>
222+
<td>DECIMAL(p, s)</td>
223+
<td>STRING</td>
224+
<td></td>
225+
</tr>
226+
<tr>
227+
<td>BOOLEAN</td>
228+
<td>BOOLEAN</td>
229+
<td></td>
230+
</tr>
231+
<tr>
232+
<td>DATE</td>
233+
<td>STRING</td>
234+
<td>with format: date (yyyy-MM-dd), example: 2024-10-21</td>
235+
</tr>
236+
<tr>
237+
<td>TIMESTAMP</td>
238+
<td>STRING</td>
239+
<td>with format: date-time (yyyy-MM-dd HH:mm:ss.SSSSSS, with UTC time zone), example: 2024-10-21 14:10:56.000000</td>
240+
</tr>
241+
<tr>
242+
<td>TIMESTAMP_LTZ</td>
243+
<td>STRING</td>
244+
<td>with format: date-time (yyyy-MM-dd HH:mm:ss.SSSSSS, with UTC time zone), example: 2024-10-21 14:10:56.000000</td>
245+
</tr>
246+
<tr>
247+
<td>CHAR(n)</td>
248+
<td>STRING</td>
249+
<td></td>
250+
</tr>
251+
<tr>
252+
<td>VARCHAR(n)</td>
253+
<td>STRING</td>
254+
<td></td>
255+
</tr>
256+
<tr>
257+
<td>ARRAY</td>
258+
<td>ARRAY</td>
259+
<td></td>
260+
</tr>
261+
<tr>
262+
<td>MAP</td>
263+
<td>STRING</td>
264+
<td></td>
265+
</tr>
266+
<tr>
267+
<td>ROW</td>
268+
<td>STRING</td>
269+
<td></td>
270+
</tr>
271+
</tbody>
272+
</table>
273+
</div>
274+
275+
{{< top >}}

0 commit comments

Comments
 (0)