Skip to content

Commit 7ad245c

Browse files
committed
Merge branch 'v1.5.0_dev_feature_kudu' into 'v1.5.0_dev'
增加kuduSink.md和kuduSide.md说明 增加kuduSink.md和kuduSide.md说明 See merge request !114
2 parents 3ee2a1d + d7ac275 commit 7ad245c

File tree

3 files changed

+193
-2
lines changed

3 files changed

+193
-2
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@
1010

1111
# 已支持
1212
* 源表:kafka 0.9,1.x版本
13-
* 维表:mysqlSQlServer,oracle,hbasemongoredis,cassandra
14-
* 结果表:mysqlSQlServer,oracle,hbaseelasticsearch5.xmongoredis,cassandra
13+
* 维表:mysql, SQlServer,oracle, hbase, mongo, redis, cassandra, kudu
14+
* 结果表:mysql, SQlServer, oracle, hbase, elasticsearch5.x, mongo, redis, cassandra, kudu
1515

1616
# 后续开发计划
1717
* 增加SQL支持CEP
@@ -154,13 +154,15 @@ sh submit.sh -sql D:\sideSql.txt -name xctest -remoteSqlPluginPath /opt/dtstack
154154
* [mongo 结果表插件](docs/mongoSink.md)
155155
* [redis 结果表插件](docs/redisSink.md)
156156
* [cassandra 结果表插件](docs/cassandraSink.md)
157+
* [kudu 结果表插件](docs/kuduSink.md)
157158

158159
### 2.3 维表插件
159160
* [hbase 维表插件](docs/hbaseSide.md)
160161
* [mysql 维表插件](docs/mysqlSide.md)
161162
* [mongo 维表插件](docs/mongoSide.md)
162163
* [redis 维表插件](docs/redisSide.md)
163164
* [cassandra 维表插件](docs/cassandraSide.md)
165+
* [kudu 维表插件](docs/kuduSide.md)
164166

165167
## 3 性能指标(新增)
166168

docs/kuduSide.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
2+
## 1.格式:
3+
All:
4+
```
5+
create table sideTable(
6+
id int,
7+
tablename1 VARCHAR,
8+
PRIMARY KEY(id),
9+
PERIOD FOR SYSTEM_TIME
10+
)WITH(
11+
type='kudu',
12+
kuduMasters ='ip1,ip2,ip3',
13+
tableName ='impala::default.testSide',
14+
cache ='ALL',
15+
primaryKey='id,xx',
16+
lowerBoundPrimaryKey='10,xx',
17+
upperBoundPrimaryKey='15,xx',
18+
workerCount='1',
19+
defaultOperationTimeoutMs='600000',
20+
defaultSocketReadTimeoutMs='6000000',
21+
batchSizeBytes='100000000',
22+
limitNum='1000',
23+
isFaultTolerant='false',
24+
partitionedJoin='false'
25+
);
26+
```
27+
LRU:
28+
```
29+
create table sideTable(
30+
id int,
31+
tablename1 VARCHAR,
32+
PRIMARY KEY(id),
33+
PERIOD FOR SYSTEM_TIME
34+
)WITH(
35+
type='kudu',
36+
kuduMasters ='ip1,ip2,ip3',
37+
tableName ='impala::default.testSide',
38+
cache ='LRU',
39+
workerCount='1',
40+
defaultOperationTimeoutMs='600000',
41+
defaultSocketReadTimeoutMs='6000000',
42+
batchSizeBytes='100000000',
43+
limitNum='1000',
44+
isFaultTolerant='false',
45+
partitionedJoin='false'
46+
);
47+
```
48+
49+
## 2.支持版本
50+
kudu 1.9.0+cdh6.2.0
51+
52+
## 3.表结构定义
53+
54+
|参数名称|含义|
55+
|----|---|
56+
| tableName | 注册到flink的表名称(可选填;不填默认和hbase对应的表名称相同)|
57+
| colName | 列名称|
58+
| colType | 列类型 [colType支持的类型](colType.md)|
59+
| PERIOD FOR SYSTEM_TIME | 关键字表明该定义的表为维表信息|
60+
| PRIMARY KEY(keyInfo) | 维表主键定义;多个列之间用逗号隔开|
61+
62+
## 3.参数
63+
64+
65+
|参数名称|含义|是否必填|默认值|
66+
|----|---|---|-----|
67+
|type | 表明维表的类型[hbase\|mysql|\kudu]|||
68+
| kuduMasters | kudu master节点的地址;格式ip[ip,ip2]|||
69+
| tableName | kudu 的表名称|||
70+
| workerCount | 工作线程数 ||
71+
| defaultOperationTimeoutMs | 写入操作超时时间 ||
72+
| defaultSocketReadTimeoutMs | socket读取超时时间 ||
73+
| primaryKey | 需要过滤的主键 ALL模式独有 ||
74+
| lowerBoundPrimaryKey | 需要过滤的主键的最小值 ALL模式独有 ||
75+
| upperBoundPrimaryKey | 需要过滤的主键的最大值(不包含) ALL模式独有 ||
76+
| workerCount | 工作线程数 ||
77+
| defaultOperationTimeoutMs | 写入操作超时时间 ||
78+
| defaultSocketReadTimeoutMs | socket读取超时时间 ||
79+
| batchSizeBytes |返回数据的大小 ||
80+
| limitNum |返回数据的条数 ||
81+
| isFaultTolerant |查询是否容错 查询失败是否扫描第二个副本 默认false 容错 ||
82+
| cache | 维表缓存策略(NONE/LRU/ALL)||NONE|
83+
| partitionedJoin | 是否在維表join之前先根据 設定的key 做一次keyby操作(可以減少维表的数据缓存量)||false|
84+
85+
86+
--------------
87+
> 缓存策略
88+
* NONE: 不做内存缓存
89+
* LRU:
90+
* cacheSize: 缓存的条目数量
91+
* cacheTTLMs:缓存的过期时间(ms)
92+
93+
## 4.样例
94+
All:
95+
```
96+
create table sideTable(
97+
id int,
98+
tablename1 VARCHAR,
99+
PRIMARY KEY(id),
100+
PERIOD FOR SYSTEM_TIME
101+
)WITH(
102+
type='kudu',
103+
kuduMasters ='ip1,ip2,ip3',
104+
tableName ='impala::default.testSide',
105+
cache ='ALL',
106+
primaryKey='id,xx',
107+
lowerBoundPrimaryKey='10,xx',
108+
upperBoundPrimaryKey='15,xx',
109+
partitionedJoin='false'
110+
);
111+
```
112+
LRU:
113+
```
114+
create table sideTable(
115+
id int,
116+
tablename1 VARCHAR,
117+
PRIMARY KEY(id),
118+
PERIOD FOR SYSTEM_TIME
119+
)WITH(
120+
type='kudu',
121+
kuduMasters ='ip1,ip2,ip3',
122+
tableName ='impala::default.testSide',
123+
cache ='LRU',
124+
partitionedJoin='false'
125+
);
126+
```
127+

docs/kuduSink.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
## 1.格式:
2+
```
3+
CREATE TABLE tableName(
4+
colName colType,
5+
...
6+
colNameX colType
7+
)WITH(
8+
type ='kudu',
9+
kuduMasters ='ip1,ip2,ip3',
10+
tableName ='impala::default.test',
11+
writeMode='upsert',
12+
workerCount='1',
13+
defaultOperationTimeoutMs='600000',
14+
defaultSocketReadTimeoutMs='6000000',
15+
parallelism ='parllNum'
16+
);
17+
18+
19+
```
20+
21+
## 2.支持版本
22+
kudu 1.9.0+cdh6.2.0
23+
24+
## 3.表结构定义
25+
26+
|参数名称|含义|
27+
|----|---|
28+
| tableName | 在 sql 中使用的名称;即注册到flink-table-env上的名称
29+
| colName | 列名称,redis中存储为 表名:主键名:主键值:列名]|
30+
| colType | 列类型 [colType支持的类型](colType.md)|
31+
32+
33+
## 4.参数:
34+
35+
|参数名称|含义|是否必填|默认值|
36+
|----|---|---|-----|
37+
|type | 表明 输出表类型[mysql\|hbase\|elasticsearch\redis\|kudu\]|||
38+
| kuduMasters | kudu master节点的地址;格式ip[ip,ip2]|||
39+
| tableName | kudu 的表名称|||
40+
| writeMode | 写入kudu的模式 insert|update|upsert |否 |upsert
41+
| workerCount | 工作线程数 ||
42+
| defaultOperationTimeoutMs | 写入操作超时时间 ||
43+
| defaultSocketReadTimeoutMs | socket读取超时时间 ||
44+
|parallelism | 并行度设置||1|
45+
46+
47+
## 5.样例:
48+
```
49+
CREATE TABLE MyResult(
50+
id int,
51+
title VARCHAR,
52+
amount decimal,
53+
tablename1 VARCHAR
54+
)WITH(
55+
type ='kudu',
56+
kuduMasters ='localhost1,localhost2,localhost3',
57+
tableName ='impala::default.test',
58+
writeMode='upsert',
59+
parallelism ='1'
60+
);
61+
62+
```

0 commit comments

Comments
 (0)