Skip to content

Commit 553b91f

Browse files
committed
docs(weekly): add this week in databend 85
Signed-off-by: Chojan Shang <[email protected]>
1 parent 52829d3 commit 553b91f

File tree

2 files changed

+190
-0
lines changed

2 files changed

+190
-0
lines changed
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
---
2+
title: "This Week in Databend #85"
3+
date: 2023-03-17
4+
slug: 2023-03-17-databend-weekly
5+
tags: [databend, weekly]
6+
description: "Get to know the latest updates on Databend this week!"
7+
contributors:
8+
- name: andylokandy
9+
- name: ariesdevil
10+
- name: b41sh
11+
- name: BohuTANG
12+
- name: Carlosfengv
13+
- name: Chasen-Zhang
14+
- name: dantengsky
15+
- name: dependabot[bot]
16+
- name: drmingdrmer
17+
- name: everpcpc
18+
- name: jun0315
19+
- name: leiysky
20+
- name: lichuang
21+
- name: mergify[bot]
22+
- name: PsiACE
23+
- name: RinChanNOWWW
24+
- name: soyeric128
25+
- name: sundy-li
26+
- name: TCeason
27+
- name: wubx
28+
- name: Xuanwo
29+
- name: xudong963
30+
- name: youngsofun
31+
- name: zhang2014
32+
- name: zhyass
33+
authors:
34+
- name: PsiACE
35+
url: https://github.com/psiace
36+
image_url: https://github.com/psiace.png
37+
---
38+
39+
[Databend](https://github.com/datafuselabs/databend) is a modern cloud data warehouse, serving your massive-scale analytics needs at low cost and complexity. Open source alternative to Snowflake. Also available in the cloud: <https://app.databend.com> .
40+
41+
> :loudspeaker: Read our blog *[Way to Go: OpenDAL successfully entered Apache Incubator](https://databend.rs/blog/opendal-enters-apache-incubator)* to learn about the story of [OpenDAL](https://github.com/apache/incubator-opendal).
42+
43+
## What's On In Databend
44+
45+
Stay connected with the latest news about Databend.
46+
47+
### Data Type: MAP
48+
49+
The MAP data structure holds `Key:Value` pairs using a nested `Array(Tuple(key, value))` and is useful when the data type is constant but the Key's value cannot be fully determined. The Key must be of a specified basic data type and duplicates are not allowed, while the Value can be any data type including nested arrays or tuples. A bloom filter index is created in Map makes it easier and faster to search for values in MAP.
50+
51+
```sql
52+
select * from nginx_log where log['ip'] = '205.91.162.148';
53+
+----+----------------------------------------+
54+
| id | log |
55+
+----+----------------------------------------+
56+
| 1 | {'ip':'205.91.162.148','url':'test-1'} |
57+
+----+----------------------------------------+
58+
1 row in set
59+
```
60+
61+
If you want to learn more about the Map data type, please read the following materials:
62+
63+
- [Docs | Data Types - Map](https://databend.rs/doc/sql-reference/data-types/data-type-map)
64+
65+
### Data Transformation During Loading Process
66+
67+
Do you remember the two RFCs mentioned last week? Now, Databend has added support for data transformation during the loading process into tables. Basic transformation operations can be achieved by using the `COPY INTO <table>` command.
68+
69+
```sql
70+
CREATE TABLE my_table(id int, name string, time date);
71+
72+
COPY INTO my_table
73+
FROM (SELECT t.id, t.name, to_date(t.timestamp) FROM @mystage t)
74+
FILE_FORMAT = (type = parquet) PATTERN='.*parquet';
75+
```
76+
77+
This feature avoids storing pre-transformed data in temporary tables and supports column reordering, column omission, and type conversion operation. In addition, partial data can be loaded from staged Parquet files or their columns can be rearranged. This feature simplifies and streamlines ETL processes, allowing users to focus on data analysis rather than mechanically moving it.
78+
79+
If you're interested, check the following documentation:
80+
81+
- [Docs | Transforming Data During a Load](https://databend.rs/doc/load-data/data-load-transform)
82+
- [PR | feat(storage): Map data type support bloom filter](https://github.com/datafuselabs/databend/pull/10457)
83+
84+
## Code Corner
85+
86+
Discover some fascinating code snippets or projects that showcase our work or learning journey.
87+
88+
### Run Multiple Futures Parallel
89+
90+
Are you interested in how to run futures in parallel? It is worth mentioning that Databend has greatly improved the scanning performance in situations with a huge number of files by utilizing this technique.
91+
92+
The following code, which is less than 30 lines long, will introduce you to how it all works.
93+
94+
```rust
95+
/// Run multiple futures parallel
96+
/// using a semaphore to limit the parallelism number, and a specified thread pool to run the futures.
97+
/// It waits for all futures to complete and returns their results.
98+
pub async fn execute_futures_in_parallel<Fut>(
99+
futures: impl IntoIterator<Item = Fut>,
100+
thread_nums: usize,
101+
permit_nums: usize,
102+
thread_name: String,
103+
) -> Result<Vec<Fut::Output>>
104+
where
105+
Fut: Future + Send + 'static,
106+
Fut::Output: Send + 'static,
107+
{
108+
// 1. build the runtime.
109+
let semaphore = Semaphore::new(permit_nums);
110+
let runtime = Arc::new(Runtime::with_worker_threads(
111+
thread_nums,
112+
Some(thread_name),
113+
)?);
114+
115+
// 2. spawn all the tasks to the runtime with semaphore.
116+
let join_handlers = runtime.try_spawn_batch(semaphore, futures).await?;
117+
118+
// 3. get all the result.
119+
future::try_join_all(join_handlers)
120+
.await
121+
.map_err(|e| ErrorCode::Internal(format!("try join all futures failure, {}", e)))
122+
}
123+
```
124+
125+
If you are interested in this Rust trick, you can read this PR: [feat: improve the parquet get splits to parallel](https://github.com/datafuselabs/databend/pull/10514).
126+
127+
### How to Create a System Table
128+
129+
System tables are tables that provide information about Databend's internal state, such as databases, tables, functions, and settings.
130+
131+
If you are interested in creating system tables, you may want to check out our recently released documentation which introduces the implementation, registration, and testing of system tables using the `system.credits` table as an example.
132+
133+
Here is a code snippet:
134+
135+
```rust
136+
let table_info = TableInfo {
137+
desc: "'system'.'credits'".to_string(),
138+
name: "credits".to_string(),
139+
ident: TableIdent::new(table_id, 0),
140+
meta: TableMeta {
141+
schema,
142+
engine: "SystemCredits".to_string(),
143+
..Default::default()
144+
},
145+
..Default::default()
146+
};
147+
```
148+
149+
- [Docs | How to Create a System Table](https://databend.rs/doc/contributing/how-to-write-a-system-table)
150+
151+
## Highlights
152+
153+
Here are some noteworthy items recorded here, perhaps you can find something that interests you.
154+
155+
- We suggest users to consider `unset max_storage_io_requests` to use `num_cpu` as the default value when upgrading to **1.0.17-nightly** and above.
156+
- Now Databend can integrate with MindsDB to provide users with machine learning workflow support. *[Bringing in-database ML to Databend](https://mindsdb.com/integrations/databend-machine-learning)*
157+
- If you happen to use HDFS and are interested in Databend, why not try our WebHDFS storage backend? This blog post may be helpful for you. *[How to Configure WebHDFS as a Storage Backend for Databend](https://databend.rs/blog/2023-03-13-webhdfs-storage-for-backend)*
158+
159+
## What's Up Next
160+
161+
We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.
162+
163+
### Support Quantile with A List
164+
165+
After the merge of PR [#10474](https://github.com/datafuselabs/databend/pull/10474), Databend began to support quantile aggregation functions, but currently only supports setting a single floating-point value as the level. If it could also support passing in a list, it may help simplify SQL writing in some scenarios.
166+
167+
```sql
168+
SELECT QUANTILE([0.25, 0.5, 0.75])(number) FROM numbers(25);
169+
+-------------------------------------+
170+
| quantile([0.25, 0.5, 0.75])(number) |
171+
+-------------------------------------+
172+
| [6, 12, 18] |
173+
+-------------------------------------+
174+
```
175+
176+
[Feature: quantile support list and add functions kurtosis() and skewness()](https://github.com/datafuselabs/databend/issues/10589)
177+
178+
Additionally, the `kurtosis(x)` and `skewness(x)` mentioned in this problem may also be a good starting point for contributing to Databend.
179+
180+
Please let us know if you're interested in contributing to this issue, or pick up a good first issue at <https://link.databend.rs/i-m-feeling-lucky> to get started.
181+
182+
## Changelog
183+
184+
You can check the changelog of Databend Nightly for details about our latest developments.
185+
186+
**Full Changelog**: <https://github.com/datafuselabs/databend/compare/v1.0.11-nightly...v1.0.21-nightly>

website/src/components/BaseComponents/AvatarGroup/index.js

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ const Avatar = ({ name }) => {
1010
url = "http://github.com/app/mergify[bot]";
1111
image_url = "https://avatars.githubusercontent.com/in/10562";
1212
}
13+
if (name == "dependabot[bot]") {
14+
url = "http://github.com/app/dependabot[bot]";
15+
image_url = "https://avatars.githubusercontent.com/in/29110";
16+
}
1317
return (
1418
<Tooltip content={name}>
1519
<a href={url} className={styles.Avatar}>

0 commit comments

Comments
 (0)