Skip to content

Commit 959bc86

Browse files
committed
docs: update as eric's comments
Signed-off-by: Chojan Shang <[email protected]>
1 parent 8bbd6cf commit 959bc86

File tree

1 file changed

+84
-112
lines changed

1 file changed

+84
-112
lines changed
Lines changed: 84 additions & 112 deletions
Original file line numberDiff line numberDiff line change
@@ -1,155 +1,135 @@
11
---
2-
title: How to Write a System Table
2+
title: How to Create a System Table
33
---
44

5-
System tables are special tables that provide information about Databend's internal state, such as databases, tables, functions, settings, etc. In this document, we will show you how to write a new system table for Databend using the credits table as an example.
5+
System tables are tables that provide information about Databend's internal state, such as databases, tables, functions, and settings. If you're familiar with the Databend code structure and have basic knowledge about Rust, you can also create your own system tables as needed.
66

7-
The credits table returns information about the upstream dependencies used by Databend, including their names, versions and licenses.
7+
Creating a system table mainly involves defining the table information (table name and schema) and how to generate and retrieve data for the table. This can be done through implementing the trait `SyncSystemTable` or `AsyncSystemTable`.
88

9-
## Prerequisites
9+
This guide will show you how to create a new system table for Databend, using the table [system.credits](https://databend.rs/doc/sql-reference/system-tables/system-credits) as an example. The table provides information Databend's upstream dependencies and the code is located at `src/query/storage/system/src/credits_table.rs`.
1010

11-
To write a new system table for Databend, you need to have some basic knowledge of Rust programming language and Databend's code structure.
11+
:::note
12+
Databend suggests that you store the code for new system tables in the directory `src/query/storage/system/src/`. However, there may be situations where you cannot do so, such as issues related to the build process. In such cases, you can place it temporarily in a directory called `src/query/service/src/databases/system` (although this is not recommended).
13+
:::
1214

13-
## Location
15+
## Creating a System Table
1416

15-
The existing system tables for Databend are located in the `query/storage` directory. You should place your new system table file in this directory as well, unless there are some special build reasons that prevent you from doing so. In that case, you can temporarily place it in the `service/databases/system` directory (not recommended).
17+
The following walks through the implementation of the table `system.credits` step by step.
1618

17-
## Definition
19+
1. Define a struct for your system table that contains only the fields for storing the table information.
1820

19-
The definition of a system table mainly focuses on two aspects: one is the table information, which includes the table `name` and `schema`, etc.; the other is the data generation/retrieval logic for the table content. These two aspects correspond to two traits: `SyncSystemTable` and `AsyncSystemTable`. You need to implement one of these traits depending on whether your data retrieval involves asynchronous function calls or not.
20-
21-
## Implementation
22-
23-
In this section, we will walk through the implementation of the credits table step by step. The code file is located at `credits_table.rs`.
24-
25-
Firstly, you need to define a struct for your system table that contains only the fields for storing the table information. For example:
26-
27-
```rust
28-
pub struct CreditsTable {
29-
table_info: TableInfo,
30-
}
31-
```
32-
33-
Next, you need to implement a create method for your system table struct that takes a `table_id` as an argument and returns an `Arc<dyn Table>`. The `table_id` is generated by `sys_db_meta.next_table_id()` when creating a new system table.
34-
35-
```rust
36-
pub fn create(table_id: u64) -> Arc<dyn Table>
37-
```
21+
```rust
22+
pub struct CreditsTable {
23+
table_info: TableInfo,
24+
}
25+
```
3826

39-
Inside this method, you need to define a schema for your system table using `TableSchemaRefExt` and `TableField`. The schema describes the structure of your system table with field names and types depending on the data you want to store in it.
27+
2. Implement a `create` method for your system table struct that takes `table_id` as an argument and returns `Arc<dyn Table>`. The `table_id` is generated by `sys_db_meta.next_table_id()` when creating a new system table.
4028

41-
For example:
29+
```rust
30+
pub fn create(table_id: u64) -> Arc<dyn Table>
31+
```
4232

43-
```rust
44-
let schema = TableSchemaRefExt::create(vec![
45-
TableField::new("name", TableDataType::String),
46-
TableField::new("version", TableDataType::String),
47-
TableField::new("license", TableDataType::String),
48-
]);
49-
```
33+
3. Define a schema for your system table using `TableSchemaRefExt` and `TableField`. The schema describes the structure of your system table with field names and types depending on the data you want to store in it.
5034

51-
For string-type data, you can use `TableDataType::String`; other basic types are similar. But if you need to allow null values in your field, such as an optional 64-bit unsigned integer field, you can use `TableDataType::Nullable(Box::new(TableDataType::Number(NumberDataType::UInt64)))` instead. `TableDataType::Nullable` indicates that null values are allowed; `TableDataType::Number(NumberDataType::UInt64)` represents that the type is 64-bit unsigned integer.
35+
For string-type data, you can use `TableDataType::String`; other basic types are similar. But if you need to allow null values in your field, such as an optional 64-bit unsigned integer field, you can use `TableDataType::Nullable(Box::new(TableDataType::Number(NumberDataType::UInt64)))` instead. `TableDataType::Nullable` indicates that null values are allowed; `TableDataType::Number(NumberDataType::UInt64)` represents that the type is 64-bit unsigned integer.
5236

53-
After defining the schema, you need to define some metadata for your system table, such as description (`desc`), `name`, `meta`, etc. You can follow other existing examples and fill in these fields accordingly.
37+
```rust
38+
let schema = TableSchemaRefExt::create(vec![
39+
TableField::new("name", TableDataType::String),
40+
TableField::new("version", TableDataType::String),
41+
TableField::new("license", TableDataType::String),
42+
]);
43+
```
5444

55-
For example:
45+
4. Define metadata for your system table, such as description (`desc`), `name`, `meta`, etc. You can follow other existing examples and fill in these fields accordingly.
5646

57-
```rust
58-
let table_info = TableInfo {
59-
desc: "'system'.'credits'".to_string(),
60-
name: "credits".to_string(),
61-
ident: TableIdent::new(table_id, 0),
62-
meta: TableMeta {
63-
schema,
64-
engine: "SystemCredits".to_string(),
65-
..Default::default()
66-
},
67-
..Default::default()
68-
};
69-
70-
SyncOneBlockSystemTable::create(CreditsTable { table_info })
71-
```
47+
```rust
48+
let table_info = TableInfo {
49+
desc: "'system'.'credits'".to_string(),
50+
name: "credits".to_string(),
51+
ident: TableIdent::new(table_id, 0),
52+
meta: TableMeta {
53+
schema,
54+
engine: "SystemCredits".to_string(),
55+
..Default::default()
56+
},
57+
..Default::default()
58+
};
7259

73-
Finally, you need to create an instance of your system table struct with these fields and wrap it with either `SyncOneBlockSystemTable` or `AsyncOneBlockSystemTable` depending on whether your data retrieval logic is synchronous or asynchronous.
60+
SyncOneBlockSystemTable::create(CreditsTable { table_info })
61+
```
7462

75-
Next, you need to implement either `SyncSystemTable` or `AsyncSystemTable` trait for your system table struct. `SyncSystemTable` requires you to define `NAME` constant and implement four methods: `get_table_info()`, `get_full_data()`, `get_partitions()` and `truncate()`. However, the last two methods have default implementations, so you don't need to implement them yourself in most cases. (`AsyncSystemTable` is similar, but it doesn't have `truncate()` method.)
63+
5. Create an instance of your system table struct with these fields and wrap it with either `SyncOneBlockSystemTable` or `AsyncOneBlockSystemTable`, depending on whether your data retrieval is synchronous or asynchronous.
7664

77-
`NAME` constant follows the format of `system.<name>`.
65+
6. Implement either `SyncSystemTable` or `AsyncSystemTable` trait for your system table struct. `SyncSystemTable` requires you to define a `NAME` constant and implement four methods: `get_table_info()`, `get_full_data()`, `get_partitions()`, and `truncate()`. However, the last two methods have default implementations, so you don't need to implement them yourself in most cases. (`AsyncSystemTable` is similar, but it doesn't have `truncate()` method.)
7866

79-
```rust
80-
const NAME: &'static str = "system.credits";
81-
```
67+
`NAME` constant follows the format of `system.<name>`.
8268

83-
`get_table_info()` method returns the table information stored in the struct.
69+
```rust
70+
const NAME: &'static str = "system.credits";
71+
```
8472

85-
```rust
86-
fn get_table_info(&self) -> &TableInfo {
87-
&self.table_info
88-
}
89-
```
73+
`get_table_info()` method returns the table information stored in the struct.
9074

91-
`get_full_data()` method is the most important part, because it contains the logic for generating or retrieving the data for your system table. The credits table has three fields that are similar, so we will only show the license field as an example.
75+
```rust
76+
fn get_table_info(&self) -> &TableInfo {
77+
&self.table_info
78+
}
79+
```
9280

93-
The license field information is obtained from an environment variable named `DATABEND_CREDITS_LICENSES` (see `common-building`). Each data item is separated by a comma.
81+
`get_full_data()` method is the most important part, because it contains the logic for generating or retrieving the data for your system table. The credits table has three fields that are similar, so we will only show the license field as an example.
9482

95-
String-type columns are eventually converted from `Vec<Vec<u8>>`, where each string needs to be converted to `Vec<u8>`. So we use `.as_bytes().to_vec()` to do this conversion when iterating over the data.
83+
The license field information is obtained from an environment variable named `DATABEND_CREDITS_LICENSES` (see `common-building`). Each data item is separated by a comma.
9684

97-
```rust
98-
let licenses: Vec<Vec<u8>> = env!("DATABEND_CREDITS_LICENSES")
99-
.split_terminator(',')
100-
.map(|x| x.trim().as_bytes().to_vec())
101-
.collect();
102-
```
85+
String-type columns are eventually converted from `Vec<Vec<u8>>`, where each string needs to be converted to `Vec<u8>`. So we use `.as_bytes().to_vec()` to do this conversion when iterating over the data.
10386

104-
After getting all the data, you can return them in a `DataBlock` format. For non-null types, use `from_data`; for nullable types, use `from_opt_data`.
87+
```rust
88+
let licenses: Vec<Vec<u8>> = env!("DATABEND_CREDITS_LICENSES")
89+
.split_terminator(',')
90+
.map(|x| x.trim().as_bytes().to_vec())
91+
.collect();
92+
```
10593

106-
For example:
94+
7. Return the retrieved data in a `DataBlock` format. Use `from_data` for non-null types and `from_opt_data` for nullable types. For example:
10795

108-
```rust
109-
Ok(DataBlock::new_from_columns(vec![
110-
StringType::from_data(names),
111-
StringType::from_data(versions),
112-
StringType::from_data(licenses),
113-
]))
114-
```
96+
```rust
97+
Ok(DataBlock::new_from_columns(vec![
98+
StringType::from_data(names),
99+
StringType::from_data(versions),
100+
StringType::from_data(licenses),
101+
]))
102+
```
115103

116-
Lastly, if you want to integrate your system table into Databend, you also need to edit `system_database.rs` and register it to `SystemDatabase`.
104+
8. Edit `system_database.rs` to register the new table to `SystemDatabase`.
117105

118-
```rust
119-
impl SystemDatabase {
120-
pub fn create(sys_db_meta: &mut InMemoryMetas, config: &Config) -> Self {
121-
...
122-
CreditsTable::create(sys_db_meta.next_table_id()),
123-
...
106+
```rust
107+
impl SystemDatabase {
108+
pub fn create(sys_db_meta: &mut InMemoryMetas, config: &Config) -> Self {
109+
...
110+
CreditsTable::create(sys_db_meta.next_table_id()),
111+
...
112+
}
124113
}
125-
}
126-
```
127-
128-
## Testing
114+
```
129115

130-
The tests for system tables are currently located at `tests/it/storages/system.rs`.
116+
## Testing a New System Table
131117

132-
For tables whose content does not change frequently, you can use Golden File testing. Its logic is to write the corresponding table into a specified file and compare it with an expected file. If they match, then the test passes; otherwise, it fails.
133-
134-
For example:
118+
The system table tests are located at `tests/it/storages/system.rs`. For tables with infrequent content changes, Golden File testing can be used, which involves writing the table to a specified file and comparing it to an expected file. For example:
135119

136120
```rust
137121
#[tokio::test(flavor = "multi_thread")]
138122
async fn test_columns_table() -> Result<()> {
139123
let (_guard, ctx) = crate::tests::create_query_context().await?;
140-
141124
let mut mint = Mint::new("tests/it/storages/testdata");
142125
let file = &mut mint.new_goldenfile("columns_table.txt").unwrap();
143126
let table = ColumnsTable::create(1);
144-
145127
run_table_tests(file, ctx, table).await?;
146128
Ok(())
147129
}
148130
```
149131

150-
For tables whose content may change dynamically or depend on external factors, there is a lack of sufficient testing methods. You can choose to test the parts that have relatively fixed patterns, such as the number of rows and columns; or you can verify whether the output contains specific content.
151-
152-
For example:
132+
For tables with dynamically changing content or external dependencies, testing methods are limited. You can test relatively fixed patterns such as the number of rows and columns, or verify if the output contains specific content. For example:
153133

154134
```rust
155135
#[tokio::test(flavor = "multi_thread")]
@@ -159,18 +139,10 @@ async fn test_metrics_table() -> Result<()> {
159139
let block = &result[0];
160140
assert_eq!(block.num_columns(), 4);
161141
assert!(block.num_rows() >= 1);
162-
163142
let output = pretty_format_blocks(result.as_slice())?;
164143
assert!(output.contains("test_test_metrics_table_count"));
165144
#[cfg(feature = "enable_histogram")]
166145
assert!(output.contains("test_test_metrics_table_histogram"));
167-
168146
Ok(())
169147
}
170148
```
171-
172-
## Summary
173-
174-
In this document, we have shown you how to write a new system table for Databend using the credits table as an example. We hope this document helps you understand the basic steps and principles of creating a system table for Databend. If you have any questions or feedback, please feel free to contact us on GitHub or Slack. Thank you for your interest and contribution to Databend!
175-
176-

0 commit comments

Comments
 (0)