diff --git a/.translation-init b/.translation-init index b5c1625b7b..12cb9eb887 100644 --- a/.translation-init +++ b/.translation-init @@ -1 +1 @@ -Translation initialization: 2025-10-17T11:13:57.833983 +Translation initialization: 2025-10-20T12:06:14.189361 diff --git a/docs/cn/sql-reference/20-sql-functions/08-window-functions/first-value.md b/docs/cn/sql-reference/20-sql-functions/08-window-functions/first-value.md index c8e095b31e..3c071aa96c 100644 --- a/docs/cn/sql-reference/20-sql-functions/08-window-functions/first-value.md +++ b/docs/cn/sql-reference/20-sql-functions/08-window-functions/first-value.md @@ -6,7 +6,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription'; -返回窗口框架(window frame)中的第一个值。 +返回窗口框架(Window Frame)中的第一个值。 另请参阅: @@ -16,7 +16,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription'; ## 语法 ```sql -FIRST_VALUE(expression) +FIRST_VALUE(expression) [ { RESPECT | IGNORE } NULLS ] OVER ( [ PARTITION BY partition_expression ] ORDER BY sort_expression [ ASC | DESC ] @@ -26,49 +26,122 @@ OVER ( **参数:** - `expression`:必需。要返回第一个值的列或表达式。 -- `PARTITION BY`:可选。将行划分为分区。 -- `ORDER BY`:必需。确定窗口内的排序方式。 -- `window_frame`:可选。定义窗口框架(默认值:RANGE UNBOUNDED PRECEDING)。 +- `PARTITION BY`:可选。将行划分为分区(Partition)。 +- `ORDER BY`:必需。确定窗口内的排序。 +- `window_frame`:可选。定义窗口框架。默认值为 `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`。 -**注意:** +**说明:** - 返回有序窗口框架中的第一个值。 -- 支持 `IGNORE NULLS` 和 `RESPECT NULLS` 选项。 -- 可用于查找每个组中最早/最低的值。 +- 支持使用 `IGNORE NULLS` 跳过空值,使用 `RESPECT NULLS` 保持默认行为。 +- 当需要基于行的语义而不是默认的基于范围的框架时,请指定一个显式的窗口框架(例如,`ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`)。 +- 可用于查找每个组或时间窗口中的最早或最低值。 ## 示例 ```sql --- 创建示例数据 -CREATE TABLE scores ( - student VARCHAR(20), - score INT +-- 示例订单数据 +CREATE OR REPLACE TABLE orders_window_demo ( + customer VARCHAR, + order_id INT, + order_time TIMESTAMP, + amount INT, + sales_rep VARCHAR ); -INSERT INTO scores VALUES - ('Alice', 95), - ('Bob', 87), - ('Charlie', 82), - ('David', 78), - ('Eve', 92); +INSERT INTO orders_window_demo VALUES + ('Alice', 1001, to_timestamp('2024-05-01 09:00:00'), 120, 'Erin'), + ('Alice', 1002, to_timestamp('2024-05-01 11:00:00'), 135, NULL), + ('Alice', 1003, to_timestamp('2024-05-02 14:30:00'), 125, 'Glen'), + ('Bob', 1004, to_timestamp('2024-05-01 08:30:00'), 90, NULL), + ('Bob', 1005, to_timestamp('2024-05-01 20:15:00'), 105, 'Kai'), + ('Bob', 1006, to_timestamp('2024-05-03 10:00:00'), 95, NULL), + ('Carol', 1007, to_timestamp('2024-05-04 09:45:00'), 80, 'Lily'); ``` -**获取最高分(按分数降序排列时的第一个值):** +**示例 1:每个客户的首次购买** ```sql -SELECT student, score, - FIRST_VALUE(score) OVER (ORDER BY score DESC) AS highest_score, - FIRST_VALUE(student) OVER (ORDER BY score DESC) AS top_student -FROM scores -ORDER BY score DESC; +SELECT customer, + order_id, + order_time, + amount, + FIRST_VALUE(amount) OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW + ) AS first_order_amount +FROM orders_window_demo +ORDER BY customer, order_time; ``` 结果: ``` -student | score | highest_score | top_student ---------+-------+---------------+------------ -Alice | 95 | 95 | Alice -Eve | 92 | 95 | Alice -Bob | 87 | 95 | Alice -Charlie | 82 | 95 | Alice -David | 78 | 95 | Alice +customer | order_id | order_time | amount | first_order_amount +---------+----------+----------------------+--------+-------------------- +Alice | 1001 | 2024-05-01 09:00:00 | 120 | 120 +Alice | 1002 | 2024-05-01 11:00:00 | 135 | 120 +Alice | 1003 | 2024-05-02 14:30:00 | 125 | 120 +Bob | 1004 | 2024-05-01 08:30:00 | 90 | 90 +Bob | 1005 | 2024-05-01 20:15:00 | 105 | 90 +Bob | 1006 | 2024-05-03 10:00:00 | 95 | 90 +Carol | 1007 | 2024-05-04 09:45:00 | 80 | 80 +``` + +**示例 2:过去 24 小时内的第一笔订单** + +```sql +SELECT customer, + order_id, + order_time, + FIRST_VALUE(order_id) OVER ( + PARTITION BY customer + ORDER BY order_time + RANGE BETWEEN INTERVAL 1 DAY PRECEDING AND CURRENT ROW + ) AS first_order_in_24h +FROM orders_window_demo +ORDER BY customer, order_time; +``` + +结果: +``` +customer | order_id | order_time | first_order_in_24h +---------+----------+----------------------+-------------------- +Alice | 1001 | 2024-05-01 09:00:00 | 1001 +Alice | 1002 | 2024-05-01 11:00:00 | 1001 +Alice | 1003 | 2024-05-02 14:30:00 | 1003 +Bob | 1004 | 2024-05-01 08:30:00 | 1004 +Bob | 1005 | 2024-05-01 20:15:00 | 1004 +Bob | 1006 | 2024-05-03 10:00:00 | 1006 +Carol | 1007 | 2024-05-04 09:45:00 | 1007 +``` + +**示例 3:跳过空值以查找第一个指定的销售代表** + +```sql +SELECT customer, + order_id, + sales_rep, + FIRST_VALUE(sales_rep) RESPECT NULLS OVER ( + PARTITION BY customer + ORDER BY order_time + ) AS first_rep_respect, + FIRST_VALUE(sales_rep) IGNORE NULLS OVER ( + PARTITION BY customer + ORDER BY order_time + ) AS first_rep_ignore +FROM orders_window_demo +ORDER BY customer, order_id; +``` + +结果: +``` +customer | order_id | sales_rep | first_rep_respect | first_rep_ignore +---------+----------+-----------+-------------------+------------------ +Alice | 1001 | Erin | Erin | Erin +Alice | 1002 | NULL | Erin | Erin +Alice | 1003 | Glen | Erin | Erin +Bob | 1004 | NULL | NULL | NULL +Bob | 1005 | Kai | NULL | Kai +Bob | 1006 | NULL | NULL | Kai +Carol | 1007 | Lily | Lily | Lily ``` \ No newline at end of file diff --git a/docs/cn/sql-reference/20-sql-functions/08-window-functions/last-value.md b/docs/cn/sql-reference/20-sql-functions/08-window-functions/last-value.md index d33fba4602..49e09a70e2 100644 --- a/docs/cn/sql-reference/20-sql-functions/08-window-functions/last-value.md +++ b/docs/cn/sql-reference/20-sql-functions/08-window-functions/last-value.md @@ -4,9 +4,9 @@ title: LAST_VALUE import FunctionDescription from '@site/src/components/FunctionDescription'; - + -返回窗口框架(Window Frame)中的最后一个值。 +返回窗口框架中的最后一个值。 另请参阅: @@ -16,7 +16,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription'; ## 语法 ```sql -LAST_VALUE(expression) +LAST_VALUE(expression) [ { RESPECT | IGNORE } NULLS ] OVER ( [ PARTITION BY partition_expression ] ORDER BY sort_expression [ ASC | DESC ] @@ -26,56 +26,124 @@ OVER ( **参数:** - `expression`:必需。要返回最后一个值的列或表达式。 -- `PARTITION BY`:可选。将行划分为分区。 +- `PARTITION BY`:可选。将行划分为分区(Partition)。 - `ORDER BY`:必需。确定窗口内的排序方式。 -- `window_frame`:可选。定义窗口框架(默认为:RANGE UNBOUNDED PRECEDING)。 +- `window_frame`:可选。定义窗口框架。默认值为 `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`。 -**注意:** +**说明:** - 返回有序窗口框架中的最后一个值。 -- 支持 `IGNORE NULLS` 和 `RESPECT NULLS` 选项。 -- 通常需要显式指定窗口框架才能获得预期结果。 -- 可用于查找每个组中的最新/最高值。 +- 支持使用 `IGNORE NULLS` 跳过空值,使用 `RESPECT NULLS` 保持默认行为。 +- 当需要获取分区(Partition)的真正最后一行时,请使用在当前行之后结束的框架(例如,`ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING`)。 +- 可用于查找每个组中的最新值,或在向前看的窗口中查找最近的值。 ## 示例 ```sql --- 创建示例数据 -CREATE TABLE scores ( - student VARCHAR(20), - score INT +-- 示例订单数据 +CREATE OR REPLACE TABLE orders_window_demo ( + customer VARCHAR, + order_id INT, + order_time TIMESTAMP, + amount INT, + sales_rep VARCHAR ); -INSERT INTO scores VALUES - ('Alice', 95), - ('Bob', 87), - ('Charlie', 82), - ('David', 78), - ('Eve', 92); +INSERT INTO orders_window_demo VALUES + ('Alice', 1001, to_timestamp('2024-05-01 09:00:00'), 120, 'Erin'), + ('Alice', 1002, to_timestamp('2024-05-01 11:00:00'), 135, NULL), + ('Alice', 1003, to_timestamp('2024-05-02 14:30:00'), 125, 'Glen'), + ('Bob', 1004, to_timestamp('2024-05-01 08:30:00'), 90, NULL), + ('Bob', 1005, to_timestamp('2024-05-01 20:15:00'), 105, 'Kai'), + ('Bob', 1006, to_timestamp('2024-05-03 10:00:00'), 95, NULL), + ('Carol', 1007, to_timestamp('2024-05-04 09:45:00'), 80, 'Lily'); ``` -**获取最低分(按分数降序排列时的最后一个值):** +**示例 1:每个客户分区中的最新订单** ```sql -SELECT student, score, - LAST_VALUE(score) OVER ( - ORDER BY score DESC - ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING - ) AS lowest_score, - LAST_VALUE(student) OVER ( - ORDER BY score DESC - ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING - ) AS lowest_student -FROM scores -ORDER BY score DESC; +SELECT customer, + order_id, + order_time, + LAST_VALUE(order_id) OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING + ) AS last_order_for_customer +FROM orders_window_demo +ORDER BY customer, order_time; ``` 结果: ``` -student | score | lowest_score | lowest_student ---------+-------+--------------+--------------- -Alice | 95 | 78 | David -Eve | 92 | 78 | David -Bob | 87 | 78 | David -Charlie | 82 | 78 | David -David | 78 | 78 | David +customer | order_id | order_time | last_order_for_customer +---------+----------+----------------------+------------------------- +Alice | 1001 | 2024-05-01 09:00:00 | 1003 +Alice | 1002 | 2024-05-01 11:00:00 | 1003 +Alice | 1003 | 2024-05-02 14:30:00 | 1003 +Bob | 1004 | 2024-05-01 08:30:00 | 1006 +Bob | 1005 | 2024-05-01 20:15:00 | 1006 +Bob | 1006 | 2024-05-03 10:00:00 | 1006 +Carol | 1007 | 2024-05-04 09:45:00 | 1007 +``` + +**示例 2:在每个客户内向前查看 12 小时** + +```sql +SELECT customer, + order_id, + order_time, + amount, + LAST_VALUE(amount) OVER ( + PARTITION BY customer + ORDER BY order_time + RANGE BETWEEN CURRENT ROW AND INTERVAL 12 HOUR FOLLOWING + ) AS last_amount_next_12h +FROM orders_window_demo +ORDER BY customer, order_time; +``` + +结果: +``` +customer | order_id | order_time | amount | last_amount_next_12h +---------+----------+----------------------+--------+---------------------- +Alice | 1001 | 2024-05-01 09:00:00 | 120 | 135 +Alice | 1002 | 2024-05-01 11:00:00 | 135 | 135 +Alice | 1003 | 2024-05-02 14:30:00 | 125 | 125 +Bob | 1004 | 2024-05-01 08:30:00 | 90 | 105 +Bob | 1005 | 2024-05-01 20:15:00 | 105 | 105 +Bob | 1006 | 2024-05-03 10:00:00 | 95 | 95 +Carol | 1007 | 2024-05-04 09:45:00 | 80 | 80 +``` + +**示例 3:向前扫描最后一个销售代表时跳过空值** + +```sql +SELECT customer, + order_id, + sales_rep, + LAST_VALUE(sales_rep) RESPECT NULLS OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING + ) AS last_rep_respect, + LAST_VALUE(sales_rep) IGNORE NULLS OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING + ) AS last_rep_ignore +FROM orders_window_demo +ORDER BY customer, order_id; +``` + +结果: +``` +customer | order_id | sales_rep | last_rep_respect | last_rep_ignore +---------+----------+-----------+------------------+----------------- +Alice | 1001 | Erin | Glen | Glen +Alice | 1002 | NULL | Glen | Glen +Alice | 1003 | Glen | Glen | Glen +Bob | 1004 | NULL | NULL | Kai +Bob | 1005 | Kai | NULL | Kai +Bob | 1006 | NULL | NULL | Kai +Carol | 1007 | Lily | Lily | Lily ``` \ No newline at end of file diff --git a/docs/cn/sql-reference/20-sql-functions/08-window-functions/nth-value.md b/docs/cn/sql-reference/20-sql-functions/08-window-functions/nth-value.md index f90a54e51b..7975d95706 100644 --- a/docs/cn/sql-reference/20-sql-functions/08-window-functions/nth-value.md +++ b/docs/cn/sql-reference/20-sql-functions/08-window-functions/nth-value.md @@ -20,7 +20,7 @@ NTH_VALUE( expression, n ) -[ { IGNORE | RESPECT } NULLS ] +[ { RESPECT | IGNORE } NULLS ] OVER ( [ PARTITION BY partition_expression ] ORDER BY order_expression @@ -29,67 +29,74 @@ OVER ( ``` **参数:** -- `expression`:要计算的列或表达式 -- `n`:要返回的值的位置编号(从 1 开始的索引) -- `IGNORE NULLS`:可选。指定后,在计算位置时将跳过 NULL 值 -- `RESPECT NULLS`:默认行为。在计算位置时将包含 NULL 值 - -**注意:** -- 位置计数从 1 开始(而不是 0) -- 如果指定的位置在窗口框架中不存在,则返回 NULL -- 关于窗口框架语法,请参阅[窗口框架语法](index.md#window-frame-syntax) +- `expression`:必需。要计算的列或表达式。 +- `n`:必需。要返回的值的位置编号(从 1 开始)。 +- `IGNORE NULLS`:可选。在计算位置时跳过 NULL 值。 +- `RESPECT NULLS`:可选。在计算位置时保留 NULL 值(默认)。 +- `window_frame`:可选。定义窗口框架。默认为 `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`。 + +**说明:** +- `n` 必须是正整数;`n = 1` 等同于 `FIRST_VALUE`。 +- 如果指定位置在框架中不存在,则返回 `NULL`。 +- 与 `ROWS BETWEEN ...` 结合使用,可以控制是在整个分区(Partition)上还是在当前已处理的行上评估位置。 +- 关于窗口框架语法,请参阅 [窗口框架语法](index.md#window-frame-syntax)。 ## 示例 ```sql --- 创建示例数据 -CREATE TABLE scores ( - student VARCHAR(20), - score INT +-- 示例订单数据 +CREATE OR REPLACE TABLE orders_window_demo ( + customer VARCHAR, + order_id INT, + order_time TIMESTAMP, + amount INT, + sales_rep VARCHAR ); -INSERT INTO scores VALUES - ('Alice', 85), - ('Bob', 90), - ('Charlie', 78), - ('David', 92), - ('Eve', 88); -``` - -**获取得分第二高的学生:** - -```sql -SELECT student, score, - NTH_VALUE(student, 2) OVER (ORDER BY score DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS second_highest_student -FROM scores; -``` - -结果: -``` -student | score | second_highest_student ----------+-------+----------------------- -David | 92 | Bob -Bob | 90 | Bob -Eve | 88 | Bob -Alice | 85 | Bob -Charlie | 78 | Bob +INSERT INTO orders_window_demo VALUES + ('Alice', 1001, to_timestamp('2024-05-01 09:00:00'), 120, 'Erin'), + ('Alice', 1002, to_timestamp('2024-05-01 11:00:00'), 135, NULL), + ('Alice', 1003, to_timestamp('2024-05-02 14:30:00'), 125, 'Glen'), + ('Bob', 1004, to_timestamp('2024-05-01 08:30:00'), 90, NULL), + ('Bob', 1005, to_timestamp('2024-05-01 20:15:00'), 105, 'Kai'), + ('Bob', 1006, to_timestamp('2024-05-03 10:00:00'), 95, NULL), + ('Carol', 1007, to_timestamp('2024-05-04 09:45:00'), 80, 'Lily'); ``` -**获取得分第三高的学生:** +**查找第二个订单,并说明对第二个销售代表的空值处理:** ```sql -SELECT student, score, - NTH_VALUE(student, 3) OVER (ORDER BY score DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS third_highest_student -FROM scores; +SELECT customer, + order_id, + order_time, + NTH_VALUE(order_id, 2) OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW + ) AS second_order_so_far, + NTH_VALUE(sales_rep, 2) RESPECT NULLS OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW + ) AS second_rep_respect, + NTH_VALUE(sales_rep, 2) IGNORE NULLS OVER ( + PARTITION BY customer + ORDER BY order_time + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW + ) AS second_rep_ignore +FROM orders_window_demo +ORDER BY customer, order_time; ``` 结果: ``` -student | score | third_highest_student ----------+-------+---------------------- -David | 92 | Eve -Bob | 90 | Eve -Eve | 88 | Eve -Alice | 85 | Eve -Charlie | 78 | Eve +customer | order_id | order_time | second_order_so_far | second_rep_respect | second_rep_ignore +---------+----------+----------------------+---------------------+--------------------+------------------- +Alice | 1001 | 2024-05-01 09:00:00 | NULL | NULL | NULL +Alice | 1002 | 2024-05-01 11:00:00 | 1002 | NULL | NULL +Alice | 1003 | 2024-05-02 14:30:00 | 1002 | NULL | Glen +Bob | 1004 | 2024-05-01 08:30:00 | NULL | NULL | NULL +Bob | 1005 | 2024-05-01 20:15:00 | 1005 | Kai | Kai +Bob | 1006 | 2024-05-03 10:00:00 | 1005 | Kai | Kai +Carol | 1007 | 2024-05-04 09:45:00 | NULL | NULL | NULL ``` \ No newline at end of file