Skip to content

窗口聚合查询

aggregate_window/_interval 标志

aggregate_window_interval 互为别名,可相互转换:

sql
... 
GROUP BY aggregate_window(time_column, <interval>, [ignore_fractions]), key_column

相当于

sql
...
WHERE ... AND _interval = '<interval>[,ignore_fractions]'

函数签名

  1. 使用 aggregate_window
sql
SELECT ...
FROM ...
WHERE ...
GROUP BY aggregate_window(
    time_column: string,
    every: duration,
    ?ignore_fractions: bool,
    ?create_empty: bool,
)
  1. 使用 _interval
sql
SELECT ...
FROM ...
WHERE ...
    AND _interval = '<interval>[,ignore_fractions]'

参数

time_column

用于窗口切分的时间列。

every

duration 表示每个窗口的长度,格式为 <int><unit>,其中 unit 可以是:

unit (influxdb 风格)unit (openplant 风格)描述
nsns纳秒
usus微秒
msms毫秒
ss
mm分钟
hh小时
dD
wW
moM月 /日历月
qQ季度 /日历季度
yY年 /日历年

示例:

  • 1h (1 小时)
  • 10m (10 分钟)
  • 3d (1 天)

ignore_fractions

指定划分时间槽的方式,默认为 true

  • true: 以整点划分时间槽
  • false: 以开始时间为基准划分时间槽

示例:

sql
-- 以整点划分时间槽
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date <= '2018-11-03 02:15:16'
GROUP BY aggregate_window(time_column(e_t_date), every('1h'), ignore_fractions(true))

结果窗口槽为:

  • [2018-11-03 01:02:03, 2018-11-03 02:00:00)
  • [2018-11-03 02:00:00, 2018-11-03 02:15:16)
sql
-- 以开始时间为基准划分时间槽
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date <= '2018-11-03 02:15:16'
GROUP BY aggregate_window(time_column(e_t_date), every('1h'), ignore_fractions(false))

结果窗口槽为:

  • [2018-11-03 01:02:03, 2018-11-03 02:02:03)
  • [2018-11-03 02:02:03, 2018-11-03 02:15:16)
sql
-- 以整点划分时间槽
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-01-31 01:02:03'
    AND e_t_date <= '2018-03-05 02:15:16'
GROUP BY aggregate_window(time_column(e_t_date), every('1mo'), ignore_fractions(true))

结果窗口槽为:

  • [2018-01-31 01:02:03, 2018-02-01 00:00:00)
  • [2018-02-01 00:00:00, 2018-03-01 00:00:00)
  • [2018-03-01 00:00:00, 2018-03-05 02:15:16)
sql
-- 以开始时间为基准划分时间槽
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-01-31 01:02:03'
    AND e_t_date <= '2018-03-05 02:15:16'
GROUP BY aggregate_window(time_column(e_t_date), every('1mo'), ignore_fractions(false))

结果窗口槽为:

  • [2018-01-31 01:02:03, 2018-02-28 01:02:03)
  • [2018-02-28 01:02:03, 2018-03-05 02:15:16)

注意,这里的划分时间为 2 月 28 日,因为 2 月没有 31 日,所以划分时间被调整到了当月的最后一天。

create_empty

是否为空的时间槽返回数据,默认为 true

  • true: 返回空的时间槽,count 表达式返回 0,其他表达式返回 null
  • false: 不返回空的时间槽

查询规则

时间对齐

对于大于小时的时间单位,窗口将对齐到查询时区的一天的开始。查询的时区由开始时间确定。

(在 influxdb 中,默认窗口边界对齐到 UTC 时区的一天的开始。)

时间包含/排除

全局窗口的开始和结束时间由 WHERE 子句确定。这与 influxdb 的 range(https://docs.influxdata.com/flux/v0.x/stdlib/universe/range/) 不同。在 influxdb 中,开始时间包含在内,结束时间排除在外。

由 "Aggregate Window" 产生的每个切片点都不包含在前一个切片中,但是包含在下一个切片中。

例如:
对于以下查询:

sql
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date < '2018-11-03 03:05:31'
GROUP BY aggregate_window(e_t_date, '1h', true, true)

结果窗口槽为:

  • [2018-11-03 01:02:03, 2018-11-03 02:00:00)
  • [2018-11-03 02:00:00, 2018-11-03 03:00:00)
  • [2018-11-03 03:00:00, 2018-11-03 03:05:31)

示例用法

带名称的参数:

sql
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date <= '2018-11-04 02:05:31'
GROUP BY aggregate_window(time_column(e_t_date), every('1h'), ignore_fractions(true), create_empty(true))

固定位置的参数:

sql
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date <= '2018-11-04 02:05:31'
GROUP BY aggregate_window(e_t_date, '1h', true, true)

不带 GROUP BY 子句的窗口查询:

sql
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date <= '2018-11-04 02:05:31'

不带 GROUP BY 子句和时间过滤器的窗口查询:

sql
SELECT last(spd)
FROM   kudu_gb32960_test
sql
SELECT last(spd)
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'

带额外 GROUP BY 子句的窗口查询:

sql
SELECT last(spd), vin
FROM   kudu_gb32960_test
WHERE  vin = 'LJNEFE2T1JN120123'
    AND e_t_date >= '2018-11-03 01:02:03'
    AND e_t_date <= '2018-11-04 02:05:31'
GROUP BY aggregate_window(e_t_date, '1h', true, true), vin

输出

sql 中可添加伪列:_start 窗口开始时间,_end 窗口结束时间,_time 值的时间,key_column key, time_column time, 当不存在这些伪列时,不予返回

sql
SELECT last(spd, true) 
FROM ...
WHERE ... 
    <AND _interval = '1h' 
| GROUP BY aggregate_window(time_column, '1h', true) 
    >

-- 输出列:last(spd), last_spd_timestamp

SELECT _start, _end, _time, last(spd, true) 
FROM ...
WHERE ... 
    <AND _interval = '1h' 
| GROUP BY aggregate_window(time_column, '1h', true) 
    >

-- 输出列:_start, _end, _time, last(spd), last_spd_timestamp

SELECT key, tm, gap_fill(spd) 
FROM ...
WHERE ...
    <AND _interval = '1h'
| GROUP BY aggregate_window(time_column, '1h', true)
    >

-- 输出列:key, tm, gap_fill(spd)

SELECT _start, _end, _time, key, tm, gap_fill(spd) 
FROM ...
WHERE ...
    <AND _interval = '1h'
| GROUP BY aggregate_window(time_column, '1h', true)
    >

-- 输出列:_start, _end, _time, key, tm, gap_fill(spd)