Oracle的分析函数,对我们进行统计有很大的帮助,可以避免一些子查询等操作,在统计中,我们对开窗函数的接触较少,下面主要介绍下开窗函数的使用;
首先我们介绍下分析函数的语义
(分为range和row):缺省时相当于RANGE UNBOUNDED PRECEDING
值域窗(RANGE WINDOW) 如:RANGE N PRECEDING, 仅对数值或日期类型有效,选定窗为排序后当前行之前,某列(即排序列)值大于/小于(当 前 行该列值 –/+ N)的所有行,因此与ORDER BY子句有关系。
行窗(ROW WINDOW)如:ROWS N PRECEDING ,选定窗为当前行及之前N行。还可以加上BETWEEN AND 形式,例如RANGE BETWEEN m PRECEDING AND n FOLLOWING,表示每行对应的数据窗口是之前m行与之后n行内。
1 SELECT empno,
2 sal,
3 mgr,
4 deptno,
5 SUM(sal) over(PARTITION BY deptno ORDER BY sal RANGE BETWEEN 0 PRECEDING AND 100 FOLLOWING) dd
6 FROM emp;
其中:上面代表按DEPARTMENT_ID分区,按SALARY升序排序,汇总当前SALARY到比当前SALARY大100之间的SALARY总和。、
按DEPARTMENT_ID分区,按SALARY升序排序,汇总当前SALARY到比当前SALARY大100之间的SALARY总和。
Analytic functions are commonly used to compute cumulative, moving, centered, and reporting aggregates.
analytic_function::=
Description of the illustration analytic_function.gif
analytic_clause::=
Description of the illustration analytic_clause.gif
Description of the illustration query_partition_clause.gif
Description of the illustration order_by_clause.gif
windowing_clause ::=
Description of the illustration windowing_clause.gif
上面的这张图片是开窗函数的具体语法,我们可以参照这个语法。
值的开窗,该值只能是日期和数字
我有这样一个要求:
1、查询的结果按照值排序,如sql:select value from t;
结果示例如下:
50
70
90
130
160
190
2、对数据进行分组。从上述数组第一个值开始,+50之内的值作为同一组值,如果超出50了,则开始一个新的分组。示例如下
50 50
70 50
90 50
130 130
160 130
190 190
3、最终结果是统计每组的个数。结果示例:
50 3
130 2
190 1
原帖见:http://www.itpub.net/thread-985707-1-1.html
1 WITH T AS (
2 SELECT 50 N FROM DUAL UNION ALL
3 SELECT 70 N FROM DUAL UNION ALL
4 SELECT 90 N FROM DUAL UNION ALL
5 SELECT 130 N FROM DUAL UNION ALL
6 SELECT 160 N FROM DUAL UNION ALL
7 SELECT 190 N FROM DUAL
8 )
9 SELECT *
10 FROM (SELECT n,
11 row_number() OVER(ORDER BY n) rn,
12 COUNT(*) OVER(ORDER BY n RANGE BETWEEN CURRENT ROW AND 50 FOLLOWING) cn
13 FROM t)
14 START WITH rn = 1
15 CONNECT BY RN = PRIOR CN + PRIOR RN;
在这里,我们通过数值开窗函数,统计了每个范围内的值,然后,通过构造条件,去进行connect by,
在这里,通过让cn和rn去相加,作为connect by的条件,这个思路非常的好,很值得我们思考
在统计的过程,我们往往只是需要去构造一个场景,条件。
我有这样一个要求:
1、查询的结果按照值排序,如sql:select value from t;
结果示例如下:
50
70
90
130
160
190
2、对数据进行分组。从上述数组第一个值开始,+50之内的值作为同一组值,如果超出50了,则开始一个新的分组。示例如下
50 50
70 50
90 50
130 130
160 130
190 190
3、最终结果是统计每组的个数。结果示例:
50 3
130 2
190 1
这样一个要求,怎么用一个sql语句实现呢。
谢谢大家!
通过如下的SQL可以实现上面的要求:
1 WITH T AS (
2 SELECT 1 N FROM DUAL UNION ALL
3 SELECT 3 N FROM DUAL UNION ALL
4 SELECT 4 N FROM DUAL UNION ALL
5 SELECT 7 N FROM DUAL UNION ALL
6 SELECT 10 N FROM DUAL UNION ALL
7 SELECT 11 N FROM DUAL UNION ALL
8 SELECT 12 N FROM DUAL UNION ALL
9 SELECT 12 N FROM DUAL UNION ALL
10 SELECT 19 N FROM DUAL UNION ALL
11 SELECT 20 N FROM DUAL
12 )
13 SELECT T2.N
14 ,DENSE_RANK() OVER(ORDER BY T2.G) G
15 FROM (
16 SELECT T.N
17 ,MAX(T1.N)OVER(ORDER BY T.N) G
18 FROM (
19 SELECT N
20 FROM (
21 SELECT N
22 ,COUNT(*) OVER(ORDER BY N RANGE BETWEEN CURRENT ROW AND 4 FOLLOWING) CNT
23 ,ROW_NUMBER()OVER(ORDER BY N) RN
24 FROM T
25 )
26 CONNECT BY RN = PRIOR RN + PRIOR CNT
27 START WITH RN = 1
28 ) T1 , T
29 WHERE T1.N(+) = T.N
30 ) T2;
在这里,我们需要关注connect by,dense rank函数和 ,MAX(T1.N)OVER(ORDER BY T.N) G这个用法。
下面是高手用with递归解决的例子,当前也可以用我们熟悉的connect by解决该问题
1 WITH T AS
2 (SELECT 1 N
3 FROM DUAL
4 UNION ALL
5 SELECT 4 N
6 FROM DUAL
7 UNION ALL
8 SELECT 3 N
9 FROM DUAL
10 UNION ALL
11 SELECT 7 N
12 FROM DUAL
13 UNION ALL
14 SELECT 10 N
15 FROM DUAL
16 UNION ALL
17 SELECT 11 N
18 FROM DUAL
19 UNION ALL
20 SELECT 12 N
21 FROM DUAL
22 UNION ALL
23 SELECT 12 N
24 FROM DUAL
25 UNION ALL
26 SELECT 19 N
27 FROM DUAL
28 UNION ALL
29 SELECT 20 N FROM DUAL),
30 v AS
31 (SELECT n, row_number() over(ORDER BY n) rn FROM t),
32 v1(flag,
33 n,
34 rn) AS
35 (SELECT n, n, rn
36 FROM v
37 WHERE rn = 1
38 UNION ALL
39 SELECT CASE
40 WHEN v.n - v1.flag >= 5 THEN
41 v.n
42 ELSE
43 v1.flag
44 END,
45 v.n,
46 v.rn
47 FROM v, v1
48 WHERE v.rn = v1.rn + 1)
49 SELECT * FROM v1
当然也有高手用MODEL语句实现了该功能,请查看原帖。