Oracle 分析函數(shù)的使用
Oracle 分析函數(shù)使用介紹 分析函數(shù)是oracle816引入的一個全新的概念,為我們分析數(shù)據(jù)提供了一種簡單高效的處理方式.在分析函數(shù)出現(xiàn)以前,我們必須使用自聯(lián)查詢,子查詢或者內(nèi)聯(lián)視圖,甚至復(fù)雜的存儲過程實現(xiàn)的語句,現(xiàn)在只要一條簡單的sql語句就可以實現(xiàn)了,而且在執(zhí)行效率方面也有相當(dāng)大的提高.下面我將針對分析函數(shù)做一些具體的說明.
今天我主要給大家介紹一下以下幾個函數(shù)的使用方法 1. 自動匯總函數(shù)rollup,cube, 2. rank 函數(shù), rank,dense_rank,row_number 3. lag,lead函數(shù) 4. sum,avg,的移動增加,移動平均數(shù) 5. ratio_to_report報表處理函數(shù) 6. first,last取基數(shù)的分析函數(shù)
基礎(chǔ)數(shù)據(jù)
Code:
|
[Copy to clipboard]
|
06:34:23 SQL> select * from t;
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE --------------- ---------- ---------- -------------- 200405 5761 G 7393344.04 200405 5761 J 5667089.85 200405 5762 G 6315075.96 200405 5762 J 6328716.15 200405 5763 G 8861742.59 200405 5763 J 7788036.32 200405 5764 G 6028670.45 200405 5764 J 6459121.49 200405 5765 G 13156065.77 200405 5765 J 11901671.70 200406 5761 G 7614587.96 200406 5761 J 5704343.05 200406 5762 G 6556992.60 200406 5762 J 6238068.05 200406 5763 G 9130055.46 200406 5763 J 7990460.25 200406 5764 G 6387706.01 200406 5764 J 6907481.66 200406 5765 G 13562968.81 200406 5765 J 12495492.50 200407 5761 G 7987050.65 200407 5761 J 5723215.28 200407 5762 G 6833096.68 200407 5762 J 6391201.44 200407 5763 G 9410815.91 200407 5763 J 8076677.41 200407 5764 G 6456433.23 200407 5764 J 6987660.53 200407 5765 G 14000101.20 200407 5765 J 12301780.20 200408 5761 G 8085170.84 200408 5761 J 6050611.37 200408 5762 G 6854584.22 200408 5762 J 6521884.50 200408 5763 G 9468707.65 200408 5763 J 8460049.43 200408 5764 G 6587559.23
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE --------------- ---------- ---------- -------------- 200408 5764 J 7342135.86 200408 5765 G 14450586.63 200408 5765 J 12680052.38
40 rows selected.
Elapsed: 00:00:00.00
|
|
1. 使用rollup函數(shù)的介紹
Quote:
|
下面是直接使用普通sql語句求出各地區(qū)的匯總數(shù)據(jù)的例子 06:41:36 SQL> set autot on 06:43:36 SQL> select area_code,sum(local_fare) local_fare 06:43:50 2 from t 06:43:51 3 group by area_code 06:43:57 4 union all 06:44:00 5 select ‘合計‘ area_code,sum(local_fare) local_fare 06:44:06 6 from t 06:44:08 7 /
AREA_CODE LOCAL_FARE ---------- -------------- 5761 54225413.04 5762 52039619.60 5763 69186545.02 5764 53156768.46 5765 104548719.19 合計 333157065.31
6 rows selected.
Elapsed: 00:00:00.03
Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=1310 Bytes= 24884)
1 0 UNION-ALL 2 1 SORT (GROUP BY) (Cost=5 Card=1309 Bytes=24871) 3 2 TABLE ACCESS (FULL) OF ‘T‘ (Cost=2 Card=1309 Bytes=248 71)
4 1 SORT (AGGREGATE) 5 4 TABLE ACCESS (FULL) OF ‘T‘ (Cost=2 Card=1309 Bytes=170 17)
Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 561 bytes sent via SQL*Net to client 503 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 6 rows processed
下面是使用分析函數(shù)rollup得出的匯總數(shù)據(jù)的例子 06:44:09 SQL> select nvl(area_code,‘合計‘) area_code,sum(local_fare) local_fare 06:45:26 2 from t 06:45:30 3 group by rollup(nvl(area_code,‘合計‘)) 06:45:50 4 /
AREA_CODE LOCAL_FARE ---------- -------------- 5761 54225413.04 5762 52039619.60 5763 69186545.02 5764 53156768.46 5765 104548719.19 333157065.31
6 rows selected.
Elapsed: 00:00:00.00
Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1309 Bytes= 24871)
1 0 SORT (GROUP BY ROLLUP) (Cost=5 Card=1309 Bytes=24871) 2 1 TABLE ACCESS (FULL) OF ‘T‘ (Cost=2 Card=1309 Bytes=24871 )
Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 557 bytes sent via SQL*Net to client 503 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 6 rows processed
從上面的例子我們不難看出使用rollup函數(shù),系統(tǒng)的sql語句更加簡單,耗用的資源更少,從6個consistent gets降到4個consistent gets,如果基表很大的話,結(jié)果就可想而知了.
|
|
1. 使用cube函數(shù)的介紹
Quote:
|
為了介紹cube函數(shù)我們再來看看另外一個使用rollup的例子 06:53:00 SQL> select area_code,bill_month,sum(local_fare) local_fare 06:53:37 2 from t 06:53:38 3 group by rollup(area_code,bill_month) 06:53:49 4 /
AREA_CODE BILL_MONTH LOCAL_FARE ---------- --------------- -------------- 5761 200405 13060433.89 5761 200406 13318931.01 5761 200407 13710265.93 5761 200408 14135782.21 5761 54225413.04 5762 200405 12643792.11 5762 200406 12795060.65 5762 200407 13224298.12 5762 200408 13376468.72 5762 52039619.60 5763 200405 16649778.91 5763 200406 17120515.71 5763 200407 17487493.32 5763 200408 17928757.08 5763 69186545.02 5764 200405 12487791.94 5764 200406 13295187.67 5764 200407 13444093.76 5764 200408 13929695.09 5764 53156768.46 5765 200405 25057737.47 5765 200406 26058461.31 5765 200407 26301881.40 5765 200408 27130639.01 5765 104548719.19 333157065.31
26 rows selected.
Elapsed: 00:00:00.00
系統(tǒng)只是根據(jù)rollup的第一個參數(shù)area_code對結(jié)果集的數(shù)據(jù)做了匯總處理,而沒有對bill_month做匯總分析處理,cube函數(shù)就是為了這個而設(shè)計的. 下面,讓我們看看使用cube函數(shù)的結(jié)果
06:58:02 SQL> select area_code,bill_month,sum(local_fare) local_fare 06:58:30 2 from t 06:58:32 3 group by cube(area_code,bill_month) 06:58:42 4 order by area_code,bill_month nulls last 06:58:57 5 /
AREA_CODE BILL_MONTH LOCAL_FARE ---------- --------------- -------------- 5761 200405 13060.43 5761 200406 13318.93 5761 200407 13710.27 5761 200408 14135.78 5761 54225.41 5762 200405 12643.79 5762 200406 12795.06 5762 200407 13224.30 5762 200408 13376.47 5762 52039.62 5763 200405 16649.78 5763 200406 17120.52 5763 200407 17487.49 5763 200408 17928.76 5763 69186.54 5764 200405 12487.79 5764 200406 13295.19 5764 200407 13444.09 5764 200408 13929.69 5764 53156.77 5765 200405 25057.74 5765 200406 26058.46 5765 200407 26301.88 5765 200408 27130.64 5765 104548.72 200405 79899.53 200406 82588.15 200407 84168.03 200408 86501.34 333157.05
30 rows selected.
Elapsed: 00:00:00.01
可以看到,在cube函數(shù)的輸出結(jié)果比使用rollup多出了幾行統(tǒng)計數(shù)據(jù).這就是cube函數(shù)根據(jù)bill_month做的匯總統(tǒng)計結(jié)果
|
|
|
1 rollup 和 cube函數(shù)的再深入
Quote:
|
從上面的結(jié)果中我們很容易發(fā)現(xiàn),每個統(tǒng)計數(shù)據(jù)所對應(yīng)的行都會出現(xiàn)null, 我們?nèi)绾蝸韰^(qū)分到底是根據(jù)那個字段做的匯總呢, 這時候,oracle的grouping函數(shù)就粉墨登場了. 如果當(dāng)前的匯總記錄是利用該字段得出的,grouping函數(shù)就會返回1,否則返回0
1 select decode(grouping(area_code),1,‘a(chǎn)ll area‘,to_char(area_code)) area_code, 2 decode(grouping(bill_month),1,‘a(chǎn)ll month‘,bill_month) bill_month, 3 sum(local_fare) local_fare 4 from t 5 group by cube(area_code,bill_month) 6* order by area_code,bill_month nulls last 07:07:29 SQL> /
AREA_CODE BILL_MONTH LOCAL_FARE ---------- --------------- -------------- 5761 200405 13060.43 5761 200406 13318.93 5761 200407 13710.27 5761 200408 14135.78 5761 all month 54225.41 5762 200405 12643.79 5762 200406 12795.06 5762 200407 13224.30 5762 200408 13376.47 5762 all month 52039.62 5763 200405 16649.78 5763 200406 17120.52 5763 200407 17487.49 5763 200408 17928.76 5763 all month 69186.54 5764 200405 12487.79 5764 200406 13295.19 5764 200407 13444.09 5764 200408 13929.69 5764 all month 53156.77 5765 200405 25057.74 5765 200406 26058.46 5765 200407 26301.88 5765 200408 27130.64 5765 all month 104548.72 all area 200405 79899.53 all area 200406 82588.15 all area 200407 84168.03 all area 200408 86501.34 all area all month 333157.05
30 rows selected.
Elapsed: 00:00:00.01 07:07:31 SQL>
可以看到,所有的空值現(xiàn)在都根據(jù)grouping函數(shù)做出了很好的區(qū)分,這樣利用rollup,cube和grouping函數(shù),我們做數(shù)據(jù)統(tǒng)計的時候就可以輕松很多了.
|
|
2. rank函數(shù)的介紹
介紹完rollup和cube函數(shù)的使用,下面我們來看看rank系列函數(shù)的使用方法.
問題2.我想查出這幾個月份中各個地區(qū)的總話費的排名.
Quote:
|
為了將rank,dense_rank,row_number函數(shù)的差別顯示出來,我們對已有的基礎(chǔ)數(shù)據(jù)做一些修改,將5763的數(shù)據(jù)改成與5761的數(shù)據(jù)相同. 1 update t t1 set local_fare = ( 2 select local_fare from t t2 3 where t1.bill_month = t2.bill_month 4 and t1.net_type = t2.net_type 5 and t2.area_code = ‘5761‘ 6* ) where area_code = ‘5763‘ 07:19:18 SQL> /
8 rows updated.
Elapsed: 00:00:00.01
我們先使用rank函數(shù)來計算各個地區(qū)的話費排名. 07:34:19 SQL> select area_code,sum(local_fare) local_fare, 07:35:25 2 rank() over (order by sum(local_fare) desc) fare_rank 07:35:44 3 from t 07:35:45 4 group by area_codee 07:35:50 5 07:35:52 SQL> select area_code,sum(local_fare) local_fare, 07:36:02 2 rank() over (order by sum(local_fare) desc) fare_rank 07:36:20 3 from t 07:36:21 4 group by area_code 07:36:25 5 /
AREA_CODE LOCAL_FARE FARE_RANK ---------- -------------- ---------- 5765 104548.72 1 5761 54225.41 2 5763 54225.41 2 5764 53156.77 4 5762 52039.62 5
Elapsed: 00:00:00.01
我們可以看到紅色標(biāo)注的地方出現(xiàn)了,跳位,排名3沒有出現(xiàn) 下面我們再看看dense_rank查詢的結(jié)果.
07:36:26 SQL> select area_code,sum(local_fare) local_fare, 07:39:16 2 dense_rank() over (order by sum(local_fare) desc ) fare_rank 07:39:39 3 from t 07:39:42 4 group by area_code 07:39:46 5 /
AREA_CODE LOCAL_FARE FARE_RANK ---------- -------------- ---------- 5765 104548.72 1 5761 54225.41 2 5763 54225.41 2 5764 53156.77 3 這是這里出現(xiàn)了第三名 5762 52039.62 4
Elapsed: 00:00:00.00
在這個例子中,出現(xiàn)了一個第三名,這就是rank和dense_rank的差別, rank如果出現(xiàn)兩個相同的數(shù)據(jù),那么后面的數(shù)據(jù)就會直接跳過這個排名,而dense_rank則不會, 差別更大的是,row_number哪怕是兩個數(shù)據(jù)完全相同,排名也會不一樣,這個特性在我們想找出對應(yīng)沒個條件的唯一記錄的時候又很大用處
1 select area_code,sum(local_fare) local_fare, 2 row_number() over (order by sum(local_fare) desc ) fare_rank 3 from t 4* group by area_code 07:44:50 SQL> /
AREA_CODE LOCAL_FARE FARE_RANK ---------- -------------- ---------- 5765 104548.72 1 5761 54225.41 2 5763 54225.41 3 5764 53156.77 4 5762 52039.62 5
在row_nubmer函數(shù)中,我們發(fā)現(xiàn),哪怕sum(local_fare)完全相同,我們還是得到了不一樣排名,我們可以利用這個特性剔除數(shù)據(jù)庫中的重復(fù)記錄.
這個帖子中的幾個例子是為了說明這三個函數(shù)的基本用法的. 下個帖子我們將詳細(xì)介紹他們的一些用法.
|
|
|
2. rank函數(shù)的介紹
a. 取出數(shù)據(jù)庫中最后入網(wǎng)的n個用戶 select user_id,tele_num,user_name,user_status,create_date from ( select user_id,tele_num,user_name,user_status,create_date, rank() over (order by create_date desc) add_rank from user_info ) where add_rank <= :n;
b.根據(jù)object_name刪除數(shù)據(jù)庫中的重復(fù)記錄 create table t as select obj#,name from sys.obj$; 再insert into t1 select * from t1 數(shù)次. delete from t1 where rowid in ( select row_id from ( select rowid row_id,row_number() over (partition by obj# order by rowid ) rn ) where rn <> 1 );
c. 取出各地區(qū)的話費收入在各個月份排名. SQL> select bill_month,area_code,sum(local_fare) local_fare, 2 rank() over (partition by bill_month order by sum(local_fare) desc) area_rank 3 from t 4 group by bill_month,area_code 5 /
BILL_MONTH AREA_CODE LOCAL_FARE AREA_RANK --------------- --------------- -------------- ---------- 200405 5765 25057.74 1 200405 5761 13060.43 2 200405 5763 13060.43 2 200405 5762 12643.79 4 200405 5764 12487.79 5 200406 5765 26058.46 1 200406 5761 13318.93 2 200406 5763 13318.93 2 200406 5764 13295.19 4 200406 5762 12795.06 5 200407 5765 26301.88 1 200407 5761 13710.27 2 200407 5763 13710.27 2 200407 5764 13444.09 4 200407 5762 13224.30 5 200408 5765 27130.64 1 200408 5761 14135.78 2 200408 5763 14135.78 2 200408 5764 13929.69 4 200408 5762 13376.47 5
20 rows selected. SQL>
3. lag和lead函數(shù)介紹
取出每個月的上個月和下個月的話費總額 1 select area_code,bill_month, local_fare cur_local_fare, 2 lag(local_fare,2,0) over (partition by area_code order by bill_month ) pre_local_fare, 3 lag(local_fare,1,0) over (partition by area_code order by bill_month ) last_local_fare, 4 lead(local_fare,1,0) over (partition by area_code order by bill_month ) next_local_fare, 5 lead(local_fare,2,0) over (partition by area_code order by bill_month ) post_local_fare 6 from ( 7 select area_code,bill_month,sum(local_fare) local_fare 8 from t 9 group by area_code,bill_month 10* ) SQL> / AREA_CODE BILL_MONTH CUR_LOCAL_FARE PRE_LOCAL_FARE LAST_LOCAL_FARE NEXT_LOCAL_FARE POST_LOCAL_FARE --------- ---------- -------------- -------------- --------------- --------------- --------------- 5761 200405 13060.433 0 0 13318.93 13710.265 5761 200406 13318.93 0 13060.433 13710.265 14135.781 5761 200407 13710.265 13060.433 13318.93 14135.781 0 5761 200408 14135.781 13318.93 13710.265 0 0 5762 200405 12643.791 0 0 12795.06 13224.297 5762 200406 12795.06 0 12643.791 13224.297 13376.468 5762 200407 13224.297 12643.791 12795.06 13376.468 0 5762 200408 13376.468 12795.06 13224.297 0 0 5763 200405 13060.433 0 0 13318.93 13710.265 5763 200406 13318.93 0 13060.433 13710.265 14135.781 5763 200407 13710.265 13060.433 13318.93 14135.781 0 5763 200408 14135.781 13318.93 13710.265 0 0 5764 200405 12487.791 0 0 13295.187 13444.093 5764 200406 13295.187 0 12487.791 13444.093 13929.694 5764 200407 13444.093 12487.791 13295.187 13929.694 0 5764 200408 13929.694 13295.187 13444.093 0 0 5765 200405 25057.736 0 0 26058.46 26301.881 5765 200406 26058.46 0 25057.736 26301.881 27130.638 5765 200407 26301.881 25057.736 26058.46 27130.638 0 5765 200408 27130.638 26058.46 26301.881 0 0 20 rows selected.
利用lag和lead函數(shù),我們可以在同一行中顯示前n行的數(shù)據(jù),也可以顯示后n行的數(shù)據(jù).
4. sum,avg,max,min移動計算數(shù)據(jù)介紹
計算出各個連續(xù)3個月的通話費用的平均數(shù) 1 select area_code,bill_month, local_fare, 2 sum(local_fare) 3 over ( partition by area_code 4 order by to_number(bill_month) 5 range between 1 preceding and 1 following ) "3month_sum", 6 avg(local_fare) 7 over ( partition by area_code 8 order by to_number(bill_month) 9 range between 1 preceding and 1 following ) "3month_avg", 10 max(local_fare) 11 over ( partition by area_code 12 order by to_number(bill_month) 13 range between 1 preceding and 1 following ) "3month_max", 14 min(local_fare) 15 over ( partition by area_code 16 order by to_number(bill_month) 17 range between 1 preceding and 1 following ) "3month_min" 18 from ( 19 select area_code,bill_month,sum(local_fare) local_fare 20 from t 21 group by area_code,bill_month 22* ) SQL> /
AREA_CODE BILL_MONTH LOCAL_FARE 3month_sum 3month_avg 3month_max 3month_min --------- ---------- ---------------- ---------- ---------- ---------- ---------- 5761 200405 13060.433 26379.363 13189.6815 13318.93 13060.433 5761 200406 13318.930 40089.628 13363.2093 13710.265 13060.433 5761 200407 13710.265 41164.976 13721.6587 14135.781 13318.93 40089.628 = 13060.433 + 13318.930 + 13710.265 13363.2093 = (13060.433 + 13318.930 + 13710.265) / 3 13710.265 = max(13060.433 + 13318.930 + 13710.265) 13060.433 = min(13060.433 + 13318.930 + 13710.265) 5761 200408 14135.781 27846.046 13923.023 14135.781 13710.265 5762 200405 12643.791 25438.851 12719.4255 12795.06 12643.791 5762 200406 12795.060 38663.148 12887.716 13224.297 12643.791 5762 200407 13224.297 39395.825 13131.9417 13376.468 12795.06 5762 200408 13376.468 26600.765 13300.3825 13376.468 13224.297 5763 200405 13060.433 26379.363 13189.6815 13318.93 13060.433 5763 200406 13318.930 40089.628 13363.2093 13710.265 13060.433 5763 200407 13710.265 41164.976 13721.6587 14135.781 13318.93 5763 200408 14135.781 27846.046 13923.023 14135.781 13710.265 5764 200405 12487.791 25782.978 12891.489 13295.187 12487.791 5764 200406 13295.187 39227.071 13075.6903 13444.093 12487.791 5764 200407 13444.093 40668.974 13556.3247 13929.694 13295.187 5764 200408 13929.694 27373.787 13686.8935 13929.694 13444.093 5765 200405 25057.736 51116.196 25558.098 26058.46 25057.736 5765 200406 26058.460 77418.077 25806.0257 26301.881 25057.736 5765 200407 26301.881 79490.979 26496.993 27130.638 26058.46 5765 200408 27130.638 53432.519 26716.2595 27130.638 26301.881
20 rows selected.
5. ratio_to_report函數(shù)的介紹
Quote:
|
1 select bill_month,area_code,sum(local_fare) local_fare, 2 ratio_to_report(sum(local_fare)) over 3 ( partition by bill_month ) area_pct 4 from t 5* group by bill_month,area_code SQL> break on bill_month skip 1 SQL> compute sum of local_fare on bill_month SQL> compute sum of area_pct on bill_month SQL> /
BILL_MONTH AREA_CODE LOCAL_FARE AREA_PCT ---------- --------- ---------------- ---------- 200405 5761 13060.433 .171149279 5762 12643.791 .165689431 5763 13060.433 .171149279 5764 12487.791 .163645143 5765 25057.736 .328366866 ********** ---------------- ---------- sum 76310.184 1
200406 5761 13318.930 .169050772 5762 12795.060 .162401542 5763 13318.930 .169050772 5764 13295.187 .168749414 5765 26058.460 .330747499 ********** ---------------- ---------- sum 78786.567 1
200407 5761 13710.265 .170545197 5762 13224.297 .164500127 5763 13710.265 .170545197 5764 13444.093 .167234221 5765 26301.881 .327175257 ********** ---------------- ---------- sum 80390.801 1
200408 5761 14135.781 .170911147 5762 13376.468 .161730539 5763 14135.781 .170911147 5764 13929.694 .168419416 5765 27130.638 .328027751 ********** ---------------- ---------- sum 82708.362 1
20 rows selected.
|
|
6 first,last函數(shù)使用介紹
Quote:
|
取出每月通話費最高和最低的兩個用戶. 1 select bill_month,area_code,sum(local_fare) local_fare, 2 first_value(area_code) 3 over (order by sum(local_fare) desc 4 rows unbounded preceding) firstval, 5 first_value(area_code) 6 over (order by sum(local_fare) asc 7 rows unbounded preceding) lastval 8 from t 9 group by bill_month,area_code 10* order by bill_month SQL> /
BILL_MONTH AREA_CODE LOCAL_FARE FIRSTVAL LASTVAL ---------- --------- ---------------- --------------- --------------- 200405 5764 12487.791 5765 5764 200405 5762 12643.791 5765 5764 200405 5761 13060.433 5765 5764 200405 5765 25057.736 5765 5764 200405 5763 13060.433 5765 5764 200406 5762 12795.060 5765 5764 200406 5763 13318.930 5765 5764 200406 5764 13295.187 5765 5764 200406 5765 26058.460 5765 5764 200406 5761 13318.930 5765 5764 200407 5762 13224.297 5765 5764 200407 5765 26301.881 5765 5764 200407 5761 13710.265 5765 5764 200407 5763 13710.265 5765 5764 200407 5764 13444.093 5765 5764 200408 5762 13376.468 5765 5764 200408 5764 13929.694 5765 5764 200408 5761 14135.781 5765 5764 200408 5765 27130.638 5765 5764 200408 5763 14135.781 5765 5764
20 rows selected.
|
|
|