Thursday, January 05, 2012

Oracle - SQL - Evenly dividing a range of Data using NTILE, MIN, and MAX

Problem: Over a widely dispersed set of IDs, create 10 evenly divided ranges each in terms of count per group. Determine the ID range for each group.
Practical: We needed to scrub a bunch of data in parallel streams. The easist answer was to create 10 simultaneous UPDATEs, but we didn't want the updates stepping on each other.
Solution: Use NTILE to create 10 evenly "divided" temporary groups [from a 'rowcount' standpoint]. Use MIN and MAX against each temporary group to determine the starting and ending points. Query:

SELECT 
   Quartile,
   MIN("ID") "StartingID",
   MAX("ID") "EndingID",
   COUNT(*) "CountPerGroup"
FROM
   ( 
      SELECT 
         a.contract_payment_id "ID",
         NTILE(10) OVER(ORDER BY a.contract_payment_id) AS quartile
      FROM   
         contract_payment a 
      WHERE
         a.transaction_type_code = 'CC' and 
         a.credit_card_type_code = 'VISA'
         and rownum <= 10000  -- Reduce resultset
 ) 
GROUP BY 
   quartile
ORDER BY 
   1;
Results:
QUARTILE  StartingID  EndingID  CountPerGroup
    1        59261      93271       1000
    2        93281     110492       1000
    3       110512     122382       1000
    4       122391     134062       1000
    5       134072     144371       1000
    6       144372     154481       1000
    7       154482     162741       1000
    8       162752     169163       1000
    9       169173     176913       1000
   10       176921     183833       1000

1 comment:

Anonymous said...

EXCELLENT !!!
Thanks so much
Sergio