Skip Headers

Oracle® Database SQL Reference
10g Release 1 (10.1)

Part Number B10759-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

REGR_ (Linear Regression) Functions

The linear regression functions are:


Syntax

linear_regr::=
Description of linear_regr.gif follows
Description of the illustration linear_regr.gif


See Also:

"Analytic Functions " for information on syntax, semantics, and restrictions


Purpose

The linear regression functions fit an ordinary-least-squares regression line to a set of number pairs. You can use them as both aggregate and analytic functions.


See Also:

"Aggregate Functions " and "About SQL Expressions " for information on valid forms of expr

These functions take as arguments any numeric datatype or any nonnumeric datatype that can be implicitly converted to a numeric datatype. Oracle determines the argument with the highest numeric precedence, implicitly converts the remaining arguments to that datatype, and returns that datatype.


See Also:

Table 2-11, "Implicit Type Conversion Matrix" for more information on implicit conversion and "Numeric Precedence " for information on numeric precedence

Oracle applies the function to the set of (expr1, expr2) pairs after eliminating all pairs for which either expr1 or expr2 is null. Oracle computes all the regression functions simultaneously during a single pass through the data.

expr1 is interpreted as a value of the dependent variable (a y value), and expr2 is interpreted as a value of the independent variable (an x value).

All of the remaining regression functions return a numeric datatype and can be null:

REGR_SXY, REGR_SXX, REGR_SYY are auxiliary functions that are used to compute various diagnostic statistics.

The following examples are based on the sample tables sh.sales and sh.products.


General Linear Regression Example

The following example provides a comparison of the various linear regression functions:

SELECT
s.channel_id,
REGR_SLOPE(s.quantity_sold, p.prod_list_price)  SLOPE ,
REGR_INTERCEPT(s.quantity_sold, p.prod_list_price)  INTCPT ,
REGR_R2(s.quantity_sold, p.prod_list_price)  RSQR ,
REGR_COUNT(s.quantity_sold, p.prod_list_price)  COUNT ,
REGR_AVGX(s.quantity_sold, p.prod_list_price)  AVGLISTP ,
REGR_AVGY(s.quantity_sold, p.prod_list_price)  AVGQSOLD
FROM  sales s, products p
WHERE s.prod_id=p.prod_id AND
p.prod_category='Men'  AND
s.time_id=to_DATE('10-OCT-2000')
GROUP BY s.channel_id;

C      SLOPE     INTCPT       RSQR      COUNT   AVGLISTP   AVGQSOLD
- ---------- ---------- ---------- ---------- ---------- ----------
C -.03529838 16.4548382 .217277422         17 87.8764706 13.3529412
I  -.0108044 13.3082392 .028398018         43  116.77907 12.0465116
P -.01729665 11.3634927 .026191191         33 80.5818182 9.96969697
S -.01277499  13.488506 .000473089         71  52.571831 12.8169014
T -.01026734 5.01019929 .064283727         21       75.2 4.23809524

REGR_SLOPE and REGR_INTERCEPT Examples

The following example determines the slope and intercept of the regression line for the amount of sales and sale profits for each fiscal year:

SELECT t.fiscal_year,
   REGR_SLOPE(s.amount_sold, s.quantity_sold) "Slope",
   REGR_INTERCEPT(s.amount_sold, s.quantity_sold) "Intercept"
   FROM sales s, times t
   WHERE s.time_id = t.time_id
   GROUP BY t.fiscal_year;

FISCAL_YEAR      Slope  Intercept
----------- ---------- ----------
       1998 49.3934247 71.6015479
       1999 49.3443482 70.1502601
       2000 49.2262135 75.0287476

The following example determines the cumulative slope and cumulative intercept of the regression line for the amount of and quantity of sales for two products (270 and 260) for weekend transactions (day_number_in_week = 6 or 7) during the last three weeks (fiscal_week_number of 50, 51, or 52) of 1998:

SELECT t.fiscal_month_number "Month", t.day_number_in_month "Day", 
   REGR_SLOPE(s.amount_sold, s.quantity_sold) 
      OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_SLOPE,
   REGR_INTERCEPT(s.amount_sold, s.quantity_sold) 
      OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month) AS CUM_ICPT 
   FROM sales s, times t
   WHERE s.time_id = t.time_id 
      AND s.prod_id IN (270, 260)
      AND t.fiscal_year=1998 
      AND t.fiscal_week_number IN (50, 51, 52)
      AND t.day_number_in_week IN (6,7)
   ORDER BY t.fiscal_month_desc, t.day_number_in_month;
Month        Day  CUM_SLOPE   CUM_ICPT
---------- ---------- ---------- ----------
        12         12        -68       1872
        12         12        -68       1872
        12         13 -20.244898 1254.36735
        12         13 -20.244898 1254.36735
        12         19 -18.826087       1287
        12         20 62.4561404  125.28655
        12         20 62.4561404  125.28655
        12         20 62.4561404  125.28655
        12         20 62.4561404  125.28655
        12         26 67.2658228 58.9712313
        12         26 67.2658228 58.9712313
        12         27 37.5245541 284.958221
        12         27 37.5245541 284.958221
        12         27 37.5245541 284.958221

REGR_COUNT Examples

The following example returns the number of customers in the customers table (out of a total of 319) who have account managers.

SELECT REGR_COUNT(customer_id, account_mgr_id) FROM customers;

REGR_COUNT(CUSTOMER_ID,ACCOUNT_MGR_ID)
--------------------------------------
                                   231

The following example computes the cumulative number of transactions for each day in April of 1998:

SELECT UNIQUE t.day_number_in_month,
   REGR_COUNT(s.amount_sold, s.quantity_sold) 
      OVER (PARTITION BY t.fiscal_month_number 
   ORDER BY t.day_number_in_month) "Regr_Count"
FROM sales s, times t
WHERE s.time_id = t.time_id 
AND t.fiscal_year = 1998 AND t.fiscal_month_number = 4; 
 
DAY_NUMBER_IN_MONTH Regr_Count
------------------- ----------
                  1        825
                  2       1650
                  3       2475
                  4       3300
. . .
                 26      21450
                 30      22200

REGR_R2 Examples

The following example computes the coefficient of determination of the regression line for amount of sales greater than 5000 and quantity sold:

SELECT REGR_R2(amount_sold, quantity_sold) FROM sales
   WHERE amount_sold > 5000;

REGR_R2(AMOUNT_SOLD,QUANTITY_SOLD)
----------------------------------
                        .024087453

The following example computes the cumulative coefficient of determination of the regression line for monthly sales amounts and quantities for each month during 1998:

SELECT t.fiscal_month_number,
      REGR_R2(SUM(s.amount_sold), SUM(s.quantity_sold))
      OVER (ORDER BY t.fiscal_month_number) "Regr_R2"
   FROM sales s, times t
   WHERE s.time_id = t.time_id
   AND t.fiscal_year = 1998
   GROUP BY t.fiscal_month_number
   ORDER BY t.fiscal_month_number;

FISCAL_MONTH_NUMBER    Regr_R2
------------------- ----------
                  1
                  2          1
                  3 .927372984
                  4 .807019972
                  5 .932745567
                  6  .94682861
                  7 .965342011
                  8 .955768075
                  9 .959542618
                 10 .938618575
                 11 .880931415
                 12 .882769189

REGR_AVGY and REGR_AVGX Examples

The following example calculates the regression average for the amount and quantity of sales for each year:

SELECT t.fiscal_year,
   REGR_AVGY(s.amount_sold, s.quantity_sold) "Regr_AvgY",
   REGR_AVGX(s.amount_sold, s.quantity_sold) "Regr_AvgX"
FROM sales s, times t
WHERE s.time_id = t.time_id
GROUP BY t.fiscal_year;

FISCAL_YEAR  Regr_AvgY  Regr_AvgX
----------- ---------- ----------
       1998 716.602044 13.0584283
       1999 714.910831 13.0665536
       2000 717.331304 13.0479781

The following example calculates the cumulative averages for the amount and quantity of sales profits for product 260 during the last two weeks of December 1998:

SELECT t.day_number_in_month,
   REGR_AVGY(s.amount_sold, s.quantity_sold)
      OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month)
      "Regr_AvgY",
   REGR_AVGX(s.amount_sold, s.quantity_sold)
      OVER (ORDER BY t.fiscal_month_desc, t.day_number_in_month)
      "Regr_AvgX"
   FROM sales s, times t
   WHERE s.time_id = t.time_id 
      AND s.prod_id = 260
      AND t.fiscal_month_desc = '1998-12'
      AND t.fiscal_week_number IN (51, 52)
   ORDER BY t.day_number_in_month;

DAY_NUMBER_IN_MONTH  Regr_AvgY  Regr_AvgX
------------------- ---------- ----------
                 14        882       24.5
                 14        882       24.5
                 15        801      22.25
                 15        801      22.25
                 16      777.6       21.6
                 18 642.857143 17.8571429
                 18 642.857143 17.8571429
                 20      589.5     16.375
                 21        544 15.1111111
                 22 592.363636 16.4545455
                 22 592.363636 16.4545455
                 24 553.846154 15.3846154
                 24 553.846154 15.3846154
                 26        522       14.5
                 27      578.4 16.0666667

REGR_SXY, REGR_SXX, and REGR_SYY Examples

The following example computes the REGR_SXY, REGR_SXX, and REGR_SYY values for the regression analysis of amount and quantity of sales for each year in the sample sh.sales table:

SELECT t.fiscal_year,
   REGR_SXY(s.amount_sold, s.quantity_sold) "Regr_sxy",
   REGR_SYY(s.amount_sold, s.quantity_sold) "Regr_syy",
   REGR_SXX(s.amount_sold, s.quantity_sold) "Regr_sxx"
FROM sales s, times t
WHERE s.time_id = t.time_id
GROUP BY t.fiscal_year;

FISCAL_YEAR   Regr_sxy   Regr_syy   Regr_sxx
----------- ---------- ---------- ----------
       1998 1620591607 2.3328E+11 32809865.2
       1999 1955866724 2.7695E+11 39637097.2
       2000 2127877398 3.0630E+11 43226509.7

The following example computes the cumulative REGR_SXY, REGR_SXX, and REGR_SYY statistics for amount and quantity of weekend sales for products 270 and 260 for each year-month value in 1998:

SELECT t.day_number_in_month,
   REGR_SXY(s.amount_sold, s.quantity_sold)
      OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxy",
   REGR_SYY(s.amount_sold, s.quantity_sold)
      OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_syy",
   REGR_SXX(s.amount_sold, s.quantity_sold)
      OVER (ORDER BY t.fiscal_year, t.fiscal_month_desc) "Regr_sxx"
FROM sales s, times t
WHERE s.time_id = t.time_id 
   AND prod_id IN (270, 260)
   AND t.fiscal_month_desc = '1998-02'
   AND t.day_number_in_week IN (6,7)
ORDER BY t.day_number_in_month;

DAY_NUMBER_IN_MONTH   Regr_sxy   Regr_syy   Regr_sxx
------------------- ---------- ---------- ----------
                  1  130973783 1.8916E+10 2577797.94
. . .
                 30  130973783 1.8916E+10 2577797.94