Skip Headers

Oracle Data Mining Application Developer's Guide
10g Release 1 (10.1)

Part Number B10699-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

A
Binning

This appendix provides a detailed example of binning.

Table A-1 displays original data before binning. Table A-2 shows the bin boundaries for numeric data; Table A-3 shows bin boundaries for categorical data. Table A-4 shows the results of binning.

Table A-1 Binning Illustration: Data before Binning
PERSON_ID AGE WORK
CLASS
EDUCATION MARITAL_STATUS OCCUPATION

2

27

Private

HS-grad

Married

Crafts

8

46

Private

Bach.

Separ.

Prof.

10

34

Private

HS-grad

Separ.

Agricultural

11

23

Sta-gov

< Bach.

NeverM

Cleric.

41

30

Private

< Bach.

Married

Sales

Table A-2 Binning Illustration: Bin Boundaries for Numeric Data
COLUMN_NAME LOWER_ BOUNDARY UPPER_BOUNDARY BIN_ID DISPLAY_NAME

AGE

17

24.3

1

17-24.3

AGE

24.3

31.6

2

24.3-31.6

AGE

31.6

38.9

3

31.6-38.9

AGE

38.9

46.2

4

38.9-46.2

AGE

46.2

53.5

5

46.2-53.5

Table A-3 Binning Illustration: Bin Boundaries for Categorical Data
COLUMN_NAME CATEGORY BIN_ID DISPLAY_NAME

WORKCLASS

Loc-gov

1

Government

WORKCLASS

Fed-gov

1

Government

WORKCLASS

Sta-gov

1

Government

WORKCLASS

Private

2

Others

EDUCATION

HS-grad

1

HS-grad

EDUCATION

< Bach.

2

< Bach.

EDUCATION

Bach.

3

Bach.

EDUCATION

Masters

4

Masters

MARITAL_STATUS

Married

1

Married

MARITAL_STATUS

NeverM

2

NeverM

MARITAL_STATUS

Divorc.

3

Divorc.

MARITAL_STATUS

Widowed

4

Widowed

MARITAL_STATUS

Separ.

5

Separ.

OCCUPATION

Prof

1

Prof

OCCUPATION

Crafts

2

Crafts

OCCUPATION

Exec.

3

Exec.

OCCUPATION

Sales

4

Sales

OCCUPATION

Cleric

5

Cleric

OCCUPATION

 

6

Other_occ

Table A-4 Binning Illustration: Assignment of Original Data to Bins
PERSON_ID AGE WORK
CLASS
WEIGHT EDUCATION MARITAL_STATUS OCCUPATION

2

2

2

2

1

1

2

8

4

2

1

3

5

1

10

3

2

1

1

5

6

11

1

1

1

2

2

5

41

2

2

2

2

1

4

A.1 Use of Automated Binning

The Java interface supports automated binning. An important advantage of automated binning is that it allows ODM to handle raw data. Automated binning also allows initial exploration of problems about which there is little or no information to guide binning decisions.

Currently automatic binning requires closed intervals for numerical bins. This can result in certain values being ignored. For example, if the salary range in the build data table is 0 to 1,000,000, any salary greater than 1,000,000 is ignored when the model is applied. If you are trying to identify likely purchasers of a high-end consumer product, attributes indicating the wealthiest individuals are likely to be deleted, and you probably won't find the best targets. Manual binning has the option of making extreme bins open-ended, that is, with infinite boundaries.