Skip Headers

Oracle Data Mining Application Developer's Guide
10g Release 1 (10.1)

Part Number B10699-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

5
ODM PL/SQL Sample Programs

This chapter provides sample code using DBMS_DATA_MINING for all the supported algorithms. The dataset used is the Drug Depot dataset that is available as part of the sample schema in Oracle10g. Please refer to Oracle Database Sample Schemas for information on sample schemas.

All samples are available in the directory $ORACLE_HOME/dm/demo/sample/plsql.

ODM sample datasets need to be loaded into a user schema prior to using the sample programs. Refer to the following scripts for creating Oracle tablespace, user schema, and loading ODM sample datasets:

$ORACLE_HOME/dm/admin/odmtbs.sql
$ORACLE_HOME/dm/admin/odmuser.sql
$ORACLE_HOME/dm/admin/dmuserld.sql
$ORACLE_HOME/dm/admin/dmshgrants.sql

5.1 Overview of ODM PL/SQL Sample Programs

The ODM PL/SQL sample programs illustrate the main operations of the data mining process:

Data mining models can be either supervised or unsupervised.

Supervised models predict the value of a specified variable, called the target variable, together with the confidence associated with each prediction. Supervised models are illustrated in the sample programs for Naive Bayes (NB), Adaptive Bayes Networks (ABN), and Support Vector Machines (SVM).

Unsupervised models have no target variable; they are used to predict group membership or relationships of an individual. Unsupervised models are illustrated in the sample programs for Clustering, Association Rules, and Non-Negative Matrix Factorization. Attribute Importance is also illustrated.

The PL/SQL sample programs rely on two sets of data:

The file $ORACLE_HOME/dm/demo/data/README.txt explains the datasets.

Each sample program for demonstrating Classification (NB, ABN, SVM) contains code that prepares the input data using DBMS_DATA_MINING_TRANSFORM, builds a model, tests a model, and then scores the model against new data. It demonstrates how to generate test results such as a confusion matrix, lift, ROC, and ranked Apply results.

The samples for Regression using SVM normalize the input data, build models, and test models using metrics such as root mean squared error, apply the models to new data, and generate ranked results.

The samples for Association demonstrate model build, and show how to obtain frequent itemsets and association rules for a given support and confidence.

The samples for Clustering demonstrate model build, and show how to obtain clustering details such as histograms, child nodes, and rules. The clusters are scored and ranked based on their probability.

The samples for Feature Extraction demonstrate model build, and show how to obtain details of various features. The features are scored and ranked based on their probability.

There is one sample program demonstrating the BLAST interface for biological sequence match and alignment.

Finally, there are three sample programs that demonstrate text mining for extracting features from a text document into a nested table column, text classification using SVM, and text feature extraction using NMF, respectively.

5.2 Summary of ODM PL/SQL Sample Programs

All the sample programs listed in the tables below are located in the directory $ORACLE_HOME/dm/demo/sample/plsql.

The summary description of these sample programs is also provided in $ORACLE_HOME/dm/demo/sample/plsql/README.txt.

Table 5-1  PL/SQL Samples Based on Individual Datasets
Sample Program Description

aidemo.sql

Attribute Importance using an MDL-based algorithm.

abndemo.sql

Classification using Adaptive Bayes Network algorithm

ardemo.sql

Association using Apriori algorithm

blastdemo.sql

BLAST sequence matching and alignment

kmdemo.sql

Clustering using k-Means algorithm

nbdemo.sql

Classification using Naive Bayes algorithm

nmfdemo.sql

Feature Extraction using NMF algorithm

svmcdemo.sql

Classification using SVM algorithm

svmrdemo.sql

Regression using SVM algorithm

Table 5-2  PL/SQL Samples Based on SH Schema
Sample Program Description

ai_sh.sql

Attribute Importance using an MDL-based algorithm

abn_sh.sql

Classification using Adaptive Bayes Network algorithm

ar_sh_.sql

Association using Apriori algorithm

akm_sh.sql

Clustering using k-Means algorithm

nb_sh.sql

Classification using Naive Bayes algorithm

nmf_sh.sql

Feature Extraction using NMF algorithm

svmc_sh.sql

Classification using SVM algorithm

svmr_sh.sql

Regression using SVM algorithm

textfe.sql

Demonstrates extracting text features from a CLOB/VARCHAR2 column into a nested table column in a table that can be provided as input to CREATE_MODEL

textnmf.sql

Text feature extraction using NMF

textsvmc.sql

Text classification using SVM