Monday, November 21, 2011

How Many Sleeps to Santa

to_date('25/12/2011','DD/MM/YYYY') - trunc(sysdate) "How Many Sleep to Santa"
from dual;

How Many Sleep to Santa

Thursday, November 17, 2011

Call for Presentations : OUG Ireland Conference 2012

The call for presentations for the annual Oracle User Group Ireland conference has been posted in last few days.
The conference is planned for March 2012 and the venue will be picked over the next few weeks.
I’m on organising committee this year. It is hoped to have a number of parallel streams covering core Database Technology, BI (&EPM), Development (including Fusion).
If you are interested in presenting a short presentation of approx. 45 minutes (including time for questions), then you will need to submit your Topic and Abstract using the following link :
The conference is not limited to presenters from Ireland and it is hoped to get a number of well known Oracle experts and Oracle ACEs to come to Dublin for the day.
What kind of topics are of interest. Well pretty much anything Oracle. We have all come across something interesting in our jobs that we could share, be it using a particular technique, new features, sharing experiences, best practices, product demos, etc
I’ve already submitted a presentation on Oracle Data Miner.
There is a Twitter hash tag for the Oracle Conference #oug_ire2012.  So add this to your Twitter tool to follow developments and announcements about the conference.
If you have any question about the conference drop me a email.

Wednesday, November 16, 2011

My UKOUG Conference 2011 Schedule

UK Oracle User Group Conference 2011

The UKOUG conference will be in a couple of weeks. I have my flights and hotel booked, and I’ve just finished selecting my agenda of presentations. I really enjoy this conference as it serves many purposes including, finding new directions Oracle is taking, new product features, some upskilling/training, confirming that the approaches that I have been using on projects are valid, getting lots of hints and tips, etc.

One thing that I always try to do and I strongly everyone (in particular first timers) to do is to go to 1 session everyday that is on a topic or product that you know (nearly) nothing about.  You might discover that you know more than you think or you may learn something new that can be feed into some project on your return or over the next 12 months.

My agenda for the conference currently looks Very busy and in between these session, there is the exhibition hall, meetings with old and new friends, meetings with product/business unit managers, asking people to write articles for Oracle Scene, checking out possible presenters to come to Ireland for our conference in March 2012, etc.  Then there is my presentation on the Wednesday afternoon.


I’ll miss most of the Oak Table event on the Sunday but I hope to make it in time for

16:40-17:30 : Performance & High Availability Panel Session


9:20-9:50 : Keynote by Mark Sunday, Oracle (H1)
10:00-10:45 : The Future of BI & Oracle roadmap, Mike Durran, Oracle (H5)
11:05-12:05 : Implementing Interactive Maps with OBIEE 11g, Antony Heljula, Peak Indicators (H10A)
12:15-13:15 : OBI 11g Analysis & Reporting New Features, Mark Rittman (8A)
14:30-15:15 : Master Data Management – What is it & how to make it work – Robert Barnett, Hub Solutions Designs (H10A)
16:20-17:35 : Dummies Guide to Oracle ADF, Grant Ronald, Oracle, (Media Suite)
16:35-18:30 : The DB Time Performance Method, Graham Wood, Oracle (H8A)
17:45-18:30 : Performance & Stability with Oracle 11g SQL Plan Management, Doug Burns (H1)
17:45-18:30 : Experiences in Virtualization, Michael Doherty (H10A)
19:45-20:45 : Exhibition Welcome Drinks
20:45-Late : Focus Pubs


9:00-11:00 : Next Generation BI Architectures Masterclass, Andrew Bond, Oracle (H10B)
10:10-10:55 : Who’s afraid of Analytic Functions, Alex Nuijten, Maxima (H5)
11:15-12:15 : Analysing Your Data with Analytic Functions, Carl Dudley, (H9)
11:25-13:25 : Using a Physical Standby to Minimize Downtime for DB Release or Server Change, Michael Abbey, Pythian (Media Suite)
14:40-15:25 : How note to make the headlines, Mark Clewett, Hitachi (H10A)
14:40-15:25 : APEX Back to Basics, Paul Broughton, APEX Evangelists (H9)
15:35-16:20 : Can People be identified in the database, Pete Finnigan (H1)
16:40-18:35 : OTN Hands-on Workshop, Todd Trichler, Oracle (H8A)
17:50-18:35 : SQL Developer Data Modeler as a replacement for Oracle Designer, Paul Bainbridge, Fujitsu, (H8B)
18:45-19:45 : Keynote : Future of Enterprise Software and Oracle, Ray Wang, Constellation Research (H1)
20:00-Late : Evening Social & Networking


9:00-10:00 : Oracle 11g Database: Automatic Parallelism, Joel Goodman, Oracle (H9)
9:00-10:00 : Big Data: Learn how to predict the future, Keith Laker, Oracle (H8B)
10:10-10:55 : All about indexes – What to index, when and how, Mark Bobak, ProQuest (H5)
11:20-12:30 : Using Application Express to Build Highly Accessible Products, Anthony Rayner, Oracle (H8A)
12:30-13:30 : Practical uses for APEX Dictionary, John Scott, APEX Evangelists (H8A)
15:20-16:05 : How to deploy you Oracle Data Miner 11g R2 Workflows in a Live Environment – Me  (H7B)
16:15-17:00 : Next Generation Data Warehousing, Kulvinder Hari, Oracle (H8A)
16:15-17:00 : Beyond RTFM and WTF Message Moments. Introducing a new standard: Oracle Fusion Applications User Assistance, Ultan O’Broin (Executive Room 7)

I know I have some overlapping sessions, but I will decide on the date which of these I will attend.

As you an see I will be following the BI stream mainly, with a few sessions on the Database and Development streams too.

This year there is a smart phone app help us organise our agenda, meetings, etc, The only downside is that the app does not import the agenda that I created on the website. So I have to do it again. Maybe for next year they will have an import agenda feature.

New UKOUG mobile app – Launched October 2011

Wednesday, November 9, 2011

ODM–PL/SQL API for Exporting & Importing Models

In a previous blog post I talked about how you can take a copy of a workflow developed in Oracle Data Miner, and load it into a new schema.
When you data mining project gets to a mature stage and you need to productionalise the data mining process and model updates, you will need to use a different set of tools.

As you gather more and more data and cases, you will be updating/refreshing your models to reflect this new data. The new update data mining model needs to be moved from the development/test environment to the production environment. As with all things in IT we would like to automate this updating of the model in production.
There are a number of database features and packages that we can use to automate the update and it involves the setting up of some scripts on the development/test database and also on the production database.

These steps include:

  • Creation of a directory on the development/test database
  • Exporting of the updated Data Mining model
  • Copying of the exported Data Mining model to the production server
  • Removing the existing Data Mining model from production
  • Importing of the new Data Mining model.
  • Rename the imported mode to the standard name

The DBMS_DATA_MINING PL/SQL package has 2 functions that allow us to export a model and to import a model. These functions are an API to the Oracle Data Pump. The function to export a model is DBMS_DATA_MINING.EXPORT_MODEL and the function to import a model is DBMS_DATA_MINING.IMPORT_MODEL.The parameters to these function are what you would expect use if you were to use Data Pump directly, but have been tailored for the data mining models.

Lets start with listing the models that we have in our development/test schema:

SQL> connect dmuser2/dmuser2
SQL> SELECT model_name FROM user_mining_models;


Create/define the directory on the server where the models will be exported to.

CREATE OR REPLACE DIRECTORY DataMiningDir_Exports AS 'c:\app\Data_Mining_Exports';

The schema you are using will need to have the CREATE ANY DIRECTORY privilege.

Now we can export our mode. In this example we are going to export the Decision Tree model (CLAS_DT_1_6)

The function has the following structure

     filename IN VARCHAR2,
     directory IN VARCHAR2,
     model_filter IN VARCHAR2 DEFAULT NULL,
     operation IN VARCHAR2 DEFAULT NULL,
     remote_link IN VARCHAR2 DEFAULT NULL,

If we wanted to export all the models into a file called Exported_DM_Models, we would run:

DBMS_DATA_MINING.EXPORT_MODEL('Exported_DM_Models', 'DataMiningDir');

If we just wanted to export our Decision Tree model to file Exported_CLASS_DT_Model, we would run:

DBMS_DATA_MINING.EXPORT_MODEL('Exported_CLASS_DT_Model', 'DataMiningDir', 'name in (''CLAS_DT_1_6'')');

Before you can load the new update data mining model into your production database we need to drop the existing model. Before we do this we need to ensure that this is done when the model is not in use, so it would be advisable to schedule the dropping of the model during a quiet time, like before or after the nightly backups/processes.


Warning : When importing the data mining model, you need to import into a tablespace that has the same name as the tablespace in the development/test database.  If the USERS tablespace is used in the development/test database, then the model will be imported into the USERS tablespace in the production database.

Hint : Create a DATAMINING tablespace in your development/test and production databases. This tablespace can be used solely for data mining purposes.

To import the decision tree model we exported previously, we would run

DBMS_DATA_MINING.IMPORT_MODEL('Exported_CLASS_DT_Model', 'DataMiningDir', 'name=’CLAS_DT_1_6''', 'IMPORT', null, null, 'dmuser2:dmuser3');

We now have the new updated data mining model loaded into the production database.

The final step before we can start using the new updated model in our production database is to rename the imported model to the standard name that is being used in the production database.


Scheduling of these steps
We can wrap most of this up into stored procedures and have schedule it to run on a semi-regular bases, using the DBMS_JOB function. The following example schedules a procedure that controls the importing, dropping and renaming of the models.

DBMS_JOB.SUBMIT(jobnum.nextval, 'import_new_data_mining_model', trunc(sysdate), add_month(trunc(sysdate)+1);

This schedules the the running of the procedure to import the new data mining models, to run immediately and then to run every month.

Saturday, November 5, 2011

What Conference ? If I had the time and money

If I had lots of free time and enough money what conferences would I go to around the world. I regularly get asked for recommendations on what conferences should a person attend. It all depends on what you want to get out of your conference trip. Be is training, education, information building, networking, etc. or to enjoy the local attractions.

The table below is my preferred list of conferences to attend. All of the conferences below are focused on two main areas. The first area is Oracle  and the second area is that of Data Mining/Predictive Analytics.

I hope you find the list useful. If you can recommend some others let me know.

Month Conference


Annual Ireland Oracle Conference – Dublin, Ireland

Predictive Analytics World – USA (San Francisco)

Text Analytics World

Hotsos Symposium


Collaborate (IOUG Conference USA)

Enterprise Data World (USA)

Miracle OpenWorld (Denmark)


OUG Harmony (Finland)


Oracle Development Tools User Group Kaleidoscope (Kscope)

Data Governance – Summer Conference

Oracle Benelux User Group Conference


VirtaThon – Online Oracle Conference


ACM SIGKDD Conference on KDD & Data Mining


Oracle Open World – San Francisco, USA

Predictive Analytics World – USA (New York)

SAS Analytics Conference


TDWI World Conference

Data Governance – Winter Conference (USA)

Predictive Analytics World – UK

International Conference on Data Mining & Engineering (ICDMKE)

Australia Oracle User Group Conference

Germany Oracle User Group Conference (DOAG)


Annual UKOUG Conference – Birmingham, UK

IEEE International Conference on Data Mining (ICDM)

Oracle Open World Latin America

There is a lot of conferences in the October, November and December months. Some of these are on overlapping dates, which is a pity. Perhaps the organisers of some of these conferences. Also during the January and February months there does not seem to be any conferences in the areas.

If you would like to sponsor a trip to one or more of these then drop me an email Smile

Thursday, November 3, 2011

ODM 11.2 Data Dictionary Views.

The Oracle 11.2 database contains the following Oracle Data Mining views. These allow you to query the database for the metadata relating to what Data Mining Models you have, what the configurations area and what data is involved.


Describes the high level information about the data mining models in the database.  Related views include DBA_MINING_MODELS and USER_MINING_MODELS.

Attribute Data Type Description
OWNER Varchar2(30) NN Owner of the mining model
MODEL_NAME Varchar2(30) NN Name of the mining model
MINING_FUNCTION Varchar2(30) What data mining function to use
ALGORITHM Varchar2(30) Algorithm used by the model
CREATION_DATE Date NN Date model was created
BUILD_DURATION Number Time in seconds for the model build process
MODEL_SIZE Number Size of model in MBytes
COMMENTS Varchar2(4000)  
Lets query the my DMUSER2 data mining schema. This was created during a previous post where we exported some ODM models from schema and loaded them into DMUSER2 schema

SELECT model_name, 

-------------  ---------------- -------------------------- -------------- ----------
CLAS_SVM_1_6   CLASSIFICATION    SUPPORT_VECTOR_MACHINES                     3      .1515
CLAS_DT_1_6    CLASSIFICATION    DECISION_TREE                               2      .0842
CLAS_GLM_1_6   CLASSIFICATION    GENERALIZED_LINEAR_MODEL                    3      .0877
CLAS_NB_1_6    CLASSIFICATION    NAIVE_BAYES                                 2      .0459


Describes the attributes of the data mining models.  Related views are DBA_MINING_MODEL_ATTRIBUTES and USER_MINING_MODEL_ATTRIBUTES.

Attribute Data Type Description
OWNER Varchar2(30) NN Owner of the mining model
MODEL_NAME Varchar2(30) NN Name of the mining mode
ATTRIBUTE_NAME Varchar2(30) NN Name of the attribute
ATTRIBUTE_TYPE Varchar2(11) Logical type of attribute
NUMERICAL – numeric data
CATEGORICAL – character data
DATA_TYPE Varchar2(12) Data type of attribute
DATA_LENGTH Number Length of data type
DATA_PRECISION Number Precision of a fixed point number
DATA_SCALE Number Scale of the fixed point number
USAGE_TYPE Varchar2(8) Indicated if the attribute was used to create the model (ACTIVE) or not (INACTIVE)
TARGET Varchar2(3) Indicates if the attribute is the target

If we take one of our data mining models that was listed about and select what attributes are used by that model;

SELECT attribute_name,
from all_mining_model_attributes
where model_name = 'CLAS_DT_1_6';

------------------------------ ----------- -------- ---
AGE                            NUMERICAL   ACTIVE   NO
Y_BOX_GAMES                    NUMERICAL   ACTIVE   NO

The first thing to note here is that all the attributes are listed as ACTIVE. This is the default and will be the case for all attributes for all the algorithms, so we can ignore this attribute in our queries, but it is good to check just in case.

The second thing to note is for the last row we have the AFFINITY_CARD has a target attribute value of YES. This is the target attributes used by the classification algorithm.


Describes the setting of the data mining models. The settings associated with a model are algorithm dependent. The Setting values can be provided as input to the model build process. Alternatively, separate settings table can used.  If no setting values are defined of provided, then the algorithm will use its default settings.

Attribute Data Type Description
OWNER Varchar2(30) NN Owner of the mining model
MODEL_NAME Varchar2(30) NN Name of the mining model
SETTING_NAME Varchar2(30) NN Name of the Setting
SETTING_VALUE Varchar2(4000) Value of the Setting
SETTING_TYPE Varchar2(7) Indicates whether the default value (DEFAULT) or a user specified value (INPUT) is used by the model

Lets take our previous example of the 'CLAS_DT_1_6' model and query the database to see what the setting are.

column setting_value format a30
select setting_name, 
from all_mining_model_settings
where model_name = 'CLAS_DT_1_6';

SETTING_NAME            SETTING_VALUE                SETTING
----------------------- ---------------------------- -------
ALGO_NAME               ALGO_DECISION_TREE           INPUT
PREP_AUTO               ON                           INPUT
TREE_TERM_MINPCT_NODE   .05                          INPUT
TREE_TERM_MINREC_SPLIT  20                           INPUT
TREE_TERM_MINPCT_SPLIT  .1                           INPUT
TREE_TERM_MAX_DEPTH     7                            INPUT
TREE_TERM_MINREC_NODE   10                           INPUT

Wednesday, November 2, 2011

Tom Kyte Seminar Day–Dublin

On Wednesday 2nd December, I attended a full day of presentations given by Tom Kyte of Oracle ( Tom covered a number of topics and these included some of his Oracle Open World presentations.

The topics that were covered included

  • 5 things about SQL (OOW11)
  • Database Option Packs
  • 5 things about PL/SQL (OOW11)
  • Q&A Ask Tom Session

All of these presentations can be downloaded from Tom’s website

Tom wont be presenting at the annual UKOUG conference in December, but he is hoping to be there next year (2012).


Monday, October 31, 2011

ODM 11.2–Data Mining PL/SQL Packages

The Oracle 11.2 database contains 3 PL/SQL packages that allow you to perform all (well almost all) of your data mining functions.

So instead of using the Oracle Data Miner tool you can write some PL/SQL code that will you to do the same things.

Before you can start using these PL/SQL packages you need to ensure that the schema that you are going to use has been setup with the following:

  • Create a schema or use and existing one
  • Grant the schema all the data mining privileges: see my earlier posting on how to setup an Oracle schema for data mining – Click here and YouTube video
  • Grant all necessary privileges to the data that you will be using for data mining

The first PL/SQL package that you will use is the DBMS_DATA_MINING_TRANSFORM. This PL/SQL package allows you to transform the data to make it suitable for data mining. There are a number of functions in this package that allows you to transform the data, but depending on the data you may need to write your own code to perform the transformations. When you apply your data model to the test or the apply data sets, ODM will automatically take the transformation functions defined using this package and apply them to the new data sets.

The second PL/SQL package is DBMS_DATA_MINING. This is the main data mining PL/SQL package. It contains functions to allow you to:

  • To create a Model
  • Describe the Model
  • Exploring and importing of Models
  • Computing costs and text metrics for classification Models
  • Applying the Model to new data
  • Administration of Models, like dropping, renaming, etc

The next (and last) PL/SQL package is DBMS_PREDICTIVE_ANALYTICS.The routines included in this package allows you to prepare data, build a model, score a model and return results of model scoring. The routines include EXPLAIN which ranks attributes in order of influence in explaining a target column. PREDICT which predicts the value of a target attribute based on values in the input. PROFILE which generates rules that describe the cases from the input data.

Over the coming weeks I will have separate blog posts on each of these PL/SQL packages. These will cover the functions that are part of each packages and will include some examples of using the package and functions.

Saturday, October 29, 2011

ODM PL/SQL API 11.2 New Features

The PL/SQL API interface for Oracle Data Miner has had a number of new features. These are listed below along with the new API features added with the 11.1 release.

  • Support for Native Transactional Data with Association Rules: you can build association rule models without first transforming the transactional data.
  • SVM class weights specified with CLAS_WEIGHTS_TABLE_NAME: including the GLM class weights
  • FORCE argument to DROP_MODEL: you can now force a drop model operation even if a serious system error has interrupted the model build process
  • GET_MODEL_DETAILS_SVM has a new REVERSE_COEF parameter: you can obtain the transformed attribute coefficients used internally by an SVM model by setting the new REVERSE_COEF parameter to 1

11.1g API New Features

  • Mining Model schema objects: previous releases, DM models were implemented as a collection of tables and metadata within the DMSYS schema. in 11.1 models are implemented as data dictionary objects in the SYS schema. A new set of DD views present DM models and their properties
  • Automatic and Embedded Data Preparation: previously data preparation was the responsibility of the user. Now it can be automated
  • Scoping of Nested Data: supports nested data types for both categorical and numerical data. Most algorithms require multi-record case data to the presented as columns of nested rows, each containing an attribute name/value pair. ODM processes each nested row as a separate attribute.
  • Standardised Handling of Sparse Data & Missing Values: standardised across all algorithms.
  • Generalised Linear Models: has a new algorithm and supports classification (logistic regression) and regression (linear regression)
  • New SQL Data Mining Function: PREDICTION_BOUNDS has been introduced for Generalised Linear Models. This returns the confidence bounds on predicted values (regression models) or predicted probabilities (classification)
  • Enhanced Support for Cost-Sensitive Decision Making: can be added or removed using DATA_MINING.ADD_COST_MATRIX and DBMS_DATA_MINING_REMOVE_COST_MATRIX.

Friday, October 21, 2011

Interesting quotes from Predictive Analytics World

The Predictive Analytics World conference is finishing up today in New York. Over the past few days the conference has had some of the leading analytic type people presenting at it.

Twitter, as usual, has been busy and there has been some very interesting and important quotes.

The list of tweets (#pawcon) below are the ones I found most interesting:

Manu Sharma from LinkedIn: "Guru" job title is down, "Ninja" is up.

Despite the "data science" buzz, the biggest skill among #pawcon attendees is " #DataMining

Andrea Medinaceli: Visualization is very powerful for making analytics results accessible to upper management (and for buy-in)

Social Network Analytics (SNA) with Zynga, 20M daily active users, 90M monthly active users; 10K nodes, 45K edges (big!)

Vertica: Zynga is an analytics company in the disguise of a gaming company; graph analytics find users/influencers

Colin Shearer: Find me something interesting in my data is a question from hell (analysis should be guided by business goals)

John Elder advocates ensemble methods - usually improve analytics results

Tom Davenport: to get real value, #analytics need to move from one-time craft to industrialized activity

10 years from now all Fortune 500 companies will have a Chief Analytics Officer at the level of COO or CFO

Must be a sign of the economy, so much of the focus on the value of predictive is on retaining customers. #PAWCON.

Tom Davenport: #Analytics is not about math, it is about relationships (with your business client) - says Intel Chief Mathematician

Karl Rexer: companies with higher analytic capabilities are doing better than their peers

Wednesday, October 19, 2011

ODM API Demos in PL/SQL (& Java)

If you have been using Oracle Data Miner to develop your data mining workflows and models, at some point you will want to move away from the tool and start using the ODM APIs.

Oracle Data Mining provides a PL/SQL API and a Java API for creating supervised and unsupervised data mining models. The two APIs are fully interoperable, so that a model can be created with one API and then modified or applied using the other API.

I will cover the Java APIs in a later post, so watch out for that.

To help you get started with using the APIs there are a number of demo PL/SQL programs available. These were available as part of the the pre-11.2g version of the tool. But they don’t seem to packaged up with the 11.2 (SQL Developer 3) application.

The following table gives a list of the PL/SQL demo programs that are available. Although these were part of the pre-11.2g tool, they still seem to work on your 11.2g database.

You can download a zip of these files from here.

The sample PL/SQL programs illustrate each of the algorithms supported by Oracle Data Mining. They include examples of data transformations appropriate for each algorithm.


I will be exploring the main APIs, how to set them up, the parameters, etc.,  over the next few weeks, so check back for these posts.

Tuesday, October 18, 2011

Book Donation by Oracle

Today I received two boxes, containing 48 books of

The Performance Management Revolution by Howard Dresner


These books have been kindly donated by Duncan Fitter, UK Business Development Director at Oracle.

I will be distributing these books to my MSc Data Mining students over the next week.

Thanks Duncan and Oracle