Wednesday, March 5, 2014
Monday, March 3, 2014
The main conference event is on Tuesday 11th March in the DCC in Dublin. Things kick off at 9:20 with Debra Lilley welcoming everyone to the event. Then Jon Paul from Oracle in Ireland will do the opening keynote. Then we can break into the 7 streams with lots of local case studies and some well known speakers from around the world including many Oracle ACEs and ACE Directors (my presentation is at 12:15).
The day ends up with 2 keynote presentations. There will be a keynote that will be focused on the App streams (Nadia Bendjedou, Oracle) and a separate keynote for Tech streams (by Tom Kyte).
Throughout the day there will be RAC Attack event. Look out for their tables in the exhibition hall. Again there will be some well known experts from around the world who will be on hand to help you get RAC setup and running on your own laptop, answer your questions and engage in lots of discussions about all thing Oracle. The RAC Attack Ninja will include Osama Mustafa, Philippe Fierens, Marcin Przepiorowski, Martin Bach and Tim Hall. Some of these are giving presentations throughout the day, so when they are not presenting you will find them at the RAC Attack table. Even if you are not going to install RAC drop by and have a chat with them.
On Wednesday 12th March the OUG Ireland Conference ventures into a second day of sessions. These sessions will be a full day of topics by Tom Kyte. This is certainly a day not to be missed. As they say places are limited so book your place today.
Click on the following image to view the agenda for the 2 days and to book your place on the 11th and 12th March.
Sunday, March 2, 2014
Wednesday, February 26, 2014
Jeff Smyth has a blog post on some of the bug fixes in SQL Developer.
and Kris Rice also has a blog post on the new updated release.
So what about Oracle Data Miner. There seems to be a couple of minor new features on being able to select statistical outputs for the transform node. Also the model and test results viewers now automatically refresh if they are open. ODM can not be installed on Oracle Personal Edition (I haven't tried this out yet).
Plus the Graph node can not have line charts based on mulitiple y axis attributes. I'll have a blog post on this soon.
Thursday, February 20, 2014
People keep asking me what is the best way to test their data mining model, with most people expecting that they have to do lots and lots of statistics. They are then confused when I say ‘Oh No you Don’t’, all you need to do is …. All you need to do is to follow the approaches that are detailed in their article. One thing that they all have in common is that they keep in mind the business problem and how/what the results they obtain mean for the business problem.
- Lift charts and decile tables to compare performance against random results
- Target shuffling to determine validity of the results
- Bootstrap sampling to test the consistency of the model
Ok Some statistics are used but not too many!!
View highlights from the report below or read it in its entirety here. Alternatively have a look at the article summary on SlideShare.
Friday, February 14, 2014
The agenda for OUG Ireland 2014 is now live. You can view the agenda and to register for the event by clicking on the following link.
Over the past couple of weeks some of the presenters have been using Twitter to share the news that they will be presenting at OUG Ireland. If you are not following them on twitter now is time to follow them. So here is the list (in no particular order) and I’ll start it off with myself
Brendan Tierney @brendantierney
Debra Lilley @debralilley
Tom Kyte @OracleAskTom
Tim Hall @oraclebase
Jon Paul @jonpauldublin
Roel Hartman @RoelH
Uli Bethke @ubethke
Antony Heljula @aheljula
Stewart Bryson @stewartbryson
Patrick Hurley @phurley
Joel Goodman @JoelJGoodman
Philippe Fierens @pfierens
Simon Haslam @simon_haslam
Martin Nash @mpnsh
Uwe Hesse @UswHesse
Martin Bach @MartinDBA
If I’m missing anyone let me know and I’ll add you to the list
Thursday, February 6, 2014
UPDATED list of conferences
We are just a few weeks into 2014 and it has been a busy time with Oracle User Group Conferences.
January : BIWA Summit
In January I gave 2 presentations at the BIWA Summit. This conference was held in the Oracle Convention Centre at Oracle Head Office.
March : OUG Ireland
I also have one presentation at the OUG Ireland conference in Dublin on the 11th March. As always this is a great day fill with sessions for well known speakers from around the world. This year we will have 6 tracks packed full. It is also a great opportunity to catch up with some friends I have known for 20+ years. Click on the following image for details of the agenda and how to register for the conference.
April : OUG Norway
I’ve also received notice that I will have 2 presentations at the Norway Oracle User Group. I’m delighted with this, as I was at this conference last year and really enjoyed it. This conference will be on the ship again this year between April 3-5, getting back into Oslo around 10am on the 5th April. Click on the following image for details of the agenda and how to register for the conference.
June : OUG Finland
In the past few days I’ve also received news that I will have 2 presentations at the Finland Oracle User Group conference. This will be my first time in Finland and I hope to get a few hours to do some exploring of Helsinki when I’m there. One of my presentations will be on using Oracle Data Miner and the second presentation will be on using R in the Oracle Database (or more correctly Oracle R Enterprise). Click on the following link for more details of the conference.
Hopefully I will see you at one of these conferences. Do make sure you say hello to me and let me know if you have any questions about the Oracle Advanced Analytics Option.
Second half of 2014
This second half of 2014 will probably be a bit quieter, but hopefully I’ll be at Oracle Open World in September (speaking or not) and also at the UKOUG Annual Conference (TECH14 or whatever it will be called) in December (speaking or not).
My travels (flights and hotel costs) to present at these conferences is made possible thanks to the Oracle ACE Director program. Also to DIT for allowing me to go.
Wednesday, January 29, 2014
Oracle has made available the BigDataLite VM appliance to download. This VM is for evaluation purposes only and is a great way to try out the various products that Oracle has in the Big Data area.
Another major advanced of downloading and using the VM is that you don’t have the “fun” of trying to install everything yourself, getting everything configured and working together.
The BigDataLite 2.4.1 VM comes with the following:
- Oracle Database 12.1c
- Cloudera’s Distribution including Apache Hadoop (CDH4.5)
- Cloudera Manager 4.8
- Oracle Big Data Connectors 2.4
- Oracle NoSQL Database 2.1.54
- Oracle JDeveloper 11g (22.214.171.124.0 )
- Oracle SQL Developer 4.0
- Oracle Data Integrator 12c R1
- Oracle R Distribution 3.0.1
There are a number of Hands-on-Labs that you can run on the VM and it comes with the MoviePlex demo data.
Get all the details and links for downloads at
WARNING: you will need a decent spec PC or laptop to host this VM. The recommendation is that you can dedicate 2 cores, at least 5GB RAM and >30G of disk for the VM. The install requires ~40G of space So this might not be for everyone.
Wednesday, January 15, 2014
The agenda has just gone live for the OUG Ireland Conference that will be on 11th March, 2014. The conference will again be in the Dublin Convention Centre (DCC). I’m on the conference committee again this year . Part of the duties includes the presentation selection and agenda planning.
This year we have over 100 submission from well know experts from around the world and from a variety of customer case students. With a limited number of slots available there was some VERY VERY difficult decisions made. To included everyone that I wanted to present at the conference we would need to run the conference over 3 days. Sadly this was not possible.
An new feature of the conference this years is that we will have Tom Kyte giving a fully day of sessions the day after the conference. This will be a paid for event.
To view the agenda for the conference you can click on the image below.
To register for the conference and the extra 1 day workshops with Tom Kyte or 12c Workshop with Joel Goodman and Uwe Hesse, go to.
I have one presentation at the conference, so hopefully I’ll see you there.
Friday, January 10, 2014
Cloudera and other sides have made available a number of resources for help all of us to get up to speed with using Hadoop etc.
So if you are starting out with Hadoop here is a short list of key resources that I have found very useful.
Saturday, December 21, 2013
The BIWA Summit 2014 is on from January 14th-16th, and is located in the Oracle Conference Center, at Oracle Head Office, in Redwood City (CA, USA). This conference is organised by a very dedicated and experienced group of people, including some very senior people in Oracle who are responsible for various analytics offerings from Oracle.
I presented that this conference last January (2013), and I’ve been tempted into presenting again in January (2014).
The conference has been expanded with more parallel tracks, a hands-on track, and a meet the experts/presenters session. So lots and lots more content and learning experiences.
I will be giving two presentations. The first one is on how Universities in the UK are using Oracle Data Miner and OBIEE to manage their Student Churn. I gave this presentation at Oracle Open World (Sept, 2013) along with Tony Heljula from Peak Indicators. This time (Tuesday 14th @10am) I’ll be giving the presentation on my own. My second presentation is a demonstration of how you can use Oracle Data Miner to do Sentiment Analysis using a sample data set from Kaggle (Wednesday 15th @11:15am). I’ve given this presentation a couple of times already and the feedback that I keep on hearing is ‘I didn’t know you could do that in Oracle’. So it is an alternative to using Endeca, R and any of the other tools that we keep on hearing about. Instead we can just use SQL.
If you come to one of my presentations make sure you ask me for one of my Oracle Data Scientist conference ribbons. I got these made up for Oracle Open World and there was lots of interest in them.
I’ve agreed to take part in the meet the experts/presenters. This is were attendees at the conference can sign up for a 15 minute 1-to-1 slot with one of the experts/presenters. I’ll be available for this from 3pm on Wednesday 15th. If you would like to sign up for one of these slots then there will be a sign up sheet at the conference. I will be hanging out at the conference for most of the 2.5 days, so do make sure you say hello at some stage.
The full agenda is live (subject to change of course) and can be found by clicking on the image below
Hopefully I’ll see you there.
Friday, December 13, 2013
The production release of SQL Developer 4 and Oracle Data Miner 4 has just been released. If you are like me you will want to upgrade and start using this latest release. For me I particularly want to be using the new Oracle Data Miner 4. Over the past (almost) 6 months I’ve been working with the Early Adopter versions (EAs) with some degree of frustration. So hopefully it will be all working now.
To download the production version of SQL Developer 4 that include Oracle Data Miner go to here.
The following are the steps that I followed to get SQL Developer installed and to migrate my Oracle Data Miner Repository. I’m running a 12.1c Oracle Database.
1. Download and unzip the SQL Developer software. Go to the \sqldeveloper folder to locate the sqldeveloper.exe file. I created a shortcut on my desktop for this. When ready then run this file.
2. As SQL Developer is opening you will get the typical splash screen and at some point you will be asked about migrating your preferences from your previous release. In my case I’m migrating from EA1. I select Yes.
After a few more seconds SQL Developer should open with all your previous settings.
3. Now to update and migrate your existing Oracle Data Mining Repository to the new versions. To start this process, to to the Tool Menu and then select Data Miner –> Make Visible
This will open the Oracle Data Miner Connections tab and the Workflow Jobs tab. If you don’t make do this step then your Oracle Data Miner workflows may not run.
4. Double click on one of your schemas in the Data Miner Connection tab.
5. Before you upgrade your repository it is advisable to take a full backup of your database, and to export your workflows. Just in case anything might happen during the Repository upgrade. I cannot stress this enough, because during a previous upgrade my repository got wiped and I had to rely on my backups.
5. The version of the repository will be check and if it needs updating then you will get the following window. I’m migrating from EA1 so you might get a slightly different messages. It all depends on what version you were previously using. Select Yes.
6. Next you will need to give the SYS password (or talk nicely to your DBA). Then you will get a warning about disconnecting your session from the repository. Click OK.
Then you can click on the Start Button
Everything should finish after a few minutes.
7. Open one of your workflows and run it to make sure all is OK.
Based on my initial few hours of working with the production version of SQL Developer 4 and Oracle Data Miner 4 is that it seems to run a lot quicker than the Early Adopter versions.
Watch out for some blog posts over the coming weeks about some of the new features that are available in SQL Developer 4. Like my previous blog posts, the new posts will be how-to type of articles.
Wednesday, December 11, 2013
As your data volumes increase, particularly as you evolve into the big data world, you will be start to see that your Oracle Data Mining scoring functions will start to take longer and longer. To apply an Oracle Data Mining model to new data is a very quick process. The models are, what Oracle calls, first class objects in the database. This basically means that they run Very quickly with very little overhead.
But as the data volumes increase you will start to see that your Apply process or scoring the data will start to take longer and longer. As with all OLTP or OLAP environments as the data grows you will start to use other in-database features to help your code run quicker. One example of this is to use the Parallel Option.
You can use the Parallel Option to run your Oracle Data Mining functions in real-time and in batch processing mode. The examples given below shows you how you can do this.
Let us first start with some basics. What are the typical commands necessary to setup our schema or objects to use Parallel. The following commands are examples of what we can use
ALTER session enable parallel dml;
ALTER TABLE table_name PARALLEL (DEGREE 8);
ALTER TABLE table_name NOPARALLEL;
CREATE TABLE … PARALLEL degree …
ALTER TABLE … PARALLEL degree …
CREATE INDEX … PARALLEL degree …
ALTER INDEX … PARALLEL degree …
You can force parallel operations for tables that have a degree of 1 by using the force option.
ALTER SESSION ENABLE PARALLEL DDL;
ALTER SESSION ENABLE PARALLEL DML;
ALTER SESSION ENABLE PARALLEL QUERY;
alter session force parallel query PARALLEL 2
You can disable parallel processing with the following session statements.
ALTER SESSION DISABLE PARALLEL DDL;
ALTER SESSION DISABLE PARALLEL DML;
ALTER SESSION DISABLE PARALLEL QUERY;
We can also tell the database what degree of Parallelism to use
ALTER SESSION FORCE PARALLEL DDL PARALLEL 32;
ALTER SESSION FORCE PARALLEL DML PARALLEL 32;
ALTER SESSION FORCE PARALLEL QUERY PARALLEL 32;
Using your Oracle Data Mining model in real-time using Parallel
When you want to use your Oracle Data Mining model in real-time, on one record or a set of records you will be using the PREDICTION and PREDICTION_PROBABILITY function. The following example shows how a Classification model is being applied to some data in a view called MINING_DATA_APPLY_V.
column prob format 99.99999
PREDICTION(DEMO_CLASS_DT_MODEL USING *) Pred,
PREDICTION_PROBABILITY(DEMO_CLASS_DT_MODEL USING *) Prob
WHERE rownum <= 18
CUST_ID PRED PROB
---------- ---------- ---------
100574 0 .63415
100577 1 .73663
100586 0 .95219
100593 0 .60061
100598 0 .95219
100599 0 .95219
100601 1 .73663
100603 0 .95219
100612 1 .73663
100619 0 .95219
100621 1 .73663
100626 1 .73663
100627 0 .95219
100628 0 .95219
100633 1 .73663
100640 0 .95219
100648 1 .73663
100650 0 .60061
If the volume of data warrants the use of the Parallel option then we can add the necessary hint to the above query as illustrated in the example below.
SELECT /*+ PARALLEL(mining_data_apply_v, 4) */
PREDICTION(DEMO_CLASS_DT_MODEL USING *) Pred,
PREDICTION_PROBABILITY(DEMO_CLASS_DT_MODEL USING *) Prob
WHERE rownum <= 18
If you turn on autotrace you will see that Parallel was used. So you should now be able to use your Oracle Data Mining models to work on a Very large number of records and by adjusting the degree of parallelism you can improvements.
Using your Oracle Data Mining model in Batch mode using Parallel
When you want to perform some batch scoring of your data using your Oracle Data Mining model you will have to use the APPLY procedure that is part of the DBMS_DATA_MINING package. But the problem with using a procedure or function is that you cannot give it a hint to tell it to use the parallel option. So unless you have the tables(s) setup with parallel and/or the session to use parallel, then you cannot run your Oracle Data Mining model in Parallel using the APPLY procedure.
So how can you get the DBMA_DATA_MINING.APPLY procedure to run in parallel?
The answer is that you can use the DBMS_PARALLEL_EXECUTE package. The following steps walks you through what you need to do to use the DMBS_PARALLEL_EXECUTE package to run your Oracle Data Mining models in parallel.
The first step required is for you to put the DBMS_DATA_MINING.APPLY code into a stored procedure. The following code shows how our DEMO_CLASS_DT_MODEL can be used by the APPLY procedure and how all of this can be incorporated into a stored procedure called SCORE_DATA.
create or replace procedure score_data
model_name => 'DEMO_CLAS_DT_MODEL',
data_table_name => 'NEW_DATA_TO_SCORE',
case_id_column_name => 'CUST_ID',
result_table_name => 'NEW_DATA_SCORED');
Next we need to create a Parallel Task for the DBMS_PARALLEL_EXECUTE package. In the following example this is called ODM_SCORE_DATA.
-- Create the TASK
Next we need to define the Parallel Workload Chunks details
-- Chunk the table by ROWID
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('ODM_SCORE_DATA', 'DMUSER', 'NEW_DATA_TO_SCORE', true, 100);
The scheduled jobs take an unassigned workload chunk, process it and will then move onto the next unassigned chunk.
Now you are ready to execute the stored procedure for your Oracle Data Mining model, in parallel by 10.
-- Execute the DML in parallel
l_sql_stmt := 'begin score_data(); end;';
DBMS_PARALLEL_EXECUTE.RUN_TASK('ODM_SCORE_DATA', l_sql_stmt, DBMS_SQL.NATIVE,
parallel_level => 10);
When every thing is finished you can then clean up and remove the task using
NOTE: The schema that will be running the above code will need to have the necessary privileges to run DBMS_SCHEDULER, for example
grant create job to dmuser;
Tuesday, December 3, 2013
If you are brave enough to be using the early adopter releases of ODMr you may have run into the issue with your workflows not running.
When you go to run your workflow you will get the following window and nothing else happens.
To get passed this you will need to kill SQL Developer using the task manager or equivalent.
So how do you stop this from happening so that you can get your workflows to run. The simple solutions is that you need to have the workflow tab open for the workflow to run correctly.
To do this you need to make Oracle Data Miner visible, by selecting Tools from the menu, then Data Miner and finally Make Visible
Then you will need to go to the View menu option, then select Data Miner and then Workflow Jobs
Now your workflows will work and complete.
Hopefully this will be fixed in the production release of ODMr 4 (SQL Developer 4)