Tuesday, April 22, 2014

Installing ORE - Part A

This blog post will look at how you can go about installing ORE in your environment.

The install involves a 4 steps. The first step is the install on the Oracle Database server. The second step involves the install on your client machine. The third steps involves creating a schema for ORE. The fourth steps is connecting to the database using ORE.

In this Part A blog post I will cover the first two steps in this process. The other steps will be coved in another blog post.

NB : A the time of writing this blog post ORE 1.4 cannot be installed on a 12c database if it has a CDB/PDB configuration. If you want to use ORE with 12c then you need to do a traditional install that does not create a CDB with a PDB. The ORE team are working hard on this and I'm sure it will be available in the next release (or two or ...) of ORE.

Step 1 : Installing ORE on the Database Server

Before you being looking at ORE you need to ensure that you have the correct version of database. If you have version 11.2.0.3 or 11.2.0.4 then you can go ahead and perform the installation below. But if you have 11.2.0.1 or 11.2.0.2 then you will need to apply a patch to your database. See my note above about 12c.

Download the Oracle R Distribution from their website. Download here.

Although you can use the standard version of R, Oracle R Distribution comes with some highly tuned packages. If you are going to use the standard R download then you will need to ensure that you download the correct version. ORE 1.4 will require R version 3.0.1. Yes this is not the current version of R.

Accept at the defaults during the installation of ROracle, and within a minute or two ROracle will be installed.

Download the Oracle R Enterprise software. Download here. This will include the Server and Supporting downloads.

Uncompress the downloaded ORE files and go to the server directory. Here you will find the install.bat (other other similar name for your platform).

Make sure your ORACLE_HOME and ORACLE_SID environment variables are set.

A number of environment and environment variables are checked. When prompted accept the defaults.

When prompted for the password for the RQSYS user, enter an appropriate password and take careful note of it.

Now go back to the Oracle download page for ORE and download the supporting packages. Unzip the downloaded file. Noting the directory that they were installed in you can now load them in R. To do this open R and run the following commands. You will need to change the directory to where these are located on your server.

install.packages("C:/app/supporting/ROracle_1.1-11.zip", repos=NULL)

install.packages("C:/app/supporting/DBI_0.2-7.zip", repos=NULL)

install.packages("C:/app/supporting/png_0.1-7.zip", repos=NULL)

install.packages("C:/app/supporting/cairo_1.5-5.zip", repos=NULL)


Or you can use the R Gui to import these packages

WARNING:If you are installing on a Windows server you may encounter some issues when importing these packages. I will have a separate blog post on this soon.

NB: The ORE installation instructions make reference to Cario-_1.5-2.zip. This is incorrect. ORE 1.4 comes with Cario-_1.5-5.zip.

At this point, assuming you didn't have any errors, you now have ORE installed on your server.


Step 2 : Installing ORE on the Client

Download the Oracle R Distribution from their website. Download here.

NOTE: If your database and client are on the one machine then there is no need to install ROracle again.

The client install is much simpler and less involved. After you have installed ROracle the next step is to install the client packages for ORE. These can be downloaded from here.

After you have unzipped the file you can use the import packages from zip feature of the R Gui tool or using RStudio. Then import the supporting packages that you also installed as part of the server install.

Now you can install the supporting packages. Unzip them and then use the R Gui or RStudio to importing them. These supporting packages can be downloaded from here.

That should be the client R software and ORE packages installed on your client machine. The next steps is to test a connection to your Oracle database using ORE. Before you can do that you will need to setup a Schema in the database to use R and also grant the necessary privileges to your other schemas that you want to access using R


Check out my next blog post (Installing ORE - Part B) for Steps 3 and 4.

Monday, April 14, 2014

Oracle R Enterprise and Oracle 12c

A few of weeks ago we had the release of Oracle R Enterprise (ORE).

There has been some posts on the R/ORE on the Oracle discussion forums about installing ORE on Oracle 12c.

It turns out that the only way to install ORE on an Oracle 12c database is if you do a traditional install. What this means is that you do not have a CDB and PDBs configuration of Oracle 12c.

I'll assume that Oracle are currently working on this particular issue, as you can imagine that that there is considerable amount of complexity in getting ORE to work with the PDBs.

If you are not using Oracle 12c then you are OK, as long as you are using 11.2.0.3 or 11.2.0.4 versions of the database. If you are using a lower version of the 11.2 database then you need to apply a patch to allow ORE to run.

As they say I'm sure it will be "fixed in the next release" :-)

Oracle Advanced Analytics and Oracle Fusion Apps

At a recent Oracle User Group conference, I was part of a round table discussion on Apps and BI. Unfortunately most of the questions were focused on Apps and the new Fusion Applications from Oracle. I mentioned that there was data mining functionality (using the Oracle Advanced Analytics Option) built into the Fusion Apps, it seems to come as a surprise to the Apps people. They were not aware of this built in functionality and capabilities. Well Oracle Data Mining and Oracle Advanced Analytics has been built into the following Oracle Fusion Applications.
  • Oracle Fusion HCM Workforce Predictions
  • Oracle Fusion CRM Sales Prediction Engine
  • Oracle Spend Classification
  • Oracle Sales Prospector
  • Oracle Adaptive Access Manager
Oracle Data Mining and Oracle Advanced Applications are also being used in the following applications:
  • Oracle Airline Data Model
  • Oracle Communications Data Model
  • Oracle Retail Data Model
  • Oracle Security Governor for Healthcare
I intend to submit a presentation on this topic to future Oracle User Group conferences as a way of spreading the Advanced Analytics message within the Oracle user community. If you would like me to present on this topic at your conference or SIG drop me an email and we can make the necessary arrangement :-)

Friday, April 11, 2014

Oracle 11.2g install on OLE 6.x

This notes are really just a reminder to myself of the typical "issues" that I encounter every time I do a new install of OEL 6.x and 11.2.0.4

These notes are in addition to the excellent installation instructions given by oracle-base.com: oel install, DB 11.2.0.x install

The notes listed below are just a reminder to myself of things that I seem to always have to look up. If you finish them useful then great.

1. Display issue & Installer not able to run

install says to do xhost +:0.0 this can give an error

instead do host +:0.0 and that should allow the installer to run


2. Now enough swap space when installer checks the pre-requisites

Need to add an addition 500M to the swap space


su (and then enter the password)

dd if=/dev/zero of=/tmp/swapfile bs=1M count=500

mkswap /tmp/swapfile

swapon /tmp/swapfile

exit (to return to the oracle user)


you can then turn off the extra space (if you really need to) after the install is finished


swapoff /tmp/swapfile

rm /tmp/swapfile


3. Post-Installation task

don't forget the final step, to set to restart flag

as root

vi /etc/oratab

change the following line to have the Y at the end (instead of the N)

DB11G:/u01/app/oracle/product/11.2.0/db_1:Y


4. Set up the automated start and stop of the DB

Again Oracle-Base gives an excellent set of instructions for doing this. Click here.

Wednesday, April 9, 2014

Oracle Text and Oracle Data Miner

This blog post is a follow up to comment on a previous blog post and to some emails.

Basically the people are asking about some messages they get when they open the Oracle Data Miner tool, that is part of SQL Developer.

If you are just using the SQL and PL/SQL functions in the database then you do not have to worried about Oracle Text. You will receive no warning message.

But if you use the Oracle Data Miner tool you will get a warning message.

Why do you get this message? Some of the functionality in the Oracle Data Miner tool relies on having Oracle Text enabled/installed in the database. You can locate this functionality under the Text section of the Component Workflow Editor palette of Oracle Data Miner.

So if you are getting these warning messages then Oracle Text was not installed when the database was created.

How can you install Oracle Text? There are 2 scripts that you need to run.

For the first script you will need to log into SYS as SYSDBA and run the following script.

ctx/admin/catctx.sql password SYSAUX TEMP NOLOCK

This script will create a user called CTXSYS with the password of password (give above), with the default tablespace of SYSAUX, the temporary tablespace of TEMP and when the account is created don't lock it (NOLOCK).

This script will also install a number of CTX packages.

The next step is to log into the CTXSYS schema (using the password above) and run the following script.

/ctx/admin/defaults/dr0defin.sql

This takes a parameter to specify the language you want to use. For example "English", "AMERICAN", etc.

The final step is to connect as SYS again and lock the CTXSYS account.

alter user ctxsys account lock password expire;

If you are using Oracle 12c then the above steps will be automatically done for you during the process. If you are using an earlier version of the database or a database that has been upgraded through some version then Oracle Text may not have been installed. In this case you can run the able commands.

Sunday, April 6, 2014

The ORE Packages

If you are interested in using ORE or just to get an idea of what does ORE give you that does not already exist in one of the other R packages then the table below lists the packages that come as part of ORE.

Before you can use then you will need to load these into your workspace. To do this you can issue the following command from the R prompt or from the prompt in RStudio.

> library(ORE)

RStudio is my preferred R interface and is widely used around the world.
ORE Installed Packages Description
ORE Oracle R Enterprise
OREbase ORE - base
OREdm The ORE functions that use the in-database Oracle Data Miner algorithms
OREeda The ORE functions used for exploratory data analysis
OREgraphics The ORE functions used for graphics
OREpredict The ORE functions used for model predictions
OREstats The ORE stats functions
ORExml The ORE functions that convert R objects to XML
DBI R Database Interface
ROracle OCI based Oracle database interface for R
XML Tools for parsing and generating XML within R and S-Plus.
bitops Functions for Bitwise operations
png Read and write PNG images

In addition to these core ORE packages, ORE also uses some R packages as part of the core ORE packages listed above. The following table lists the R packages that are used in the ORE packages. So make sure you have these packages installed. They should have come with your installation of R, but if something has happened then you can download them again.

R Packages used by ORE Description
base The R Base Package
boot Bootstrap Functions (originally by Angelo Canty for S)
class Functions for Classification
cluster Cluster Analysis Extended Rousseeuw et al
codetools Code Analysis Tools for R
compiler The R Compiler Package
datasets The R Datasets Package
foreign Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, dBase, ..
graphics The R Graphics Package
grDevices The R Graphics Devices and Support for Colours and Fonts
grid The Grid Graphics Package
KernSmooth Functions for kernel smoothing for Wand & Jones (1995)
lattice Lattice Graphics
MASS Support Functions and Datasets for Venables and Ripley's MASS
Matrix Sparse and Dense Matrix Classes and Methods
methods Formal Methods and Classes
mgcv GAMs with GCV/AIC/REML smoothness estimation and GAMMs by PQL
nlme Linear and Nonlinear Mixed Effects Models


I've been using R a lot over the past few years and I've had a number of projects involving R particularly over the past 12 month. I just found out that I will now have another short duration R project in May and June.

So watch out for lots more blog posts on R and ORE. Plus the usual blog posts on using Oracle Data Mining. ORE and Oracle Data Mining are very closely linked.

Sunday, March 30, 2014

Gartner 2014 Advanced Analytics Quadrant

The Gartner 2014 Advanced Analytics Quadrant is out now. Well it is if you can find it.

Some of the companies have put it up on their websites to promote their position.

For some reason Oracle hasn't and I wonder why?

Gartner Advanced Analytics MQ Feb2014

You can see that some typical technologies are missing from this, but this is to be expected. How much are companies really deploying these alternatives on real problems and in production. Perhaps the positioning of Revolution Analysis might be an indicator. At some point there might be a shift from investigative analysis into more main stream projects and then into production.

What is still evident from this years quadrant is that SAS and IBM (SPSS) still have very dominant positions and perhaps will have for some time to come.

It will be interesting how this will all play out over the next few years.

Friday, March 28, 2014

ODM Repository upgrade Issue with 4.0.1

An important announcement was made on the Oracle Data Mining discussion forum last night and I haven't seen anything on twitter about it yet (but maybe I missed it). It was about some ODM Repository migration issues that you might encounter with using ODM in SQL Developer 4.0.1 and using the Oracle Database 11.2.0.3.

Check out the full announcement here.

Make sure you have a full backup of your ODM schema and the repository before you perform your ODM repository upgrade.

As most people are still on Oracle 11g then this is a potential problem that most of you maybe facing.


I had a a repository migration issues last September during Oracle Open World. EA2 was release and in my eagerness to upgrade (and because I was writing my book on it) I had an issue where my repository go dropped and a new repository created. But nothing was migrated over to the new repository.

Guess what? I lost all my work. I was at OOW and my back ups were back home in Ireland. So you can imagine how I felt.

Here is a link to my blog post about it.

Thursday, March 27, 2014

Oracle BigDataLite version 2.5.1 is now available

Back at the end of January Oracle finally go round to releasing the updated version of the Oracle BigDataLite virtual machine. Check out my previous blog post of this.

Yesterday (27th March) I say on Facebook that a new updated versions of the BigDataLite VM was released. I must have missed the tweet and other publicity on this somewhere :-(

This is a great VM that allows you to play with the various Big Data technologies without the hassle of going through the who install and configuration thing.

If you are interested in this then here are the details of what it contains and where you can find more details.

The following components are included on Oracle Big Data Lite Virtual Machine v 2.5:

Oracle Enterprise Linux 6.4

Oracle Database 12c Release 1 Enterprise Edition (12.1.0.1)

Cloudera’s Distribution including Apache Hadoop (CDH4.6)

Cloudera Manager 4.8.2

Cloudera Enterprise Technology, including:

   Cloudera RTQ (Impala 1.2.3)

   Cloudera RTS (Search 1.2)

Oracle Big Data Connectors 2.5

   Oracle SQL Connector for HDFS 2.3.0

   Oracle Loader for Hadoop 2.3.1

   Oracle Data Integrator 11g

   Oracle R Advanced Analytics for Hadoop 2.3.1

   Oracle XQuery for Hadoop 2.4.0

Oracle NoSQL Database Enterprise Edition 12cR1 (2.1.54)

Oracle JDeveloper 11g

Oracle SQL Developer 4.0

Oracle Data Integrator 12cR1/

Oracle R Distribution 3.0.1


Go to the Oracle Big Data Lite Virtual Machine landing page on OTN to download the latest release.

Wednesday, March 26, 2014

Predicting using ORE package

In a previous post I gave a an overview of the various in-database data mining algorithms that you can use in your Oracle R Enterprise scripts.

To create data mining models based on those algorithms you need to use the ore.odm functions.

After you have developed and tested your models you will select one of these to score your new data.

How can you do this using ORE? There is a suite of ORE functions called ore.predict that you can use to apply your data mining model to score or label new data.

The following table lists the ore.predict functions:

ORE Predict Function Description
ore.predict-glm Generalized linear model
ore.predict-kmeans k-Means clustering mode
ore.predict-lm Linear regression model
ore.predict-matrix A matrix with no more than 1000 rows
ore.predict-multinom Multinomial log-linear model
ore.predict-nnet Neural network models
ore.predict-ore.model An Oracle R Enterprise model
ore.predict-prcomp Principal components analysis on a matrix
ore.predict-princomp Principal components analysis on a numeric matrix
ore.predict-rpart Recursive partitioning and regression tree model


As you will see from the above table there are more ore.predict functions than there are ore.odm functions. The reason for this is that ORE comes with some additional data mining algorithms. These are in addition to the sub-set of Oracle Data Mining algorithms that it uses. These include the ore.glm, ore.lm, ore.neural and ore.stepwise.

You also need to watch out for the data mining algorithms that are not used in prediction. These include the Minimum Description Length, Apriori and Non-Negative Matrix Factorization.

Remember that these ore.predict functions are run inside the Oracle Database. No data is extracted to the data analyst laptop or desktop. All the data stays in the database. The ORE functions are run in the database on the data in the database

Sunday, March 23, 2014

Using the in-database ODM algorithms in ORE

Oracle R Enterprise is the version of R that Oracle has that runs in the database instead of on your laptop or desktop.

Oracle already has a significant number of data mining algorithms in the database. With ORE they have exposed these so that they can be easily called from your R (ORE) scripts.

To access these in-database data mining algorithms you will need to use the ore.odm package.

ORE is continually being developed with new functionality being added all the time. Over the past 2 years Oracle have released and updated version of ORE about every 6 months. ORE is generally not certified with the latest version of R. But is slightly behind but only a point or two of the current release. For example the current version of ORE 1.4 (released only last week) is certified for R version 3.0.1. But the current release of R is 3.0.3.

Will ORE work with the latest version of R? The simple answer is maybe or in theory it should, but is not certified.

Let's get back to ore.dm. The following table maps the ore.odm functions to the in-database Oracle Data Mining functions.

ORE Function Oracle Data Mining Algorithm What Algorithm can be used for
ore.odmAI Minimum Description Length Attribute Importance
ore.odmAssocRules Apriori Association Rules
ore.odmDT Decision Tree Classification
ore.odmGLM Generalized Linear Model Classification and Regression
ore.odmKMeans k-Means Clustering
ore.odmNB Naïve Bayes Classification
ore.odmNMF Non-Negative Matrix Factorization Feature Extraction
ore.odmOC O-Cluster Clustering
ore.odmSVM Support Vector Machines Classification and Regression

As you can see we only have a subset of the in-database Oracle Dat Miner algorithms. This is a pity really, but I'm sure as we get newer releases of ORE these will be added.

Thursday, March 20, 2014

Issues with using latest release of ODM

The title of this blog post makes it sound more dramatic than it actually is.

The reason for this blog post is down to me receiving a recent comment on the blog, plus having received numerous emails and a recent OTN Discussion Forum topic for Oracle Data Mining.

The main thing that they have in common is that if I use the latest version of Oracle Data Mining (ODM) it tells me that I need to upgrade my ODM Repository. What impact will this have?

The ODM Repository stores lots of information about the workflows you create using the (free) Oracle Data Mining tool that comes as part of SQL Developer. Yes you do have to pay for the OAA option, so is it really free? Well some part are like the explore node and the graph node.

If you download and want to use the latest version of the ODM tool or you want to try it out before rolling it out to others then you will need to upgrade your ODM repository.

And this the problem that people are facing.

If you upgrade then the ODM Repository it is updated to work with the latest version of the ODM tool. But what happens to everyone else who is using the previous release of the tool? The answer to that is they can no longer use ODM against their database.

Why is that? Well the version of the tool is tied to a version of the Repository. If you upgrade to the newer tool and repository then your older versions of the ODM tool no longer work.

The result of all of this is that you cannot have a mixture of versions of the ODM tool (SQL Developer) being used in your team/company.

There is a very simple solution to all of this. Everyone uses the same version of the ODM tool (i.e. the same version of SQL Developer). For example your team might be using SQL Dev 4 that was released last December. But in early March there was a new patch release 4.1. In order to use this new version of the tool all of your team needs to start using it at the same time. The first person to use it will be prompted to migrate the ODM repository. This is automatically done once you enter the password for SYS.

But in some teams this is not possible to do, you want to try out the tool to see that it works correctly before getting others to use it. The way around this is to have a separate database and use it for your testing. You can easily copy across your workflows and ODM objects to the test database.

This might not be possible for everyone, so what can you do. Create a Virtual Machine and try it out on your own desktop is one way.

The answer to this problem is not ideal, but hopefully you have a better idea of why things are happening this way and what you can or cannot do about it.

Like I said at the topic of this blog post that the title is a bit more dramatic than is really the case :-)


My next blog post will be on another question I've been asked a few times and this is 'When I go to use the ODM tool it tells me that the Oracle Text feature of Oracle needs to be enabled'

Sunday, March 16, 2014

ORE 1.4 New Parallel feature

Oracle R Enterprise (ORE) 1.4 has just been released and can downloaded from here. Remember there is a client and server side install required and ORE 1.4 is certified against R 3.0.1 and the Oracle R Distribution

ORE

One of the interesting new features is the PARALLEL option. You can set this to significantly improve the performance of your R server side code by using the PARALLEL database option. You can set the degree of PARALLEL at a global level in your code by using the ore.parallel setting.

The default setting for this ore.parallel setting is FALSE or 1. Otherwise it must be set to a minimum of 2 of more to enable the Parallel database option.

Alternatively you can set the ore.parallel setting to TRUE to use the default degree of parallelism that is set for the database object or set to NULL to use the default database setting

You will also be able to set the degree of parallel (DOP) using the parallel enabled functions ore.groupApply, ore.rowApply and ore.indexApply.

They have also made available or as they say exposed some more of the in-database Oracle Data Mining algorithms. These include the ODM algorithms for Association rules (ore.odmAssocRules), the feature extraction algorithm called Non-Negative Matrix Factorization (NMF) (ore.odmNMF) and the ODM Clustering algorithm O-Cluster (ore.odmOC)

Watch out of some blog posts on these over the coming weeks.


Check out the OTN page for the R Technologies from Oracle

R

Wednesday, March 12, 2014

ODM: Changing the bar chart format in Explore Node

In Oracle Data Miner you can use the Explore Node to gather an initial set of statistics for your dataset. As part of this you will also get a bar chart that shows the distributions of the values contained within each attribute. The following example shows the default layout of the bar charts. Explore1

These graphs a very useful for presenting the initial data exploration results from to your business users. In addition to these graphs you can also use the Graph node to give some additional graphical representations.

But the default bar chart that is produced by the Explore Node can appear to be a bit basic.

So what if we could change the layout to have a 3-D effect. People like 3-D bar charts.

Is this possible in Oracle Data Miner? If so then how can we do it?

Well it is possible and you can use the following steps to change your bar charts to 3-D.

To access the Explore Node settings go the the Tools menu and then select Preferences from the drop down menu.

Explore2

Then the Preferences window opens scroll down to the Data Miner option and expand the available options.

Explore3

The Explorer Data Viewer allows you to change the Precision settings. The section option is the Graphical Settings. You can change the Depth Radius setting. By default this is set to Zero. By increasing this value you can change the degree of the 3-D effect of the bar charts. You can also change the colour scheme too.

Explore4

I'm not a fan of the other colour schemes that are available and mu favourite is still the default Nautical. The following bar chart is the same as the one at the top of this post but has the 3-D effect.

Explore5