Oracle BDA (Big Data Appliance)

What is Oracle BDA?

bda

The Oracle Big Data Appliance (BDA) is an engineered system offering by Oracle.

This means that the hardware and the software come pre-configured. The advantage being that there is no need to buy each hardware and software component separately and then try to make them all work together.  The disadvantage is that you are stuck with whatever configuration Oracle gives you.

The latest version is the X6-2.  Look at its data sheet for the complete specification.  It can be bought as a full rack or a starter rack.

Below is a summary of its hardware and software offerings.

Hardware

Full Rack Starter rack
18 x compute/storage nodes 6x compute/storage nodes
2 x 32 Port QDR Infiniband Leaf Switch
1 x 36 Port QDR Infiniband Spine Switch
Ethernet Admin switch
2x redundant power distribution units (PDUs)

Software

Operating System Oracle Linux 5 or Oracle Linux 6
Integrated Software Cloudera Enterprise 5 – Data Hub Edition with support for:

  • Cloudera’s Distribution including Apache Hadoop (CDH)
  • Cloudera Impala
  • Cloudera Search
  • Apache HBase and Apache Accumulo
  • Apache Spark
  • Apache Kafka
  • Cloudera Manager with support for:
  • Cloudera Navigator
  • Cloudera Back-up and Disaster Recovery (BDR)

Oracle Perfect Balance
Oracle Table Access for Hadoop

Other Oracle Java JDK 8
MySQL Database Enterprise Server – Advanced Edition
Oracle Big Data Appliance Enterprise Manager Plug-In
Oracle R Distribution
Oracle NoSQL Database Community Edition (CE)
Big Data Appliance X6-2 – Optional Software Oracle Big Data SQL
Oracle Big Data Connectors:

  • Oracle SQL Connector for Hadoop
  • Oracle Loader for Hadoop
  • Oracle XQuery for Hadoop
  • Oracle R Advanced Analytics for Hadoop
  • Oracle Data Integrator

Oracle Audit Vault and Database Firewall for Hadoop Auditing
Oracle Data Integrator
Oracle GoldenGate
Oracle NoSQL Database Enterprise Edition
Oracle Big Data Spatial and Graph
Oracle Big Data Discovery

For more information, the full documentation can be accessed here: BDA Documentation

Why should I get one?

Two Relational DBAs walk into a NoSQL bar . . .

They soon walk out . . .

They could not find a table.

I know Big Data is super trendy right now, but it’s not a pair jeans.  You need a business case to justify spending about half a million dollars on a BDA.

In a relational world . . .

Below is a relational diagram of that oldie but goodie orders application that comes with MS Access.

orders

It is the perfect relational schema.  Orders and order details (one to many relationship) and customers, suppliers, products, etc.  all in third normal form.

first normal form each attribute has only one value.
second normal form in 1NF and non-primary key attributes are are dependent on the whole primary key
third normal form in 2NF and all attributes are determined by the primary key and not by any non-primary key attributes

The records are transactions (each order is a sale, $$$$) that must conform to ACID

A – atomicity The transaction is indivisible. If part of the transaction fails, the entire transaction fails. No partial transactions.
C – consistency Only valid data that follows database rules can be written.
I – isolation Effects of an incomplete transaction are not visible to another transaction.
D – durability Once the transaction is committed, it will remain in the database.

This is our relational world and the relational model worked well for transactional data (sale/purchase orders, employee payment records, things to do with money.)

The Internet broke the relational model

That is until that thing called the Internet came around and dramatically increased the volume of data out there.

New types of data emerged that looked nothing like transactions (well, maybe  a little):

  • Clickstream data
  • Twitter, Facebook, blogs, and comments
  • Geotagged data
  • Sensor data
  • Server logs

This data can be semistructured and non-structured (very hard to normalize.)

And the data is looked at in aggregate (so no ACID-compliant transactions.)

Without normalization and ACID (these are not transactions!) this data needs to be stored and processed a different way, outside RDBMS (relational database management systems.)

So, the questions that need to be answered before considering a Big Data solution are

  1. Look at your data
  2. Do you see transactions?
    1. Can they be normalized?
    2. Do they need to conform to ACID?

If the answers are YES: use the relational model.  It will make  your life easier.

If the answer are NO: then start looking at Big Data.

What can I do with it?

Oracle Big Data Lite Virtual Machine

There is a way to test-drive the BDA without buying it first.

Oracle has a VM (Virtual Machine) with Big Data Lite that you can use.

Go to the home page for the Big Data Lite.  You will need to download and install the Oracle VM VirtualBox first. Follow the instructions to deploy the Big Data Lite.

There is a demo there you can play with, the MoviePlex, with videos and labs.

More on what can be done with BDA in following posts.

Introduction to Hadoop

 

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s