Installing Hortonworks Data Platform 2.5 on Microsoft Azure

I presented this topic to the Big Data Meetup in Nottingham on Thursday but sometimes people prefer a blog to a presentation, so I’ve fashioned this article from the slides…

This article assumes the following:

Start by navigating to the Azure login page and enter your details. If you have never visited before your screen will look like this:

If you’ve logged in before the page will show your login and you can just click it:

After you login, you’ll arrive at the Dashboard:

Choose the “Marketplace” link at the bottom right, which leads to the following screen where you can type “HDP” and it will show you the options for Hortonworks Data Platform. There are currently two options 2.4 and 2.5 – I chose 2.5:

When you choose 2.5 it will bring up this screen which shows the details of the option you have chosen and offers you the “Create” button to go ahead and start the creation process – click on Create:

After clicking on Create, the process moves on to a five step wizard, the first step of which allows you to choose “Basic options” for the VM. I set the following options:

Name: oramosshdp25sandbox

VM Disk Type: SSD

User name: jeff

SSH Public key: my public SSH key

Subscription: Leave  set to Free Trial (if that’s what you are using, as per screenshot, or your Corporate/Pay As You Go subscription if you have one)

Resource Group: Create New called hdp25sandbox_rg

Location: UK West

A screenshot of these options looks like this:

Click on OK and move on to the 2nd step in the wizard for choosing the size of the VM. I chose the DS3_V2 size which seemed to work OK – you might be able to get away with something smaller, perhaps.

Click on Select and move on to step 3 of the wizard which is about configuring optional features. For this step I set the following:

Use managed disks: Yes

Leaving all other options as defaults this looks like:

Click on OK and move on to step 4 which is just a summary of the configuration:

If you’re happy, click on OK and move on to step 5 where you accept the terms of use and “buy” the VM:

If you’re happy, click on Purchase and that’s the end of the wizard. Azure then goes off to deploy the VM, which can take a few minutes. You’ll be returned to the dashboard screen where you’ll see the VM at the top right with the word Deploying on it:

As I say, it takes a few minutes to complete, but when it does, you’ll see a popup notification in the top right of the screen and the VM tile will change to look as below:

So, you now have the Sandbox VM up and running.

The VM by default only has inbound SSH access enabled and can only be accessed by IP address so we’ll make some changes to these next. First we’ll give the VM a DNS name which allows you to access it on the internet via a name rather than an IP address. From the dashboard screen (above) click on the VM and it takes you to this screen:

You’ll notice the Public IP address which is a hyperlink…click on that link and it takes you to the following screen where you can specify the DNS Name which means the machine will have a Fully Qualified Domain Name that you can access via the internet. I set my DNS Name to oramosshdp25sandbox and given I’d previously chosen to use UK West as the location, the Fully Qualified Domain Name is thus oramosshdp25sandbox.ukwest.cloudapp.azure.com as per the screenshot below:

Now, navigate to the Inbound Security Rules page which is under the Network Security Group page (access from the Resource List on the dashboard). Notice that the only rule existing is one to allow inbound SSH communication:

In order to facilitate additional capabilities you should open up a few more ports, as follows:

  • 8888 – HDP
  • 8080 – Ambari
  • 4200 – Web SSH access
  • 50070 – Default Node Name
  • 21000 – Atlas
  • 9995 – Zeppelin
  • 15000 – Falcon
  • 6080 – Ranger

Click on Inbound Security Rule which takes you to the page for maintaining these rules and enter the details for the 8888 port. I specified the name as default-allow-8888 and the port as 8888 as shown below:

Click on OK to create the rule. Carry out the same process for the other ports.

Now that we’ve undertaken these additional activities we can access the VM using an SSH terminal logging onto oramosshdp25sandbox.ukwest.cloudapp.azure.com as the user you have created (jeff in my case) and the private SSH key:

Whilst you are in the SSH terminal you can reset the Ambari password. This is not strictly necessary unless you want to login to Ambari as admin, but I’ll describe it anyway.

First become root with:

sudo su - root

Now SSH into the Docker Image as root:

ssh root@172.17.0.2

You will be prompted to change the password for root on this first login – the current password is hadoop.

After changing the password run the Ambari password reset process:

ambari-admin-password-reset

Follow the instructions to reset the password and after that it will start the Ambari server process.

Once all that is done, exit out of the sessions and the original SSH terminal.

Now go into HDP via the web interface by logging on to the following URL:

http://oramosshdp25sandbox.ukwest.cloudapp.azure.com:8888

The first time you access this URL you’ll be given a welcome (marketing) page which asks for your details:

Fill out the details and hit Submit which will take you to the main entry page for HDP:

Choose the Launch Dashboard option on the left, which brings up a pair of browser windows that use the entire desktop and show the Ambari login page on the left hand browser and the Tutorials website on the right hand browser like this:

You can use either the admin user that you just reset the password for or the predefined user raj_ops (password raj_ops) to access Ambari. Click on Sign In on the left hand browser once you entered the credentials and it takes you into the main Ambari homepage:

This is the main systems management environment for Hortonworks – more documentation here.

If we close this pair of browsers now and go back to the main HDP entry page and choose the Quick Links option on the right we get this page:

From here you can choose to use any of these specific components.

NOTE – I couldn’t get Atlas and Falcon to work – they need more configuration/setup to get them functional. Ranger, Zeppelin and the Web SSH client work fine though.

Just a basic introduction but I hope you find it useful.

12cR2 tightens up ORA-01841 for zero year ANSI dates, but not for Oracle SQL syntax

In moving some more code from an 11gR2 database to a 12cR2 database, I found another change where a piece of code that works in 11gR2 doesn’t compile in 12cR2.

In this instance a view was being created with a projected date column which used the ANSI DATE syntax. Here is a simplified test script:

CREATE OR REPLACE VIEW test1 AS
SELECT date '0000-01-01' date_col
FROM dual
/
DROP VIEW test
/

CREATE OR REPLACE VIEW test2 AS
SELECT TO_DATE('0000-01-01','YYYY-MM-DD') date_col
FROM dual
/

Running this on 11gR2 gives:

SQL>CREATE OR REPLACE VIEW test1 AS
  2  SELECT date '0000-01-01' date_col
  3  FROM   dual
  4  /

View created.

SQL>CREATE OR REPLACE VIEW test2 AS
  2  SELECT TO_DATE('0000-01-01','YYYY-MM-DD') date_col
  3  FROM   dual
  4  /

View created.

Now running this on 12cR2 gives:

SQL> CREATE OR REPLACE VIEW test1 AS
  2  SELECT date '0000-01-01' date_col
  3  FROM   dual
  4  /
SELECT date '0000-01-01' date_col
            *
ERROR at line 2:
ORA-01841: (full) year must be between -4713 and +9999, and not be 0


SQL> CREATE OR REPLACE VIEW test2 AS
  2  SELECT TO_DATE('0000-01-01','YYYY-MM-DD') date_col
  3  FROM   dual
  4  /

View created.

The date is zero and thus the error message is correct in 12cR2 for the ANSI DATE syntax.

ORA-54002 when trying to create Virtual Column using REGEXP_REPLACE on Oracle 12cR2

I encountered an issue today trying to create a table in an Oracle 12cR2 database, the DDL for which, I extracted from an Oracle 11gR2 database. The error returned when trying to create the table was:

ORA-54002: only pure functions can be specified in a virtual column expression

The definition of the table included a Virtual Column which used a REGEXP_REPLACE call to derive a value from another column on the table.

Here is a simplified test case illustrating the scenario (Thanks Tim for the REGEXP_REPLACE example code):

select * from v$version
/
create table test_ora54002_12c(
 col1 VARCHAR2(20 CHAR) NOT NULL
 ,virtual_column1 VARCHAR2(4000 CHAR) GENERATED ALWAYS AS(REGEXP_REPLACE(col1, '([A-Z])', ' \1', 2)) VIRTUAL VISIBLE
)
/
drop table test_ora54002_12c purge
/

Running this on 11gR2 gives:

SQL> select * from v$version
 2 /

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production

5 rows selected.

Elapsed: 00:00:00.40
SQL> create table test_ora54002_12c(
 2 col1 VARCHAR2(20 CHAR) NOT NULL
 3 ,virtual_column1 VARCHAR2(4000 CHAR) GENERATED ALWAYS AS(REGEXP_REPLACE(col1, '([A-Z])', ' \1', 2)) VIRTUAL VISIBLE
 4 )
 5 /

Table created.

Elapsed: 00:00:00.24
SQL> drop table test_ora54002_12c purge
 2 /

Table dropped.

Running this on 12cR2 gives:

SQL> select * from v$version
/
 2
BANNER CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production 0
PL/SQL Release 12.2.0.1.0 - Production 0
CORE 12.2.0.1.0 Production 0
TNS for Linux: Version 12.2.0.1.0 - Production 0
NLSRTL Version 12.2.0.1.0 - Production 0

SQL> create table test_ora54002_12c(
 col1 VARCHAR2(20 CHAR) NOT NULL
 ,virtual_column1 VARCHAR2(4000 CHAR) GENERATED ALWAYS AS(REGEXP_REPLACE(col1, '([A-Z])', ' \1', 2)) VIRTUAL VISIBLE
)
/
 2 3 4 5 ,virtual_column1 VARCHAR2(4000 CHAR) GENERATED ALWAYS AS(REGEXP_REPLACE(col1, '([A-Z])', ' \1', 2)) VIRTUAL VISIBLE
 *
ERROR at line 3:
ORA-54002: only pure functions can be specified in a virtual column expression


SQL> drop table test_ora54002_12c purge
/
 2 drop table test_ora54002_12c purge
 *
ERROR at line 1:
ORA-00942: table or view does not exist

As you can see, 12cR2 gives the ORA-54002 error.

Looking on MOS, highlights this article, which suggests that you shouldn’t have been able to do this in 11gR2, i.e. it was a bug and that 12cR2 has fixed this bug and thus you can no longer create such a virtual column (the article refers to functional index and check constraint use cases as well).

In my case, I was able to rewrite the virtual column to use simple string functions such as SUBSTR, TRANSLATE and INSTR to achieve what I wanted and the virtual column was allowed to be created with these – problem solved – a shame really as the REGEXP_REPLACE approach was far neater.

Installing Oracle 12c Release 2 Database on a Proxmox Container

Obviously nobody could beat Tim to getting the comprehensive installation instructions out first, but here are my notes for installing it on a proxmox container environment which is what I use as my research platform. Some of the calls used are from or based on Tim’s prior 12cR1 installation article – thanks Tim.

NOTE – this post is just a guide and is based on my environment – you will likely need to make changes to suit your own environment.

Environment

root@billy:~# pveversion
pve-manager/4.4-12/e71b7a74 (running kernel: 4.4.40-1-pve)

Host Preparation

Some of the activities required involve changing linux parameters but these can’t be applied inside a proxmox container – you’ll see errors like these if you try:

[root@db12cr2 ~]# sysctl -p
sysctl: setting key "fs.file-max": Read-only file system

Instead you have to do these at the host level – and only if you think they are relevant and that those settings wouldn’t upset all of your other environments running on that host. I haven’t tried but you could potentially just tell the GUI installer to ignore the warnings relating to these entries and not make these changes at all especially if you’re only using it for small scale research purposes.

As root on the proxmox host, run the following:

echo "fs.file-max = 6815744" >>/etc/sysctl.d/98-oracle.conf
echo "kernel.panic_on_oops = 1" >>/etc/sysctl.d/98-oracle.conf
echo "net.ipv4.conf.default.rp_filter = 2" >>/etc/sysctl.d/98-oracle.conf
/sbin/sysctl -p

Create And Prepare The Container

I use Centos 7 as the template for most of my activities and these notes are based around that.

pct create 130 u01:vztmpl/centos-7-default_20160205_amd64.tar.xz -rootfs 60 -hostname db12cr2 -memory 10240 -nameserver 192.168.1.25 -searchdomain oramoss.com -net0 name=eth0,bridge=vmbr0,gw=192.168.1.1,ip=192.168.1.130/24 -swap 10240 -cpulimit 4 -storage local

You’ll have your own way of getting the installation files to be available to the container but I do it by adding a mount point so I can access the area where all my software is:

vi /etc/pve/nodes/${HOSTNAME}/lxc/130.conf

…and add this:

mp0: /mnt/backups/common_share,mp=/mnt/common_share

Start And Enter The Container

pct start 130
pct enter 130

Install Additional Packages

I’m going to use the Oracle Preinstall package but there are still a few things to add:

yum install gcc-c++ wget openssh-server -y

gcc-c++ is not necessary according to the 12cR2 installation manuals, but the GUI installer complains during the prerequisite checks if it’s not there.

wget is needed to download some files and it’s not on the Centos 7 template.

openssh server is to allow me to login remotely via SSH for the GUI install later.

Get OpenSSH To Autostart

systemctl enable sshd.service
systemctl start sshd.service
systemctl status sshd.service

Install Oracle Preinstall Package

#Get the Oracle Linux 7 repo - this works for Centos 7.
cd /etc/yum.repos.d/ 
wget http://public-yum.oracle.com/public-yum-ol7.repo
#The following stops GPG Key errors:
wget http://public-yum.oracle.com/RPM-GPG-KEY-oracle-ol7 -O /etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
#Update everything
yum update -y
#Install the preinstall package
yum install oracle-database-server-12cR2-preinstall -y

Configure System Limits

echo "oracle soft nofile 1024" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle hard nofile 65536" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle soft nproc 16384" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle hard nproc 16384" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle soft stack 10240" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle hard stack 32768" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle hard memlock 134217728" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf
echo "oracle soft memlock 134217728" >>/etc/security/limits.d/oracle-rdbms-server-12cR2-preinstall.conf

Change Password For “oracle” User

passwd oracle
   <<set a password>>

Create Oracle Home Directory

mkdir -p /u01/app/oracle/product/12.2.0.1/db_1
chown -R oracle:oinstall /u01
chmod -R 775 /u01

Modify The Profile Of “oracle” User

echo "# Oracle Settings" >>/home/oracle/.bash_profile
echo "export TMP=/tmp" >>/home/oracle/.bash_profile
echo "export TMPDIR=\$TMP" >>/home/oracle/.bash_profile
echo "export ORACLE_HOSTNAME=db12cr2.oramoss.com" >>/home/oracle/.bash_profile
echo "export ORACLE_UNQNAME=cdb1" >>/home/oracle/.bash_profile
echo "export ORACLE_BASE=/u01/app/oracle" >>/home/oracle/.bash_profile
echo "export ORACLE_HOME=\$ORACLE_BASE/product/12.2.0.1/db_1" >>/home/oracle/.bash_profile
echo "export ORACLE_SID=cdb1" >>/home/oracle/.bash_profile
echo "export PATH=/usr/sbin:\$PATH" >>/home/oracle/.bash_profile
echo "export PATH=\$ORACLE_HOME/bin:\$PATH" >>/home/oracle/.bash_profile
echo "export LD_LIBRARY_PATH=\$ORACLE_HOME/lib:/lib:/usr/lib" >>/home/oracle/.bash_profile
echo "export CLASSPATH=\$ORACLE_HOME/jlib:\$ORACLE_HOME/rdbms/jlib" >>/home/oracle/.bash_profile

Create Software Directory And Copy Files Over

mkdir -p /u01/software
cp /mnt/common_share/linuxx64_12201_database.zip /u01/software
unzip linuxx64_12201_database.zip
rm /u01/software/linuxx64_12201_database.zip

Run The Installer

Log in as the “oracle” user

cd /u01/software/database
./runInstaller

Install the software and a database by running through the GUI screens and following the instructions. The installer complains on the prerequisite checks screen about some of the kernel memory parameters (rmem%, wmem%) which you can ignore.

Configure Auto Start

Follow these instructions from Tim to setup auto start using the runuser method – make sure you change the ORACLE_HOME to be 12.2.0.1 not 12.1.0.2 that is mentioned.

Now reboot the container and it should return with the database automatically started.

Check Oracle Database Auto Starts

[oracle@db12cr2 ~]$ sqlplus /nolog

SQL*Plus: Release 12.2.0.1.0 Production on Thu Mar 2 14:16:53 2017

Copyright (c) 1982, 2016, Oracle. All rights reserved.

SQL> conn sys/Password01 as sysdba
 Connected.
 SQL> show sga

Total System Global Area 3221225472 bytes
 Fixed Size 8797928 bytes
 Variable Size 687866136 bytes
 Database Buffers 2516582400 bytes
 Redo Buffers 7979008 bytes
 SQL>

Conclusion

All pretty painless and relatively quick. I’ll take a dump of the container next in order to use it as a template for building future containers.