The Oracle Big Data Lite VM available on Oracle technet, provides a pre built environment for learning about a number of key Oracle products, including Oracle 12c database, Big Data Discovery and Data integrator as well as Cloudera Distribution – Apache Hadoop (CDH 5.8.0).
The download ultimately delivers an OVA “appliance” file for use with Oracle VirtualBox, but there isn’t anything to stop you running this as a VM on proxmox 4, with a bit of effort, as follows.
NOTE – Things to read which can help with this process:
- Oracle Big Data Lite Deployment Guide.
- How to upload an OVA to proxmox guide by James Coyle: https://www.jamescoyle.net/how-to/1218-upload-ova-to-proxmox-kvm
- Converting to RAW and pushing to a raw lvm partition: https://www.nnbfn.net/2011/03/convert-kvm-qcow2-to-lvm-raw-partition/
- Firstly download the files that make up the OVA from here.
- Follow the instructions on the download page to convert the multiple files into one single OVA file.
- For Oracle Virtualbox, simple follow the rest of the instructions in the Deployment Guide.
- For Proxmox, where I was running LVM storage for the virtual machines, first rename the single OVA file to .ISO, then upload that file (BigDataLite460.iso) to a storage area on your proxmox host, in this case, mine was called “data”. You can upload the file through the Proxmox GUI, or manually via the command line. My files were uploaded through the GUI and end up in “/mnt/pve-data/template/iso”.
- Now, bring up a shell and navigate to the ISO directory and then unpack the ISO file by running “tar xvf BigDataLite460.iso”. This should create five files which include one OVF file (Open Virtualisation Format) and four VMDK files (Virtual Machine Disk).
root@HP20052433:/mnt/pve-data/template/iso# ls -l total 204127600 -rw------- 1 root root 8680527872 Oct 25 02:43 BigDataLite460-disk1.vmdk -rw------- 1 root root 1696855040 Oct 25 02:45 BigDataLite460-disk2.vmdk -rw------- 1 root root 23999689216 Oct 25 03:11 BigDataLite460-disk3.vmdk -rw------- 1 root root 220160 Oct 25 03:11 BigDataLite460-disk4.vmdk -rw-r--r-- 1 root root 34377315328 Nov 14 10:59 BigDataLite460.iso -rw------- 1 root root 20056 Oct 25 02:31 BigDataLite460.ovf
- Now, create a new VM in proxmox via the GUI or manually. The VM I created had the required memory and CPUs as per the deployment guide, together with four Hard Disks – mine were all on the SCSI interface and were set to be 10G in size initially – this will change later.
- The hard disks were using a storage area on Proxmox that was defined as type LVM.
- Now convert the VMDK files to RAW files which we’ll then push to the LVM Hard Disks as follows:
qemu-img convert -f vmdk BigDataLite460-disk1.vmdk -O raw BigDataLite460-disk1.raw qemu-img convert -f vmdk BigDataLite460-disk2.vmdk -O raw BigDataLite460-disk2.raw qemu-img convert -f vmdk BigDataLite460-disk3.vmdk -O raw BigDataLite460-disk3.raw qemu-img convert -f vmdk BigDataLite460-disk4.vmdk -O raw BigDataLite460-disk4.raw
- Now list those raw files, so we can see their sizes:
root@HP20052433:/mnt/pve-data/template/iso# ls -l *.raw -rw-r--r-- 1 root root 104857600000 Nov 16 07:58 BigDataLite460-disk1.raw -rw-r--r-- 1 root root 214748364800 Nov 16 08:01 BigDataLite460-disk2.raw -rw-r--r-- 1 root root 128849018880 Nov 16 08:27 BigDataLite460-disk3.raw -rw-r--r-- 1 root root 32212254720 Nov 16 08:27 BigDataLite460-disk4.raw
- Now resize the lvm hard disks to the corresponding sizes (the ID of my proxmox VM was 106 and my hard disks were scsi):
qm resize 106 scsi0 104857600000 qm resize 106 scsi1 214748364800 qm resize 106 scsi2 128849018880 qm resize 106 scsi3 32212254720
- Now copy over the content of the raw files to the corresponding lvm hard disks:
dd if=BigDataLite460-disk1.raw of=/dev/vm_storage_group/vm-106-disk-1 dd if=BigDataLite460-disk2.raw of=/dev/vm_storage_group/vm-106-disk-2 dd if=BigDataLite460-disk3.raw of=/dev/vm_storage_group/vm-106-disk-3 dd if=BigDataLite460-disk4.raw of=/dev/vm_storage_group/vm-106-disk-4
- Now start the VM and hey presto there it is.
- You could stop there as it’s a self contained environment, but obviously you can also do a whole bunch of networking stuff to make it visible on your network as well.