The following post continues a series of high level overview posts on Isilon and Hadoop implementations. It provides the core tasks needed to complete the setup and get a basic operational Hadoop cluster running with Isilon, additional topics will be covered later or in upcoming documents. Since the steps to this process are long, I'll break this post up into two parts.
This procedure is based on the following:
Isilon OneFS: 184.108.40.206
CDH 5 parcel: 5.7.1-1.cdh5.7.1.p0.11
OneFS 220.127.116.11 contains a number of updates to facilitate the integration and deployment of hadoop against OneFS, it is highly recommended to use this version. The procedure may requires additional steps prior to 18.104.22.168 not documented in this post.
Before installing any Hadoop cluster, the OneFS supportability matrix should be consulted for compatibility: https://community.emc.com/docs/DOC-37101
This blog assumes the following Isilon Hadoop environment is configured and operational:
-Isilon is licensed for HDFS
-A dedicated Isilon Access Zone is in use (not the system zone).
-Isilon HDFS root directory in the Access Zone exists
-The Isilon SmartConnect Zone configuration is implemented per best practice for Isilon HDFS access.
-The Isilon HDFS configuration is correctly configured.
-A simple access model will exists between Hadoop and Isilon; user UID & GID and parity will exist.
The best approach to achieving parity is beyond the scope of this post and will be addressed in up coming posts.
Assuming the Isilon is setup and configured for integration with Cloudera, we can begin the deployment of the Cloudera Manager.
This post does not address the setup, configuration and deployment of the Linux hosts used to deploy Hadoop services on. The Cloudera documentation should be consulted to setup and prepare the hosts correctly: Overview of Cloudera and the Cloudera Documentation Set The post also does not address advanced Cloudera installs, the focus is to highlight the Isilon integration into the installer and how to complete the install.
A good overview of the procedure can be found here: Installation Path A - Automated Installation by Cloudera Manager (Non-Production Mode) This post begins with the download of the bits and installation of CM.
# chmod u+x cloudera-manager-installer.bin
On running the installer, you'll get the following:
Next, Accept the Cloudera License
Next, we will let Cloudera Manager install the JDK
Yes, Accept the Oracle
OK, note the URL and the user/pass for the Cloudera Manager WebUI
You can validate the Cloudera Manager Service is running, if you see problems tail the cloudera-scm-server.log as you start the service.
# service cloudera-scm-server status
# cloudera-scm-server (pid 10487) is running...
# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
Log in to the Cloudera Manager WebUI; user:admin, password:admin
Select the Yes check box to accept the EULA and Continue,
Select the version you wish to deploy,
In this post we will deploy to just a single Linux host, but the process is the same when multiple hosts are used in the Hadoop cluster.
Add the FQDN of the Linux hosts to be deployed, Search,
On completion of the search, select the host(s) to deploy to,
Select to use Parcels
Select the CDH Stack you wish to deploy
Select the Additional Parcels and Agent configuration as needed and what is supported by Isilon,
Select install the JDK,
Select install the JUSEP files is you intended to secure this cluster later, Continue,
We will not deploy in Single User Mode, Continue
Provide the SSH credentials, either root password or SSH keys depending on how you set your Linux hosts up and wish to manage them, Continue
The installation will begin
Installation completes and the installer continues,
Parcels being downloaded,
Parcels being distributed
Parcels unpacked and activated
The host inspector will then validate hosts, versions and additional software installed
The installer will check and validate the hosts, if any deviations are seen recommendation are presented to optimize the hosts. If the validation checker fails it is suggested to follow the recommendations and then re-try the validation.
Common errors are seen with:
make the recommended changes to hosts and Run Again,
This completes part 1 of the install, deploying Hadoop services with Cloudera Manager is continued in Part 2.