Ambari Automated Kerberos Configuration with EMC Isilon

 

Kerberos is at the heart of strong authentication and encryption for Hadoop, but has always been a challenge to configure and administer. Ambari 2.0 introduced wizard-driven automated Kerberos configuration, which makes the process much faster and less error-prone. Beginning in OneFS 8.0.0.1, EMC Isilon customers will be able to leverage this excellent feature.

 

My name is Karthik Palaniappan, and I am a developer on Isilon's HDFS team. Through this blog post, I will walk you through configuring Kerberos security with your Ambari-managed Hadoop cluster.

 

Prerequisites

  • OneFS 8.0.0.1 or higher.

  • Ambari 2.0 or higher.

  • MIT KDC running (Heimdal is not supported). Follow the steps here to setup up your Kerberos infrastructure.

  • Forward and reverse DNS between all hosts.

  • All services are running (green) on the Ambari Dashboard.

     

Enable Kerberos

 

Pre-Configuration

 

Before launching the wizard, you must set two configurations and restart all services.

  • In HDFS -> Custom core-site set "hadoop.security.token.service.use_ip" to "false"
  • In MapReduce2 -> Advanced mapred-site add "`hadoop classpath`:" to the beginning of "mapreduce.application.classpath". Note the colon and backticks (but do not copy the quotation marks).

 

Get Started

 

Navigate to Admin -> Kerberos and press the "Enable Kerberos" button. The titles within this section refer to the titles of the Kerberos wizard pages.

 

step1.png

 

Select "Existing MIT KDC", and ensure that the pre-requisites are met, then click "Next". Note that Isilon does not use Java, and does not need the JCE.

 

Configure Kerberos / Install and Test Kerberos Client

 

Fill in all KDC and admin server information. On step 3 (Install and Test Kerberos Client), the Ambari server will do a smoke test to ensure you have configured Kerberos correctly.

step3.png

 

Configure Identities / Confirm Configuration

 

a) Ambari User Principals (UPNs)

 

Ambari creates user principals in the form ${username}-${clustername}@${realm}, then uses hadoop.security.auth_to_local in core-site.xml to map the principals into just ${username} on the filesystem.

 

Isilon does not honor the mapping rules, so you must remove the -${clustername} from all principals in the "Ambari Principals" section. Isilon will strip off the @${realm}, so no aliasing is necessary. In my Ambari 2.2.1 cluster running HDFS, YARN, MapReduce2, Tez, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Falcon, Storm, Flume, Accumulo, Ambari Metrics, Kafka, Knox, Mahout, Slider, and Spark, I made the following modifications in the "General" tab:

 

  • Smokeuser Principal Name: ${cluster-env/smokeuser}-${cluster_name}@${realm} => ${cluster-env/smokeuser}@${realm}

  • spark.history.kerberos.principal: ${spark-env/spark_user}-${cluster_name}@${realm} => ${spark-env/spark_user}-@${realm}

  • Storm principal name: ${storm-env/storm_user}-${cluster_name}@${realm} => ${storm-env/storm_user}-@${realm}

  • HBase user principal: ${hbase-env/hbase_user}-${cluster_name}@${realm} => ${hbase-env/hbase_user}@${realm}

  • HDFS user principal: ${hadoop-env/hdfs_user}-${cluster_name}@${realm} => ${hadoop-env/hdfs_user}@${realm}

  • accumulo_principal_name: ${accumulo-env/accumulo_user}-${cluster_name}@${realm} => ${accumulo-env/accumulo_user}@${realm}

  • trace.user: tracer-${cluster_name}@${realm} => tracer@${realm}

     

step4-UPNs.png

 

b) Service Principals (SPNs)

 

Ambari creates service principals, some of which are different than their UNIX usernames. Again, since Isilon does not honor the mapping rules, you must modify the principal names to match their UNIX usernames. In my Ambari 2.2.1 cluster, I made the following modifications in the "Advanced" tab:

 

  • HDFS -> dfs.namenode.kerberos.principal: nn/_HOST@${realm} => hdfs/_HOST@${realm}

  • YARN -> yarn.resourcemanager.principal: rm/_HOST@${realm} => yarn/_HOST@${realm}

  • YARN -> yarn.nodemanager.principal: nm/_HOST@${realm} => yarn/_HOST@${realm}

  • MapReduce2 -> mapreduce.jobhistory.principal: jhs/_HOST@${realm} => mapred/_HOST@${realm}

  • Falcon -> *.dfs.namenode.kerberos.principal: nn/_HOST@${realm} => hdfs/_HOST@${realm}

     

step4-SPNs.png

 

After configuring the appropriate principals, press "Next". At the "Confirm Configuration" screen, press "Next".

 

Stop Services / Kerberize Cluster

 

Stopping and Kerberizing services should succeed.

 

step7.png

 

Do not proceed: Isilon does not allow Ambari to create keytabs for Isilon principals. Instead, you must manually configure Kerberos on Isilon using the steps below.

 

a) Create KDC as an Isilon auth provider

 

Note: If this Isilon zone is already configured to use your MIT KDC, you can skip these steps.

 

isi auth krb5 create --realm=$REALM --admin-server=$admin_server --kdc=$kdc_server --user=$admin_principal --password=$admin_password

isi zone zones modify --zone=$isilon_zone --add-auth-provider=krb5:$REALM

 

b) Create service principals for HDFS and HTTP (for WebHDFS).

 

isi auth krb5 spn create --provider-name=$REALM --spn=hdfs/$isilon_smartconnect@$REALM --user=$admin_principal --password=$admin_password

isi auth krb5 spn create --provider-name=$REALM --spn=HTTP/$isilon_smartconnect@$REALM --user=$admin_principal --password=$admin_password

 

c) Create any necessary proxy users

 

In unsecured clusters, any user can impersonate any other user. In secured clusters, proxy users need to be explicitly specified.

If you have Hive or Oozie, add the appropriate proxy users.

 

isi hdfs proxyusers create oozie --zone=$isilon_zone --add-user=ambari-qa

isi hdfs proxyusers create hive --zone=$isilon_zone --add-user=ambari-qa

 

d) Disable simple authentication

 

Only Kerberos or delegation token authentication will be allowed.

 

isi hdfs settings modify --zone=$isilon_zone --authentication-mode=kerberos_only

 

 

Now that Isilon is configured as well, press "Next" in Ambari to move on to the last step of the wizard.

 

Start and Test Services

 

step8.png

 

If services do not start up, here are some tricks for debugging Kerberos issues:

  1. Due to a bug in YARN, you need to set the "yarn.resourcemanager.principal" to yarn/$rm_hostname@$REALM in YARN -> Custom yarn-site. The "_HOST" syntax does not work with Kerberos enabled.
  2. To debug Java GSSAPI/Kerberos errors, add "-Dsun.security.krb5.debug=true" to HADOOP_OPTS.
  3. For HTTP 401 errors, use curl with -iv for extra debug information.
  4. Ensure forward and reverse DNS is set up between all hosts.

 

(Optional) Strong RPC Security

 

In HDFS -> Custom core-site set "hadoop.rpc.protection" to "integrity" or "privacy". In addition to authentication, integrity guarantees messages have not been tampered with, and privacy encrypts all messages.

 

Run a job!

 

From any client host, try a MapReduce job!

 

kinit <some-user>

yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 1 1000

 

Job Finished in 37.635 seconds

Estimated value of Pi is 3.14800000000000000000

 

Congratulations--you have secured your cluster with Kerberos!

 

(Optional) Disable Kerberos

 

Clean up Isilon

 

Let's clean up Isilon first. This is essentially the inverse of enabling Kerberos.

 

a) Disable Kerberos authentication

 

isi hdfs settings modify --authentication-mode=simple_only --zone=$isilon_zone

 

b) Delete any proxy users

 

isi hdfs proxyusers delete oozie --zone=$isilon_zone

isi hdfs proxyusers delete hive --zone=$isilon_zone

 

c) Delete principals

 

isi auth krb5 spn delete --provider-name=$REALM --spn=hdfs/$isilon_smartconnect@$REALM --all

isi auth krb5 spn delete --provider-name=$REALM --spn=HTTP/$isilon_smartconnect@$REALM --all

 

 

Note that the above commands only remove those principals from Isilon, but do not remove them from the KDC. Use these commands to remove the Isilon principals from the KDC:

 

kadmin -p $admin_principal

 

kadmin: delete_principal hdfs/$isilon_smartconnect@$REALM

kadmin: delete_principal HTTP/$isilon_smartconnect@$REALM

 

d) Remove KDC as an Isilon authentication provider

 

isi zone zones modify --zone=$isilon_zone --remove-auth-provider=krb5:$REALM

isi auth krb5 delete --provider-name=$REALM

 

Clean up clients using Ambari

 

Press "Disable Kerberos" in Admin -> Kerberos. All the services should come up green.