With the recent publication of the high level overviews of deploying Kerberos authentication against Isilon and Hadoop on this blog, I thought I'd return and discuss some of the considerations around the configuration and methodologies used within OneFS to facilitate Kerberized Hadoop on Isilon.

 

One of the cornerstones of this implementation is leveraging the Active Directory's ability to also provide UNIX identities for users as well as normal SID's with additional schema attributes complying with rfc2307. Using these additional features we can simplify user mapping and identity management on Isilon from a permissions management perspective. Using the rfc2307 extension is definitely not the only method to achieve this but it does provide an elegant and simplified solution.

 

Let's discuss some of the considerations on OneFS with implementing Kerberized hadoop with AD.

 

 

PREREQUISITES

  • The cluster must be joined correctly to the target Active Directory.
  • The Access Zone the HDFS root lives under is configured for this Active Directory provider
  • All IP addresses within the required SmartConnectZone must be added to the reverse DNS with the same FQDN for the cluster delegation.
  • Isilon will leverage the Active Directory Schema extension that support UNIX identities; known as the Microsoft Service for UNIX or the Microsoft Identity Management for UNIX. These schema attributes extend Active Directory objects to provide UID’s and GID’s to a user account in Active Directory.
  • Users running hadoop jobs are Active Directory User Principals with UNIX attributes allocated.

 

 

OneFS ACTIVE DIRECTORY SETTINGS

In order to enable kerberized hadoop authentication operations where Active Directory is the authentication authority a couple of advanced options will need to be enabled on the Active Directory Provider.

 

1.png

 

 

From the Isilon WebUI:

Access

Authentication Providers

Active Directory

View Details

                Advanced Active Directory Settings

 

  • Enable - rfc2307: This leverages the Identity Management for UNIX services in the Active Directory schema
  • Map user/group into primary domain: Yes – Without this setting the domain name will need to be prefixed during user login.

 

The example below shows the advanced active directory settings utilized for the test domain FOO.COM. If the status indicator appears in any color other than green the active directory is out of synchronization with OneFS and will need to be restored before continuing.

 

2.png

 

Currently enabling rfc2307 for SFU support can be managed by the CLI but the assume default domain switch is missing from the CLI. It will likely return in an MR shortly.

 

#isi auth ads modify --sfu-support=rfc2307 FOO.COM

#isi auth ads view --provider-name=FOO.COM -v

 

5.png

 

Having enabled these features, we can validate look ups are working for short and long name:

#isi auth mapping token --user=administrator --zone=rip2-cd1

#isi auth mapping token --user=administrator@FOO.COM --zone=rip2-cd1

 

 

6.png

 

7.png

 

 

 

 

SFU-RFC2307 Enablement on the Active Directory Provider

By leveraging the Active Directory Provider with SFU support for rfc2307 enabled, we maintain a consistent user and identity mapping between users executing Hadoop jobs and Isilon. This will allow the implementation of a standard Isilon permissioning model leveraging the OneFS permission model with posix file permissions. Without SFU-rfc2307 support Isilon will need to leverage user mapping to a different LDAP provider who can provide UNIX UID & GID'S for the user.

 

I will discuss the permissioning model in an upcoming post specifically, but for a great background checkout the following series of multiprotocol post I coauthored a while back.

 

 

 

But, what is enablement of SFU-rfc2307 doing for us, the short answer is it providing UID's & GID's from Active Directory for our AD user accounts. Since our access token now contain Directory Service based UID/GID & SID we can permission directly against these AD identities to support full multiprotocol access.

 

 

User in Active Directory

Isilon User Access Token

  3.png

User’s UNIX ID as seen in Active Directory

4.png

User Token as seen on Isilon

 

 

The token validates that the Active Directory provider is pulling the correct information from Active Directory and the UNIX

identities are present.

 

Since AD is now providing the correct UID from AD for the users running jobs, the on-disk permission will based on UID's & GID'S (as can be seen in the token) and the permission model utilized can be easily based posix authoritative permissions and managed with existing tools; chown & chmod.

 

This wraps up the high level overview of how to leverage the Active Directory Provider for kerberized hadoop access by leveraging the SFU-rfc2307 extension in AD.

 

 

Next up Isilon permissioning strategies with multiprotcol access with Kerberized hadoop.

 

 

 

 

 

 

russ_stevenson

Isilon

Using Hadoop with Isilon - Isilon Info Hub