With the recent publication of the high level overviews of deploying Kerberos authentication against Isilon and Hadoop on this blog, I thought I'd return and discuss some of the considerations around the configuration and methodologies used within OneFS to facilitate Kerberized Hadoop on Isilon.
One of the cornerstones of this implementation is leveraging the Active Directory's ability to also provide UNIX identities for users as well as normal SID's with additional schema attributes complying with rfc2307. Using these additional features we can simplify user mapping and identity management on Isilon from a permissions management perspective. Using the rfc2307 extension is definitely not the only method to achieve this but it does provide an elegant and simplified solution.
Let's discuss some of the considerations on OneFS with implementing Kerberized hadoop with AD.
- The cluster must be joined correctly to the target Active Directory.
- The Access Zone the HDFS root lives under is configured for this Active Directory provider
- All IP addresses within the required SmartConnectZone must be added to the reverse DNS with the same FQDN for the cluster delegation.
- Isilon will leverage the Active Directory Schema extension that support UNIX identities; known as the Microsoft Service for UNIX or the Microsoft Identity Management for UNIX. These schema attributes extend Active Directory objects to provide UID’s and GID’s to a user account in Active Directory.
- Users running hadoop jobs are Active Directory User Principals with UNIX attributes allocated.
OneFS ACTIVE DIRECTORY SETTINGS
In order to enable kerberized hadoop authentication operations where Active Directory is the authentication authority a couple of advanced options will need to be enabled on the Active Directory Provider.
From the Isilon WebUI:
Advanced Active Directory Settings
- Enable - rfc2307: This leverages the Identity Management for UNIX services in the Active Directory schema
- Map user/group into primary domain: Yes – Without this setting the domain name will need to be prefixed during user login.
The example below shows the advanced active directory settings utilized for the test domain FOO.COM. If the status indicator appears in any color other than green the active directory is out of synchronization with OneFS and will need to be restored before continuing.
Currently enabling rfc2307 for SFU support can be managed by the CLI but the assume default domain switch is missing from the CLI. It will likely return in an MR shortly.
#isi auth ads modify --sfu-support=rfc2307 FOO.COM
#isi auth ads view --provider-name=FOO.COM -v
Having enabled these features, we can validate look ups are working for short and long name:
#isi auth mapping token --user=administrator --zone=rip2-cd1
#isi auth mapping token --user=administrator@FOO.COM --zone=rip2-cd1
SFU-RFC2307 Enablement on the Active Directory Provider
By leveraging the Active Directory Provider with SFU support for rfc2307 enabled, we maintain a consistent user and identity mapping between users executing Hadoop jobs and Isilon. This will allow the implementation of a standard Isilon permissioning model leveraging the OneFS permission model with posix file permissions. Without SFU-rfc2307 support Isilon will need to leverage user mapping to a different LDAP provider who can provide UNIX UID & GID'S for the user.
I will discuss the permissioning model in an upcoming post specifically, but for a great background checkout the following series of multiprotocol post I coauthored a while back.
- Multiprotocol Concept Series Part 1: Overview
- Multiprotocol Concepts Series part 2: Access Tokens, User Mapping, and ID Mapping: Covers access tokens, user mapping, ID mapping, and briefly touches on directory services and on-disk identity.
- Multiprotocol Concepts Series part 3: On-disk identity: Covers on-disk identity, including how OneFS determines on-disk identity and handles different types of identity across directory services.
- Multiprotocol Concepts Series part 4: Isilon file access checking: how OneFS presents protocol-specific views of permissions so that NFS exports display mode bits and SMB shares show ACLs.
- Multiprotocol Concepts Series part 5: Troubleshooting permissions issues and the permissions repair job: Covers how to troubleshoot permissions issues and how to use the OneFS Job Engine's permissions repair job.
But, what is enablement of SFU-rfc2307 doing for us, the short answer is it providing UID's & GID's from Active Directory for our AD user accounts. Since our access token now contain Directory Service based UID/GID & SID we can permission directly against these AD identities to support full multiprotocol access.
User in Active Directory
Isilon User Access Token
User’s UNIX ID as seen in Active Directory
User Token as seen on Isilon
The token validates that the Active Directory provider is pulling the correct information from Active Directory and the UNIX
identities are present.
Since AD is now providing the correct UID from AD for the users running jobs, the on-disk permission will based on UID's & GID'S (as can be seen in the token) and the permission model utilized can be easily based posix authoritative permissions and managed with existing tools; chown & chmod.
This wraps up the high level overview of how to leverage the Active Directory Provider for kerberized hadoop access by leveraging the SFU-rfc2307 extension in AD.
Next up Isilon permissioning strategies with multiprotcol access with Kerberized hadoop.