Find Communities by: Category | Product

Thank you! The Ask the Expert session featuring the EMC Isilon Information Experience team was a lively exchange. It was great to share experiences and ideas about Isilon content with you. In this follow-up post, we'll update you about what actions we're taking based on your suggestions.

Finding just what you need, quickly

One item that came through loud and clear is that we produce a lot of dense information about Isilon products. Those of you who manage several EMC platforms can get inundated, and sorting out what’s important from what can wait can be quite a task. We should make it a lot easier and faster for you to find just what’s relevant to you. We’ve created a proposal for doing that which is now in review.

Cool Commands videos

Thanks for the great ideas for the new Cool Commands video series.  The first batch of videos will cover broad commands, and you’ll see more specific commands like isi statistics covered in subsequent videos.

When in doubt, script with the OneFS API

Command line interface (CLI) commands can change from release to release and that can be a source of concern. We have the answer for you: the OneFS Platform API (PAPI). Even if the underlying CLI commands change for a particular release, PAPI scripting will be consistent from release to release. Check out the OneFS API Info Hub and OneFS Uptime Hub for information about getting started.

New! Updated! Lookie!

Oh yes, those New! and Updated! tags on our Info Hubs can be distracting. Special notices can really clutter up the content, too. You let us know that if tags and notices are left up for too long, they can be confusing. So we’ll remove New! and Updated! tags after 2 weeks and remove notices as soon as they’re no longer relevant.

Tell us more

Your feedback is invaluable – don’t be shy about letting us know about your experience accessing and using EMC Isilon technical content. Email us at isicontent@emc.com with your feedback. And thank you!

There have been a couple of recent inquiries from the field around OneFS read cache persistence, and felt it might be of interest to a wider audience.

 

Firstly, a quick OneFS caching architecture review…


Cache_architecture_1.jpg

 

The Level 1 cache (L1), or front-end cache, is memory that is nearest to the protocol layers (e.g. NFS, SMB, etc) used by clients, or initiators, connected to that node. The primary purpose of L1 cache is to prefetch data from remote nodes.

 

Level 2 cache (L2), or back-end cache, refers to local memory on the node on which a particular block of data is stored. L2 cache is globally accessible from any node in the cluster and is used to reduce the latency of a read operation by not requiring a seek directly from the disk drives

 

Also known as SmartFlash, the level 3 cache (L3) refers to a subsystem which caches evicted L2 blocks on one or more SSDs on the node owning the L2 blocks. Unlike L1 and L2, not all nodes or clusters have an L3 cache, since it requires solid state drives (SSDs) to be present and exclusively reserved and configured for caching use.

 

L3 serves as a cost-effective method of extending a node’s cache from gigabytes to terabytes. This allows clients to retain a larger working set of data in cache, before being forced to retrieve data from higher latency spinning disk (HDD). The L3 cache is populated with “interesting” L2 blocks that are being dropped from memory. Since L3 is based on persistent flash storage, once the cache is populated, or warmed, it’s highly durable and persists across node reboots, etc.


Cache Eviction

 

Having an efficient eviction and replacement policy is absolutely vital for cache performance. This is evident in OneFS, where each level of the cache hierarchy employs a different strategy for eviction, tailored to the attributes of that cache type.

 

For L1 cache in storage nodes, cache aging is based on a drop-behind algorithm. The L2 cache uses a Least Recently Used algorithm (LRU), since it is relatively simple to implement, low-overhead, and performs well in general. By contrast, the L3 cache employs a first-in, first-out eviction policy (or FIFO), since it’s writing to what is effectively a specialized linear filesystem on SSD.

 

Note: One drawback of LRU is that it is not scan resistant. For example, a OneFS Job Engine job or backup process that scans a large amount of data can cause the L2 cache to be flushed. This can be mitigated to a large degree by the L3 cache.

 

Cache Persistence

 

Effective caching is all about keeping hot data hot! So the most frequently accessed data and metadata on a node should just remain in L2 cache and not get evicted to L3.

 

For the next tier of cached data that’s accessed frequently enough to live in L3, but not frequently enough to always live in RAM, there’s a mechanism in place to keep these semi-frequently accessed blocks in L3.

 

To maintain this L3 cache persistence, when the kernel goes to read a metadata or data block, the following process occurs:

 

1) First, L1 cache is checked. Then, if no hit, L2 cache is consulted.

 

2) If a hit is found in memory, it’s done.

 

3) If not in memory, L3 is then checked.

 

4) If there’s an L3 hit, and that item is near the end of the L3 FIFO (last 10%), a flag is set on the block which causes it to be evicted into L3 again when it is evicted out of L2.

 

This marking process helps guard against the chronological eviction of blocks that are accessed while they are in the last 10% of the cache, and serves to keep most of the useful data in cache.

 

 

Read process on local node

 

Cache_architecture_2.jpg

 

 

Read process on remote node

 

Cache_architecture_3.jpg

Isilon Support is now breaking in to the social space. As you may or may not have noticed in the past couple months, there have been Isilon Support engineers actively participating in your Isilon Community questions. If you'd like to know more about us, check us out at sjogrd, Shane Dekart, and johnsonka here on the Community! Send us a message, ask us your questions, we are here to help! Aside from the three Support Engineers who are here for your questions, the Isilon team is breaking in to other social media. Check us out!

IDTV.PNG.pngIsilon has put together a comprehensive video playlist that really gets you involved in managing your cluster. The playlist covers a wide variety of topics ranging from SmartPools to InsightIQ. Check out what we have available today! [New content added from time to time, so check back or subscribe to notifications to see new videos!]

 

If you have an idea for a video or there is a video you would like to see created, please send us your ideas! The best email is isicontent@emc.com, let us know on Twitter @EMCSupport, or leave your comments below!

 

 

 

 

EMC-Cluster-Talk-v02.jpgIf videos aren't your thing, there is always the ClusterTalk podcast to satisfy your want to learn more about your Isilon cluster's inner workings, tips, tricks, or gotcha's.

 

Made by and for people who work with Big Data, ClusterTalk explores how next-gen data storage touches down in real-world innovation. Hosts Scott Pinzon and cadiletta along with guest experts, also help EMC Isilon scale-out NAS users get the most performance, efficiency, and insight from their OneFS clusters.

 

 

 

This monthly podcast will have a file posted here on the Community along with the show notes and conversations.

 

Subscribe to the podcast on iTunes - iTunes - Podcasts - EMC Isilon ClusterTalk Podcast by EMC Corporation

 

Listen on Stitcher - EMC Isilon ClusterTalk Podcast | Listen via Stitcher Radio On Demand

 

Or our PodBean hosting site - EMC Isilon ClusterTalk Podcast

 

If there is ever a topic you'd like to hear our experts chime in on in this podcast, please let us know! The ClusterTalk team can be reached at clustertalk@emc.com or by leaving your comments below. Also, if you tweet us @EMCSupport we would be more than happy to pass your feedback along.

ECN.png

 

Last, but certainly not least, we're on TWITTER! Have a question, comment, concern, or looking for an Isilon related chat, you can reach out to us @EMCSupport OR @EMCIsilon.

 

Questions, comments, or concerns for me, David, or Shane? Leave your thoughts below or reach out to us in any of the communication methods above. Let's get social!

 

 

 

 

 

 

 

katie.png

 

Katie Johnson

Technical Support Engineer II/Administration Team Coach

Twitter: @EMCSupport

 

 

 

 

 

 

 

 

 

 

shane.png

 

Shane Dekart

Technical Support Engineer II

Twitter: @EMCSupport

 

 

 

 

 

 

 

 

 

 

dave.png

 

David Sjogren

Technical Support Engineer II/Windows Protocols Team Coach

Twitter: @EMCSupport

 

 

 

 

 

 

 

 

 

 

Share the knowledge with your followers!

tweet-button.png

This blog post is the first in a series covering basic multiprotocol concepts in OneFS. The goal is to offer a simple and clear explanation of how OneFS handles multiprotocol data access. The first blog posts in this series will cover high-level concepts. Subsequent blog posts will dive a little deeper and provide examples.

 

If you find multiprotocol data access in the Isilon OneFS operating system confusing, you’re not alone. Every network-attached storage (NAS) platform approaches multiprotocol data access differently, and there is no formal industry standard for how to implement multiprotocol access in the file systems. How a vendor handles the integration of permission and security models to enable access to the same file through different protocols varies among the vendors themselves.

 

So what does multiprotocol mean in OneFS? Essentially, it means ensuring the consistency of secured data access, regardless of protocol. Different users, operating systems, and implementations can write and read the same files on the cluster.

 

Setting the stage: single protocol vs. multiprotocol

To highlight the benefits of multiprotocol data access, let’s focus, first, on the differences between single protocol access and multiprotocol access.

 

Single data access protocols are self-contained. Windows users access Windows file servers through the Server Message Block (SMB) or Common Internet File System (CIFS) protocol. UNIX users access file servers through the Network File System (NFS) protocol. When a user connects to a cluster to read and write files, the protocol assesses the files’ security against a set of permissions to determine whether access will be allowed. Each protocol has its own type of file permissions to the user and to the file(s), which prevents a UNIX user from accessing Windows file servers, and vice versa. Each protocol is a closed system.

 

Multiprotocol access puts the NAS platform in the middle, creating a system where different users can connect to the same file server (or cluster) through different protocols. The multiprotocol NAS platform handles and stores the permissions for each protocol and user.

multiprotocol overview_nas.jpg

 

In OneFS, multiprotocol means that users who connect through NFS, SMB, and other protocols can access the same file and directories. If necessary, you can create a file or a directory that can be accessed only by a Windows or UNIX client. But unlike other file systems or NAS systems—which might maintain protocol permissions separately or rely on user mapping—Isilon OneFS uses a single unified permission model. This is the key to understanding multiprotocol access in OneFS.

multiprotocol overview.jpg

 

The unified permission model is implemented by creating a common access token. The access token is generated when a user connects to the cluster. In OneFS, your identity (or multiple identities from different directory services) is encapsulated into a single token that represents you to OneFS. The access token contains your user identifier (UID), user security identifier (SID), Windows group memberships (SID’s), group identification number (GID’s) from LDAP group memberships, and more. All those identities are rolled into one, contained in the token. This token is then presented directly against the file permissions stored on the OneFS file system.

Takeaways

Here are some highlights:

  • Every NAS platform implements multiprotocol differently. No industry standard exists.
  • Multiprotocol in OneFS refers to consistent file access regardless of protocol.
  • The key to how multiprotocol works in OneFS is the unified permission model.
  • An access token in OneFS contains all identities associated with a single user.
  • The access token is presented against file permissions stored in OneFS to define file access.

Coming soon

The next blog post in this series will expand on the multiprotocol concepts covered here and will address common questions about generating access tokens, on-disk identities, user mapping, and directory services. Additional posts will address how OneFS stores file and share permissions, POSIX permissions, access control list (ACL) policies, and how to check permissions. The following common multiprotocol commands will also be covered in more detail:

  • -ls –le/ls -led
  • -ls –len/ls -lend
  • -chmod/chown
  • isi auth mapping token

 

Tell us what you think of this article. Was this level of information useful? Do you have questions that you would like us to cover in future blog posts? Let us know by leaving a comment.

Isilon InsightIQ 3.2: a now more customize-able version! In his recent blog, Patrick Kreuch goes over all the cool new features of 3.2; if you're interested, you can find his blog here: http://isiblog.emc.com/2015/07/insightiq-3-2-cool-new-stuff/

 

All of the cool new features aside (they are pretty cool, so this may be hard), InsightIQ 3.2 is the most usable version for managing your cluster statistics. While there is not an option to limit time frame for your performance statistics, you can now be sure the ones you collect are the ones you want. Are you concerned about the disk IOPs on your cluster? No? Well stop collecting that statistic. Are you concerned with Per File heat maps? No? Get rid of them! In this mini tutorial of sorts, I'd like to walk you through customizing what your InsightIQ instance collects from your Isilon Clusters. I will also be covering what the configurable data sets mean to you so you may decide whether or not you need them in your data gathering.

 

It's as easy as accessing your InsightIQ Web Interface and disabling what you don't need. Start on the Settings page logged in as your administrative account:

 

SettingsPage.PNG.png

 

Click on Configure next to the cluster in question (mine is called groot). On the resulting page, click on Data Set Configuration:

 

DataSetConfiguration1.PNG.png

It's the last tab in your options, highlighted here with a stellar red arrow. Once you click that you'll be here:

 

DataSetConfiguration.PNG.png

 

From this page you can disable or enable any data set you wish! Note: these are for the most part performance or "live" reporting modules with the exception of Quotas, those are in the File System Reporting tab. (Note 2: You can also see if you have any delayed data sets here, like my job statistics in the groot example.)

 

Don't know which data sets are important for your business monitoring? Here is a quick breakdown of what they cover:

  • Active Client Count
    • Active Clients
      • Displays the number of unique client addresses generating protocol traffic on the monitored cluster. Clients that are connected, but not generating any traffic, are not counted.
    • Clients Summary
      • Displays clients that are currently consuming the most bandwidth.

 

  • Aggregated External Network Counters
    • External Network Packets Rate
      • Displays the total number of packets that passed through the external network interfaces in the monitored cluster. You can optionally break out this data by direction, interface, or node.
    • External Network Throughput Rate
      • Displays the total amount of data that passed through the external network interfaces in the monitored cluster. You can optionally break out this data by interface, direction, client, operation class, protocol, or node.

 

  • CPU usage
    • CPU % Use
      • Displays the average CPU usage for all nodes in the monitored cluster.

 

  • Connected Clients
    • Connected Client
      • Displays the number of unique client addresses with established TCP connections to the cluster on known ports.

 

  • Deduplication
    • Deduplication Summary (Physical)
      • Displays the amount of space that deduplication has saved on the cluster and the amount of data that has been deduplicated. This module refers to the estimated physical space and data.
    • Deduplication Summary (Logical)
      • Displays the amount of space that deduplication has saved on the cluster and the amount of data that has been deduplicated. This module refers to the logical space and data.

 

  • Disk Performance
    • Disk Activity
      • Displays the average percentage of time that disks in the cluster spend performing operations instead of sitting idle.
    • Disk Operations Rate
      • Displays the average rate at which the disks in the cluster are servicing data read/write/change requests, also referred to as operations or disk transfers.
    • Disk Throughput Rate
      • Displays the total amount of data being read from and written to the disks in the cluster. You can optionally break out this data by disk, direction, or node.
    • Average Disk Hardware Latency
      • Displays the average amount of time it takes for the physical disk hardware to service an operation or transfer.
    • Average Disk Operation Size
      • Displays the average size of the operations or transfers that the disks in the cluster are servicing.
    • Average Pending Disk Operations Count
      • Displays the average number of operations or transfers that are in the processing queue for each disk in the cluster.
    • Pending Disk Operations Latency
      • Displays the average amount of time that disk operations spend in the input/output scheduler.
    • Slow Disk Access Rate
      • Displays the rate at which slow (long-latency) disk operations occur.

 

  • Disk Storage
    • Cluster Capacity
      • Displays the amount of storage on the cluster.
    • Total Capacity
      • The total amount of storage capacity on the cluster.
    • Total Used Physical Space
      • Displays the total amount of physical space that is currently being used on the monitored cluster.

 

  • Events
    • Event Summary
      • Displays events generated by the monitored cluster during the specified time range.
    • Event Timeline
      • The horizontal red event timeline represents the same period of time that is currently represented in the performance charts.

 

  • External NIC counters
    • External Network Errors
      • Displays the number of errors generated for the external network interfaces.

 

  • IFS Cache Performance
    • L1 Cache Throughput Rate, L2 Cache Throughput Rate, L3 Cache Throughput Rate, L1 and L2 Cache Prefetch Throughput Rate, Overall Cache Throughput Rate
    • Average Cached Data Age
      • Indicates the average amount of time that data has been in the L1 and L2 caches.

 

  • IFS Operation counters
    • Blocking File System Events Rate
      • Displays the number of file blocking events occurring in the file system per second.
    • Contended File System Events Rate
      • Displays the number of file contention events, such as lock contention or read/write contention, occurring in the file system per second.
    • Deadlocked File System Events Rate
      • Displays the number of file system deadlock events that the file system is processing per second.

 

  • IFS Throughput
    • File System Events Rate
      • Displays the number of file system events, or operations, (such as read, write, lookup, or rename) that the file system is servicing per second.
    • File System Output Rate
      • Displays the rate at which data is being read from the file system.
    • File System Throughput Rate
      • Displays the rate at which data is being read from and written to the file system.

 

  • IFS Usable Capacity
    • Allocated Capacity
      • The storage capacity of nodes that belong to node pools that include three or more nodes.
    • Writable Capacity
      • The amount of storage capacity that user data can be written to on the cluster.

 

  • IFS Usage
    • User Data Including Protection
      • The amount of storage capacity that is occupied by user data and protection for that user data.

 

  • Job Engine Statistics
    • Jobs
      • Displays the number of active and inactive jobs on the cluster.
    • Job Workers
      • Displays the number of active and assigned workers on the cluster.

 

  • Per File Operation Counters
    • More granular Operation Counters.

     

    • Per Protocol Performance
      • Protocol Operation Average Latency
        • Displays the average amount of time required for protocols to process incoming operations.
      • Protocol Operations Rate
        • Displays the total number of requests that were originated by clients for all file data access protocols.

     

    • Quotas
      • Quota Reporting

     

    Once you have customized your Performance Gathering from your clusters, watch as your InsightIQ becomes more relevant and usable!

    Filter Blog

    By date:
    By tag: