OneFS heat statistics deep dive

NOTE: This topic is part of the Uptime Information Hub.

 

The Isilon OneFS operating system incorporates a powerful statistics engine that constantly collects statistics from various system components. In particular, there are detailed statistics available for the protocols supported by the operating system, for the HDD and SSD drives in each node, and for the clients that connect to the cluster.There is also another class of statistics—called “heat” statistics—which are somewhat more unusual. This article will cover that class in depth. deep_dive6.jpg

 

What are heat statistics?

 

Heat statistics exist to enable an administrator to locate hotspots within the OneFS file system, and determine which operations are being performed against which file system paths/objects. To understand how these operations work, let’s dive into the technical details.

 

Background

 

The metadata for file system objects in OneFS is stored in LINs (Logical Inode Numbers). These are similar to inodes in other POSIX-compliant file systems. A few relevant differences are that the LINs themselves—which are mirrored for protection—describe the location of the on-disk inode mirrors for each LIN, and each on-disk inode mirror contains the metadata for a file. Additionally, the OneFS LIN space is huge (2^64 possible values). Because of this, LINs are never reused, and a newly created file system object will always receive a new (higher) LIN number. This means that the heat statistics are somewhat different from the other statistics classes. For instance, for any given protocol, there is a small, finite number of protocol operations. This means that statistics for latency and operations (such as read, write, and so on) can be accurately captured for each of them.

 

The number of paths in a OneFS file system is frequently measured in the billions, and therefore it is not practical to track operations occurring against every path. Of course, not every path in the system is active at the same time. Nonetheless, it is quite possible for the “working set of paths” to be large. Because of this, the heat statistics use a sampling method. A fixed-size buffer is used to track the LIN and the operations occurring against that LIN. If the buffer overflows, the sampling rate is reduced. The upshot is that the heat statistics are a statistical measure, but they are not necessarily absolute. In other words, if the heat statistics show 50 lock operations per second against a particular LIN, there may be 50 per second, or there may be more than 50 per second. The important thing to understand is that the numbers are accurate relative to one another. For example, if one directory shows 1,000 operations per second and another shows 100 per second, you know that the former is ten times busier.

 

For LINs that map to externally visible paths, the affected file system path is displayed. Some LINs in OneFS are internal to the file system operation and therefore do not have a “/ifs/…” pathname available. In older OneFS releases, these show up as “UNKNOWN”. In newer releases, a number of them have been added to the code and show up as, for example, “SYSTEM (0x0)” for LIN 0 (the “write” LIN). For more information on cluster statistics, see Troubleshooting using cluster statistics.

 

What statistics can be measured?

 

The OneFS statistics engine implements a number of orthogonal dimensions that can be specified to narrow or sort queries. The heat statistics are no exception. There are 12 different event types that are categorized into five different classes. Queries can be sorted based off six different parameters (class, event, LIN, node, operation rate, or path).

 

The following table details the classes and operations available:

 

Class

Operations

Description

read

read

File data reads

write

write

File data writes

namespace_read

lookup

Filename lookups

getattr

Read file metadata

namespace_write

rename

Rename/move file

link

Create hardlink

unlink

Unlink/delete

setattr

Set file metadata

other

lock

Get a LIN lock on the file

blocked

Lock initiator blocked because another thread holds the lock

contended

Lock holder informed that another thread wants the lock

deadlocked

LIN lock domain deadlock detected and broken.

 

How do I use heat statistics?

 

The heat statistics are a powerful tool. They can be extremely helpful when investigating cluster performance, but you must be careful about interpreting the output.

 

First, as mentioned earlier, the numbers are not absolute, but rather they are relative to one another. As such, they are valuable for comparison within the same output, but it is not necessarily valid to compare the operation rate from output in one timeframe to the output in a different timeframe, because there is no way to determine if the scaling factor for data collection was the same.

 

One way of looking at the heat statistics is whether they are indicators of “healthy” or normal workflow versus “unhealthy” or problematic workflow. Some of the statistics are useful for one, and some are useful for the other as follows:

  • The read and write operations are somewhat obvious. These will show if there is heavy read or write activity to a handful of files. Similarly, since every operation in the file system requires some kind of locking, the “lock” operation is a workload indicator and not, of itself, a problem indicator.
  • The namespace operations fall somewhat in the middle. They are an indicator of workload, but if, for example, there are heavy metadata updates concentrated in a single directory, then this is likely to limit performance, and the application responsible for the workload should be modified (if possible) to spread the data over a larger number of directories to eliminate false/unnecessary sharing.
  • Finally, the “blocked”, “contended” and “deadlocked” operations all fall more toward the “problem” or at least suboptimal performance end of the spectrum.
    • Blocked and contended operations are two sides of the same coin. When a lock is already held and another thread wants the same lock, the new locker blocks, and the holder is notified of the contention. If there are high blocked/contended numbers on a path, this suggests that the workflow is not taking advantage of the scale-out nature of the cluster and will be limited in performance.
    • Deadlocked is the most expensive of all operations. The OneFS file system generally tries to avoid deadlocks, but it is written in such a way that they are expected in certain situations, and the LIN lock domain in OneFS has code to detect and break deadlocks. This is, however, very expensive, because any work has to be unwound and retried (transparently) from scratch. As such, if the deadlock events show up high in the list of heat events (for example, in the top 10), this is indicative of a serious problem. You should open a Dell EMC support case to determine the cause and help remediate the issue.

Conclusion

 

The OneFS heat statistics are a powerful and useful tool when looking at the performance of a OneFS cluster. We hope that this document explains the nature of the available statistics and how they should be used.