Got asked the following question from the field recently:
“If I configure a nodepool to use L3 cache, will it be overridden by the “Metadata SSD Strategy” setting in a file pool policy?”
The SSDs in a OneFS nodepool can be used exclusively for L3 cache or an SmartPools SSD strategy, not both. All the SSDs in the pool will be formatted entirely differently for each of these options.
If the SSDs are reserved for L3, they will be formatted as a large linear device for use as an LRU (least recently used) read cache. Otherwise, the SSDs will be formatted as a regular storage device for use in the OneFS file system under SmartPools SSD strategies.
You can tell how a pool’s SSDs are being utilized from the WebUI, by navigating to Dashboard > Cluster Overview > Cluster Status.
Any SSDs reserved for L3 caching will be exclusive reserved and explicitly marked as such, and their capacity will not be included in any SSD usage stats, etc.
For example, take a Gen5 cluster with two node pools:
Nodes 1-3 are X410s using their SSDs for metadata read.
Nodes 4-6 are S210s with their SSDs reserved exclusively for L3 cache.
L3 cache is enabled per node pool via a simple on or off configuration setting. Other than this, there are no additional visible configuration settings available. When enabled, L3 consumes all the SSD in node pool.
Please note that L3 cache is enabled by default on any new node pool containing SSDs.
The WebUI also provides a global setting to enable L3 cache by default for new node pools.
Enabling L3 cache on an existing nodepool with SSDs takes some time, since the data and metadata on the SSDs needs to be evacuated to other drives before the SSDs can be formatted for caching. Conversely, disabling L3 cache is a very fast operation, since no data needs to be moved and drive reformatting can begin right away.
Although there’s no percentage completion reporting shown when converting nodepools to use L3 cache, this can be estimated by tracking SSD space usage throughout the job run. The Job impact policy of the FlexProtect_Plus or SmartPools job, responsible for the L3 conversion, can also be re-prioritized to run faster or slower.
Unlike HDDs and SSDs that are used for storage, when an SSD used for L3 cache fails, the drive state immediately changes to ‘REPLACE’ without a FlexProtect job running. An SSD drive used for L3 cache contains only cache data that does not have to be protected by FlexProtect. Once the drive status is reported as ‘replace’, the failed SSD can safely be pulled and swapped.