Here is the output from my lab cluster and the explanation:
isi01-1# isi storagepool nodepools list -v
Nodes: 1, 2, 3
Protection Policy: +2d:1n
L3 Enabled: Yes
L3 Migration Status: l3
Avail Bytes: 50.8274T
Avail SSD Bytes: 0b
Free Bytes: 56.5289T
Free SSD Bytes: 0b
Total Bytes: 73.0445T
Total SSD Bytes: 0b
Virtual Hot Spare Bytes: 5.7014T
isi01-1# isi_hw_status | grep Product
Product: X200-2U-Single-6144MB-2x1GE-2x10GE SFP+-27TB-600GB SSD
So you can see my cluster is comprised of 3 nodes, each with 27TB of space, right? So on paper that 27*3=81TB. But that's base 10 math meaning the drive manufacturers call them 3TB (9 of them per node in my case but what are they really?) So 3TB in manufacturer numbers (base-10) is actually only 3,000,000,000,000 bytes.
A gigabyte is really 1073741824 bytes, so 3TB is in reality 3298534883328 bytes. So that's 3 trillion/3298534883328=.909494701
I have 27 of those disks, so that means my cluster capacity isn't 81TB, it's 81*.909494701=73.66907TB
Look again at my capacity figures, the total is listed as 73.0445T, so that's a very tiny loss for formatted space. So now subtract the VHS space of 5.7014T, and you'll get the total space shown to the user: 67.3426TB.
A quick verification of that at the CLI:
isi01-1# isi stat -d -q
Cluster Name: isi01
Cluster Health: [ ATTN]
Cluster Storage: HDD SSD Storage
Size: 67T (73T Raw) 0 (0 Raw)
VHS Size: 5.7T
Used: 17T (25%) 0 (n/a)
Avail: 51T (75%) 0 (n/a)
Throughput (bps) HDD Storage SSD Storage
Name Health| In Out Total| Used / Size |Used / Size
x200_6.0tb_6.0gb-ra| OK | 44K| 12M| 12M| 17T/ 67T( 25%)| L3: 1.6T
m | | | | | |
So now if I were to subtract the Free Space, of 56.5289 from the total of 73.0445, I get 16.5156 Used, which in the info above is rounded up to 17TB.
There is no way to really bulk show how much overhead from parity you have, but here is one method that does work, though it'll take you a good deal of time, because it has to do a treewalk:
Compare the output of these 2 commands on the directory you want to look at:
isi01-1# du -sh /ifs/DMTEST/few_large_nfs
isi01-1# du -shA /ifs/DMTEST/few_large_nfs
You can see that including the A in the du output will show the size of the files themselves, excluding parity.
I hope this helps explain what's going on here: please note that my syntax was from OneFS 7.2, so some of what you see may be just a little bit different if you're on an older release (as far as syntax goes).
Senior Solution Architect
EMC Isilon Offer & Enablement Team
Here is the general accepted formula used when sizing:
1) Find total raw TB
2) Multiply that result by (1000^4/1024^4) to get base 2 TB
3) Subtract 1 GB per drive for the OS partitions
4) Subtract 0.0083% of that result to account for the file system format
5) Subtract the protection overhead from that result
As for the protection overhead that you are planning to use, look to the "OneFS Administration Guide" on support.emc.com. Skip to the section: "OneFS data protection" where it will talk about N+M data protection, protection schemes such as N+1, N+2:1 (default), 2x, etc and the associated cost/parity overhead. Also, you will see a very good matrix listing the percent overhead which begins by reminding us: "The parity overhead for each protection level depends on the file size and the number of nodes in the cluster."You can also refer chart below for capacity calculation..
Number of +1 overhead +2:1 overhead +2 overhead +3:1 overhead +3 overhead +4 overhead
3 nodes 2+1 (33%) 4+2 (33%) 3x 3+3 (50%) 3x 3x
4 nodes 3+1 (25%) 6+2 (25%) 2+2 (50%) 9+3 (25%) 4x 4x
5 nodes 4+1 (20%) 8+2 (20%) 3+2 (40%) 12+3 (20%) 4x 5x
6 nodes 5+1 (17%) 10+2 (17%) 4+2 (34%) 15+3 (17%) 3+3 (50%) 5x
7 nodes 6+1 (14%) 12+2 (14%) 5+2 (28%) 16+3 (15%) 4+3 (43%) 5x
8 nodes 7+1 (12.5%) 14+2 (12.5%) 6+2 (25%) 16+3 (15%) 5+3 (38%) 4+4 (50%)
9 nodes 8+1 (11%) 16+2 (11%) 7+2 (22%) 16+3 (15%) 6+3 (33%) 5+4 (44%)
10 nodes 10+1 (10%) 16+2 (11%) 8+2 (20%) 16+3 (15%) 7+3 (30%) 6+4 (40%)
12 nodes 11+1 (9%) 16+2 (11%) 10+2 (17%) 16+3 (15%) 9+3 (25%) 8+4 (33%)
14 nodes 13+1 (8%) 16+2 (11%) 12+2 (15%) 16+3 (15%) 11+3 (21%) 10+4 (29%)
16 nodes 15+1 (6%) 16+2 (11%) 14+2 (13%) 16+3 (15%) 13+3 (19%) 12+4 (25%)
18 nodes 16+1 (5%) 16+2 (11%) 16+2 (11%) 16+3 (15%) 15+3 (17%) 14+4 (22%)
20 nodes 16+1 (5%) 16+2 (11%) 16+2 (11%) 16+3 (15%) 16+3 (15%) 16+4 (20%)
30 nodes 16+1 (5%) 16+2 (11%) 16+2 (11%) 16+3 (15%) 16+3 (15%) 16+4 (20%)
Base 2 is for us old ***** and operating systems that were mostly written by old *****. The current standard is base 10 where 1 TB is 1000^4 bytes. We're not just talking about "manufacturer" units - we're talking well-documented industry standards that more than a decade old. See alsoBinary prefix - Wikipedia, the free encyclopedia
The other problem we have is that the quoted drive capacity is *before* formatting. A single 1TB drive is neither 1TB nor 1TiB.
To quote wikipedia:
The unit kilobyte is commonly used to indicate either 1000 or 1024 bytes. The value 1024 originated as compromise technical jargon for the byte multiples that needed to be expressed by powers of 2, but lacked a convenient name. As 1024 (210) approximates 1000 (103), roughly corresponding SI multiples were used for binary multiples. In 1998 the International Electrotechnical Commission (IEC) enacted standards for binary prefixes, specifying the use of kilobyte to strictly denote 1000 bytes and kibibyte to denote 1024 bytes. By 2007, the IEC Standard had been adopted by the IEEE, EU, and NIST and is now part of the International System of Quantities. Nevertheless, the term kilobyte continues to be widely used with both of the following two meanings:
- 1 kB = 1000bytes = 103 bytes is the definition recommended by the International Electrotechnical Commission (IEC). This definition is used in networking contexts and most storage media, particularly hard drives, Flash-based storage, and DVDs, and is also consistent with the other uses of the SI prefix in computing, such as CPU clock speeds or measures of performance. The Mac OS X 10.6 file manager is a notable example of this usage in software. Since Snow Leopard, file sizes are reported in decimal units.
- 1 KB (or KiB) = 1024bytes = 210 bytes is the definition used by most vendors of memory devices and software when referring to amounts of computer memory, such as Microsoft Windows and Linux.[unreliable source?] In the unambiguous IEC standard the unit for this amount of information is one kibibyte (KiB).
On Isilon (and every other NAS array and even file servers running Windows or Linux), it's really hard to figure out how much data you can really store. You might be able to store a 1GB file but not be able to store 1,000 1MB files because of the overhead involved. Not just in overall protection (RAID-5, RAID-6, OneFS protection level, etc.), but in per-file overhead with minimum blocksize requirements. If your average file size is 1K, all answers suck (and OneFS might not have been the right solution for you). If your average file size is 128K or larger, OneFS is an excellent choice.
To help with determining how much more data can be added to a cluster, InsightIQ 3.1 has added a capacity report. It looks at the last file systems analytics job using the logical space vs physical space used for the file data already on your cluster to estimate how much more data you can add to the cluster. This is making an assumption that the data you add to your cluster will similar to the data already on the cluster. It also give a very nice breakdown at to how the numbers was arrived at. The only downside is that it currently only works with clusters running OneFS 7.2.
Here is a screenshot of the calculations with a quick explications of each line that I wrote. Admittedly, it's not the best screenshot as this test cluster has a lot of thin provision VMDK files on it, so the overhead looks much lower than most environments.
There is a lot of data here, but it does help administrators understand what is currently consumed for storage and how the estimated value of more capacity is calculated. This is very useful, as the adaptive nature of OneFS is so different from other storage systems, it can bring a lot of clarity to administrators. This page is laid out and calculated like a simple subtraction equation to make it easy to follow. Note: all number are in base 2.
- "Total Capacity", this is the total raw capacity of the entire cluster, not including overhead.
- "Unallocated Capacity", in the context of Isilon OneFS, there should only be a number here with there is a new node type add that is not yet 3 more nodes. So there is capacity on that new node, but data cannot be put on it until there are at least 3 of that node time.
- "Allocated Capacity", this is the capacity available for data to be stored on. This is usually the same and Total Capacity.
- "Reserved for Virtual Hot Spares", this is reservation of space at a cluster level for drive failures to insure there is enough capacity for a drive rebuilt to complete as OneFS does not have stand-by hot spares and uses all the drives, all the time.
- "Writeable Capacity", this is the capacity in which data can be written.
- "User Data including Protection", this is all the physical space consumed by the data currently stored on the cluster, including the protection overhead to store it safely.
- "Snapshots Usage", this is the space consumed by all the snapshots on the cluster.
- "Remaining Capacity", this is the left over raw capacity that can still have data written to.
- "Estimated Additional Protection Overhead", this number is based on the latest File System Analytics job that was run. This tells InsightIQ how much data physically vs logically is being consumed, which is used to give an estimate for future data that will be added. This assumes that the data to be added will be similar to the data currently stored. Not this number is very low because there are many thin provisioned files on the cluster. That means they are logically very large, but physically very small. This make the number small. However, if the data to be added is similar to what is storaged, this should not be an issue.
- This is drop-down that will list all the File System Analytics (FSA) jobs and gives the option to choose an estimation based on a report other than that latest.
- "Estimated XXX of Usable Capacity", this is estimation of how much more logical (what the end user would understand) data can be added to the cluster.
I have a question regarding the total capacity. We are dealing with more than 10 Isilon clusters, and we find really difficult to calculate the total provisioning, I mean, if I count total Hard Limit Quotas could be that this amount exceed the total size of the cluster, and this without count the snapshots and the directories without quotas. I was looking for a good report where shows these data, but I can't find it. I will need a report showing the total capacity, total Hard quota limits of all shares, the snapshots usage and the directories without limit, that plus the protection could be a good approx of the total space remaining.
To separate out the effects of snapshots per directory, the easiest way is probably not immediately obvious -- it requires defining two distinct quota sets (domains) per directory. First one would include snaps, second one excludes snaps.
Both quotas can be queried inclusive or exclusive the protection overhead, which gives you in total four numbers for each directory; any kind report can in principle be created from these.
Of course the hard quota limits can be set in multiple ways as needed.
Don't know wether InsightIQ can make anything reasonably pretty out of such a specific quota setup, but the key points is: Two quota sets per directory provide all relevant information -- and simplify the creation of home-made reporting scripts.