There have been a couple of recent questions on the node quorum requirement differences for entire clusters versus OneFS nodepools.


For example:

 

“If I set protection on a six node X410 nodepool in a twenty node cluster to N+4, why does the data get 5x mirrored rather than erasure code parity protected? Clearly the cluster has far more than the nine nodes total which are needed for cluster quorum?”

 

In order for OneFS to properly function and accept data writes, a quorum of nodes must be active and responding. A quorum is defined as a simple majority: a cluster with x nodes must have [x/2]+1 nodes online in order to allow writes. This same quorum requirement is also true for individual nodepools within a heterogenous cluster. You need a minimum of three nodes of a specific hardware config in order to create a new nodepool. So, in this case, you’d need at least nine X410 nodes in your nodepool to allow for four node failures and still satisfy quorum for that pool.

 

OneFS clustering is based on the CAP theorem, and does not compromise on Consistency or Availability. As such, it uses a quorum to prevent Partitioning, or “split-brain” conditions, that can be introduced if the cluster should temporarily divide into two clusters. This is a pre-requisite of CAP. The quorum rule guarantees that, regardless of how many nodes fail or come back online, if a write takes place, it can be made consistent with any previous writes that have ever taken place.


So quorum dictates the number of nodes required in order to move to a given data protection level. For an erasure-code (FEC) based protection-level of N+M, the cluster must contain at least 2 M+1 nodes. For example, a minimum of nine nodes is required for a +4n configuration; this allows for a simultaneous loss of four nodes while still maintaining a quorum of five nodes for the cluster to remain fully operational.

 

If a cluster does drop below quorum, the file system will automatically be placed into a protected, read-only state, denying writes, but still allowing read access to the available data.

 

In the instances where a protection level is set too high for OneFS to achieve using FEC, the default behavior is to protect that data using mirroring instead. Obviously, this has a negative impact on space utilization. Here’s how that works in practice:

 

 

Number of nodes

[+1n]

[+2d:1n]

[+2n]

[+3d:1n]

[+3d:1n1d]

[+3n]

[+4d:1n]

[+4d:2n]

[+4n]

3

2 +1 (33%)

4 + 2 (33%)

X3

6 + 3 (33%)

3 + 3 (50%)

X3

8 + 4 (33%)

X3

4

3 +1 (25%)

6 + 2 (25%)

X3

9 + 3 (25%)

5 + 3 (38%)

X4

12 + 4 (25%)

4 + 4 (50%)

X4

5

4 +1 (20%)

8 + 2 (20%)

3 + 2 (40%)

12 + 3 (20%)

7 + 3 (30%)

X4

16 + 4 (20%)

6 + 4 (40%)

X5

6

5 +1 (17%)

10 + 2 (17%)

4 + 2 (33%)

15 + 3 (17%)

9 + 3 (25%)

3 + 3

(50%)

16 + 4 (20%)

8 + 4 (33%)

X5

7

6 +1 (14%)

12 + 2 (14%)

5 + 2 (29%)

15 + 3 (17%)

11 + 3 (21%)

4 + 3 (43%)

16 + 4 (20%)

10 + 4 (29%)

X5

8

7 +1 (13%)

14 + 2 (12.5%)

6 + 2 (25%)

15 + 3 (17%)

13 + 3 (19%)

5 + 3 (38%)

16 + 4 (20%)

12 + 4 (25%)

4 + 4

(50%)

9

8 +1 (11%)

16 + 2 (11%)

7 + 2 (22%)

15 + 3 (17%)

15 + 3 (17%)

6 + 3 (33%)

16 + 4 (20%)

14 + 4 (22%)

5 + 4 (44%)

10

9 +1 (10%)

16 + 2 (11%)

8 + 2 (20%)

15 + 3 (17%)

15 + 3 (17%)

7 + 3 (30%)

16 + 4 (20%)

16 + 4 (20%)

6 + 4 (40%)

12

11 +1 (8%)

16 + 2 (11%)

10 + 2 (17%)

15 + 3 (17%)

15 + 3 (17%)

9 + 3 (25%)

16 + 4 (20%)

16 + 4 (20%)

6 + 4 (40%)

14

13 +1 (7%)

16 + 2 (11%)

12 + 2 (14%)

15 + 3 (17%)

15 + 3 (17%)

11 + 3 (21%)

16 + 4 (20%)

16 + 4 (20%)

10 + 4 (29%)

16

15 +1 (6%)

16 + 2 (11%)

14 + 2 (13%)

15 + 3 (17%)

15 + 3 (17%)

13 + 3 (19%)

16 + 4 (20%)

16 + 4 (20%)

12 + 4 (25%)

18

16 +1 (6%)

16 + 2 (11%)

16 + 2 (11%)

15 + 3 (17%)

15 + 3 (17%)

15 + 3 (17%)

16 + 4 (20%)

16 + 4 (20%)

14 + 4 (22%)

20

16 +1 (6%)

16 + 2 (11%)

16 + 2 (11%)

16 + 3 (16%)

16 + 3 (16%)

16 + 3 (16%)

16 + 4 (20%)

16 + 4 (20%)

14 + 4 (22%)

30

16 +1 (6%)

16 + 2 (11%)

16 + 2 (11%)

16 + 3 (16%)

16 + 3 (16%)

16 + 3 (16%)

16 + 4 (20%)

16 + 4 (20%)

14 + 4 (22%)


Note that the protection overhead % (in brackets) is a very rough guide and will vary across different datasets, depending on quantities of small files, etc.


More information on OneFS storage protection and overhead can be found in the following blog article:

 

https://community.emc.com/community/products/isilon/blog/2015/03/09/exploring-onefs-storage-protection-overhead