mirror of
https://github.com/minio/minio.git
synced 2025-01-02 10:33:21 -05:00
acc452b7ce
In cases where a cluster is degraded, we do not uphold our consistency guarantee and we will write fewer erasure codes and rely on healing to recreate the missing shards. In some cases replacing known bad disks in practice take days. We want to change the behavior of a known degraded system to keep the erasure code promise of the storage class for each object. This will create the objects with the same confidence as a fully functional cluster. The tradeoff will be that objects created during a partial outage will take up slightly more space. This means that when the storage class is EC:4, there should always be written 4 parity shards, even if some disks are unavailable. When an object is created on a set, the disks are immediately checked. If any disks are unavailable additional parity shards will be made for each offline disk, up to 50% of the number of disks. We add an internal metadata field with the actual and intended erasure code level, this can optionally be picked up later by the scanner if we decide that data like this should be re-sharded.
3.9 KiB
3.9 KiB
Erasure code sizing guide
Toy Setups
Capacity constrained environments, MinIO will work but not recommended for production.
servers | drives (per node) | stripe_size | parity chosen (default) | tolerance for reads (servers) | tolerance for writes (servers) |
---|---|---|---|---|---|
1 | 1 | 1 | 0 | 0 | 0 |
1 | 4 | 4 | 2 | 0 | 0 |
4 | 1 | 4 | 2 | 2 | 1 |
5 | 1 | 5 | 2 | 2 | 2 |
6 | 1 | 6 | 3 | 3 | 2 |
7 | 1 | 7 | 3 | 3 | 3 |
Minimum System Configuration for Production
servers | drives (per node) | stripe_size | parity chosen (default) | tolerance for reads (servers) | tolerance for writes (servers) |
---|---|---|---|---|---|
4 | 2 | 8 | 4 | 2 | 1 |
5 | 2 | 10 | 4 | 2 | 2 |
6 | 2 | 12 | 4 | 2 | 2 |
7 | 2 | 14 | 4 | 2 | 2 |
8 | 1 | 8 | 4 | 4 | 3 |
8 | 2 | 16 | 4 | 2 | 2 |
9 | 2 | 9 | 4 | 4 | 4 |
10 | 2 | 10 | 4 | 4 | 4 |
11 | 2 | 11 | 4 | 4 | 4 |
12 | 2 | 12 | 4 | 4 | 4 |
13 | 2 | 13 | 4 | 4 | 4 |
14 | 2 | 14 | 4 | 4 | 4 |
15 | 2 | 15 | 4 | 4 | 4 |
16 | 2 | 16 | 4 | 4 | 4 |
If one or more disks are offline at the start of a PutObject or NewMultipartUpload operation the object will have additional data protection bits added automatically to provide the regular safety for these objects up to 50% of the number of disks. This will allow normal write operations to take place on systems that exceed the write tolerance.
This means that in the examples above the system will always write 4 parity shards at the expense of slightly higher disk usage.