Add more erasure codes on degraded systems. (#11852)

In cases where a cluster is degraded, we do not uphold our consistency 
guarantee and we will write fewer erasure codes and rely on healing 
to recreate the missing shards.

In some cases replacing known bad disks in practice take days.
We want to change the behavior of a known degraded system to keep
the erasure code promise of the storage class for each object.

This will create the objects with the same confidence as a fully 
functional cluster. The tradeoff will be that objects created 
during a partial outage will take up slightly more space.

This means that when the storage class is EC:4, there should 
always be written 4 parity shards, even if some disks are unavailable.

When an object is created on a set, the disks are immediately 
checked. If any disks are unavailable additional parity shards 
will be made for each offline disk, up to 50% of the number of disks.

We add an internal metadata field with the actual and intended 
erasure code level, this can optionally be picked up later by 
the scanner if we decide that data like this should be re-sharded.
This commit is contained in:
Klaus Post
2021-05-27 20:38:09 +02:00
committed by GitHub
parent be541dba8a
commit acc452b7ce
6 changed files with 197 additions and 6 deletions

View File

@@ -10,6 +10,8 @@ MinIO in distributed mode can help you setup a highly-available storage system w
Distributed MinIO provides protection against multiple node/drive failures and [bit rot](https://github.com/minio/minio/blob/master/docs/erasure/README.md#what-is-bit-rot-protection) using [erasure code](https://docs.min.io/docs/minio-erasure-code-quickstart-guide). As the minimum disks required for distributed MinIO is 4 (same as minimum disks required for erasure coding), erasure code automatically kicks in as you launch distributed MinIO.
If one or more disks are offline at the start of a PutObject or NewMultipartUpload operation the object will have additional data protection bits added automatically to provide additional safety for these objects.
### High availability
A stand-alone MinIO server would go down if the server hosting the disks goes offline. In contrast, a distributed MinIO setup with _m_ servers and _n_ disks will have your data safe as long as _m/2_ servers or _m*n_/2 or more disks are online.

View File

@@ -32,3 +32,8 @@ Capacity constrained environments, MinIO will work but not recommended for produ
| 15 | 2 | 15 | 4 | 4 | 4 |
| 16 | 2 | 16 | 4 | 4 | 4 |
If one or more disks are offline at the start of a PutObject or NewMultipartUpload operation the object will have additional data
protection bits added automatically to provide the regular safety for these objects up to 50% of the number of disks.
This will allow normal write operations to take place on systems that exceed the write tolerance.
This means that in the examples above the system will always write 4 parity shards at the expense of slightly higher disk usage.