This change provides new implementations of the XL backend operations:
- create file
- read file
- heal file
Further this change adds table based tests for all three operations.
This affects also the bitrot algorithm integration. Algorithms are now
integrated in an idiomatic way (like crypto.Hash).
Fixes#4696Fixes#4649Fixes#4359
xl.storageDisks is sometimes passed to some low-level XL functions. Some disks in
xl.storageDisks are set to nil when they encounter some errors. This means all
elements in xl.storageDisks will be nil after some time which lead to an unusable XL.
This is an enhancement to the XL/distributed-XL mode. FS mode is
unaffected.
The ReadFileWithVerify storage-layer call is similar to ReadFile with
the additional functionality of performing bit-rot checking. It
accepts additional parameters for a hashing algorithm to use and the
expected hex-encoded hash string.
This patch provides significant performance improvement because:
1. combines the step of reading the file (during
erasure-decoding/reconstruction) with bit-rot verification;
2. limits the number of file-reads; and
3. avoids transferring the file over the network for bit-rot
verification.
ReadFile API is implemented as ReadFileWithVerify with empty hashing
arguments.
Credits to AB and Harsha for the algorithmic improvement.
Fixes#4236.
Ignore a disk which wasn't able to successfully perform an action to
avoid eventual perturbations when the disk comes back in the middle
of write change.
Current implementation didn't honor quorum properly and didn't
handle the errors generated properly. This patch addresses that
and also moves common code `cleanupMultipartUploads` into xl
specific private function.
Fixes#3665
This change brings in changes at multiple places
- Reuse buffers at almost all locations ranging
from rpc, fs, xl, checksum etc.
- Change caching behavior to disable itself
under low memory conditions i.e < 8GB of RAM.
- Only objects cached are of size 1/10th the size
of the cache for example if 4GB is the cache size
the maximum object size which will be cached
is going to be 400MB. This change is an
optimization to cache more objects rather
than few larger objects.
- If object cache is enabled default GC
percent has been reduced to 20% in lieu
with newly found behavior of GC. If the cache
utilization reaches 75% of the maximum value
GC percent is reduced to 10% to make GC
more aggressive.
- Do not use *bytes.Buffer* due to its growth
requirements. For every allocation *bytes.Buffer*
allocates an additional buffer for its internal
purposes. This is undesirable for us, so
implemented a new cappedWriter which is capped to a
desired size, beyond this all writes rejected.
Possible fix for #3403.
This is needed as explained by @krisis
Lets say we have following errors.
```
[]error{nil, errFileNotFound, errDiskAccessDenied, errDiskAccesDenied}
```
Since the last two errors are filtered, the maximum is nil,
depending on map order.
Let's say we get nil from reduceErr. Clearly at this point
we don't have quorum nodes agreeing about the data and since
GetObject only requires N/2 (Read quorum) and isDiskQuorum
would have returned true. This is problematic and can lead to
undersiable consequences.
Fixes#3298