Add new ReadFileWithVerify storage-layer API (#4349)

This is an enhancement to the XL/distributed-XL mode. FS mode is
unaffected.

The ReadFileWithVerify storage-layer call is similar to ReadFile with
the additional functionality of performing bit-rot checking. It
accepts additional parameters for a hashing algorithm to use and the
expected hex-encoded hash string.

This patch provides significant performance improvement because:

1. combines the step of reading the file (during
erasure-decoding/reconstruction) with bit-rot verification;

2. limits the number of file-reads; and

3. avoids transferring the file over the network for bit-rot
verification.

ReadFile API is implemented as ReadFileWithVerify with empty hashing
arguments.

Credits to AB and Harsha for the algorithmic improvement.

Fixes #4236.
This commit is contained in:
Aditya Manthramurthy
2017-05-16 14:21:52 -07:00
committed by Harshavardhana
parent cae4683971
commit 8975da4e84
20 changed files with 471 additions and 88 deletions

View File

@@ -49,19 +49,33 @@ func (t byObjectPartNumber) Less(i, j int) bool { return t[i].Number < t[j].Numb
// checkSumInfo - carries checksums of individual scattered parts per disk.
type checkSumInfo struct {
Name string `json:"name"`
Algorithm string `json:"algorithm"`
Hash string `json:"hash"`
Name string `json:"name"`
Algorithm HashAlgo `json:"algorithm"`
Hash string `json:"hash"`
}
// Various algorithms supported by bit-rot protection feature.
// HashAlgo - represents a supported hashing algorithm for bitrot
// verification.
type HashAlgo string
const (
// "sha256" is specifically used on arm64 bit platforms.
sha256Algo = "sha256"
// Rest of the platforms default to blake2b.
blake2bAlgo = "blake2b"
// HashBlake2b represents the Blake 2b hashing algorithm
HashBlake2b HashAlgo = "blake2b"
// HashSha256 represents the SHA256 hashing algorithm
HashSha256 HashAlgo = "sha256"
)
// isValidHashAlgo - function that checks if the hash algorithm is
// valid (known and used).
func isValidHashAlgo(algo HashAlgo) bool {
switch algo {
case HashSha256, HashBlake2b:
return true
default:
return false
}
}
// Constant indicates current bit-rot algo used when creating objects.
// Depending on the architecture we are choosing a different checksum.
var bitRotAlgo = getDefaultBitRotAlgo()
@@ -70,7 +84,7 @@ var bitRotAlgo = getDefaultBitRotAlgo()
// Currently this function defaults to "blake2b" as the preferred
// checksum algorithm on all architectures except ARM64. On ARM64
// we use sha256 (optimized using sha2 instructions of ARM NEON chip).
func getDefaultBitRotAlgo() string {
func getDefaultBitRotAlgo() HashAlgo {
switch runtime.GOARCH {
case "arm64":
// As a special case for ARM64 we use an optimized
@@ -79,17 +93,17 @@ func getDefaultBitRotAlgo() string {
// This would also allows erasure coded writes
// on ARM64 servers to be on-par with their
// counter-part X86_64 servers.
return sha256Algo
return HashSha256
default:
// Default for all other architectures we use blake2b.
return blake2bAlgo
return HashBlake2b
}
}
// erasureInfo - carries erasure coding related information, block
// distribution and checksums.
type erasureInfo struct {
Algorithm string `json:"algorithm"`
Algorithm HashAlgo `json:"algorithm"`
DataBlocks int `json:"data"`
ParityBlocks int `json:"parity"`
BlockSize int64 `json:"blockSize"`