This commit fixes a bug in the put-part
implementation. The SSE headers should be
set as specified by AWS - See:
https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html
Now, the MinIO server should set SSE-C headers,
like `x-amz-server-side-encryption-customer-algorithm`.
Fixes#11991
It is inefficient to decide to heal an object before checking its
lifecycle for expiration or transition. This commit will just reverse
the order of action: evaluate lifecycle and heal only if asked and
lifecycle resulted a NoneAction.
This PR fixes
- close leaking bandwidth report channel leakage
- remove the closer requirement for bandwidth monitor
instead if Read() fails remember the error and return
error for all subsequent reads.
- use locking for usage-cache.bin updates, with inline
data we cannot afford to have concurrent writes to
usage-cache.bin corrupting xl.meta
Multiple disks from the same set would be writing concurrently.
```
WARNING: DATA RACE
Write at 0x00c002100ce0 by goroutine 166:
github.com/minio/minio/cmd.(*erasureSets).connectDisks.func1()
d:/minio/minio/cmd/erasure-sets.go:254 +0x82f
Previous write at 0x00c002100ce0 by goroutine 129:
github.com/minio/minio/cmd.(*erasureSets).connectDisks.func1()
d:/minio/minio/cmd/erasure-sets.go:254 +0x82f
Goroutine 166 (running) created at:
github.com/minio/minio/cmd.(*erasureSets).connectDisks()
d:/minio/minio/cmd/erasure-sets.go:210 +0x324
github.com/minio/minio/cmd.(*erasureSets).monitorAndConnectEndpoints()
d:/minio/minio/cmd/erasure-sets.go:288 +0x244
Goroutine 129 (finished) created at:
github.com/minio/minio/cmd.(*erasureSets).connectDisks()
d:/minio/minio/cmd/erasure-sets.go:210 +0x324
github.com/minio/minio/cmd.(*erasureSets).monitorAndConnectEndpoints()
d:/minio/minio/cmd/erasure-sets.go:288 +0x244
```
service accounts were not inheriting parent policies
anymore due to refactors in the PolicyDBGet() from
the latest release, fix this behavior properly.
replication didn't work as expected when deletion of
delete markers was requested in DeleteMultipleObjects
API, this is due to incorrect lookup elements being
used to look for delete markers.
For large objects taking more than '3 minutes' response
times in a single PUT operation can timeout prematurely
as 'ResponseHeader' timeout hits for 3 minutes. Avoid
this by keeping the connection active during CreateFile
phase.
baseDirFromPrefix(prefix) for object names without
parent directory incorrectly uses empty path, leading
to long listing at various paths that are not useful
for healing - avoid this listing completely if "baseDir"
returns empty simple use the "prefix" as is.
this improves startup performance significantly
For large objects taking more than '3 minutes' response
times in a single PUT operation can timeout prematurely
as 'ResponseHeader' timeout hits for 3 minutes. Avoid
this by keeping the connection active during CreateFile
phase.
some SDKs might incorrectly send duplicate
entries for keys such as "conditions", Go
stdlib unmarshal for JSON does not support
duplicate keys - instead skips the first
duplicate and only preserves the last entry.
This can lead to issues where a policy JSON
while being valid might not properly apply
the required conditions, allowing situations
where POST policy JSON would end up allowing
uploads to unauthorized buckets and paths.
This PR fixes this properly.
This commit adds a `MarshalText` implementation
to the `crypto.Context` type.
The `MarshalText` implementation replaces the
`WriteTo` and `AppendTo` implementation.
It is slightly slower than the `AppendTo` implementation
```
goos: darwin
goarch: arm64
pkg: github.com/minio/minio/cmd/crypto
BenchmarkContext_AppendTo/0-elems-8 381475698 2.892 ns/op 0 B/op 0 allocs/op
BenchmarkContext_AppendTo/1-elems-8 17945088 67.54 ns/op 0 B/op 0 allocs/op
BenchmarkContext_AppendTo/3-elems-8 5431770 221.2 ns/op 72 B/op 2 allocs/op
BenchmarkContext_AppendTo/4-elems-8 3430684 346.7 ns/op 88 B/op 2 allocs/op
```
vs.
```
BenchmarkContext/0-elems-8 135819834 8.658 ns/op 2 B/op 1 allocs/op
BenchmarkContext/1-elems-8 13326243 89.20 ns/op 128 B/op 1 allocs/op
BenchmarkContext/3-elems-8 4935301 243.1 ns/op 200 B/op 3 allocs/op
BenchmarkContext/4-elems-8 2792142 428.2 ns/op 504 B/op 4 allocs/op
goos: darwin
```
However, the `AppendTo` benchmark used a pre-allocated buffer. While
this improves its performance it does not match the actual usage of
`crypto.Context` which is passed to a `KMS` and always encoded into
a newly allocated buffer.
Therefore, this change seems acceptable since it should not impact the
actual performance but reduces the overall code for Context marshaling.
When an object is removed, its parent directory is inspected to check if
it is empty to remove if that is the case.
However, we can use os.Remove() directly since it is only able to remove
a file or an empty directory.
RenameData renames xl.meta and data dir and removes the parent directory
if empty, however, there is a duplicate check for empty dir, since the
parent dir of xl.meta is always the same as the data-dir.
on freshReads if drive returns errInvalidArgument, we
should simply turn-off DirectIO and read normally, there
are situations in k8s like environments where the drives
behave sporadically in a single deployment and may not
have been implemented properly to handle O_DIRECT for
reads.
This PR adds deadlines per Write() calls, such
that slow drives are timed-out appropriately and
the overall responsiveness for Writes() is always
up to a predefined threshold providing applications
sustained latency even if one of the drives is slow
to respond.
MRF was starting to heal when it receives a disk connection event, which
is not good when a node having multiple disks reconnects to the cluster.
Besides, MRF needs Remove healing option to remove stale files.
- write in o_dsync instead of o_direct for smaller
objects to avoid unaligned double Write() situations
that may arise for smaller objects < 128KiB
- avoid fallocate() as its not useful since we do not
use Append() semantics anymore, fallocate is not useful
for streaming I/O we can save on a syscall
- createFile() doesn't need to validate `bucket` name
with a Lstat() call since createFile() is only used
to write at `minioTmpBucket`
- use io.Copy() when writing unAligned writes to allow
usage of ReadFrom() from *os.File providing zero
buffer writes().
```
mc admin info --json
```
provides these details, for now, we shall eventually
expose this at Prometheus level eventually.
Co-authored-by: Harshavardhana <harsha@minio.io>
This commit fixes a security issue in the signature v4 chunked
reader. Before, the reader returned unverified data to the caller
and would only verify the chunk signature once it has encountered
the end of the chunk payload.
Now, the chunk reader reads the entire chunk into an in-memory buffer,
verifies the signature and then returns data to the caller.
In general, this is a common security problem. We verifying data
streams, the verifier MUST NOT return data to the upper layers / its
callers as long as it has not verified the current data chunk / data
segment:
```
func (r *Reader) Read(buffer []byte) {
if err := r.readNext(r.internalBuffer); err != nil {
return err
}
if err := r.verify(r.internalBuffer); err != nil {
return err
}
copy(buffer, r.internalBuffer)
}
```
For operations that require the object to exist make it possible to
detect if the file isn't found in *any* pool.
This will allow these to return the error early without having to re-check.