improvements in this PR include
- decommission objects that have __XLDIR__ suffix
- decommission objects that have `null` version on
a versioned bucket.
- make sure to look for any "decom" failures to ensure
that we do not wrong conclude decom as complete without
all files getting copied over.
- break out eagerly upon first error for objects with
multiple versions, leave the object as is for support
debugging and analysis.
heal bucket metadata and IAM entries for
sites participating in site replication from
the site with the most updated entry.
Co-authored-by: Harshavardhana <harsha@minio.io>
Co-authored-by: Aditya Manthramurthy <aditya@minio.io>
The site replication status call was using a loop iteration variable sent
directly into go-routines instead of being passed as an argument. As the
variable is being updated in the loop, previously launched go routines do not
necessarily use the value at the time they were launched.
This PR fixes two issues
- The first fix is a regression from #14555, the fix itself in #14555
is correct but the interpretation of that information by the
object layer code for "replication" was not correct. This PR
tries to fix this situation by making sure the "Delete" replication
works as expected when "VersionPurgeStatus" is already set.
Without this fix, there is a DELETE marker created incorrectly on
the source where the "DELETE" was triggered.
- The second fix is perhaps an older problem started since we inlined-data
on the disk for small objects, CopyObject() incorrectly inline's
a non-inlined data. This is due to the fact that we have code where
we read the `part.1` under certain conditions where the size of the
`part.1` is less than the specific "threshold".
This eventually causes problems when we are "deleting" the data that
is only inlined, which means dataDir is ignored leaving such
dataDir on the disk, that looks like an inconsistent content on
the namespace.
fixes#14767
It is wasteful to allow parallel upgrades of MinIO server. This also generates
weird error invoked by selfupdate module when it happens such as:
'rename /opt/bin/.minio.old /opt/bin/..minio.old.old'
currently filterPefix was never used and set
that would filter out entries when needed
when `prefix` doesn't end with `/` - this
often leads to objects getting Walked(), Healed()
that were never requested by the caller.
without this wait there is a potential for some objects
that are in actively being decommissioned would cancel,
however the decommission status might wrongly conclude
this as "Complete".
To avoid this make sure to add waitgroups on the parallel
workers, allowing parallel copies to complete fully before
we return.
In previous releases, mc admin user list would return the list of users
that have policies mapped in IAM database. However, this was removed but
this commit will bring it back until we revamp this.
- This change switches to a new parquet library
- SelectObjectContent now takes a single lock at the beginning and holds it
during the operation. Previously the operation took a lock every time the
parquet library performed a Seek on the underlying object stream.
- Add basic support for LogicalType annotations for timestamps.
Execute the object, drive and net speedtests as part of the healthinfo
(if requested by the client), and include their result in the response.
The options for the speedtests have been picked from the default values
used by `mc support perf` command.
This commit improves the listing of encrypted objects:
- Use `etag.Format` and `etag.Decrypt`
- Detect SSE-S3 single-part objects in a single iteration
- Fix batch size to `250`
- Pass request context to `DecryptAll` to not waste resources
when a client cancels the operation.
Signed-off-by: Andreas Auernhammer <hi@aead.dev>
This commit adds two new functions to the
internal `etag` package:
- `ETag.Format`
- `Decrypt`
The `Decrypt` function decrypts an encrypted
ETag using a decryption key. It returns not
encrypted / multipart ETags unmodified.
The `Decrypt` function is mainly used when
handling SSE-S3 encrypted single-part objects.
In particular, the ETag of an SSE-S3 encrypted
single-part object needs to be decrypted since
S3 clients expect that this ETag is equal to the
content MD5.
The `ETag.Format` method also covers SSE ETag handling.
MinIO encrypts all ETags of SSE single part objects.
However, only the ETag of SSE-S3 encrypted single part
objects needs to be decrypted.
The ETag of an SSE-C or SSE-KMS single part object
does not correspond to its content MD5 and can be
a random value.
The `ETag.Format` function formats an ETag such that
it is an AWS S3 compliant ETag. In particular, it
returns non-encrypted ETags (single / multipart)
unmodified. However, for encrypted ETags it returns
the trailing 16 bytes as ETag. For encrypted ETags
the last 16 bytes will be a random value.
The main purpose of `Format` is to format ETags
such that clients accept them as well-formed AWS S3
ETags.
It differs from the `String` method since `String`
will return string representations for encrypted
ETags that are not AWS S3 compliant.
Signed-off-by: Andreas Auernhammer <hi@aead.dev>
The deployment id was being written to the health report towards the end
of the handler. Because of this, if there was a timeout in any of the
data fetching, the deployment id was not getting written at all. Upload
of such reports fails on SUBNET as deployment id is the unique
identifier for a cluster in subnet.
Fixed by writing the deployment id at the beginning of the processing.
This commit simplifies the ETag decryption and size adjustment
when listing object parts.
When listing object parts, MinIO has to decrypt the ETag of all
parts if and only if the object resp. the parts is encrypted using
SSE-S3.
In case of SSE-KMS and SSE-C, MinIO returns a pseudo-random ETag.
This is inline with AWS S3 behavior.
Further, MinIO has to adjust the size of all encrypted parts due to
the encryption overhead.
The ListObjectParts does specifically not use the KMS bulk decryption
API (4d2fc530d0) since the ETags of all
parts are encrypted using the same object encryption key. Therefore,
MinIO only has to connect to the KMS once, even if there are multiple
parts resp. ETags. It can simply reuse the same object encryption key.
Signed-off-by: Andreas Auernhammer <hi@aead.dev>