Add up to 256 bytes of padding for compressed+encrypted files.
This will obscure the obvious cases of extremely compressible content
and leave a similar output size for a very wide variety of inputs.
This does *not* mean the compression ratio doesn't leak information
about the content, but the outcome space is much smaller,
so often *less* information is leaked.
Make bucket requests sent after decommissioning is started are not
created in a suspended pool. Therefore listing buckets should avoid
suspended pools as well.
Rename Trigger -> Event to be a more appropriate
name for the audit event.
Bonus: fixes a bug in AddMRFWorker() it did not
cancel the waitgroup, leading to waitgroup leaks.
There is no point in compressing very small files.
Typically the effective size on disk will be the same due to disk blocks.
So don't waste resources on extremely small files.
We don't check on multipart. 1) because we don't know and 2) this is very likely a big object anyway.
This commit adds a minimal set of KMS-related metrics:
```
# HELP minio_cluster_kms_online Reports whether the KMS is online (1) or offline (0)
# TYPE minio_cluster_kms_online gauge
minio_cluster_kms_online{server="127.0.0.1:9000"} 1
# HELP minio_cluster_kms_request_error Number of KMS requests that failed with a well-defined error
# TYPE minio_cluster_kms_request_error counter
minio_cluster_kms_request_error{server="127.0.0.1:9000"} 16790
# HELP minio_cluster_kms_request_success Number of KMS requests that succeeded
# TYPE minio_cluster_kms_request_success counter
minio_cluster_kms_request_success{server="127.0.0.1:9000"} 348031
```
Currently, we report whether the KMS is available and how many requests
succeeded/failed. However, KES exposes much more metrics that can be
exposed if necessary. See: https://pkg.go.dev/github.com/minio/kes#Metric
Signed-off-by: Andreas Auernhammer <hi@aead.dev>
If more than 1M folders (objects or prefixes) are found at the top level in a bucket allow it to be compacted.
While very suboptimal structure we should limit memory usage at some point.
GetDiskInfo() uses timedValue to cache the disk info for one second.
timedValue behavior was recently changed to return an old cached value
when calculating a new value returns an error.
When a mount point is empty, GetDiskInfo() will return errUnformattedDisk,
timedValue will return cached disk info with unexpected IsRootDisk value,
e.g. false if the mount point belongs to a root disk. Therefore, the mount
point will be considered a valid disk and will be formatted as well.
This commit will also add more defensive code when marking root disks:
always mark a disk offline for any GetDiskInfo() error except
errUnformattedDisk. The server will try anyway to reconnect to those
disks every 10 seconds.
it is not safe to pass around sync.Map
through pointers, as it may be concurrently
updated by different callers.
this PR simplifies by avoiding sync.Map
altogether, we do not need sync.Map
to keep object->erasureMap association.
This PR fixes a crash when concurrently
using this value when audit logs are
configured.
```
fatal error: concurrent map iteration and map write
goroutine 247651580 [running]:
runtime.throw({0x277a6c1?, 0xc002381400?})
runtime/panic.go:992 +0x71 fp=0xc004d29b20 sp=0xc004d29af0 pc=0x438671
runtime.mapiternext(0xc0d6e87f18?)
runtime/map.go:871 +0x4eb fp=0xc004d29b90 sp=0xc004d29b20 pc=0x41002b
```
The current code uses approximation using a ratio. The approximation
can skew if we have multiple pools with different disk capacities.
Replace the algorithm with a simpler one which counts data
disks and ignore parity disks.
fix: allow certain mutation on objects during decommission
currently by mistake deletion of objects was skipped,
if the object resided on the pool being decommissioned.
delete's are okay to be allowed since decommission is
designed to run on a cluster with active I/O.
Small uploads spend a significant amount of time (~5%) fetching disk info metrics. Also maps are allocated for each call.
Add a 100ms cache to disk metrics.
versioned buckets were not creating the delete markers
present in the versioned stack of an object, this essentially
would stop decommission to succeed.
This PR fixes creating such delete markers properly during
a decommissioning process, adds tests as well.
Current code incorrectly passed the
config asset object name while decommissioning,
make sure that we pass the right object name
to be hashed on the newer set of pools.
This PR fixes situations after a successful
decommission, the users and policies might go
missing due to wrong hashed set.
also use designated names for internal
calls
- storageREST calls are storageR
- lockREST calls are lockR
- peerREST calls are just peer
Named in this fashion to facilitate wildcard matches
by having prefixes of the same name.
Additionally, also enable funcNames for generic handlers
that return errors, currently we disable '<unknown>'
In a replicated setup, when an object is updated in one cluster but
still waiting to be replicated to the other cluster, GET requests with
if-match, and range headers will likely fail. It is better to proxy
requests instead.
Also, this commit avoids printing verbose logs about precondition &
range errors.
fix: change timedvalue to return previous cached value
caller can interpret the underlying error and decide
accordingly, places where we do not interpret the
errors upon timedValue.Get() - we should simply use
the previously cached value instead of returning "empty".
Bonus: remove some unused code
Add a generic handler that adds a new tracing context to the request if
tracing is enabled. Other handlers are free to modify the tracing
context to update information on the fly, such as, func name, enable
body logging etc..
With this commit, requests like this
```
curl -H "Host: ::1:3000" http://localhost:9000/
```
will be traced as well.