- bucket metadata does not need to look for legacy things
anymore if b.Created is non-zero
- stagger bucket metadata loads across lots of nodes to
avoid the current thundering herd problem.
- Remove deadlines for RenameData, RenameFile - these
calls should not ever be timed out and should wait
until completion or wait for client timeout. Do not
choose timeouts for applications during the WRITE phase.
- increase R/W buffer size, increase maxMergeMessages to 30
We have observed cases where a blocked stream will block for cancellations.
This happens when response channel is blocked and we want to push an error.
This will have the response mutex locked, which will prevent all other operations until upstream is unblocked.
Make this behavior non-blocking and if blocked spawn a goroutine that will send the response and close the output.
Still a lot of "dancing". Added a test for this and reviewed.
Depending on when the context cancelation is picked up the handler may return and close the channel before `SubscribeJSON` returns, causing:
```
Feb 05 17:12:00 s3-us-node11 minio[3973657]: panic: send on closed channel
Feb 05 17:12:00 s3-us-node11 minio[3973657]: goroutine 378007076 [running]:
Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub.(*PubSub[...]).SubscribeJSON.func1()
Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub/pubsub.go:139 +0x12d
Feb 05 17:12:00 s3-us-node11 minio[3973657]: created by github.com/minio/minio/internal/pubsub.(*PubSub[...]).SubscribeJSON in goroutine 378010884
Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub/pubsub.go:124 +0x352
```
Wait explicitly for the goroutine to exit.
Bonus: Listen for doneCh when sending to not risk getting blocked there is channel isn't being emptied.
this fixes rare bugs we have seen but never really found a
reproducer for
- PutObjectRetention() returning 503s
- PutObjectTags() returning 503s
- PutObjectMetadata() updates during replication returning 503s
These calls return errors, and this perpetuates with
no apparent fix.
This PR fixes with correct quorum requirement.
To force limit the duration of STS accounts, the user can create a new
policy, like the following:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["sts:AssumeRoleWithWebIdentity"],
"Condition": {"NumericLessThanEquals": {"sts:DurationSeconds": "300"}}
}]
}
And force binding the policy to all OpenID users, whether using a claim name or role
ARN.
disk tokens usage is not necessary anymore with the implementation
of deadlines for storage calls and active monitoring of the drive
for I/O timeouts.
Functionality kicking off a bad drive is still supported, it's just that
we do not have to serialize I/O in the manner tokens would do.
Recycle would always be called on the dummy value `any(newRT())` instead of the actual value given to the recycle function.
Caught by race tests, but mostly harmless, except for reduced perf.
Other minor cleanups. Introduced in #18940 (unreleased)
Interpret `null` inline policy for access keys as inheriting parent
policy. Since MinIO Console currently sends this value, we need to honor it
for now. A larger fix in Console and in the server are required.
Fixes#18939.
Allow internal types to support a `Recycler` interface, which will allow for sharing of common types across handlers.
This means that all `grid.MSS` (and similar) objects are shared across in a common pool instead of a per-handler pool.
Add internal request reuse of internal types. Add for safe (pointerless) types explicitly.
Only log params for internal types. Doing Sprint(obj) is just a bit too messy.
With this change, only a user with `UpdateServiceAccountAdminAction`
permission is able to edit access keys.
We would like to let a user edit their own access keys, however the
feature needs to be re-designed for better security and integration with
external systems like AD/LDAP and OpenID.
This change prevents privilege escalation via service accounts.
for actionable, inspections we have `mc support inspect`
we do not need double logging, healing will report relevant
errors if any, in terms of quorum lost etc.
Each Put, List, Multipart operations heavily rely on making
GetBucketInfo() call to verify if bucket exists or not on
a regular basis. This has a large performance cost when there
are tons of servers involved.
We did optimize this part by vectorizing the bucket calls,
however its not enough, beyond 100 nodes and this becomes
fairly visible in terms of performance.