move away from map[string]interface{} to map[string]string
to simplify the audit, and also provide concise information.
avoids large allocations under load(), reduces the amount
of audit information generated, as the current implementation
was a bit free-form. instead all datastructures must be
flattened.
Directory objects are used by applications that simulate the folder
structure of an on-disk filesystem. These are zero-byte objects with names
ending with '/'. They are only used to check whether a 'folder' exists in
the namespace.
This change uses the updated ldap library in minio/pkg (bumped
up to v3). A new config parameter is added for LDAP configuration to
specify extra user attributes to load from the LDAP server and to store
them as additional claims for the user.
A test is added in sts_handlers.go that shows how to access the LDAP
attributes as a claim.
This is in preparation for adding SSH pubkey authentication to MinIO's SFTP
integration.
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.
Also introduce bugLogIf() for errors that should not happen in the first
place.
When objects have more versions than their ILM policy expects to retain
via NewerNoncurrentVersions, but they don't qualify for expiry due to
NoncurrentDays are configured in that rule.
In this case, applyNewerNoncurrentVersionsLimit method was enqueuing empty
tasks, which lead to a panic (panic: runtime error: index out of range [0] with
length 0) in newerNoncurrentTask.OpHash method, which assumes the task
to contain at least one version to expire.
Remove api.expiration_workers config setting which was inadvertently left behind. Per review comment
https://github.com/minio/minio/pull/18926, expiration_workers can be configured via ilm.expiration_workers.
- Use a shared worker pool for all ILM expiry tasks
- Free version cleanup executes in a separate goroutine
- Add a free version only if removing the remote object fails
- Add ILM expiry metrics to the node namespace
- Move tier journal tasks to expiryState
- Remove unused on-disk journal for tiered objects pending deletion
- Distribute expiry tasks across workers such that the expiry of versions of
the same object serialized
- Ability to resize worker pool without server restart
- Make scaling down of expiryState workers' concurrency safe; Thanks
@klauspost
- Add error logs when expiryState and transition state are not
initialized (yet)
* metrics: Add missed tier journal entry tasks
* Initialize the ILM worker pool after the object layer
This PR fixes a bug that perhaps has been long introduced,
with no visible workarounds. In any deployment, if an entire
erasure set is deleted, there is no way the cluster recovers.
It also fixes a long-standing bug in expiring transitioned objects.
The expiration action was deleting the current version in the case'
of tiered objects instead of adding a delete marker.
minio_node_tier_ttlb_seconds - Distribution of time to last byte for streaming objects from warm tier
minio_node_tier_requests_success - Number of requests to download object from warm tier that were successful
minio_node_tier_requests_failure - Number of requests to download object from warm tier that failed
it is okay if the warm-tier cannot keep up, we should continue
to take I/O at hot-tier, only fail hot-tier or block it when
we are disk full.
Bonus: add metrics counter for these missed tasks, we will
know for sure if one of the node is lagging behind or is
losing too many tasks during transitioning.
AWS S3 closes keep-alive connections frequently
leading to frivolous logs filling up the MinIO
logs when the transition tier is an AWS S3 bucket.
Ignore such transient errors, let MinIO retry
it when it can.
Immediate transition use case and is mostly used to fill warm
backend with a lot of data when a new deployment is created
Currently, if the transition queue is complete, the transition will be
deferred to the scanner; change this behavior by blocking the PUT request
until the transition queue has a new place for a transition task.
globalDeploymentID was being read while it was being set.
Fixes race:
```
WARNING: DATA RACE
Write at 0x0000079605a0 by main goroutine:
github.com/minio/minio/cmd.connectLoadInitFormats()
github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0
github.com/minio/minio/cmd.waitForFormatErasure()
github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d
...
Previous read at 0x0000079605a0 by goroutine 105:
github.com/minio/minio/cmd.newContext()
github.com/minio/minio/cmd/utils.go:817 +0x31e
github.com/minio/minio/cmd.adminMiddleware.func1()
github.com/minio/minio/cmd/admin-router.go:110 +0x96
net/http.HandlerFunc.ServeHTTP()
net/http/server.go:2136 +0x47
github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1()
github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a
net/http.HandlerFunc.ServeHTTP()
net/http/server.go:2136 +0x47
...
```
* Reduce allocations
* Add stringsHasPrefixFold which can compare string prefixes, while ignoring case and not allocating.
* Reuse all msgp.Readers
* Reuse metadata buffers when not reading data.
* Make type safe. Make buffer 4K instead of 8.
* Unslice
This PR changes the handling of bucket deletes for site
replicated setups to hold on to deleted bucket state until
it syncs to all the clusters participating in site replication.
Spark/Hadoop workloads which use Hadoop MR
Committer v1/v2 algorithm upload objects to a
temporary prefix in a bucket. These objects are
'renamed' to a different prefix on Job commit.
Object storage admins are forced to configure
separate ILM policies to expire these objects
and their versions to reclaim space.
Our solution:
This can be avoided by simply marking objects
under these prefixes to be excluded from versioning,
as shown below. Consequently, these objects are
excluded from replication, and don't require ILM
policies to prune unnecessary versions.
- MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Status>Enabled</Status>
<ExcludeFolders>true</ExcludeFolders>
<ExcludedPrefixes>
<Prefix>app1-jobs/*/_temporary/</Prefix>
</ExcludedPrefixes>
<ExcludedPrefixes>
<Prefix>app2-jobs/*/__magic/</Prefix>
</ExcludedPrefixes>
<!-- .. up to 10 prefixes in all -->
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket
from versioning. This is required to prevent the parent
folders from accumulating delete markers, especially
those which are shared across spark workloads
spanning projects/teams.
- To enable version exclusion on a list of prefixes
```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```