avoids error during upgrades such as
```
API: SYSTEM()
Time: 19:19:22 UTC 03/18/2024
DeploymentID: 24e4b574-b28d-4e94-9bfa-03c363a600c2
Error: Invalid api configuration: found invalid keys (expiry_workers=100 ) for 'api' sub-system, use 'mc admin config reset myminio api' to fix invalid keys (*fmt.wrapError)
11: internal/logger/logger.go:260:logger.LogIf()
...
```
at scale customers might start with failed drives,
causing skew in the overall usage ratio per EC set.
make this configurable such that customers can turn
this off as needed depending on how comfortable they
are.
Remove api.expiration_workers config setting which was inadvertently left behind. Per review comment
https://github.com/minio/minio/pull/18926, expiration_workers can be configured via ilm.expiration_workers.
- Use a shared worker pool for all ILM expiry tasks
- Free version cleanup executes in a separate goroutine
- Add a free version only if removing the remote object fails
- Add ILM expiry metrics to the node namespace
- Move tier journal tasks to expiryState
- Remove unused on-disk journal for tiered objects pending deletion
- Distribute expiry tasks across workers such that the expiry of versions of
the same object serialized
- Ability to resize worker pool without server restart
- Make scaling down of expiryState workers' concurrency safe; Thanks
@klauspost
- Add error logs when expiryState and transition state are not
initialized (yet)
* metrics: Add missed tier journal entry tasks
* Initialize the ILM worker pool after the object layer
Bonus: enable audit alerts for object versions
beyond the configured value, default is '100'
versions per object beyond which scanner will
alert for each such objects.
disk tokens usage is not necessary anymore with the implementation
of deadlines for storage calls and active monitoring of the drive
for I/O timeouts.
Functionality kicking off a bad drive is still supported, it's just that
we do not have to serialize I/O in the manner tokens would do.
Add a hidden configuration under the scanner sub section to configure if
the scanner should sleep between two objects scan. The configuration has
only effect when there is no drive activity related to s3 requests or
healing.
By default, the code will keep the current behavior which is doing
sleep between objects.
To forcefully enable the full scan speed in idle mode, you can do this:
`mc admin config set myminio scanner idle_speed=full`
NOTE: This feature is not retro-active; it will not cater to previous transactions
on existing setups.
To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment
variable as part of systemd service or k8s configmap.
Once this has been enabled, you need to also set `list_quorum`.
```
~ mc admin config set alias/ api list_quorum=auto`
```
A new debugging tool is available to check for any missing counters.
use memory for async events when necessary and dequeue them as
needed, for all synchronous events customers must enable
```
MINIO_API_SYNC_EVENTS=on
```
Async events can be lost but is upto to the admin to
decide what they want, we will not create run-away number
of goroutines per event instead we will queue them properly.
Currently the max async workers is set to runtime.GOMAXPROCS(0)
which is more than sufficient in general, but it can be made
configurable in future but may not be needed.
When minio runs with MINIO_CI_CD=on, it is expected to communicate
with the locally running SUBNET. This is happening in the case of MinIO
via call home functionality. However, the subnet-related functionality inside the
console continues to talk to the SUBNET production URL. Because of this,
the console cannot be tested with a locally running SUBNET.
Set the env variable CONSOLE_SUBNET_URL correctly in such cases.
(The console already has code to use the value of this variable
as the subnet URL)
Optionally allows customers to enable
- Enable an external cache to catch GET/HEAD responses
- Enable skipping disks that are slow to respond in GET/HEAD
when we have already achieved a quorum
sendfile implementation to perform DMA on all platforms
Go stdlib already supports sendfile/splice implementations
for
- Linux
- Windows
- *BSD
- Solaris
Along with this change however O_DIRECT for reads() must be
removed as well since we need to use sendfile() implementation
The main reason to add O_DIRECT for reads was to reduce the
chances of page-cache causing OOMs for MinIO, however it would
seem that avoiding buffer copies from user-space to kernel space
this issue is not a problem anymore.
There is no Go based memory allocation required, and neither
the page-cache is referenced back to MinIO. This page-
cache reference is fully owned by kernel at this point, this
essentially should solve the problem of page-cache build up.
With this now we also support SG - when NIC supports Scatter/Gather
https://en.wikipedia.org/wiki/Gather/scatter_(vector_addressing)
replace io.Discard usage to fix NUMA copy() latencies
On NUMA systems copying from 8K buffer allocated via
io.Discard leads to large latency build-up for every
```
copy(new8kbuf, largebuf)
```
can in-cur upto 1ms worth of latencies on NUMA systems
due to memory sharding across NUMA nodes.
`GetParityForSC` has a value receiver, so Config is copied before the lock is obtained.
Make it pointer receiver.
Fixes:
```
WARNING: DATA RACE
Read at 0x0000079cdd10 by goroutine 190:
github.com/minio/minio/cmd.(*erasureServerPools).BackendInfo()
github.com/minio/minio/cmd/erasure-server-pool.go:579 +0x6f
github.com/minio/minio/cmd.(*erasureServerPools).LocalStorageInfo()
github.com/minio/minio/cmd/erasure-server-pool.go:614 +0x3c6
github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler()
github.com/minio/minio/cmd/peer-rest-server.go:347 +0x4ea
github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler-fm()
...
WARNING: DATA RACE
Read at 0x0000079cdd10 by goroutine 190:
github.com/minio/minio/cmd.(*erasureServerPools).BackendInfo()
github.com/minio/minio/cmd/erasure-server-pool.go:579 +0x6f
github.com/minio/minio/cmd.(*erasureServerPools).LocalStorageInfo()
github.com/minio/minio/cmd/erasure-server-pool.go:614 +0x3c6
github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler()
github.com/minio/minio/cmd/peer-rest-server.go:347 +0x4ea
github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler-fm()
```
- we already have MRF for most recent failures
- we trigger healing during HEAD/GET operation
These are enough, also change the default max wait
from 5sec to 1sec for default scanner speed.
configs from 2020 server throws an
error due to deprecation of the keys
however an attempt is made to parse
them, we should have chosen existing
defaults - this PR fixes that.
Bonus:
- avoid calling DiskInfo() calls when missing blocks
instead heal the object using MRF operation.
- change the max_sleep to 250ms beyond that we will
not stop healing.