unlinking() at two different locations on a disk when there
are lots to purge, this can lead to huge IOwaits, instead
rely on rename() to .trash to avoid running multiple unlinks()
in parallel.
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.
Also introduce bugLogIf() for errors that should not happen in the first
place.
globalLocalDrives seem to be not updated during the
HealFormat() leads to a requirement where the server
needs to be restarted for the healing to continue.
for actionable, inspections we have `mc support inspect`
we do not need double logging, healing will report relevant
errors if any, in terms of quorum lost etc.
- Move RenameFile to websockets
- Move ReadAll that is primarily is used
for reading 'format.json' to to websockets
- Optimize DiskInfo calls, and provide a way
to make a NoOp DiskInfo call.
NOTE: This feature is not retro-active; it will not cater to previous transactions
on existing setups.
To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment
variable as part of systemd service or k8s configmap.
Once this has been enabled, you need to also set `list_quorum`.
```
~ mc admin config set alias/ api list_quorum=auto`
```
A new debugging tool is available to check for any missing counters.
This PR also increases per node bpool memory from 1024 entries
to 2048 entries; along with that, it also moves the byte pool
centrally instead of being per pool.
`OpMuxConnectError` was not handled correctly.
Remove local checks for single request handlers so they can
run before being registered locally.
Bonus: Only log IAM bootstrap on startup.
Tiering statistics have been broken for some time now, a regression
was introduced in 6f2406b0b6
Bonus fixes an issue where the objects are not assumed to be
of the 'STANDARD' storage-class for the objects that have
not yet tiered, this should be conditional based on the object's
metadata not a default assumption.
This PR also does some cleanup in terms of implementation,
fixes#18070
Errors such as
```
returned an error (context deadline exceeded) (*fmt.wrapError)
```
```
(msgp: too few bytes left to read object) (*fmt.wrapError)
```
Bonus:
- avoid calling DiskInfo() calls when missing blocks
instead heal the object using MRF operation.
- change the max_sleep to 250ms beyond that we will
not stop healing.
this randomness is needed to avoid scanning
the same buckets across different erasure sets,
in the same order.
allow random buckets to be scanned instead
allowing a wider spread of ILM, replication
checks.
Additionally do not loop over twice to fill
the channel, fill the channel regardless of
having bucket new or old.
Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner.
Also removes a tiny bit of CPU by each write operation.