data-dir not being present is okay, however we can still
rely on the `rename()` atomic call instead of relying on
write xl.meta write which may truncate the io.EOF.
Add a new function logger.Event() to send the log to Console and
http/kafka log webhooks. This will include some internal events such as
disk healing and rebalance/decommissioning
the PR in #16541 was incorrect and hand wrong assumptions
about the overall setup, revert this since this expectation
to have offline servers is wrong and we can end up with a
bigger chicken and egg problem.
This reverts commit 5996c8c4d5.
Bonus:
- preserve disk in globalLocalDrives properly upon connectDisks()
- do not return 'nil' from newXLStorage(), getting it ready for
the next set of changes for 'format.json' loading.
The previous logic of calculating per second values for disk io stats
divides the stats by the host uptime. This doesn't work in k8s
environment as the uptime is of the pod, but the stats (from
/proc/diskstats) are from the host.
Fix this by storing the initial values of uptime and the stats at the
timme of server startup, and using the difference between current and
initial values when calculating the per second values.
globalLocalDrives seem to be not updated during the
HealFormat() leads to a requirement where the server
needs to be restarted for the healing to continue.
a/prefix
a/prefix/1.txt
where `a/prefix` is an object which does not have `/` at the end,
we do not have to aggressively recursively delete all the sub-folders
as well. Instead convert the call into self contained to deleting
'xl.meta' and then subsequently attempting to Remove the parent.
Bonus: enable audit alerts for object versions
beyond the configured value, default is '100'
versions per object beyond which scanner will
alert for each such objects.
when we expand via pools, there is no reason to stick
with the same distributionAlgo as the rest. Since the
algo only makes sense with-in a pool not across pools.
This allows for newer pools to use newer codepaths to
avoid legacy file lookups when they have a pre-existing
deployment from 2019, they can expand their new pool
to be of a newer distribution format, allowing the
pool to be more performant.
Fix reported races that are actually synchronized by network calls.
But this should add some extra safety for untimely disconnects.
Race reported:
```
WARNING: DATA RACE
Read at 0x00c00171c9c0 by goroutine 214:
github.com/minio/minio/internal/grid.(*muxClient).addResponse()
e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:519 +0x111
github.com/minio/minio/internal/grid.(*muxClient).error()
e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:470 +0x21d
github.com/minio/minio/internal/grid.(*Connection).handleDisconnectClientMux()
e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:1391 +0x15b
github.com/minio/minio/internal/grid.(*Connection).handleMsg()
e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:1190 +0x1ab
github.com/minio/minio/internal/grid.(*Connection).handleMessages.func1()
e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:981 +0x610
Previous write at 0x00c00171c9c0 by goroutine 1081:
github.com/minio/minio/internal/grid.(*muxClient).roundtrip()
e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:94 +0x324
github.com/minio/minio/internal/grid.(*muxClient).traceRoundtrip()
e:/gopath/src/github.com/minio/minio/internal/grid/trace.go:74 +0x10e4
github.com/minio/minio/internal/grid.(*Subroute).Request()
e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:366 +0x230
github.com/minio/minio/internal/grid.(*SingleHandler[go.shape.*github.com/minio/minio/cmd.DiskInfoOptions,go.shape.*github.com/minio/minio/cmd.DiskInfo]).Call()
e:/gopath/src/github.com/minio/minio/internal/grid/handlers.go:554 +0x3fd
github.com/minio/minio/cmd.(*storageRESTClient).DiskInfo()
e:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:314 +0x270
github.com/minio/minio/cmd.erasureObjects.getOnlineDisksWithHealingAndInfo.func1()
e:/gopath/src/github.com/minio/minio/cmd/erasure.go:293 +0x171
```
This read will always happen after the write, since there is a network call in between.
However a disconnect could come in while we are setting up the call, so we protect against that with extra checks.
- bucket metadata does not need to look for legacy things
anymore if b.Created is non-zero
- stagger bucket metadata loads across lots of nodes to
avoid the current thundering herd problem.
- Remove deadlines for RenameData, RenameFile - these
calls should not ever be timed out and should wait
until completion or wait for client timeout. Do not
choose timeouts for applications during the WRITE phase.
- increase R/W buffer size, increase maxMergeMessages to 30
We have observed cases where a blocked stream will block for cancellations.
This happens when response channel is blocked and we want to push an error.
This will have the response mutex locked, which will prevent all other operations until upstream is unblocked.
Make this behavior non-blocking and if blocked spawn a goroutine that will send the response and close the output.
Still a lot of "dancing". Added a test for this and reviewed.