Ensure delete marker replication success, especially since the
recent optimizations to heal on HEAD, LIST and GET can force
replication attempts on delete marker before underlying object
version could have synced.
PUT shall only proceed if pre-conditions are met, the new
code uses
- x-minio-source-mtime
- x-minio-source-etag
to verify if the object indeed needs to be replicated
or not, allowing us to avoid StatObject() call.
If replication config could not be read from bucket metadata for some
reason, issue a panic so that unexpected replication outcomes can
be avoided for replicated buckets.
For similar reasons, adding a panic while fetching object-lock config
if it failed for reason other than non-existence of config.
to avoid relying on scanner-calculated replication metrics.
This will improve the accuracy of the replication stats reported.
This PR also adds on to #15556 by handing replication
traffic that could not be queued by available workers to the
MRF queue so that entries in `PENDING` status are healed faster.
Currently, there is a short time window where the code is allowed
to save the status of a replication resync. Currently, the window is
`now.Sub(st.EndTime) <= resyncTimeInterval`. Also, any failure to
write in the backend disks is not retried.
Refactor the code a little bit to rely on the last timestamp of a
successful write of the resync status of any given bucket in the
backend disks.
When replication is enabled in a particular bucket, the listing will send
objects to bucket replication, but it is also sending prefixes for non
recursive listing which is useless and shows a lot of error logs.
This commit will ignore prefixes.
This PR improves the replication failure healing by persisting
most recent failures to disk and re-queuing them until the replication
is successful.
While this does not eliminate the need for healing during a full scan,
queuing MRF vastly improves the ETA to keeping replicated buckets
in sync as it does not wait for the scanner visit to detect unreplicated
object versions.
competing calls on the same object on versioned bucket
mutating calls on the same object may unexpected have
higher delays.
This can be reproduced with a replicated bucket
overwriting the same object writes, deletes repeatedly.
For longer locks like scanner keep the 1sec interval
This PR fixes possible leaks that may emanate from not
listening on context cancelation or timeouts.
```
goroutine 60957610 [chan send, 16 minutes]:
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func1.1.1(...)
github.com/minio/minio/cmd/erasure-server-pool.go:1724 +0x368
github.com/minio/minio/cmd.listPathRaw({0x4a9a740, 0xc0666dffc0},...
github.com/minio/minio/cmd/metacache-set.go:1022 +0xfc4
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func1.1()
github.com/minio/minio/cmd/erasure-server-pool.go:1764 +0x528
created by github.com/minio/minio/cmd.(*erasureServerPools).Walk.func1
github.com/minio/minio/cmd/erasure-server-pool.go:1697 +0x1b7
```