When a fresh drive healing is finished, add more checks for the drive listing
errors. If any, re-list and heal again. Although this is an infrequent use
case to have listPathRaw() returning nil when minDisks is set to 1, we
still need to handle all possible use cases to avoid missing healing
any object.
Also, check for HealObject result to decide of an object is healed in the
fresh disk since HealObject returns nil if an object is healed in any
disk, and not in the new fresh drive.
due to a historic bug in CopyObject() where
an inlined object loses its metadata, the
check causes an incorrect fallback verifying
data-dir.
CopyObject() bug was fixed in ffa91f9794 however
the occurrence of this problem is historic, so
the aforementioned check is stretching too much.
Bonus: simplify fileInfoRaw() to read xl.json as well,
also recreate buckets properly.
In the very rare case when all drives in a erasure set need to be healed,
remove .healing.bin from all drives, otherwise it will be stuck in a
loop
Also, fix a unit test that fails sometimes due to wrong test.
This change uses the updated ldap library in minio/pkg (bumped
up to v3). A new config parameter is added for LDAP configuration to
specify extra user attributes to load from the LDAP server and to store
them as additional claims for the user.
A test is added in sts_handlers.go that shows how to access the LDAP
attributes as a claim.
This is in preparation for adding SSH pubkey authentication to MinIO's SFTP
integration.
instead upon any error in renameData(), we still
preserve the existing dataDir in some form for
recoverability in strange situations such as out
of disk space type errors.
Bonus: avoid running list and heal() instead allow
versions disparity to return the actual versions,
uuid to heal. Currently limit this to 100 versions
and lesser disparate objects.
an undo now reverts back the xl.meta from xl.meta.bkp
during overwrites on such flaky setups.
Bonus: Save N depth syscalls via skipping the parents
upon overwrites and versioned updates.
Flaky setup examples are stretch clusters with regular
packet drops etc, we need to add some defensive code
around to avoid dangling objects.
This PR makes a feasible approach to handle all the scenarios
that we must face to avoid returning "panic."
Instead, we must return "errServerNotInitialized" when a
bucketMetadataSys.Get() is called, allowing the caller to
retry their operation and wait.
Bonus fix the way data-usage-cache stores the object.
Instead of storing usage-cache.bin with the bucket as
`.minio.sys/buckets`, the `buckets` must be relative
to the bucket `.minio.sys` as part of the object name.
Otherwise, there is no way to decommission entries at
`.minio.sys/buckets` and their final erasure set positions.
A bucket must never have a `/` in it. Adds code to read()
from existing data-usage.bin upon upgrade.
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.
Also introduce bugLogIf() for errors that should not happen in the first
place.
New disk healing code skips/expires objects that ILM supposed to expire.
Add more visibility to the user about this activity by calculating those
objects and print it at the end of healing activity.
This PR fixes a bug that perhaps has been long introduced,
with no visible workarounds. In any deployment, if an entire
erasure set is deleted, there is no way the cluster recovers.
Add a new function logger.Event() to send the log to Console and
http/kafka log webhooks. This will include some internal events such as
disk healing and rebalance/decommissioning
- Move RenameFile to websockets
- Move ReadAll that is primarily is used
for reading 'format.json' to to websockets
- Optimize DiskInfo calls, and provide a way
to make a NoOp DiskInfo call.
`OpMuxConnectError` was not handled correctly.
Remove local checks for single request handlers so they can
run before being registered locally.
Bonus: Only log IAM bootstrap on startup.
Bonus:
- avoid calling DiskInfo() calls when missing blocks
instead heal the object using MRF operation.
- change the max_sleep to 250ms beyond that we will
not stop healing.
When healing is parallelized by setting the ` _MINIO_HEAL_WORKERS`
environment variable, multiple goroutines may race while updating the disk's
healing tracker. This change serializes only these concurrent updates using a
channel. Note, the healing tracker is still not concurrency safe in other contexts.
Directories markers are not healed when healing a new fresh disk. A
a proper fix would be moving object names encoding/decoding to erasure
object level but it is too late now since the object to set distribution is
calculated at a higher level.
- Always reformat all disks when a new disk is detected, this will
ensure new uploads to be written in new fresh disks
- Always heal all buckets first when an erasure set started to be healed
- Use a lock to prevent two disks belonging to different nodes but in
the same erasure set to be healed in parallel
- Heal different sets in parallel
Bonus:
- Avoid logging errUnformattedDisk when a new fresh disk is inserted but
not detected by healing mechanism yet (10 seconds lag)
Main motivation is move towards a common backend format
for all different types of modes in MinIO, allowing for
a simpler code and predictable behavior across all features.
This PR also brings features such as versioning, replication,
transitioning to single drive setups.