minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	6cb2f56395	Revert "Revert "tests: Add context cancelation (#15374 )"" This reverts commit `564a0afae1`.	2022-10-14 03:08:40 -07:00
Poorna	426c902b87	site replication: fix healing of bucket deletes. (#15377 ) This PR changes the handling of bucket deletes for site replicated setups to hold on to deleted bucket state until it syncs to all the clusters participating in site replication.	2022-07-25 17:51:32 -07:00
Minio Trusted	564a0afae1	Revert "tests: Add context cancelation (#15374 )" This reverts commit `1e332f0eb1`. Reverting this as tests are failing randomly.	2022-07-21 13:58:56 -07:00
Klaus Post	1e332f0eb1	tests: Add context cancelation (#15374 ) A huge number of goroutines would build up from various monitors When creating test filesystems provide a context so they can shut down when no longer needed.	2022-07-21 11:52:18 -07:00
Anis Elleuch	b3eda248a3	Parallelize new disks healing of different erasure sets (#15112 ) - Always reformat all disks when a new disk is detected, this will ensure new uploads to be written in new fresh disks - Always heal all buckets first when an erasure set started to be healed - Use a lock to prevent two disks belonging to different nodes but in the same erasure set to be healed in parallel - Heal different sets in parallel Bonus: - Avoid logging errUnformattedDisk when a new fresh disk is inserted but not detected by healing mechanism yet (10 seconds lag)	2022-06-21 07:53:55 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Anis Elleuch	16431d222c	heal: Enable periodic bitrot scan configuration (#14464 )	2022-04-07 08:10:40 -07:00
Anis Elleuch	bbc914e174	heal: Do not override heal scan mode mode if it is set (#14476 ) mc admin heal has --scan=deep flag which enforces bitrot checking when doing the healing. Do not force override an existing heal scan option.	2022-03-04 18:25:06 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	b883803b21	fix: healing across pools removing dangling objects (#13990 ) adds other simplifications to the code when running namespace heals across pools.	2021-12-25 09:01:44 -08:00
Harshavardhana	661b263e77	add gocritic/ruleguard checks back again, cleanup code. (#13665 ) - remove some duplicated code - reported a bug, separately fixed in #13664 - using strings.ReplaceAll() when needed - using filepath.ToSlash() use when needed - remove all non-Go style comments from the codebase Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2021-11-16 09:28:29 -08:00
Harshavardhana	a19e3bc9d9	add more dangling heal related tests (#13140 ) also make sure that HealObject() never returns 'ObjectNotFound' or 'VersionNotFound' errors, as those are meaningless and not useful for the caller.	2021-09-02 20:56:13 -07:00
Harshavardhana	ed16ce9b73	add healing workers support to parallelize healing (#13081 ) Faster healing as well as making healing more responsive for faster scanner times. also fixes a bug introduced in #13079, newly replaced disks were not healing automatically.	2021-08-26 20:32:58 -07:00
Harshavardhana	c11a2ac396	refactor healing to remove certain structs (#13079 ) - remove sourceCh usage from healing we already have tasks and resp channel - use read locks to lookup globalHealConfig - fix healing resolver to pick candidates quickly that need healing, without this resolver was unexpectedly skipping.	2021-08-26 14:06:04 -07:00
Harshavardhana	0559f46bbb	fix: make healObject() make non-blocking (#13071 ) healObject() should be non-blocking to ensure that scanner is not blocked for a long time, this adversely affects performance of the scanner and also affects the way usage is updated subsequently. This PR allows for a non-blocking behavior for healing, dropping operations that cannot be queued anymore.	2021-08-25 17:46:20 -07:00
Harshavardhana	b4bf82c751	do not heal "backend-encrypted" out-of-band with migration (#12556 ) backend-encrypted doesn't need to be explicitly healed anymore since this file is deleted upon upgrade and migration to the KMS based encrypted config/IAM credentials.	2021-06-23 12:09:10 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Harshavardhana	ebf75ef10d	fix: remove all unused code (#12360 )	2021-05-24 09:28:19 -07:00
Harshavardhana	1aa5858543	move madmin to github.com/minio/madmin-go (#12239 )	2021-05-06 08:52:02 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	d971061305	use listPathRaw for HealObjects() instead of expensive WalkVersions() (#11675 )	2021-03-06 09:25:48 -08:00
Klaus Post	fa9cf1251b	Imporve healing and reporting (#11312 ) * Provide information on actively healing, buckets healed/queued, objects healed/failed. * Add concurrent healing of multiple sets (typically on startup). * Add bucket level resume, so restarts will only heal non-healed buckets. * Print summary after healing a disk is done.	2021-03-04 14:36:23 -08:00
Klaus Post	c5b2a8441b	fix: faster healing when disk is replaced. (#11520 )	2021-02-18 11:06:54 -08:00
Harshavardhana	82f0471d1b	honor maxWait heal config when maxIO hits (#11338 )	2021-01-25 07:53:12 -08:00
Harshavardhana	59d3639396	fix: inherit heal opts globally, including bitrot settings (#11166 ) Bonus re-use ReadFileStream internal io.Copy buffers, fixes lots of chatty allocations when reading metacache readers with many sustained concurrent listing operations ``` 17.30GB 1.27% 84.80% 35.26GB 2.58% io.copyBuffer ```	2020-12-24 23:04:03 -08:00
Harshavardhana	7c9ef76f66	fix: timer deadlock on expired timers (#11124 ) issue was introduced in #11106 the following pattern <-t.C // timer fired if !t.Stop() { <-t.C // timer hangs } Seems to hang at the last `t.C` line, this issue happens because a fired timer cannot be Stopped() anymore and t.Stop() returns `false` leading to confusing state of usage. Refactor the code such that use timers appropriately with exact requirements in place.	2020-12-17 12:35:02 -08:00
Harshavardhana	c606c76323	fix: prioritized latest buckets for crawler to finish the scans faster (#11115 ) crawler should only ListBuckets once not for each serverPool, buckets are same across all pools, across sets and ListBuckets always returns an unified view, once list buckets returns sort it by create time to scan the latest buckets earlier with the assumption that latest buckets would have lesser content than older buckets allowing them to be scanned faster and also to be able to provide more closer to latest view.	2020-12-15 17:34:54 -08:00
Harshavardhana	8368ab76aa	fix: remove the requirement for healing buckets in ListBucketsHeal (#11098 ) With new refactor of bucket healing, healing bucket happens automatically including its metadata, there is no need to redundant heal buckets also in ListBucketsHeal remove it.	2020-12-14 12:07:07 -08:00
Harshavardhana	2eb52ca5f4	fix: heal bucket metadata right before healing bucket (#11097 ) optimization mainly to avoid listing the entire `.minio.sys/buckets/.minio.sys` directory, this can get really huge and comes in the way of startup routines, contents inside `.minio.sys/buckets/.minio.sys` are rather transient and not necessary to be healed.	2020-12-13 11:57:08 -08:00
Klaus Post	a896125490	Add crawler delay config + dynamic config values (#11018 )	2020-12-04 09:32:35 -08:00
Harshavardhana	951b6b203b	skip metacache entries healing to speed up startup	2020-12-02 21:30:54 -08:00
Harshavardhana	44e23b7f4f	fix: startup being slow - wait only if IOCount > 0	2020-12-02 21:06:17 -08:00
Harshavardhana	96c0ce1f0c	add support for tuning healing to make healing more aggressive (#11003 ) supports `mc admin config set <alias> heal sleep=100ms` to enable more aggressive healing under certain times. also optimize some areas that were doing extra checks than necessary when bitrotscan was enabled, avoid double sleeps make healing more predictable. fixes #10497	2020-12-02 11:12:00 -08:00
Harshavardhana	cbdab62c1e	fix: heal user/metadata right away upon server startup (#10863 ) this is needed such that we make sure to heal the users, policies and bucket metadata right away as we do listing based on list cache which only lists '3' sufficiently good drives, to avoid possibly losing access to these users upon upgrade make sure to heal them.	2020-11-10 09:02:06 -08:00
Harshavardhana	a0d0645128	remove safeMode behavior in startup (#10645 ) In almost all scenarios MinIO now is mostly ready for all sub-systems independently, safe-mode is not useful anymore and do not serve its original intended purpose. allow server to be fully functional even with config partially configured, this is to cater for availability of actual I/O v/s manually fixing the server. In k8s like environments it will never make sense to take pod into safe-mode state, because there is no real access to perform any remote operation on them.	2020-10-09 09:59:52 -07:00
Harshavardhana	66174692a2	add '.healing.bin' for tracking currently healing disk (#10573 ) add a hint on the disk to allow for tracking fresh disk being healed, to allow for restartable heals, and also use this as a way to track and remove disks. There are more pending changes where we should move all the disk formatting logic to backend drives, this PR doesn't deal with this refactor instead makes it easier to track healing in the future.	2020-09-28 19:39:32 -07:00
Anis Elleuch	b302c8a5f4	heal: Fix periodic healing cleanup (#10569 ) isEnded() was incorrectly calculating if the current healing sequence is ended or not. h.currentStatus.Items could be empty if healing is very slow and mc admin heal consumed all items.	2020-09-25 10:29:00 -07:00
Harshavardhana	96997d2b21	allow ctrl+c to be consistent at early startup (#10435 ) fixes #10431	2020-09-08 09:10:55 -07:00
Harshavardhana	b0e1d4ce78	re-attach offline drive after new drive replacement (#10416 ) inconsistent drive healing when one of the drive is offline while a new drive was replaced, this change is to ensure that we can add the offline drive back into the mix by healing it again.	2020-09-04 17:09:02 -07:00
Harshavardhana	7778fef6bb	update continous heal metrics appropriately for scanned items (#10352 ) bonus make sure to ignore objectNotFound, and versionNotFound errors properly at all layers, since HealObjects() returns objectNotFound error if the bucket or prefix is empty.	2020-08-26 08:53:33 -07:00
Klaus Post	c097ce9c32	continous healing based on crawler (#10103 ) Design: https://gist.github.com/klauspost/792fe25c315caf1dd15c8e79df124914	2020-08-24 13:47:01 -07:00
Klaus Post	95ae6c4b49	Fix missing unlock in *healSequence.hasEnded() (#10305 ) The background healing sequence would always hang when this function is called.	2020-08-20 08:48:09 -07:00
Klaus Post	bb5976d727	healbucket: Send object version ID (#10263 ) Based on our previous conversations I assume we should send the version id when healing an object. Maybe we should even list object versions and heal all?	2020-08-17 08:25:44 -07:00
Harshavardhana	2a9819aff8	fix: refactor background heal for cluster health (#10225 )	2020-08-07 19:43:06 -07:00
Harshavardhana	6c6137b2e7	add cluster maintenance healthcheck drive heal affinity (#10218 )	2020-08-07 13:22:53 -07:00
Harshavardhana	17747db93f	fix: support healing older content (#10076 ) This PR adds support for healing older content i.e from 2yrs, 1yr. Also handles other situations where our config was not encrypted yet. This PR also ensures that our Listing is consistent and quorum friendly, such that we don't list partial objects	2020-07-17 17:41:29 -07:00
Harshavardhana	187c3f62df	fix: heal replaced drives properly (#10069 ) healing was not working properly when drives were replaced, due to the error check in root disk calculation this PR fixes this behavior This PR also adds additional fix for missing metadata entries from .minio.sys as part of disk healing as well. Added code to ignore and print more context sensitive errors for better debugging. This PR is continuation of fix in `7b14e9b660`	2020-07-17 10:08:04 -07:00
Harshavardhana	cdb0e6ffed	support proper values for listMultipartUploads/listParts (#9970 ) object KMS is configured with auto-encryption, there were issues when using docker registry - this has been left unnoticed for a while. This PR fixes an issue with compatibility. Additionally also fix the continuation-token implementation infinite loop issue which was missed as part of #9939 Also fix the heal token to be generated as a client facing value instead of what is remembered by the server, this allows for the server to be stateless regarding the token's behavior.	2020-07-03 19:27:13 -07:00
Anis Elleuch	2be20588bf	Reroute requests based token heal/listing (#9939 ) When manual healing is triggered, one node in a cluster will become the authority to heal. mc regularly sends new requests to fetch the status of the ongoing healing process, but a load balancer could land the healing request to a node that is not doing the healing request. This PR will redirect a request to the node based on the node index found described as part of the client token. A similar technique is also used to proxy ListObjectsV2 requests by encoding this information in continuation-token	2020-07-03 11:53:03 -07:00
Harshavardhana	810a4f0723	fix: return proper errors Get/HeadObject for deleteMarkers (#9957 )	2020-07-02 16:17:27 -07:00

1 2 3

101 Commits