minio

Commit Graph

Author	SHA1	Message	Date
Klaus Post	2229509362	fix: leaking offline disks in MarkOffline() thread (#18414 ) `monitorAndConnectEndpoints` will continue to attempt to reconnect offline disks. Since disks were never closed, a `MarkOffline` would continue to try to check these disks forever. Close previous disks.	2023-11-09 09:33:32 -08:00
Aditya Manthramurthy	1c99fb106c	Update to minio/pkg/v2 (#17967 )	2023-09-04 12:57:37 -07:00
Harshavardhana	eb55034dfe	optimize deletePrefix, use direct set location via object name (#17827 ) * optimize deletePrefix, use direct set location via object name instead of fanning out the calls for an object force delete we can assume the set location and not do fan-out calls * Apply suggestions from code review Co-authored-by: Krishnan Parthasarathi <krisis@users.noreply.github.com> --------- Co-authored-by: Krishnan Parthasarathi <krisis@users.noreply.github.com>	2023-08-09 16:30:22 -07:00
Harshavardhana	b0f0e53bba	fix: make sure to correctly initialize health checks (#17765 ) health checks were missing for drives replaced since - HealFormat() would replace the drives without a health check - disconnected drives when they reconnect via connectEndpoint() the loop also loses health checks for local disks and merges these into a single code. - other than this separate cleanUp, health check variables to avoid overloading them with similar requirements. - also ensure that we compete via context selector for disk monitoring such that the canceled disks don't linger around longer waiting for the ticker to trigger. - allow disabling active monitoring.	2023-08-01 10:54:26 -07:00
Harshavardhana	81be718674	fix: optimize DiskInfo() call avoid metrics when not needed (#17763 )	2023-07-31 15:20:48 -07:00
Aditya Manthramurthy	5a1612fe32	Bump up madmin-go and pkg deps (#17469 )	2023-06-19 17:53:08 -07:00
Klaus Post	c839b64f6a	fix: compressed+encrypted block overhead (#17289 )	2023-05-26 10:57:07 -07:00
jiuker	f037c9b286	Protecting the read index is not out of bounds (#17226 )	2023-05-17 12:09:41 -07:00
Praveen raj Mani	72802a5972	Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069 )	2023-04-25 22:57:40 -07:00
Harshavardhana	84f31ed45d	simplify MRF, converge it to regular healing (#17026 )	2023-04-19 07:47:42 -07:00
Harshavardhana	6825bd7e75	fix: inlined objects don't need to honor long locks (#17039 )	2023-04-17 12:16:37 -07:00
Poorna	d1e775313d	support decommissioning of tiered objects (#16751 )	2023-03-16 07:48:05 -07:00
Harshavardhana	b4ef5ff294	remove unnecessary code checking for supported features (#16423 )	2023-01-17 19:37:47 +05:30
Anis Elleuch	2146ed4033	xl: Quit early when EC config is incorrect (#16390 ) Co-authored-by: Anis Elleuch <anis@min.io>	2023-01-09 23:07:45 -08:00
Anis Elleuch	0333412148	fix: heal only once per disk per set among multiple disks (#16358 )	2023-01-05 20:41:19 -08:00
Harshavardhana	a15a2556c3	converge listBuckets() as a peer call (#16346 )	2023-01-03 23:39:40 -08:00
Harshavardhana	f1bbb7fef5	vectorize cluster-wide calls such as bucket operations (#16313 )	2023-01-03 08:16:39 -08:00
Harshavardhana	f93183f66e	fix: a deadlock by refactoring listBuckets() under site replication (#16323 )	2022-12-29 00:08:31 -08:00
Harshavardhana	b882310e2b	avoid locks for internal and invalid buckets in MakeBucket() (#16302 )	2022-12-23 07:46:00 -08:00
Aditya Manthramurthy	a30cfdd88f	Bump up madmin-go to v2 (#16162 )	2022-12-06 13:46:50 -08:00
Harshavardhana	5a8df7efb3	re-implement StorageInfo to be a peer call (#16155 )	2022-12-01 14:31:35 -08:00
Klaus Post	cc1d8f0057	Check for abandoned data when healing (#16122 )	2022-11-28 10:20:55 -08:00
Klaus Post	98ba622679	Reduce temporary file clean-up waits (#16110 )	2022-11-22 07:23:36 -08:00
Klaus Post	ecc932d5dd	Clean entire tmp-old on restart (#15979 )	2022-10-31 07:27:50 -07:00
Harshavardhana	23b329b9df	remove gateway completely (#15929 )	2022-10-24 17:44:15 -07:00
Klaus Post	3c605c93fe	warn when 0 parity has been set as default parity (#15790 )	2022-10-04 22:41:42 -07:00
Klaus Post	a9f1ad7924	Add extended checksum support (#15433 )	2022-08-29 16:57:16 -07:00
ebozduman	b57e7321e7	Replaces 'disk'=>'drive' visible to end user (#15464 )	2022-08-04 16:10:08 -07:00
Poorna	426c902b87	site replication: fix healing of bucket deletes. (#15377 ) This PR changes the handling of bucket deletes for site replicated setups to hold on to deleted bucket state until it syncs to all the clusters participating in site replication.	2022-07-25 17:51:32 -07:00
Anis Elleuch	ed0cbfb31e	fix: rootdisk detection by not using cached value when GetDiskInfo() errors out (#15249 ) GetDiskInfo() uses timedValue to cache the disk info for one second. timedValue behavior was recently changed to return an old cached value when calculating a new value returns an error. When a mount point is empty, GetDiskInfo() will return errUnformattedDisk, timedValue will return cached disk info with unexpected IsRootDisk value, e.g. false if the mount point belongs to a root disk. Therefore, the mount point will be considered a valid disk and will be formatted as well. This commit will also add more defensive code when marking root disks: always mark a disk offline for any GetDiskInfo() error except errUnformattedDisk. The server will try anyway to reconnect to those disks every 10 seconds.	2022-07-07 17:05:23 -07:00
Harshavardhana	32b2f6117e	fix: do not pass around sync.Map (#15250 ) it is not safe to pass around sync.Map through pointers, as it may be concurrently updated by different callers. this PR simplifies by avoiding sync.Map altogether, we do not need sync.Map to keep object->erasureMap association. This PR fixes a crash when concurrently using this value when audit logs are configured. ``` fatal error: concurrent map iteration and map write goroutine 247651580 [running]: runtime.throw({0x277a6c1?, 0xc002381400?}) runtime/panic.go:992 +0x71 fp=0xc004d29b20 sp=0xc004d29af0 pc=0x438671 runtime.mapiternext(0xc0d6e87f18?) runtime/map.go:871 +0x4eb fp=0xc004d29b90 sp=0xc004d29b20 pc=0x41002b ```	2022-07-07 17:04:25 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
Harshavardhana	bd099f5e71	fix: change timedValue to return the previously cached value (#15169 ) fix: change timedvalue to return previous cached value caller can interpret the underlying error and decide accordingly, places where we do not interpret the errors upon timedValue.Get() - we should simply use the previously cached value instead of returning "empty". Bonus: remove some unused code	2022-06-25 08:50:16 -07:00
Poorna	cb097e6b0a	CopyObject: fix read/write err on closed pipe (#15135 ) Fixes: #15128 Regression from PR#14971	2022-06-21 19:20:11 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Harshavardhana	03f8b25b50	disable connectDisks loop under testing (#14920 ) avoids races during tests, keeps tests predictable	2022-05-16 05:36:00 -07:00
Anis Elleuch	44a3b58e52	Add audit log for decommissioning (#14858 )	2022-05-04 00:45:27 -07:00
Harshavardhana	eda34423d7	update gofumpt -w - new changes	2022-04-13 12:00:11 -07:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Xuehan Xu	becec6cb6b	correct mrf.newSetReconnected invocation's param order (#14426 ) Signed-off-by: xuxuehan <xuxuehan@qianxin.com>	2022-02-28 09:13:19 -08:00
Harshavardhana	0cbdc458c5	fix: do not reload disk format.json on a reconnected disk (#14351 ) An onlineDisk means its a valid disk but it may be a re-connected disk, this PR verifies that based on LastConn() to only trigger MRF. Current code would again re-load the disk 'format.json' which is not necessary and perhaps an unnecessary call. A potential side affect of this is closing perfectly online disks and getting re-replaced by reloading 'format.json'. This PR tries to avoid this situation by making sure MRF is triggered but not reloading 'format.json' because of MRF.	2022-02-21 15:51:54 -08:00
Anis Elleuch	1f92fc3fc0	Always check for root disks unless MINIO_CI_CD is set (#14232 ) The current code considers a pool with all root disks to be as part of a testing environment even if there are other pools with mounted disks. This will result to illegitimate writing in root disks. Fix this by simplifing the logic: require MINIO_CI_CD in order to skip root disk check.	2022-02-13 15:42:07 -08:00
Harshavardhana	fad3d66093	parallelize background cleanup on local disks across sets (#14290 )	2022-02-11 14:22:48 -08:00
Harshavardhana	6123377e66	speedup getFormatErasureInQuorum use driveCount (#14239 ) startup speed-up, currently getFormatErasureInQuorum() would spend up to 2-3secs when there are 3000+ drives for example in a setup, simplify this implementation to use drive counts.	2022-02-04 12:21:21 -08:00
Anis Elleuch	127e8bf3b6	heal: Avoid printing repetitive error to heal a root disk (#14220 ) The healing code repeatedly tries to heal a root disk when it is empty the reason is that connectEndpoint() returns errUnformattedDisk even if the disk is a root disk. Changing that to returning another error will avoid queueing the disk to the healing code in each connect disks iteration.	2022-01-31 17:28:20 -08:00
Harshavardhana	b68f0cbde4	ignore remote disks with diskID empty as offline (#14168 ) concurrent loading of erasure sets can now expose a situation in a distributed setup that might return diskID as empty, treat such disks as offline.	2022-01-24 19:40:02 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
yinhen	d300e775a6	Avoid reconnect of disk during startup sequence (#14070 )	2022-01-10 23:33:58 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00

1 2 3 4

168 Commits