minio

Commit Graph

Author	SHA1	Message	Date
Anis Eleuch	135874ebdc	heal: Avoid marking a bucket as done when remote drives are offline (#19587 )	2024-04-25 23:32:14 -07:00
Harshavardhana	cb06aee5ac	convert multipart-cleanup from a blocking unlink() to a rename to trash (#19495 ) unlinking() at two different locations on a disk when there are lots to purge, this can lead to huge IOwaits, instead rely on rename() to .trash to avoid running multiple unlinks() in parallel.	2024-04-15 03:02:39 -07:00
Anis Eleuch	95bf4a57b6	logging: Add subsystem to log API (#19002 ) Create new code paths for multiple subsystems in the code. This will make maintaing this easier later. Also introduce bugLogIf() for errors that should not happen in the first place.	2024-04-04 05:04:40 -07:00
Klaus Post	3d6194e93c	Remove empty replication stats (#19385 ) When sending final stats upstream also trim empty ReplicationStats.	2024-03-29 11:57:52 -07:00
Harshavardhana	d7520f0ae6	fix: make sure maintenance=true is honored properly (#19156 ) fixes a regression from #18700	2024-02-29 08:37:57 -08:00
Harshavardhana	f965434022	fix: re-use endpoint strings to avoid allocation during audit (#19116 )	2024-02-23 16:19:13 -08:00
Harshavardhana	eac4e4b279	honor replaced disk properly by updating globalLocalDrives (#19038 ) globalLocalDrives seem to be not updated during the HealFormat() leads to a requirement where the server needs to be restarted for the healing to continue.	2024-02-12 13:00:20 -08:00
Harshavardhana	caac9d216e	remove all the frivolous logs, that may or may not be actionable (#18922 ) for actionable, inspections we have `mc support inspect` we do not need double logging, healing will report relevant errors if any, in terms of quorum lost etc.	2024-01-30 18:11:45 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00
Harshavardhana	74851834c0	further bootstrap/startup optimization for reading 'format.json' (#18868 ) - Move RenameFile to websockets - Move ReadAll that is primarily is used for reading 'format.json' to to websockets - Optimize DiskInfo calls, and provide a way to make a NoOp DiskInfo call.	2024-01-25 12:45:46 -08:00
Harshavardhana	e11d851aee	add new drive I/O waiting/tokens metric (#18836 ) Bonus: add virtual memory used as well part of the system resource metrics.	2024-01-19 14:51:36 -08:00
Anis Eleuch	a47fc75c26	xl: Remove wrong wording for errCorruptedFormat (#18775 ) Also add errCorruptedBackend to make it easier to differentiate between corrupted content or something else wrong in the backend drive	2024-01-12 14:48:44 -08:00
Anis Eleuch	3f4488c589	scanner: Allow full throttle if there is no parallel disk ops (#18109 )	2024-01-02 13:51:24 -08:00
Harshavardhana	a50ea92c64	feat: introduce list_quorum="auto" to prefer quorum drives (#18084 ) NOTE: This feature is not retro-active; it will not cater to previous transactions on existing setups. To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment variable as part of systemd service or k8s configmap. Once this has been enabled, you need to also set `list_quorum`. ``` ~ mc admin config set alias/ api list_quorum=auto` ``` A new debugging tool is available to check for any missing counters.	2023-12-29 15:52:41 -08:00
Anis Eleuch	8432fd5ac2	prom: Add online and healing drives metrics per erasure set (#18700 )	2023-12-21 16:56:43 -08:00
Harshavardhana	7c948adf88	allow pre-allocating buffers to reduce frequent GCs during growth (#18686 ) This PR also increases per node bpool memory from 1024 entries to 2048 entries; along with that, it also moves the byte pool centrally instead of being per pool.	2023-12-21 08:59:38 -08:00
Klaus Post	5f971fea6e	Fix Mux Connect Error (#18567 ) `OpMuxConnectError` was not handled correctly. Remove local checks for single request handlers so they can run before being registered locally. Bonus: Only log IAM bootstrap on startup.	2023-12-01 00:18:04 -08:00
jiuker	be02333529	feat: drive sub-sys to max timeout reload (#18501 )	2023-11-27 09:15:06 -08:00
Harshavardhana	877e0cac03	fix: tiering statistics handling a bug in clone() implementation (#18342 ) Tiering statistics have been broken for some time now, a regression was introduced in `6f2406b0b6` Bonus fixes an issue where the objects are not assumed to be of the 'STANDARD' storage-class for the objects that have not yet tiered, this should be conditional based on the object's metadata not a default assumption. This PR also does some cleanup in terms of implementation, fixes #18070	2023-10-30 09:59:51 -07:00
Harshavardhana	c60f54e5be	make ListMultipart/ListParts more reliable skip healing disks (#18312 ) this PR also fixes old flaky tests, by properly marking disk offline-based tests.	2023-10-24 23:33:25 -07:00
Harshavardhana	9f7044aed0	fix: ignore transient errors in read path (#18006 ) Errors such as ``` returned an error (context deadline exceeded) (fmt.wrapError) ``` ``` (msgp: too few bytes left to read object) (fmt.wrapError) ```	2023-09-11 15:29:59 -07:00
Aditya Manthramurthy	1c99fb106c	Update to minio/pkg/v2 (#17967 )	2023-09-04 12:57:37 -07:00
Harshavardhana	4643efe6be	fix: add deadline worker pattern for local disk removers (#17845 )	2023-08-14 12:28:13 -07:00
Harshavardhana	c45bc32d98	skip disks under scanning when healing disks (#17822 ) Bonus: - avoid calling DiskInfo() calls when missing blocks instead heal the object using MRF operation. - change the max_sleep to 250ms beyond that we will not stop healing.	2023-08-09 12:51:47 -07:00
Harshavardhana	a7a7533190	add new errors for Disks with timeouts (#17770 )	2023-08-01 12:47:50 -07:00
Harshavardhana	81be718674	fix: optimize DiskInfo() call avoid metrics when not needed (#17763 )	2023-07-31 15:20:48 -07:00
Anis Eleuch	756d6aa729	fix: report correct pool/set/disk indexes for offline disks (#17695 )	2023-07-20 07:48:21 -07:00
Harshavardhana	bdddf597f6	shuffle buckets randomly before being scanned (#17644 ) this randomness is needed to avoid scanning the same buckets across different erasure sets, in the same order. allow random buckets to be scanned instead allowing a wider spread of ILM, replication checks. Additionally do not loop over twice to fill the channel, fill the channel regardless of having bucket new or old.	2023-07-14 02:25:40 -07:00
Kaan Kabalak	f64d62b01d	Fix style of logOnceIf calls w/unique identifiers (#17631 )	2023-07-11 13:17:45 -07:00
Kaan Kabalak	21fbe88e1f	Print certain log messages once per error (#17484 )	2023-06-24 20:29:13 -07:00
Aditya Manthramurthy	5a1612fe32	Bump up madmin-go and pkg deps (#17469 )	2023-06-19 17:53:08 -07:00
Praveen raj Mani	72802a5972	Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069 )	2023-04-25 22:57:40 -07:00
Harshavardhana	8fd07bcd51	simplify sort.Sort by using sort.Slice (#17066 )	2023-04-24 13:28:18 -07:00
Anis Eleuch	1f1c267b6c	Add used inodes to disk info (#16994 )	2023-04-07 07:52:14 -07:00
Klaus Post	a547bf517d	Remove locks on usage cache (#16786 )	2023-03-09 15:15:46 -08:00
Klaus Post	d07089ceac	Fix scanner deadlock on lost global lock (#16726 )	2023-02-28 21:34:45 -08:00
Klaus Post	9acf1024e4	Remove bloom filter (#16682 ) Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner. Also removes a tiny bit of CPU by each write operation.	2023-02-24 09:03:31 +05:30
Harshavardhana	b4ef5ff294	remove unnecessary code checking for supported features (#16423 )	2023-01-17 19:37:47 +05:30
Harshavardhana	f1bbb7fef5	vectorize cluster-wide calls such as bucket operations (#16313 )	2023-01-03 08:16:39 -08:00
Aditya Manthramurthy	a30cfdd88f	Bump up madmin-go to v2 (#16162 )	2022-12-06 13:46:50 -08:00
Harshavardhana	5a8df7efb3	re-implement StorageInfo to be a peer call (#16155 )	2022-12-01 14:31:35 -08:00
Klaus Post	98ba622679	Reduce temporary file clean-up waits (#16110 )	2022-11-22 07:23:36 -08:00
Harshavardhana	23b329b9df	remove gateway completely (#15929 )	2022-10-24 17:44:15 -07:00
Anis Elleuch	18fb86b7be	convert context.DeadlineExceed to offline disk in DiskInfo() (#15886 )	2022-10-18 03:01:16 -07:00
Harshavardhana	c79bcc8838	Revert "convert context.DeadlineExceed to offline disk in DiskInfo() (#15869 )" This reverts commit `0fe58dbb34`.	2022-10-14 20:37:50 -07:00
Anis Elleuch	0fe58dbb34	convert context.DeadlineExceed to offline disk in DiskInfo() (#15869 )	2022-10-14 19:32:13 -07:00
Anis Elleuch	db7a9b2c37	heal-info: Return the endpoint of a disk with unknown state (#15854 )	2022-10-13 16:41:44 -07:00
Anis Elleuch	5682685c80	Introduce disk io stats metrics (#15512 )	2022-08-16 07:13:49 -07:00
Harshavardhana	a406bb0288	restrict number of disks used for scanning buckets upto GOMAXPROCS (#15492 ) control scanner parallelism to avoid higher CPU usage on nodes that have more drives but an old CPU.	2022-08-08 16:16:44 -07:00
ebozduman	b57e7321e7	Replaces 'disk'=>'drive' visible to end user (#15464 )	2022-08-04 16:10:08 -07:00

1 2 3

131 Commits