minio

Commit Graph

Author	SHA1	Message	Date
Anis Eleuch	a47fc75c26	xl: Remove wrong wording for errCorruptedFormat (#18775 ) Also add errCorruptedBackend to make it easier to differentiate between corrupted content or something else wrong in the backend drive	2024-01-12 14:48:44 -08:00
Anis Eleuch	3f4488c589	scanner: Allow full throttle if there is no parallel disk ops (#18109 )	2024-01-02 13:51:24 -08:00
Harshavardhana	a50ea92c64	feat: introduce list_quorum="auto" to prefer quorum drives (#18084 ) NOTE: This feature is not retro-active; it will not cater to previous transactions on existing setups. To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment variable as part of systemd service or k8s configmap. Once this has been enabled, you need to also set `list_quorum`. ``` ~ mc admin config set alias/ api list_quorum=auto` ``` A new debugging tool is available to check for any missing counters.	2023-12-29 15:52:41 -08:00
Anis Eleuch	8432fd5ac2	prom: Add online and healing drives metrics per erasure set (#18700 )	2023-12-21 16:56:43 -08:00
Harshavardhana	7c948adf88	allow pre-allocating buffers to reduce frequent GCs during growth (#18686 ) This PR also increases per node bpool memory from 1024 entries to 2048 entries; along with that, it also moves the byte pool centrally instead of being per pool.	2023-12-21 08:59:38 -08:00
Klaus Post	5f971fea6e	Fix Mux Connect Error (#18567 ) `OpMuxConnectError` was not handled correctly. Remove local checks for single request handlers so they can run before being registered locally. Bonus: Only log IAM bootstrap on startup.	2023-12-01 00:18:04 -08:00
jiuker	be02333529	feat: drive sub-sys to max timeout reload (#18501 )	2023-11-27 09:15:06 -08:00
Harshavardhana	877e0cac03	fix: tiering statistics handling a bug in clone() implementation (#18342 ) Tiering statistics have been broken for some time now, a regression was introduced in `6f2406b0b6` Bonus fixes an issue where the objects are not assumed to be of the 'STANDARD' storage-class for the objects that have not yet tiered, this should be conditional based on the object's metadata not a default assumption. This PR also does some cleanup in terms of implementation, fixes #18070	2023-10-30 09:59:51 -07:00
Harshavardhana	c60f54e5be	make ListMultipart/ListParts more reliable skip healing disks (#18312 ) this PR also fixes old flaky tests, by properly marking disk offline-based tests.	2023-10-24 23:33:25 -07:00
Harshavardhana	9f7044aed0	fix: ignore transient errors in read path (#18006 ) Errors such as ``` returned an error (context deadline exceeded) (fmt.wrapError) ``` ``` (msgp: too few bytes left to read object) (fmt.wrapError) ```	2023-09-11 15:29:59 -07:00
Aditya Manthramurthy	1c99fb106c	Update to minio/pkg/v2 (#17967 )	2023-09-04 12:57:37 -07:00
Harshavardhana	4643efe6be	fix: add deadline worker pattern for local disk removers (#17845 )	2023-08-14 12:28:13 -07:00
Harshavardhana	c45bc32d98	skip disks under scanning when healing disks (#17822 ) Bonus: - avoid calling DiskInfo() calls when missing blocks instead heal the object using MRF operation. - change the max_sleep to 250ms beyond that we will not stop healing.	2023-08-09 12:51:47 -07:00
Harshavardhana	a7a7533190	add new errors for Disks with timeouts (#17770 )	2023-08-01 12:47:50 -07:00
Harshavardhana	81be718674	fix: optimize DiskInfo() call avoid metrics when not needed (#17763 )	2023-07-31 15:20:48 -07:00
Anis Eleuch	756d6aa729	fix: report correct pool/set/disk indexes for offline disks (#17695 )	2023-07-20 07:48:21 -07:00
Harshavardhana	bdddf597f6	shuffle buckets randomly before being scanned (#17644 ) this randomness is needed to avoid scanning the same buckets across different erasure sets, in the same order. allow random buckets to be scanned instead allowing a wider spread of ILM, replication checks. Additionally do not loop over twice to fill the channel, fill the channel regardless of having bucket new or old.	2023-07-14 02:25:40 -07:00
Kaan Kabalak	f64d62b01d	Fix style of logOnceIf calls w/unique identifiers (#17631 )	2023-07-11 13:17:45 -07:00
Kaan Kabalak	21fbe88e1f	Print certain log messages once per error (#17484 )	2023-06-24 20:29:13 -07:00
Aditya Manthramurthy	5a1612fe32	Bump up madmin-go and pkg deps (#17469 )	2023-06-19 17:53:08 -07:00
Praveen raj Mani	72802a5972	Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069 )	2023-04-25 22:57:40 -07:00
Harshavardhana	8fd07bcd51	simplify sort.Sort by using sort.Slice (#17066 )	2023-04-24 13:28:18 -07:00
Anis Eleuch	1f1c267b6c	Add used inodes to disk info (#16994 )	2023-04-07 07:52:14 -07:00
Klaus Post	a547bf517d	Remove locks on usage cache (#16786 )	2023-03-09 15:15:46 -08:00
Klaus Post	d07089ceac	Fix scanner deadlock on lost global lock (#16726 )	2023-02-28 21:34:45 -08:00
Klaus Post	9acf1024e4	Remove bloom filter (#16682 ) Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner. Also removes a tiny bit of CPU by each write operation.	2023-02-24 09:03:31 +05:30
Harshavardhana	b4ef5ff294	remove unnecessary code checking for supported features (#16423 )	2023-01-17 19:37:47 +05:30
Harshavardhana	f1bbb7fef5	vectorize cluster-wide calls such as bucket operations (#16313 )	2023-01-03 08:16:39 -08:00
Aditya Manthramurthy	a30cfdd88f	Bump up madmin-go to v2 (#16162 )	2022-12-06 13:46:50 -08:00
Harshavardhana	5a8df7efb3	re-implement StorageInfo to be a peer call (#16155 )	2022-12-01 14:31:35 -08:00
Klaus Post	98ba622679	Reduce temporary file clean-up waits (#16110 )	2022-11-22 07:23:36 -08:00
Harshavardhana	23b329b9df	remove gateway completely (#15929 )	2022-10-24 17:44:15 -07:00
Anis Elleuch	18fb86b7be	convert context.DeadlineExceed to offline disk in DiskInfo() (#15886 )	2022-10-18 03:01:16 -07:00
Harshavardhana	c79bcc8838	Revert "convert context.DeadlineExceed to offline disk in DiskInfo() (#15869 )" This reverts commit `0fe58dbb34`.	2022-10-14 20:37:50 -07:00
Anis Elleuch	0fe58dbb34	convert context.DeadlineExceed to offline disk in DiskInfo() (#15869 )	2022-10-14 19:32:13 -07:00
Anis Elleuch	db7a9b2c37	heal-info: Return the endpoint of a disk with unknown state (#15854 )	2022-10-13 16:41:44 -07:00
Anis Elleuch	5682685c80	Introduce disk io stats metrics (#15512 )	2022-08-16 07:13:49 -07:00
Harshavardhana	a406bb0288	restrict number of disks used for scanning buckets upto GOMAXPROCS (#15492 ) control scanner parallelism to avoid higher CPU usage on nodes that have more drives but an old CPU.	2022-08-08 16:16:44 -07:00
ebozduman	b57e7321e7	Replaces 'disk'=>'drive' visible to end user (#15464 )	2022-08-04 16:10:08 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Anis Elleuch	16431d222c	heal: Enable periodic bitrot scan configuration (#14464 )	2022-04-07 08:10:40 -07:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Harshavardhana	57118919d2	cached diskIDs are not needed for scanner healing (#14170 ) This PR removes an unnecessary state that gets passed around for DiskIDs, which is not necessary since each disk exactly knows which pool and which set it belongs to on a running system. Currently cached DiskId's won't work properly because it always ends up skipping offline disks and never runs healing when disks are offline, as it expects all the cached diskIDs to be present always. This also sort of made things in-flexible in terms perhaps a new diskID for `format.json`. (however this is not a big issue) This is an unnecessary requirement that healing via scanner needs all drives to be online, instead healing should trigger even when partial nodes and drives are available this ensures that we keep the SLA in-tact on the objects when disks are offline for a prolonged period of time.	2022-01-26 08:34:56 -08:00
Harshavardhana	9d588319dd	support site replication to replicate IAM users,groups (#14128 ) - Site replication was missing replicating users, groups when an empty site was added. - Add site replication for groups and users when they are disabled and enabled. - Add support for replicating bucket quota config.	2022-01-19 20:02:24 -08:00
Krishnan Parthasarathi	070c31eac5	Wait for updates collector when disk.NSScanner returns error (#14127 )	2022-01-19 00:46:43 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	866a95de38	fix: choose appropriate quorum for a given erasure set (#13998 ) multiObject delete should honor expected quorum	2021-12-28 12:41:52 -08:00
Anis Elleuch	1d9e91e00f	Fix wrong reporting of total disks after restart (#13326 ) A restart of the cluster and a failed disk will wrongly count the number of total disks.	2021-09-29 11:36:19 -07:00
Klaus Post	88d719689c	Synchronize bucket cycle numbers (#13058 ) Synchronize bucket cycles so it is much more likely that the same prefixes will be picked up for scanning. Use the global bloom filter cycle for that. Bump bloom filter versions to clear those.	2021-08-25 08:25:26 -07:00

1 2 3

120 Commits