minio

mirror of https://github.com/minio/minio.git synced 2024-12-25 06:35:56 -05:00

Author	SHA1	Message	Date
Poorna	21bf5b4db7	replication: heal proactively upon access (#15501 ) Queue failed/pending replication for healing during listing and GET/HEAD API calls. This includes healing of existing objects that were never replicated or those in the middle of a resync operation. This PR also fixes a bug in ListObjectVersions where lifecycle filtering should be done.	2022-08-09 15:00:24 -07:00
ebozduman	b57e7321e7	Replaces 'disk'=>'drive' visible to end user (#15464 )	2022-08-04 16:10:08 -07:00
Harshavardhana	a6e0ec4e6f	Add support converting non-inlined to inlined (#15444 ) This is a feature to allow for inode compaction on large clusters that use a lot of small files spread across a large heirarchy.	2022-08-02 23:10:22 -07:00
Harshavardhana	cbd70d26b5	optimize speedtest for smaller setups (#15414 ) this has been observed in multiple environments where the setups are small `speedtest` naturally fails with default '10s' and the concurrency of '32' is big for such clusters. choose a smaller value i.e equal to number of drives in such clusters and let 'autotune' increase the concurrency instead.	2022-07-27 14:41:59 -07:00
Poorna	426c902b87	site replication: fix healing of bucket deletes. (#15377 ) This PR changes the handling of bucket deletes for site replicated setups to hold on to deleted bucket state until it syncs to all the clusters participating in site replication.	2022-07-25 17:51:32 -07:00
Harshavardhana	7da9e3a6f8	support encrypted/compressed objects properly during decommission (#15320 ) fixes #15314	2022-07-16 19:35:24 -07:00
Harshavardhana	1b339ea062	allow force delete on decom pool (#15302 ) Bonus: - skip suspended pool from being considered for multipart uploads - add more context for decomErrors()	2022-07-14 20:44:22 -07:00
Anis Elleuch	996cac5fed	Avoid listing buckets from a suspended pool (#15283 ) Make bucket requests sent after decommissioning is started are not created in a suspended pool. Therefore listing buckets should avoid suspended pools as well.	2022-07-13 07:44:50 -07:00
Harshavardhana	ae92521310	remove unnecessary nAgreed value in partial() func (#15242 )	2022-07-07 13:45:34 -07:00
Anis Elleuch	8d98282afd	Better reporting of total/free usable capacity of the cluster (#15230 ) The current code uses approximation using a ratio. The approximation can skew if we have multiple pools with different disk capacities. Replace the algorithm with a simpler one which counts data disks and ignore parity disks.	2022-07-06 13:29:49 -07:00
Harshavardhana	2518af5f9e	fix: allow certain mutations on objects during decommissioning (#15231 ) fix: allow certain mutation on objects during decommission currently by mistake deletion of objects was skipped, if the object resided on the pool being decommissioned. delete's are okay to be allowed since decommission is designed to run on a cluster with active I/O.	2022-07-06 09:53:16 -07:00
Harshavardhana	9d80ff5a05	fix: decommission delete markers for non-current objects (#15225 ) versioned buckets were not creating the delete markers present in the versioned stack of an object, this essentially would stop decommission to succeed. This PR fixes creating such delete markers properly during a decommissioning process, adds tests as well.	2022-07-05 07:37:24 -07:00
Harshavardhana	9c605ad153	allow support for parity '0', '1' enabling support for 2,3 drive setups (#15171 ) allows for further granular setups - 2 drives (1 parity, 1 data) - 3 drives (1 parity, 2 data) Bonus: allows '0' parity as well.	2022-06-27 20:22:18 -07:00
Anis Elleuch	73733a8fb9	heal: Report correctly in multip-pools setup (#15117 ) `mc admin heal -r <alias>` in a multi setup pools returns incorrectly grey objects. The reason is that erasure-server-pools.HealObject() runs HealObject in all pools and returns the result of the first nil error. However, in the lower erasureObject level, HealObject() returns nil if an object does not exist + missing error in each disk of the object in that pool, therefore confusing mc. Make erasureObject.HealObject() to return not found error in the lower level, so at least erasureServerPools will know what pools to ignore.	2022-06-20 08:07:45 -07:00
Harshavardhana	b0d7332a0c	healthcheck cluster endpoint should honor write/readQuorum per pool (#15053 )	2022-06-07 19:08:21 -07:00
Harshavardhana	31c4fdbf79	fix: resyncing 'null' version on pre-existing content (#15043 ) PR #15041 fixed replicating 'null' version however due to a regression from #14994 caused the target versions for these 'null' versioned objects to have different 'versions', this may cause confusion with bi-directional replication and cause double replication. This PR fixes this properly by making sure we replicate the correct versions on the objects.	2022-06-06 15:14:56 -07:00
Harshavardhana	52221db7ef	fix: for unexpected errors in reading versioning config panic (#14994 ) We need to make sure if we cannot read bucket metadata for some reason, and bucket metadata is not missing and returning corrupted information we should panic such handlers to disallow I/O to protect the overall state on the system. In-case of such corruption we have a mechanism now to force recreate the metadata on the bucket, using `x-minio-force-create` header with `PUT /bucket` API call. Additionally fix the versioning config updated state to be set properly for the site replication healing to trigger correctly.	2022-05-31 02:57:57 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Klaus Post	c0bf02b8b2	Ignore disks with 0 total space (#14981 ) Ignore disks with 0 total Mainly defensive to ensure no `/0` in percent calculation.	2022-05-26 06:01:50 -07:00
Klaus Post	a4be0b88f6	Add server pool reserved space (#14974 ) If one or more pools reach 85% usage in a set, we will only use pools that have more free space. In case all pools are above 85% we allow all of them to be used with the regular distribution.	2022-05-25 13:20:20 -07:00
Klaus Post	41cdb357bb	Compensate for different server pool sizes (#14968 ) When a server pool with a different number of sets is added they are not compensated when choosing a destination pool for new objects. This leads to the unbalanced placement of objects with smaller pools getting a bigger number of objects since we only compare the destination sets directly. This change will compensate for differences in set sizes when choosing the destination pool. Different set sizes are already compensated by fewer disks.	2022-05-24 18:57:14 -07:00
Anis Elleuch	1e037883b0	pools: GetObjectNInfo should cover locking during object read (#14887 ) In case of multi-pools setup, GetObjectNInfo returns a GetObjectReader but it unlocks the read lock when quitting GetObjectNInfo. This should not happen, unlock should only happen when GetObjectReader is closed.	2022-05-10 07:47:40 -07:00
Harshavardhana	598ce1e354	supply prefix filtering when necessary (#14772 ) currently filterPefix was never used and set that would filter out entries when needed when `prefix` doesn't end with `/` - this often leads to objects getting Walked(), Healed() that were never requested by the caller.	2022-04-19 08:20:48 -07:00
Harshavardhana	153a612253	fetch bucket retention config once for ILM evalAction (#14727 ) This is mainly an optimization, does not change any existing functionality.	2022-04-11 13:25:32 -07:00
Harshavardhana	e77ad3f9bb	make sure to pass Lifecycle if set for List filtering (#14722 ) PR #14606 never really passed the Lifecycle filter down to the listing callers to ensure skipping the entries.	2022-04-10 11:14:52 -07:00
Anis Elleuch	16431d222c	heal: Enable periodic bitrot scan configuration (#14464 )	2022-04-07 08:10:40 -07:00
Harshavardhana	7956ff0313	fix: multiple pool setup return incorrect DeleteMarker metadata (#14642 )	2022-03-27 23:39:50 -07:00
Klaus Post	2ac54e5a7b	ListObjects: Filter lifecycle expired objects (#14606 ) For ListObjects and ListObjectsV2 perform lifecycle checks on all objects before returning. This will filter out objects that are pending lifecycle expiration. Bonus: Cheaper server pool conflict resolution by not converting to FileInfo.	2022-03-22 12:39:45 -07:00
Harshavardhana	bd6f7b6d83	fix: make decommission restart non-blocking (#14591 ) currently an on-going decommission, during a server restart might block the startup sequence for relatively longer periods, instead start the decommission in background lazily.	2022-03-20 14:46:43 -07:00
Harshavardhana	5d6f6d8d5b	create missing .minio.sys/config, .minio.sys/buckets during decommission (#14497 )	2022-03-07 16:18:57 -08:00
Harshavardhana	7e803adf13	do not attempt force delete on bucket (#14452 ) caller needs to ask explicitly for force delete otherwise, the force delete might end up deleting an existing bucket with data. fixes #14445	2022-03-02 20:47:53 -08:00
Harshavardhana	e43cc316ff	remove errCh usage from HealObjects() simplify it (#14414 ) errCh is not needed instead, rely on errs slice to capture and return errors instead. most probably fixes #14247	2022-02-25 12:20:41 -08:00
Klaus Post	60cd513a33	Fix leaked healing goroutines (#14322 ) Only the first `listAndHeal` would ever be able to write on errCh, blocking all others infinitely. Instead read all errors but return the first non-nil, if any. The intention appears to be that this should cancel on any error, so that part is kept. Regression from #13990	2022-02-16 08:40:18 -08:00
Poorna	ed3418c046	Refactor replication resync to be an active process (#14266 ) When resync is triggered, walk the bucket namespace and resync objects that are unreplicated. This PR also adds an API to report resync progress.	2022-02-10 10:16:52 -08:00
Harshavardhana	f19a414e09	fix: allow danging objects to be purged properly deleteMultipleObjects() (#14273 ) Deleting bulk objects had an issue since the relevant versionID is not passed through the layers to ensure that the dangling object purge actually works cleanly. This is a continuation of quorum related error returned by multi-object delete API from #14248 This PR ensures that we pass down correct information as well as extend the scope of dangling object detection.	2022-02-08 20:08:23 -08:00
Harshavardhana	dbd05d6e82	remove FIFO bucket quota, use ILM expiration instead (#14206 )	2022-01-31 11:07:04 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
Harshavardhana	404b05a44c	fix: ignore drained pool in Healing, hold lock additionally (#14080 )	2022-01-11 12:27:47 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00
Harshavardhana	001b77e7e1	use readConfig/saveConfig to simplify I/O on usage/tracker info (#14019 )	2022-01-03 10:22:58 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	b883803b21	fix: healing across pools removing dangling objects (#13990 ) adds other simplifications to the code when running namespace heals across pools.	2021-12-25 09:01:44 -08:00
Harshavardhana	2f1e8ba612	add more directory marker tests and fix a bug (#13871 ) ListObjects() should never list a delete-marked folder if latest is delete marker and delimiter is not provided. ListObjectVersions() should list a delete-marked folder even if latest is delete marker and delimiter is not provided. Enhance further versioning listing on the buckets	2021-12-09 14:59:23 -08:00
Harshavardhana	dcff6c996d	fix: do not list delete-marked objects (#13864 ) delete marked objects should not be considered for listing when listing is delimited, this issue as introduced in PR #13804 which was mainly to address listing of directories in listing when delimited. This PR fixes this properly and adds tests to ensure that we behave in accordance with how an S3 API behaves for ListObjects() without versions.	2021-12-08 17:34:52 -08:00
Harshavardhana	b120bcb60a	validate if cached value is empty before use (#13830 ) fixes a crash reproduced while running hadoop tests ``` goroutine 201564 [running]: github.com/minio/minio/cmd.metaCacheEntries.resolve({0xc0206ab7a0, 0x4, 0xc0015b1908}, 0xc0212a7040) github.com/minio/minio/cmd/metacache-entries.go:352 +0x58a ``` Bonus: HeadBucket() should always provide content-type	2021-12-06 02:59:51 -08:00
Klaus Post	3db931dc0e	Improve listing consistency with version merging (#13723 )	2021-12-02 11:29:16 -08:00
Harshavardhana	21c868a646	fix: do not ignore delete-marker directories in ListObjects() (#13804 ) Following scenario such as objects that exist inside a prefix say `folder/` must be included in the listObjects() response. ``` 2aa16073-387e-492c-9d59-b4b0b7b6997a v2 DEL folder/ a5b9ce68-7239-4921-90ab-20aed402c7a2 v1 PUT folder/ f2211798-0eeb-4d9e-9184-fcfeae27d069 v1 PUT folder/1.txt ``` Current master does not handle this scenario, because it ignores the top level delete-marker on folders. This is however unexpected. It is expected that list-objects returns the top level prefix in this situation. ``` aws s3api list-objects --bucket harshavardhana --prefix unique/ \ --delimiter / --profile minio --endpoint-url http://localhost:9000 { "CommonPrefixes": [ { "Prefix": "unique/folder/" } ] } ``` There are applications in the wild such as Hadoop s3a connector that exploit this behavior and expect the folder to be present in the response. This also makes the behavior consistent with AWS S3.	2021-12-02 08:46:33 -08:00
Harshavardhana	b280a37c4d	add delete-marker proactively in DeleteObject() (#13795 ) single object delete was not working properly on a bucket when versioning was suspended, current version 'null' object was never removed. added unit tests to cover the behavior fixes #13783	2021-11-30 18:30:06 -08:00
Harshavardhana	4545ecad58	ignore swapped drives instead of throwing errors (#13655 ) - add checks such that swapped disks are detected and ignored - never used for normal operations. - implement `unrecognizedDisk` to be ignored with all operations returning `errDiskNotFound`. - also add checks such that we do not load unexpected disks while connecting automatically. - additionally humanize the values when printing the errors. Bonus: fixes handling of non-quorum situations in getLatestFileInfo(), that does not work when 2 drives are down, currently this function would return errors incorrectly.	2021-11-15 09:46:55 -08:00
Klaus Post	23d6770ff9	Inspect: Preserve permission flags (#13490 ) Preserve permission from disk files. Can help identify issues. Refactor GetRawData function to be cleaner.	2021-10-21 11:20:13 -07:00

1 2 3

131 Commits