minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	013cc66d8e	add dataErrs for healing debug log (#15092 )	2022-06-16 09:42:45 -07:00
Harshavardhana	dea8220eee	do not heal outdated disks > parityBlocks (#14976 ) this PR also fixes a situation where incorrect partsMetadata slice was used where fi.Data was re-used from a single drive causing duplication of the shards across all drives. This happens for situations where shouldHeal() returns true for all drives > parityBlocks. To avoid this we should never attempt to heal on all drives > parityBlocks, unless we are doing metadata migration from xl.json -> xl.meta	2022-05-25 15:17:10 -07:00
Harshavardhana	eda34423d7	update gofumpt -w - new changes	2022-04-13 12:00:11 -07:00
Anis Elleuch	3fca4055d2	heal: Re-heal an object when a corruption is found during normal scan (#14482 ) When scanning using normal mode, HealObject() can report an error saying that it found a corrupted part. This doesn't have when HealObject() is called with bitrot scan flag. However, when this happens, we can still restart HealObject() with the bitrot scan. This is also important because this means the scanner and the new disks healer will not be able to heal an object that doesn't exist in a specific disk and has corruption in another disk. Also without this PR, mc admin heal command without bitrot will report an error.	2022-03-04 18:24:34 -08:00
Harshavardhana	f19a414e09	fix: allow danging objects to be purged properly deleteMultipleObjects() (#14273 ) Deleting bulk objects had an issue since the relevant versionID is not passed through the layers to ensure that the dangling object purge actually works cleanly. This is a continuation of quorum related error returned by multi-object delete API from #14248 This PR ensures that we pass down correct information as well as extend the scope of dangling object detection.	2022-02-08 20:08:23 -08:00
Harshavardhana	f546636c52	fix: use renameAll instead of deleteObject() for purging temporary files (#14096 ) This PR simplifies few things - Multipart parts are renamed, upon failure are unrenamed() keep this multipart specific behavior it is needed and works fine. - AbortMultipart should blindly delete once lock is acquired instead of re-reading metadata and calculating quorum, abort is a delete() operation and client has no business looking for errors on this. - Skip Access() calls to folders that are operating on `.minio.sys/multipart` folder as well.	2022-01-13 11:07:41 -08:00
Harshavardhana	38ccc4f672	fix: make sure to avoid calling RenameData() on disconnected disks. (#14094 ) Large clusters with multiple sets, or multi-pool setups at times might fail and report unexpected "file not found" errors. This can become a problem during startup sequence when some files need to be created at multiple locations. - This PR ensures that we nil the erasure writers such that they are skipped in RenameData() call. - RenameData() doesn't need to "Access()" calls for `.minio.sys` folders they always exist. - Make sure PutObject() never returns ObjectNotFound{} for any errors, make sure it always returns "WriteQuorum" when renameData() fails with ObjectNotFound{}. Return appropriate errors for all other cases.	2022-01-12 18:49:01 -08:00
Harshavardhana	b7c5e45fff	heal: isObjectDangling should return false when it cannot decide (#14053 ) In a multi-pool setup when disks are coming up, or in a single pool setup let's say with 100's of erasure sets with a slow network. It's possible when healing is attempted on `.minio.sys/config` folder, it can lead to healing unexpectedly deleting some policy files as dangling due to a mistake in understanding when `isObjectDangling` is considered to be 'true'. This issue happened in commit `30135eed86` when we assumed the validMeta with empty ErasureInfo is considered to be fully dangling. This implementation issue gets exposed when the server is starting up. This is most easily seen with multiple-pool setups because of the disconnected fashion pools that come up. The decision to purge the object as dangling is taken incorrectly prior to the correct state being achieved on each pool, when the corresponding drive let's say returns 'errDiskNotFound', a 'delete' is triggered. At this point, the 'drive' comes online because this is part of the startup sequence as drives can come online lazily. This kind of situation exists because we allow (totalDisks/2) number of drives to be online when the server is being restarted. Implementation made an incorrect assumption here leading to policies getting deleted. Added tests to capture the implementation requirements.	2022-01-07 19:11:54 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	79df2c7ce7	correctly calculate read quorum based on the available fileInfo (#14000 ) The current usage of assuming `default` parity of `4` is not correct for all objects stored on MinIO, objects in .minio.sys have maximum parity, healing won't trigger on these objects due to incorrect verification of quorum.	2021-12-28 15:33:03 -08:00
Harshavardhana	b883803b21	fix: healing across pools removing dangling objects (#13990 ) adds other simplifications to the code when running namespace heals across pools.	2021-12-25 09:01:44 -08:00
Harshavardhana	7e3a7d7044	add healing for invalid shards by skipping the blocks (#13978 ) Built on top of #13945, now we need to simply skip the shards and its automated.	2021-12-23 23:01:46 -08:00
Harshavardhana	0e3037631f	skip inconsistent shards if possible (#13945 ) data shards were wrong due to a healing bug reported in #13803 mainly with unaligned object sizes. This PR is an attempt to automatically avoid these shards, with available information about the `xl.meta` and actually disk mtime.	2021-12-21 10:08:26 -08:00
jiangfucheng	7460fb8349	fix padding error and compatible with uploaded objects (#13803 )	2021-12-03 09:26:30 -08:00
Harshavardhana	28f95f1fbe	quorum calculation getLatestFileInfo should be itself (#13717 ) FileInfo quorum shouldn't be passed down, instead inferred after obtaining a maximally occurring FileInfo. This PR also changes other functions that rely on wrong quorum calculation. Update tests as well to handle the proper requirement. All these changes are needed when migrating from older deployments where we used to set N/2 quorum for reads to EC:4 parity in newer releases.	2021-11-22 09:36:29 -08:00
Harshavardhana	c791de0e1e	re-implement pickValidInfo dataDir, move to quorum calculation (#13681 ) dataDir loosely based on maxima is incorrect and does not work in all situations such as disks in the following order - xl.json migration to xl.meta there may be partial xl.json's leftover if some disks are not yet connected when the disk is yet to come up, since xl.json mtime and xl.meta is same the dataDir maxima doesn't work properly leading to quorum issues. - its also possible that XLV1 might be true among the disks available, make sure to keep FileInfo based on common quorum and skip unexpected disks with the older data format. Also, this PR tests upgrade from older to a newer release if the data is readable and matches the checksum. NOTE: this is just initial work we can build on top of this to do further tests.	2021-11-21 10:41:30 -08:00
Harshavardhana	4545ecad58	ignore swapped drives instead of throwing errors (#13655 ) - add checks such that swapped disks are detected and ignored - never used for normal operations. - implement `unrecognizedDisk` to be ignored with all operations returning `errDiskNotFound`. - also add checks such that we do not load unexpected disks while connecting automatically. - additionally humanize the values when printing the errors. Bonus: fixes handling of non-quorum situations in getLatestFileInfo(), that does not work when 2 drives are down, currently this function would return errors incorrectly.	2021-11-15 09:46:55 -08:00
Harshavardhana	94d587e6fc	fix: delete-markers without quorum were unreadable (#13351 ) DeleteMarkers were unreadable if they had quorum based guarantees, this PR tries to fix this behavior appropriately. DeleteMarkers with sufficient should be allowed and the return error should be accordingly with or without version-id. This also allows for overwrites which may not be possible in a multi-pool setup. fixes #12787	2021-10-04 08:53:38 -07:00
Anis Elleuch	1d9e91e00f	Fix wrong reporting of total disks after restart (#13326 ) A restart of the cluster and a failed disk will wrongly count the number of total disks.	2021-09-29 11:36:19 -07:00
Klaus Post	0e7fdcee30	Healing: Decide healing inlining based on metadata (#13178 ) Don't perform an independent evaluation of inlining, but mirror the decision made when uploading the object. Leads to some objects being inlined or not based on new metrics. Instead respect previous decision.	2021-09-09 08:55:43 -07:00
Harshavardhana	a19e3bc9d9	add more dangling heal related tests (#13140 ) also make sure that HealObject() never returns 'ObjectNotFound' or 'VersionNotFound' errors, as those are meaningless and not useful for the caller.	2021-09-02 20:56:13 -07:00
Krishnan Parthasarathi	db35bcf2ce	heal: Remove transitioned objects' parts from outdated disks (#13018 ) Bonus: check equality for replication and other metadata	2021-08-23 13:14:55 -07:00
Harshavardhana	035882d292	fix: remove parentIsObject() check (#12851 ) we will allow situations such as ``` a/b/1.txt a/b ``` and ``` a/b a/b/1.txt ``` we are going to document that this usecase is not supported and we will never support it, if any application does this users have to delete the top level parent to make sure namespace is accessible at lower level. rest of the situations where the prefixes get created across sets are supported as is.	2021-08-03 13:26:57 -07:00
Harshavardhana	a9d9b520ec	remove short circuited healing optimization (#12796 ) this healing optimization caused multiple regressions in healing - delete-markers incorrectly missing heal and returning incorrect healing results to client. - missing individual 'parts' such as for restored object or simply for all objects just missing few parts. This optimization is not necessary, we should proceed to verify all cases possible not just when metadata is inconsistent.	2021-07-26 16:51:09 -07:00
Harshavardhana	a3f7d575e0	improve delete-marker healing (#12794 ) delete-markers missing on drives were not healed due to few things disksWithAllParts() does not know-how to deal with delete markers, add support for that. fixes #12787	2021-07-26 11:48:09 -07:00
Harshavardhana	f175ff8f66	add healing fixes for delete-marker (#12788 ) - delete-markers are incorrectly reported as corrupt with wrong data sent to client 'mc admin heal -r' on objects with delete marker will report as 'grey' incorrectly. - do not heal delete-markers during HeadObject() this can lead to inconsistent order of heals on the object, although this is not an issue in terms of order of versions it is rather simpler to keep the same order on all drives. - defaultHealResult() should handle 'err == nil' case such that valid cases should be handled as 'drive' status OK.	2021-07-26 08:01:41 -07:00
Harshavardhana	542fe4ea2e	fix: legacy objects with 10MiB blockSize should use right buffers (#12459 ) healing code was using incorrect buffers to heal older objects with 10MiB erasure blockSize, incorrect calculation of such buffers can lead to incorrect premature closure of io.Pipe() during healing. fixes #12410	2021-06-07 10:06:06 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Harshavardhana	b5ebfd35b4	fix: always prefer DataBlocks present in FileInfo (#12386 )	2021-05-27 10:11:50 -07:00
Klaus Post	3fff50120b	Revert heal locks (#12365 ) A lot of healing is likely to be on non-existing objects and locks are very expensive and will slow down scanning significantly. In cases where all are valid or, all are broken allow rejection without locking. Keep the existing behavior, but move the check for dangling objects to after the lock has been acquired. ``` _, err = getLatestFileInfo(ctx, partsMetadata, errs) if err != nil { return er.purgeObjectDangling(ctx, bucket, object, versionID, partsMetadata, errs, []error{}, opts) } ``` Revert "heal: Hold lock when reading xl.meta from disks (#12362)" This reverts commit `abd32065aa`	2021-05-25 17:02:06 -07:00
Harshavardhana	4840974d7a	fix: inline data upon overwrites should be readable (#12369 ) This PR fixes two bugs - Remove fi.Data upon overwrite of objects from inlined-data to non-inlined-data - Workaround for an existing bug on disk with latest releases to ignore fi.Data and instead read from the disk for non-inlined-data - Addtionally add a reserved metadata header to indicate data is inlined for a given version.	2021-05-25 16:33:06 -07:00
Harshavardhana	ed4941a5f3	fix: calculate dataBlocks properly in healing (#12364 )	2021-05-25 09:34:27 -07:00
Anis Elleuch	abd32065aa	heal: Hold lock when reading xl.meta from disks (#12362 ) Lock is hold in healObject() after reading xl.meta from disks the first time. This commit will held the lock since the beginning of HealObject() Co-authored-by: Anis Elleuch <anis@min.io>	2021-05-24 13:39:38 -07:00
Klaus Post	cde6469b88	Fix hanging erasure writes (#12253 ) However, this slice is also used for closing the writers, so close is never called on these. Furthermore when an error is returned from a write it is now reported to the reader. bonus: remove unused heal param from `newBitrotWriter`. * Remove copy, now that we don't mutate.	2021-05-17 08:32:28 -07:00
Harshavardhana	f1e479d274	remove more duplicate bloom filter trackers (#12302 ) At some places bloom filter tracker was getting updated for `.minio.sys/tmp` bucket, there is no reason to update bloom filters for those. And add a missing bloom filter update for MakeBucket() Bonus: purge unused function deleteEmptyDir()	2021-05-17 08:25:48 -07:00
Harshavardhana	d84261aa6d	fix: ensure proper usage of DataDir (#12300 ) - GetObject() should always use a common dataDir to read from when it starts reading, this allows the code in erasure decoding to have sane expectations. - Healing should always heal on the common dataDir, this allows the code in dangling object detection to purge dangling content. These both situations can happen under certain types of retries during PUT when server is restarting etc, some namespace entries might be left over.	2021-05-14 16:50:47 -07:00
Krishnan Parthasarathi	0bab1c1895	Heal restored object contents on disk (#12238 )	2021-05-06 16:06:57 -07:00
Harshavardhana	1aa5858543	move madmin to github.com/minio/madmin-go (#12239 )	2021-05-06 08:52:02 -07:00
Harshavardhana	ff36baeaa7	fix: attempt to drain the ReadFileStream for connection pooling (#12208 ) avoid time_wait build up with getObject requests if there are pending callers and they timeout, can lead to time_wait states Bonus share the same buffer pool with erasure healing logic, additionally also fixes a race where parallel readers were never cleanup during Encode() phase, because pipe.Reader end was never closed(). Added closer right away upon an error during Encode to make sure to avoid racy Close() while stream was still being Read().	2021-05-04 10:12:08 -07:00
Krishnan Parthasarathi	860bf1bab2	Add IsRemote method on FileInfo, ObjectInfo (#12209 ) Provides a convenient method to know if an object's contents are in its remote tier.	2021-05-04 08:40:42 -07:00
Harshavardhana	f7a87b30bf	Revert "deprecate embedded browser (#12163 )" This reverts commit `736d8cbac4`. Bring contrib files for older contributions	2021-04-30 08:50:39 -07:00
Harshavardhana	64f6020854	fix: cleanup locking, cancel context upon lock timeout (#12183 ) upon errors to acquire lock context would still leak, since the cancel would never be called. since the lock is never acquired - proactively clear it before returning.	2021-04-29 20:55:21 -07:00
Anis Elleuch	9e797532dc	lock: Always cancel the returned Get(R)Lock context (#12162 ) * lock: Always cancel the returned Get(R)Lock context There is a leak with cancel created inside the locking mechanism. The cancel purpose was to cancel operations such erasure get/put that are holding non-refreshable locks. This PR will ensure the created context.Cancel is passed to the unlock API so it will cleanup and avoid leaks. * locks: Avoid returning nil cancel in local lockers Since there is no Refresh mechanism in the local locking mechanism, we do not generate a new context or cancel. Currently, a nil cancel function is returned but this can cause a crash. Return a dummy function instead.	2021-04-27 16:12:50 -07:00
Harshavardhana	736d8cbac4	deprecate embedded browser (#12163 ) https://github.com/minio/console takes over the functionality for the future object browser development Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-27 10:52:12 -07:00
Poorna Krishnamoorthy	4be0f92067	Fix multipart restore to remove part match (#12161 ) Part ETags are not available after multipart finalizes, removing this check as not useful. Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2021-04-26 18:24:06 -07:00
Krishnan Parthasarathi	c829e3a13b	Support for remote tier management (#12090 ) With this change, MinIO's ILM supports transitioning objects to a remote tier. This change includes support for Azure Blob Storage, AWS S3 compatible object storage incl. MinIO and Google Cloud Storage as remote tier storage backends. Some new additions include: - Admin APIs remote tier configuration management - Simple journal to track remote objects to be 'collected' This is used by object API handlers which 'mutate' object versions by overwriting/replacing content (Put/CopyObject) or removing the version itself (e.g DeleteObjectVersion). - Rework of previous ILM transition to fit the new model In the new model, a storage class (a.k.a remote tier) is defined by the 'remote' object storage type (one of s3, azure, GCS), bucket name and a prefix. * Fixed bugs, review comments, and more unit-tests - Leverage inline small object feature - Migrate legacy objects to the latest object format before transitioning - Fix restore to particular version if specified - Extend SharedDataDirCount to handle transitioned and restored objects - Restore-object should accept version-id for version-suspended bucket (#12091) - Check if remote tier creds have sufficient permissions - Bonus minor fixes to existing error messages Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Krishna Srinivas <krishna@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	a7acfa6158	fix: pick valid FileInfo additionally based on dataDir (#12116 ) * fix: pick valid FileInfo additionally based on dataDir historically we have always relied on modTime to be consistent and same, we can now add additional reference to look for the same dataDir value. A dataDir is the same for an object at a given point in time for a given version, let's say a `null` version is overwritten in quorum we do not by mistake pick up the fileInfo's incorrectly. * make sure to not preserve fi.Data Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-21 19:06:08 -07:00
Harshavardhana	2ef824bbb2	collapse two distinct calls into single RenameData() call (#12093 ) This is an optimization by reducing one extra system call, and many network operations. This reduction should increase the performance for small file workloads.	2021-04-20 10:44:39 -07:00
Klaus Post	d267d152ba	healing: re-read metadata after lock (#12004 ) Do no use potentially wrong metadata from before acquiring lock. Plus remove unused NoLock option.	2021-04-07 10:39:48 -07:00

1 2

83 Commits