minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Harshavardhana	03f8b25b50	disable connectDisks loop under testing (#14920 ) avoids races during tests, keeps tests predictable	2022-05-16 05:36:00 -07:00
Anis Elleuch	44a3b58e52	Add audit log for decommissioning (#14858 )	2022-05-04 00:45:27 -07:00
Harshavardhana	eda34423d7	update gofumpt -w - new changes	2022-04-13 12:00:11 -07:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Xuehan Xu	becec6cb6b	correct mrf.newSetReconnected invocation's param order (#14426 ) Signed-off-by: xuxuehan <xuxuehan@qianxin.com>	2022-02-28 09:13:19 -08:00
Harshavardhana	0cbdc458c5	fix: do not reload disk format.json on a reconnected disk (#14351 ) An onlineDisk means its a valid disk but it may be a re-connected disk, this PR verifies that based on LastConn() to only trigger MRF. Current code would again re-load the disk 'format.json' which is not necessary and perhaps an unnecessary call. A potential side affect of this is closing perfectly online disks and getting re-replaced by reloading 'format.json'. This PR tries to avoid this situation by making sure MRF is triggered but not reloading 'format.json' because of MRF.	2022-02-21 15:51:54 -08:00
Anis Elleuch	1f92fc3fc0	Always check for root disks unless MINIO_CI_CD is set (#14232 ) The current code considers a pool with all root disks to be as part of a testing environment even if there are other pools with mounted disks. This will result to illegitimate writing in root disks. Fix this by simplifing the logic: require MINIO_CI_CD in order to skip root disk check.	2022-02-13 15:42:07 -08:00
Harshavardhana	fad3d66093	parallelize background cleanup on local disks across sets (#14290 )	2022-02-11 14:22:48 -08:00
Harshavardhana	6123377e66	speedup getFormatErasureInQuorum use driveCount (#14239 ) startup speed-up, currently getFormatErasureInQuorum() would spend up to 2-3secs when there are 3000+ drives for example in a setup, simplify this implementation to use drive counts.	2022-02-04 12:21:21 -08:00
Anis Elleuch	127e8bf3b6	heal: Avoid printing repetitive error to heal a root disk (#14220 ) The healing code repeatedly tries to heal a root disk when it is empty the reason is that connectEndpoint() returns errUnformattedDisk even if the disk is a root disk. Changing that to returning another error will avoid queueing the disk to the healing code in each connect disks iteration.	2022-01-31 17:28:20 -08:00
Harshavardhana	b68f0cbde4	ignore remote disks with diskID empty as offline (#14168 ) concurrent loading of erasure sets can now expose a situation in a distributed setup that might return diskID as empty, treat such disks as offline.	2022-01-24 19:40:02 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
yinhen	d300e775a6	Avoid reconnect of disk during startup sequence (#14070 )	2022-01-10 23:33:58 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00
Klaus Post	0e31cff762	fix: DeleteMultipleObjects to finish even if cancelled + concurrent sets (#14038 ) * Process sets concurrently. * Disconnect context from request. * Insert context cancellation checks. * errFileNotFound and errFileVersionNotFound are ok, unless creating delete markers.	2022-01-06 10:47:49 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	4545ecad58	ignore swapped drives instead of throwing errors (#13655 ) - add checks such that swapped disks are detected and ignored - never used for normal operations. - implement `unrecognizedDisk` to be ignored with all operations returning `errDiskNotFound`. - also add checks such that we do not load unexpected disks while connecting automatically. - additionally humanize the values when printing the errors. Bonus: fixes handling of non-quorum situations in getLatestFileInfo(), that does not work when 2 drives are down, currently this function would return errors incorrectly.	2021-11-15 09:46:55 -08:00
Harshavardhana	8bb52c9c2a	fix: ignore disks that are available but not writable (#13585 ) This is to allow replacing drives while some drives while available are not writable.	2021-11-04 16:42:49 -07:00
Klaus Post	d9c1d79e30	Protect logger targets (#13529 ) Logger targets were not race protected against concurrent updates from for example `HTTPConsoleLoggerSys`. Restrict direct access to targets and make slices immutable so a returned slice can be processed safely without locks.	2021-10-28 07:35:28 -07:00
Harshavardhana	0c48b1d993	fix: benchmarking test initialization > go test -run=none -bench=Benchmark github.com/minio/minio/cmd Runs now without any crashes. fixes #13380	2021-10-08 11:38:30 -07:00
Klaus Post	421160631a	MakeBucket: Delete leftover buckets on error (#13368 ) In (erasureServerPools).MakeBucketWithLocation deletes the created buckets if any set returns an error. Add `NoRecreate` option, which will not recreate the bucket in `DeleteBucket`, if the operation fails. Additionally use context.Background() for operations we always want to be performed.	2021-10-06 10:24:40 -07:00
Harshavardhana	fabf60bc4c	fix: allow configuring cleanup of stale multipart uploads (#13354 ) allow dynamically changing cleanup of stale multipart uploads, their expiry and how frequently its checked. Improves #13270	2021-10-04 10:52:28 -07:00
Anis Elleuch	1d9e91e00f	Fix wrong reporting of total disks after restart (#13326 ) A restart of the cluster and a failed disk will wrongly count the number of total disks.	2021-09-29 11:36:19 -07:00
Anis Elleuch	68a2d6fc40	xl: Avoid empty endpoints (#13299 ) An endpoint can be empty when a disk is offline or something wrong with it. Avoid it by filling erasureSets.endpointStrings with values from arguments.	2021-09-25 10:51:03 -07:00
Harshavardhana	1a884cd8e1	fix: deleting objects was not working after upgrades (#13242 ) DeleteObject() on existing objects before `xl.json` to `xl.meta` change were not working, not sure when this regression was added. This PR fixes this properly. Also this PR ensures that we perform rename of xl.json to xl.meta only during "write" phase of the call i.e either during Healing or PutObject() overwrites. Also handles few other scenarios during migration where `backendEncryptedFile` was missing deleteConfig() will fail with `configNotFound` this case was not ignored, which can lead to failure during upgrades.	2021-09-17 19:34:48 -07:00
Harshavardhana	6d42569ade	remove ListBucketsMetadata instead add them to AccountInfo() (#13241 )	2021-09-17 15:02:21 -07:00
Harshavardhana	45bcf73185	feat: Add ListBucketsWithMetadata extension API (#13219 )	2021-09-16 09:52:41 -07:00
Harshavardhana	787a72a993	make sure to ignore the rootDisk when healing drives (#13209 ) fixes #13208	2021-09-14 15:10:00 -07:00
Harshavardhana	035882d292	fix: remove parentIsObject() check (#12851 ) we will allow situations such as ``` a/b/1.txt a/b ``` and ``` a/b a/b/1.txt ``` we are going to document that this usecase is not supported and we will never support it, if any application does this users have to delete the top level parent to make sure namespace is accessible at lower level. rest of the situations where the prefixes get created across sets are supported as is.	2021-08-03 13:26:57 -07:00
Anis Elleuch	b0b4696a64	heal: Add MRF metrics to background heal API response (#12398 ) This commit gathers MRF metrics from all nodes in a cluster and return it to the caller. This will show information about the number of objects in the MRF queues waiting to be healed.	2021-07-15 22:32:06 -07:00
Harshavardhana	4669d19f2a	fix: simplify diskMap usage to keep certain checks predictable (#12519 ) Bonus: also make sure that we Sanitize() the drives only during startup of the server, but not during disk reconnects.	2021-06-16 14:26:26 -07:00
Anis Elleuch	7722b91e1d	s3: Force a prefix removal using a special header (#12504 ) An S3 client can send `x-minio-force-delete: true` to remove a prefix.	2021-06-15 18:43:14 -07:00
Harshavardhana	a93aa2eac1	fix: upon failure attempt an undo for all calls in DeleteBucket() (#12480 ) its possible that, version might exist on second pool such that upon deleteBucket() might have deleted the bucket on pool1 successfully since it doesn't have any objects, undo such operations properly in all any error scenario. Also delete bucket metadata from pool layer rather than sets layer.	2021-06-09 17:13:00 -07:00
Anis Elleuch	8e9e028c0c	fix: safe update of the audit objectErasureMap (#12477 ) objectErasureMap in the audit holds information about the objects involved in the current S3 operation such as pool index, set an index, and disk endpoints. One user saw a crash due to a concurrent update of objectErasureMap information. Use sync.Map to prevent a crash.	2021-06-09 10:51:19 -07:00
Harshavardhana	542fe4ea2e	fix: legacy objects with 10MiB blockSize should use right buffers (#12459 ) healing code was using incorrect buffers to heal older objects with 10MiB erasure blockSize, incorrect calculation of such buffers can lead to incorrect premature closure of io.Pipe() during healing. fixes #12410	2021-06-07 10:06:06 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Harshavardhana	81d5688d56	move the dependency to minio/pkg for common libraries (#12397 )	2021-05-28 15:17:01 -07:00
Anis Elleuch	56d4d7b8b1	MRF: Better detection of non stable disks (#12252 ) MRF does not detect when a node is disconnected and reconnected quickly this change will ensure that MRF is alerted by comparing the last disk reconnection timestamp with the last MRF check time. Signed-off-by: Anis Elleuch <anis@min.io> Co-authored-by: Klaus Post <klauspost@gmail.com>	2021-05-11 09:19:15 -07:00
Harshavardhana	1aa5858543	move madmin to github.com/minio/madmin-go (#12239 )	2021-05-06 08:52:02 -07:00
Krishnan Parthasarathi	c829e3a13b	Support for remote tier management (#12090 ) With this change, MinIO's ILM supports transitioning objects to a remote tier. This change includes support for Azure Blob Storage, AWS S3 compatible object storage incl. MinIO and Google Cloud Storage as remote tier storage backends. Some new additions include: - Admin APIs remote tier configuration management - Simple journal to track remote objects to be 'collected' This is used by object API handlers which 'mutate' object versions by overwriting/replacing content (Put/CopyObject) or removing the version itself (e.g DeleteObjectVersion). - Rework of previous ILM transition to fit the new model In the new model, a storage class (a.k.a remote tier) is defined by the 'remote' object storage type (one of s3, azure, GCS), bucket name and a prefix. * Fixed bugs, review comments, and more unit-tests - Leverage inline small object feature - Migrate legacy objects to the latest object format before transitioning - Fix restore to particular version if specified - Extend SharedDataDirCount to handle transitioned and restored objects - Restore-object should accept version-id for version-suspended bucket (#12091) - Check if remote tier creds have sufficient permissions - Bonus minor fixes to existing error messages Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Krishna Srinivas <krishna@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	e85b28398b	fix: pre-allocate certain slices with expected capacity (#12044 ) Avoids append() based tiny allocations on known allocated slices repeated access.	2021-04-12 13:45:06 -07:00
Klaus Post	111c02770e	Fix data race when connecting disks (#11983 ) Multiple disks from the same set would be writing concurrently. ``` WARNING: DATA RACE Write at 0x00c002100ce0 by goroutine 166: github.com/minio/minio/cmd.(erasureSets).connectDisks.func1() d:/minio/minio/cmd/erasure-sets.go:254 +0x82f Previous write at 0x00c002100ce0 by goroutine 129: github.com/minio/minio/cmd.(erasureSets).connectDisks.func1() d:/minio/minio/cmd/erasure-sets.go:254 +0x82f Goroutine 166 (running) created at: github.com/minio/minio/cmd.(erasureSets).connectDisks() d:/minio/minio/cmd/erasure-sets.go:210 +0x324 github.com/minio/minio/cmd.(erasureSets).monitorAndConnectEndpoints() d:/minio/minio/cmd/erasure-sets.go:288 +0x244 Goroutine 129 (finished) created at: github.com/minio/minio/cmd.(erasureSets).connectDisks() d:/minio/minio/cmd/erasure-sets.go:210 +0x324 github.com/minio/minio/cmd.(erasureSets).monitorAndConnectEndpoints() d:/minio/minio/cmd/erasure-sets.go:288 +0x244 ```	2021-04-06 11:33:10 -07:00
Harshavardhana	d46386246f	api: Introduce metadata update APIs to update only metadata (#11962 ) Current implementation heavily relies on readAllFileInfo but with the advent of xl.meta inlined with data, we cannot easily avoid reading data when we are only interested is updating metadata, this leads to invariably write amplification during metadata updates, repeatedly reading data when we are only interested in updating metadata. This PR ensures that we implement a metadata only update API at storage layer, that handles updates to metadata alone for any given version - given the version is valid and present. This helps reduce the chattiness for following calls.. - PutObjectTags - DeleteObjectTags - PutObjectLegalHold - PutObjectRetention - ReplicateObject (updates metadata on replication status)	2021-04-04 13:32:31 -07:00
Anis Elleuch	2c296652f7	Simplify access to local node name (#11907 ) The local node name is heavily used in tracing, create a new global variable to store it. Multiple goroutines can access it since it won't be changed later.	2021-03-26 11:37:58 -07:00
Harshavardhana	90d8ec6310	fix: reject duplicate keys in PostPolicyJSON document (#11902 ) fixes #11894	2021-03-25 13:57:57 -07:00
Anis Elleuch	14d89eaae4	mrf: Enhance behavior for better results (#11788 ) MRF was starting to heal when it receives a disk connection event, which is not good when a node having multiple disks reconnects to the cluster. Besides, MRF needs Remove healing option to remove stale files.	2021-03-18 11:19:02 -07:00
Harshavardhana	add3cd4e44	allow configuring delete cleanup interval from default 10minutes (#11818 )	2021-03-17 15:15:58 -07:00

1 2 3

134 Commits