minio

Commit Graph

Author	SHA1	Message	Date
Anis Elleuch	ed0cbfb31e	fix: rootdisk detection by not using cached value when GetDiskInfo() errors out (#15249 ) GetDiskInfo() uses timedValue to cache the disk info for one second. timedValue behavior was recently changed to return an old cached value when calculating a new value returns an error. When a mount point is empty, GetDiskInfo() will return errUnformattedDisk, timedValue will return cached disk info with unexpected IsRootDisk value, e.g. false if the mount point belongs to a root disk. Therefore, the mount point will be considered a valid disk and will be formatted as well. This commit will also add more defensive code when marking root disks: always mark a disk offline for any GetDiskInfo() error except errUnformattedDisk. The server will try anyway to reconnect to those disks every 10 seconds.	2022-07-07 17:05:23 -07:00
Harshavardhana	32b2f6117e	fix: do not pass around sync.Map (#15250 ) it is not safe to pass around sync.Map through pointers, as it may be concurrently updated by different callers. this PR simplifies by avoiding sync.Map altogether, we do not need sync.Map to keep object->erasureMap association. This PR fixes a crash when concurrently using this value when audit logs are configured. ``` fatal error: concurrent map iteration and map write goroutine 247651580 [running]: runtime.throw({0x277a6c1?, 0xc002381400?}) runtime/panic.go:992 +0x71 fp=0xc004d29b20 sp=0xc004d29af0 pc=0x438671 runtime.mapiternext(0xc0d6e87f18?) runtime/map.go:871 +0x4eb fp=0xc004d29b90 sp=0xc004d29b20 pc=0x41002b ```	2022-07-07 17:04:25 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
Harshavardhana	bd099f5e71	fix: change timedValue to return the previously cached value (#15169 ) fix: change timedvalue to return previous cached value caller can interpret the underlying error and decide accordingly, places where we do not interpret the errors upon timedValue.Get() - we should simply use the previously cached value instead of returning "empty". Bonus: remove some unused code	2022-06-25 08:50:16 -07:00
Poorna	cb097e6b0a	CopyObject: fix read/write err on closed pipe (#15135 ) Fixes: #15128 Regression from PR#14971	2022-06-21 19:20:11 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Harshavardhana	03f8b25b50	disable connectDisks loop under testing (#14920 ) avoids races during tests, keeps tests predictable	2022-05-16 05:36:00 -07:00
Anis Elleuch	44a3b58e52	Add audit log for decommissioning (#14858 )	2022-05-04 00:45:27 -07:00
Harshavardhana	eda34423d7	update gofumpt -w - new changes	2022-04-13 12:00:11 -07:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Xuehan Xu	becec6cb6b	correct mrf.newSetReconnected invocation's param order (#14426 ) Signed-off-by: xuxuehan <xuxuehan@qianxin.com>	2022-02-28 09:13:19 -08:00
Harshavardhana	0cbdc458c5	fix: do not reload disk format.json on a reconnected disk (#14351 ) An onlineDisk means its a valid disk but it may be a re-connected disk, this PR verifies that based on LastConn() to only trigger MRF. Current code would again re-load the disk 'format.json' which is not necessary and perhaps an unnecessary call. A potential side affect of this is closing perfectly online disks and getting re-replaced by reloading 'format.json'. This PR tries to avoid this situation by making sure MRF is triggered but not reloading 'format.json' because of MRF.	2022-02-21 15:51:54 -08:00
Anis Elleuch	1f92fc3fc0	Always check for root disks unless MINIO_CI_CD is set (#14232 ) The current code considers a pool with all root disks to be as part of a testing environment even if there are other pools with mounted disks. This will result to illegitimate writing in root disks. Fix this by simplifing the logic: require MINIO_CI_CD in order to skip root disk check.	2022-02-13 15:42:07 -08:00
Harshavardhana	fad3d66093	parallelize background cleanup on local disks across sets (#14290 )	2022-02-11 14:22:48 -08:00
Harshavardhana	6123377e66	speedup getFormatErasureInQuorum use driveCount (#14239 ) startup speed-up, currently getFormatErasureInQuorum() would spend up to 2-3secs when there are 3000+ drives for example in a setup, simplify this implementation to use drive counts.	2022-02-04 12:21:21 -08:00
Anis Elleuch	127e8bf3b6	heal: Avoid printing repetitive error to heal a root disk (#14220 ) The healing code repeatedly tries to heal a root disk when it is empty the reason is that connectEndpoint() returns errUnformattedDisk even if the disk is a root disk. Changing that to returning another error will avoid queueing the disk to the healing code in each connect disks iteration.	2022-01-31 17:28:20 -08:00
Harshavardhana	b68f0cbde4	ignore remote disks with diskID empty as offline (#14168 ) concurrent loading of erasure sets can now expose a situation in a distributed setup that might return diskID as empty, treat such disks as offline.	2022-01-24 19:40:02 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
yinhen	d300e775a6	Avoid reconnect of disk during startup sequence (#14070 )	2022-01-10 23:33:58 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00
Klaus Post	0e31cff762	fix: DeleteMultipleObjects to finish even if cancelled + concurrent sets (#14038 ) * Process sets concurrently. * Disconnect context from request. * Insert context cancellation checks. * errFileNotFound and errFileVersionNotFound are ok, unless creating delete markers.	2022-01-06 10:47:49 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	4545ecad58	ignore swapped drives instead of throwing errors (#13655 ) - add checks such that swapped disks are detected and ignored - never used for normal operations. - implement `unrecognizedDisk` to be ignored with all operations returning `errDiskNotFound`. - also add checks such that we do not load unexpected disks while connecting automatically. - additionally humanize the values when printing the errors. Bonus: fixes handling of non-quorum situations in getLatestFileInfo(), that does not work when 2 drives are down, currently this function would return errors incorrectly.	2021-11-15 09:46:55 -08:00
Harshavardhana	8bb52c9c2a	fix: ignore disks that are available but not writable (#13585 ) This is to allow replacing drives while some drives while available are not writable.	2021-11-04 16:42:49 -07:00
Klaus Post	d9c1d79e30	Protect logger targets (#13529 ) Logger targets were not race protected against concurrent updates from for example `HTTPConsoleLoggerSys`. Restrict direct access to targets and make slices immutable so a returned slice can be processed safely without locks.	2021-10-28 07:35:28 -07:00
Harshavardhana	0c48b1d993	fix: benchmarking test initialization > go test -run=none -bench=Benchmark github.com/minio/minio/cmd Runs now without any crashes. fixes #13380	2021-10-08 11:38:30 -07:00
Klaus Post	421160631a	MakeBucket: Delete leftover buckets on error (#13368 ) In (erasureServerPools).MakeBucketWithLocation deletes the created buckets if any set returns an error. Add `NoRecreate` option, which will not recreate the bucket in `DeleteBucket`, if the operation fails. Additionally use context.Background() for operations we always want to be performed.	2021-10-06 10:24:40 -07:00
Harshavardhana	fabf60bc4c	fix: allow configuring cleanup of stale multipart uploads (#13354 ) allow dynamically changing cleanup of stale multipart uploads, their expiry and how frequently its checked. Improves #13270	2021-10-04 10:52:28 -07:00
Anis Elleuch	1d9e91e00f	Fix wrong reporting of total disks after restart (#13326 ) A restart of the cluster and a failed disk will wrongly count the number of total disks.	2021-09-29 11:36:19 -07:00
Anis Elleuch	68a2d6fc40	xl: Avoid empty endpoints (#13299 ) An endpoint can be empty when a disk is offline or something wrong with it. Avoid it by filling erasureSets.endpointStrings with values from arguments.	2021-09-25 10:51:03 -07:00
Harshavardhana	1a884cd8e1	fix: deleting objects was not working after upgrades (#13242 ) DeleteObject() on existing objects before `xl.json` to `xl.meta` change were not working, not sure when this regression was added. This PR fixes this properly. Also this PR ensures that we perform rename of xl.json to xl.meta only during "write" phase of the call i.e either during Healing or PutObject() overwrites. Also handles few other scenarios during migration where `backendEncryptedFile` was missing deleteConfig() will fail with `configNotFound` this case was not ignored, which can lead to failure during upgrades.	2021-09-17 19:34:48 -07:00
Harshavardhana	6d42569ade	remove ListBucketsMetadata instead add them to AccountInfo() (#13241 )	2021-09-17 15:02:21 -07:00
Harshavardhana	45bcf73185	feat: Add ListBucketsWithMetadata extension API (#13219 )	2021-09-16 09:52:41 -07:00
Harshavardhana	787a72a993	make sure to ignore the rootDisk when healing drives (#13209 ) fixes #13208	2021-09-14 15:10:00 -07:00
Harshavardhana	035882d292	fix: remove parentIsObject() check (#12851 ) we will allow situations such as ``` a/b/1.txt a/b ``` and ``` a/b a/b/1.txt ``` we are going to document that this usecase is not supported and we will never support it, if any application does this users have to delete the top level parent to make sure namespace is accessible at lower level. rest of the situations where the prefixes get created across sets are supported as is.	2021-08-03 13:26:57 -07:00
Anis Elleuch	b0b4696a64	heal: Add MRF metrics to background heal API response (#12398 ) This commit gathers MRF metrics from all nodes in a cluster and return it to the caller. This will show information about the number of objects in the MRF queues waiting to be healed.	2021-07-15 22:32:06 -07:00
Harshavardhana	4669d19f2a	fix: simplify diskMap usage to keep certain checks predictable (#12519 ) Bonus: also make sure that we Sanitize() the drives only during startup of the server, but not during disk reconnects.	2021-06-16 14:26:26 -07:00
Anis Elleuch	7722b91e1d	s3: Force a prefix removal using a special header (#12504 ) An S3 client can send `x-minio-force-delete: true` to remove a prefix.	2021-06-15 18:43:14 -07:00
Harshavardhana	a93aa2eac1	fix: upon failure attempt an undo for all calls in DeleteBucket() (#12480 ) its possible that, version might exist on second pool such that upon deleteBucket() might have deleted the bucket on pool1 successfully since it doesn't have any objects, undo such operations properly in all any error scenario. Also delete bucket metadata from pool layer rather than sets layer.	2021-06-09 17:13:00 -07:00
Anis Elleuch	8e9e028c0c	fix: safe update of the audit objectErasureMap (#12477 ) objectErasureMap in the audit holds information about the objects involved in the current S3 operation such as pool index, set an index, and disk endpoints. One user saw a crash due to a concurrent update of objectErasureMap information. Use sync.Map to prevent a crash.	2021-06-09 10:51:19 -07:00
Harshavardhana	542fe4ea2e	fix: legacy objects with 10MiB blockSize should use right buffers (#12459 ) healing code was using incorrect buffers to heal older objects with 10MiB erasure blockSize, incorrect calculation of such buffers can lead to incorrect premature closure of io.Pipe() during healing. fixes #12410	2021-06-07 10:06:06 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Harshavardhana	81d5688d56	move the dependency to minio/pkg for common libraries (#12397 )	2021-05-28 15:17:01 -07:00
Anis Elleuch	56d4d7b8b1	MRF: Better detection of non stable disks (#12252 ) MRF does not detect when a node is disconnected and reconnected quickly this change will ensure that MRF is alerted by comparing the last disk reconnection timestamp with the last MRF check time. Signed-off-by: Anis Elleuch <anis@min.io> Co-authored-by: Klaus Post <klauspost@gmail.com>	2021-05-11 09:19:15 -07:00
Harshavardhana	1aa5858543	move madmin to github.com/minio/madmin-go (#12239 )	2021-05-06 08:52:02 -07:00
Krishnan Parthasarathi	c829e3a13b	Support for remote tier management (#12090 ) With this change, MinIO's ILM supports transitioning objects to a remote tier. This change includes support for Azure Blob Storage, AWS S3 compatible object storage incl. MinIO and Google Cloud Storage as remote tier storage backends. Some new additions include: - Admin APIs remote tier configuration management - Simple journal to track remote objects to be 'collected' This is used by object API handlers which 'mutate' object versions by overwriting/replacing content (Put/CopyObject) or removing the version itself (e.g DeleteObjectVersion). - Rework of previous ILM transition to fit the new model In the new model, a storage class (a.k.a remote tier) is defined by the 'remote' object storage type (one of s3, azure, GCS), bucket name and a prefix. * Fixed bugs, review comments, and more unit-tests - Leverage inline small object feature - Migrate legacy objects to the latest object format before transitioning - Fix restore to particular version if specified - Extend SharedDataDirCount to handle transitioned and restored objects - Restore-object should accept version-id for version-suspended bucket (#12091) - Check if remote tier creds have sufficient permissions - Bonus minor fixes to existing error messages Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Krishna Srinivas <krishna@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	e85b28398b	fix: pre-allocate certain slices with expected capacity (#12044 ) Avoids append() based tiny allocations on known allocated slices repeated access.	2021-04-12 13:45:06 -07:00
Klaus Post	111c02770e	Fix data race when connecting disks (#11983 ) Multiple disks from the same set would be writing concurrently. ``` WARNING: DATA RACE Write at 0x00c002100ce0 by goroutine 166: github.com/minio/minio/cmd.(erasureSets).connectDisks.func1() d:/minio/minio/cmd/erasure-sets.go:254 +0x82f Previous write at 0x00c002100ce0 by goroutine 129: github.com/minio/minio/cmd.(erasureSets).connectDisks.func1() d:/minio/minio/cmd/erasure-sets.go:254 +0x82f Goroutine 166 (running) created at: github.com/minio/minio/cmd.(erasureSets).connectDisks() d:/minio/minio/cmd/erasure-sets.go:210 +0x324 github.com/minio/minio/cmd.(erasureSets).monitorAndConnectEndpoints() d:/minio/minio/cmd/erasure-sets.go:288 +0x244 Goroutine 129 (finished) created at: github.com/minio/minio/cmd.(erasureSets).connectDisks() d:/minio/minio/cmd/erasure-sets.go:210 +0x324 github.com/minio/minio/cmd.(erasureSets).monitorAndConnectEndpoints() d:/minio/minio/cmd/erasure-sets.go:288 +0x244 ```	2021-04-06 11:33:10 -07:00

1 2 3

139 Commits