minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Anis Elleuch	ca69e54cb6	tests: Fix sporadic failure of TestXLStorageDeleteFile (#14911 ) The test expects from DeleteFile to return errDiskNotFound when the disk is not available. It calls os.RemoveAll() to remove one disk after XL storage initialization. However, this latter contains some goroutines which can race with os.RemoveAll() and then the test fails sporadically with returning random errors. The commit will tweak the initialization routine of the XL storage to only run deletion of temporary and metacache data in the background, so TestXLStorageDeleteFile won't fail anymore.	2022-05-12 15:24:58 -07:00
Harshavardhana	43eb5a001c	re-use transport for AdminInfo() call (#14571 ) avoids creating new transport for each `isServerResolvable` request, instead re-use the available global transport and do not try to forcibly close connections to avoid TIME_WAIT build upon large clusters. Never use httpClient.CloseIdleConnections() since that can have a drastic effect on existing connections on the transport pool. Remove it everywhere.	2022-03-17 16:20:10 -07:00
Harshavardhana	03a6e8aee2	fix: creating steep directory structure on trash folder (#14314 ) weird directory structures get created on the '.trash' folder upon server restarts, this PR fixes this.	2022-02-15 16:34:03 -08:00
Daniel	8ae46bce93	fix the error logs have been omitted because of retryCount never exceed 10 (#14268 )	2022-02-09 03:14:22 -08:00
Harshavardhana	6123377e66	speedup getFormatErasureInQuorum use driveCount (#14239 ) startup speed-up, currently getFormatErasureInQuorum() would spend up to 2-3secs when there are 3000+ drives for example in a setup, simplify this implementation to use drive counts.	2022-02-04 12:21:21 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
Harshavardhana	7f214a0e46	use dnscache resolver for resolving command line endpoints (#14135 ) this helps in caching the resolved values early on, avoids causing further resolution for individual nodes when object layer comes online. this can speed up our startup time during, upgrades etc by an order of magnitude. additional changes in connectLoadInitFormats() and parallelize all calls that might be potentially blocking.	2022-01-20 13:03:15 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	a7c430355a	fix: throw appropriate errors when all disks fail (#13820 ) when all disks fail with same error, fail server startup anyways - we cannot proceed. fixes #13818	2021-12-03 09:25:17 -08:00
Klaus Post	47de1d2e0e	Fix diskinfo race (#12857 ) Fixes share info struct. ``` WARNING: DATA RACE Read at 0x00c011780618 by goroutine 419: github.com/minio/minio/cmd.(DiskMetrics).DecodeMsg() c:/gopath/src/github.com/minio/minio/cmd/storage-datatypes_gen.go:331 +0x247 github.com/minio/minio/cmd.(DiskInfo).DecodeMsg() c:/gopath/src/github.com/minio/minio/cmd/storage-datatypes_gen.go:76 +0x5ec github.com/tinylib/msgp/msgp.Decode() c:/gopath/pkg/mod/github.com/tinylib/msgp@v1.1.6-0.20210521143832-0becd170c402/msgp/read.go:105 +0x70 github.com/minio/minio/cmd.(storageRESTClient).DiskInfo.func1.1() c:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:288 +0x235 github.com/minio/minio/cmd.(timedValue).Get() c:/gopath/src/github.com/minio/minio/cmd/utils.go:886 +0x77 github.com/minio/minio/cmd.(storageRESTClient).DiskInfo() c:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:297 +0xf9 github.com/minio/minio/cmd.getDiskInfos() c:/gopath/src/github.com/minio/minio/cmd/object-api-utils.go:962 +0x1a8 github.com/minio/minio/cmd.(erasureServerPools).getServerPoolsAvailableSpace.func1() c:/gopath/src/github.com/minio/minio/cmd/erasure-server-pool.go:241 +0x27c github.com/minio/minio/internal/sync/errgroup.(Group).Go.func1() c:/gopath/src/github.com/minio/minio/internal/sync/errgroup/errgroup.go:123 +0xd7 Previous write at 0x00c011780618 by goroutine 423: github.com/minio/minio/cmd.(DiskMetrics).DecodeMsg() c:/gopath/src/github.com/minio/minio/cmd/storage-datatypes_gen.go:332 +0x6e4 github.com/minio/minio/cmd.(DiskInfo).DecodeMsg() c:/gopath/src/github.com/minio/minio/cmd/storage-datatypes_gen.go:76 +0x5ec github.com/tinylib/msgp/msgp.Decode() c:/gopath/pkg/mod/github.com/tinylib/msgp@v1.1.6-0.20210521143832-0becd170c402/msgp/read.go:105 +0x70 github.com/minio/minio/cmd.(storageRESTClient).DiskInfo.func1.1() c:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:288 +0x235 github.com/minio/minio/cmd.(timedValue).Get() c:/gopath/src/github.com/minio/minio/cmd/utils.go:886 +0x77 github.com/minio/minio/cmd.(storageRESTClient).DiskInfo() c:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:297 +0xf9 github.com/minio/minio/cmd.getDiskInfos() c:/gopath/src/github.com/minio/minio/cmd/object-api-utils.go:962 +0x1a8 github.com/minio/minio/cmd.(erasureServerPools).getServerPoolsAvailableSpace.func1() c:/gopath/src/github.com/minio/minio/cmd/erasure-server-pool.go:241 +0x27c github.com/minio/minio/internal/sync/errgroup.(Group).Go.func1() c:/gopath/src/github.com/minio/minio/internal/sync/errgroup/errgroup.go:123 +0xd7 ```	2021-08-23 01:13:47 -07:00
Harshavardhana	180eabaa8e	fix: rename(tmp, tmp-old) is necessary previous PR incorrectly changed rename() from tmp to -> tmp/.trash/uuid, since it is self referential - to clear this up make sure its renamed to a separate folder and deleted in background - just like before.	2021-06-16 16:19:26 -07:00
Harshavardhana	4669d19f2a	fix: simplify diskMap usage to keep certain checks predictable (#12519 ) Bonus: also make sure that we Sanitize() the drives only during startup of the server, but not during disk reconnects.	2021-06-16 14:26:26 -07:00
Anis Elleuch	810af07529	xl: Avoid multi-disks node to exit when one disk fails (#12423 ) It makes sense that a node that has multiple disks starts when one disk fails, returning an i/o error for example. This commit will make this faulty tolerance available in this specific use case.	2021-06-05 09:10:32 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Harshavardhana	4d876d03e8	fix: do not fail upon faulty/non-writable drives gracefully start the server, if there are other drives available - print enough information for administrator to notice the errors in console. Bonus: for really large streams use larger buffer for writes.	2021-05-15 12:57:18 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	2a79ea0332	isServerResolvable its sufficient to check server is reachable (#11609 ) using isServerResolvable for expiration can lead to chicken and egg problems, a lock might expire knowingly when server is booting up causing perpetual locks getting expired.	2021-02-22 16:29:53 -08:00
Harshavardhana	da676ac298	remove network calls for getLocalDisks (#11603 )	2021-02-22 13:19:44 -08:00
Harshavardhana	b3c56b53fb	fix: metacache should only rename entries during cleanup (#11503 ) To avoid large delays in metacache cleanup, use rename instead of recursive delete calls, renames are cheaper move the content to minioMetaTmpBucket and then cleanup this folder once in 24hrs instead. If the new cache can replace an existing one, we should let it replace since that is currently being saved anyways, this avoids pile up of 1000's of metacache entires for same listing calls that are not necessary to be stored on disk.	2021-02-11 10:22:03 -08:00
Harshavardhana	1e53bf2789	fix: allow expansion with newer constraints for older setups (#11372 ) currently we had a restriction where older setups would need to follow previous style of "stripe" count being same expansion, we can relax that instead newer pools can be expanded for older setups with newer constraints of common parity ratio.	2021-01-29 11:40:55 -08:00
Harshavardhana	9cdd981ce7	fix: expire locks only on participating lockers (#11335 ) additionally also add a new ForceUnlock API, to allow forcibly unlocking locks if possible.	2021-01-25 10:01:27 -08:00
Harshavardhana	1ad2b7b699	fix: add stricter validation for erasure server pools (#11299 ) During expansion we need to validate if - new deployment is expanded with newer constraints - existing deployment is expanded with older constraints - multiple server pools rejected if they have different deploymentID and distribution algo	2021-01-19 10:01:31 -08:00
Harshavardhana	f903cae6ff	Support variable server pools (#11256 ) Current implementation requires server pools to have same erasure stripe sizes, to facilitate same SLA and expectations. This PR allows server pools to be variadic, i.e they do not have to be same erasure stripe sizes - instead they should have SLA for parity ratio. If the parity ratio cannot be guaranteed by the new server pool, the deployment is rejected i.e server pool expansion is not allowed.	2021-01-16 12:08:02 -08:00
Harshavardhana	a4f6705874	expire stale locks when owner is down (#11247 ) fixes #11246	2021-01-07 19:16:18 -08:00
Harshavardhana	b5d291ea88	fix: rename remaining zone -> pool (#11231 )	2021-01-06 09:35:47 -08:00
Harshavardhana	5c451d1690	update x/net/http2 to address few bugs (#11144 ) additionally also configure http2 healthcheck values to quickly detect unstable connections and let them timeout. also use single transport for proxying requests	2020-12-21 21:42:38 -08:00
Harshavardhana	8368ab76aa	fix: remove the requirement for healing buckets in ListBucketsHeal (#11098 ) With new refactor of bucket healing, healing bucket happens automatically including its metadata, there is no need to redundant heal buckets also in ListBucketsHeal remove it.	2020-12-14 12:07:07 -08:00
Harshavardhana	df93102235	fix: unwrapping issues with os.Is* functions (#10949 ) reduces 3 stat calls, reducing the overall startup time significantly.	2020-11-23 08:36:49 -08:00
Harshavardhana	02cfa774be	allow requests to be proxied when server is booting up (#10790 ) when server is booting up there is a possibility that users might see '503' because object layer when not initialized, then the request is proxied to neighboring peers first one which is online.	2020-10-30 12:20:28 -07:00
Harshavardhana	6a8c62f9fd	make sure to preserve UUID from reference format (#10748 ) reference format should be source of truth for inconsistent drives which reconnect, add them back to their original position remove automatic fix for existing offline disk uuids	2020-10-24 13:23:08 -07:00
Harshavardhana	66174692a2	add '.healing.bin' for tracking currently healing disk (#10573 ) add a hint on the disk to allow for tracking fresh disk being healed, to allow for restartable heals, and also use this as a way to track and remove disks. There are more pending changes where we should move all the disk formatting logic to backend drives, this PR doesn't deal with this refactor instead makes it easier to track healing in the future.	2020-09-28 19:39:32 -07:00
Harshavardhana	c13afd56e8	Remove MaxConnsPerHost settings to avoid potential hangs (#10438 ) MaxConnsPerHost can potentially hang a call without any way to timeout, we do not need this setting for our proxy and gateway implementations instead IdleConn settings are good enough. Also ensure to use NewRequestWithContext and make sure to take the disks offline only for network errors. Fixes #10304	2020-09-08 14:22:04 -07:00
Harshavardhana	a359e36e35	tolerate listing with only readQuorum disks (#10357 ) We can reduce this further in the future, but this is a good value to keep around. With the advent of continuous healing, we can be assured that namespace will eventually be consistent so we are okay to avoid the necessity to a list across all drives on all sets. Bonus Pop()'s in parallel seem to have the potential to wait too on large drive setups and cause more slowness instead of gaining any performance remove it for now. Also, implement load balanced reply for local disks, ensuring that local disks have an affinity for - cleanupStaleMultipartUploads()	2020-08-26 19:29:35 -07:00
Harshavardhana	74116204ce	handle fresh setup with mixed drives (#10273 ) fresh drive setups when one of the drive is a root drive, we should ignore such a root drive and not proceed to format. This PR handles this properly by marking the disks which are root disk and they are taken offline.	2020-08-18 14:37:26 -07:00
Harshavardhana	1e2ebc9945	feat: time to bring back http2.0 support (#10230 ) Bonus move our CI/CD to go1.14	2020-08-10 09:02:29 -07:00
Harshavardhana	0b8255529a	fix: proxies set keep-alive timeouts to be system dependent (#10199 ) Split the DialContext's one for internode and another for all other external communications especially proxy forwarders, gateway transport etc.	2020-08-04 14:55:53 -07:00
Harshavardhana	b16781846e	allow server to start even with corrupted/faulty disks (#10175 )	2020-08-03 18:17:48 -07:00
Harshavardhana	187c3f62df	fix: heal replaced drives properly (#10069 ) healing was not working properly when drives were replaced, due to the error check in root disk calculation this PR fixes this behavior This PR also adds additional fix for missing metadata entries from .minio.sys as part of disk healing as well. Added code to ignore and print more context sensitive errors for better debugging. This PR is continuation of fix in `7b14e9b660`	2020-07-17 10:08:04 -07:00
Harshavardhana	e7d7d5232c	fix: admin info output and improve overall performance (#10015 ) - admin info node offline check is now quicker - admin info now doesn't duplicate the code across doing the same checks for disks - rely on StorageInfo to return appropriate errors instead of calling locally. - diskID checks now return proper errors when disk not found v/s format.json missing. - add more disk states for more clarity on the underlying disk errors.	2020-07-13 09:51:07 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
Harshavardhana	d0ae69087c	fix: add proper errors for disks with preexisting content (#9703 )	2020-05-26 09:32:33 -07:00
Harshavardhana	9c85928740	add formatting message for zones in ordinals (#9596 ) Unlike the message > Formatting 2 zone, 1 set(s), 6 drives per set. It is more readable as ordinal > Formatting 2nd zone, 1 set(s), 6 drives per set.	2020-05-13 20:25:29 -07:00
Harshavardhana	6ac48a65cb	fix: use unused cacheMetrics code in prometheus (#9588 ) remove all other unusued/deadcode	2020-05-13 08:15:26 -07:00
Klaus Post	c4464e36c8	fix: limit HTTP transport tuables to affordable values (#9383 ) Close connections pro-actively in transient calls	2020-04-17 11:20:56 -07:00
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	2020-04-09 09:30:02 -07:00
Harshavardhana	4714958e99	fix: possible connection leaks in sets init, heal (#9263 )	2020-04-03 18:06:31 -07:00
Harshavardhana	6f992134a2	fix: startup load time by reusing storageDisks (#9210 )	2020-03-27 14:48:30 -07:00
Harshavardhana	ff932ca2a0	fix: log only catastrophic errors in prepare storage (#9189 )	2020-03-23 07:32:18 -07:00
Harshavardhana	6a00eb10bf	fix: allow set drive count of proper divisible values (#9101 ) Currently the code assumed some orthogonal requirements which led situations where when we have a setup where we have let's say for example 168 drives, the final set_drive_count chosen was 14. Indeed 168 drives are divisible by 12 but this wasn't allowed due to an unexpected requirement to have 12 to be a perfect modulo of 14 which is not possible. This assumption was incorrect. This PR fixes this old assumption properly, also adds few tests and some negative tests as well. Improvements are seen in error messages as well.	2020-03-08 13:30:25 -07:00

1 2 3

126 Commits