minio

mirror of https://github.com/minio/minio.git synced 2025-11-24 03:27:44 -05:00

Author	SHA1	Message	Date
Klaus Post	b890bbfa63	Add local disk health checks (#14447 ) The main goal of this PR is to solve the situation where disks stop responding to operations. This generally causes an FD build-up and eventually will crash the server. This adds detection of hung disks, where calls on disk get stuck. We add functionality to `xlStorageDiskIDCheck` where it keeps track of the number of concurrent requests on a given disk. A total number of 100 operations are allowed. If this limit is reached we will block (but not reject) new requests, but we will monitor the state of the disk. If no requests have been completed or updated within a 15-second window, we mark the disk as offline. Requests that are blocked will be unblocked and return an error as "faulty disk". New requests will be rejected until the disk is marked OK again. Once a disk has been marked faulty, a check will run every 5 seconds that will attempt to write and read back a file. As long as this fails the disk will remain faulty. To prevent lots of long-running requests to mark the disk faulty we implement a callback feature that allows updating the status as parts of these operations are running. We add a reader and writer wrapper that will update the status of each successful read/write operation. This should allow fine enough granularity that a slow, but still operational disk will not reach 15 seconds where 50 operations have not progressed. Note that errors themselves are not enough to mark a disk faulty. A nil (or io.EOF) error will mark a disk as "good". * Make concurrent disk setting configurable via `_MINIO_DISK_MAX_CONCURRENT`. * de-couple IsOnline() from disk health tracker The purpose of IsOnline() is to ensure that we reconnect the drive only when the "drive" was - disconnected from network we need to validate if the drive is "correct" and is the same drive which belongs to this server. - drive was replaced we have to format it - we support hot swapping of the drives. IsOnline() is not meant for taking the drive offline when it is hung, it is not useful we can let the drive be online instead "return" errors for relevant calls. * return errFaultyDisk for DiskInfo() call Co-authored-by: Harshavardhana <harsha@minio.io> Possible future Improvements: * Unify the REST server and local xlStorageDiskIDCheck. This would also improve stats significantly. * Allow reads/writes to be aborted by the context. * Add usage stats, concurrent count, blocked operations, etc.	2022-03-09 11:38:54 -08:00
Klaus Post	7060c809c0	Add authorization header to HEAD requests (#14510 ) Add Authorization to network check requests. Fixes #14507	2022-03-09 10:48:56 -08:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Andreas Auernhammer	b48f719b8e	kes: remove unnecessary error conversion (#14459 ) This commit removes some duplicate code that converts KES API errors. This code was added since KES `0.18.0` changed some exported API errors. However, the KES SDK handles this error conversion itself. Therefore, it is not necessary to duplicate this behavior in MinIO. See: `21555fa624/error.go (L94)` Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-03 09:42:37 -08:00
Lenin Alevski	289fcbd08c	KES dependency upgrade (#14454 ) - Updating KES dependency to v.0.18.0 - Fixing incompatibility issue when checking for errors during KES key creation Signed-off-by: Lenin Alevski <alevsk.8772@gmail.com>	2022-03-02 23:03:40 -08:00
Harshavardhana	f6875bb893	fix: regression from refactor in AMQP notification (#14455 ) fixes a regression introduced in #14269 that refactored the notification registration logic, all the amqp targets however online will not be available for use anymore. fixes #14451	2022-03-02 21:35:48 -08:00
Klaus Post	b030ef1aca	tests: Clean up dsync package (#14415 ) Add non-constant timeouts to dsync package. Reduce test runtime by minutes. Hopefully not too aggressive.	2022-03-01 11:14:28 -08:00
Klaus Post	88fd1cba71	select: add MISSING operator support (#14406 ) Probably not full support, but for regular checks it should work. Fixes #14358	2022-02-25 12:31:19 -08:00
hellivan	5307e18085	use keycloak_realm properly for keycloak user lookups (#14401 ) In case a user-defined a value for the MINIO_IDENTITY_OPENID_KEYCLOAK_REALM environment variable, construct the path properly.	2022-02-24 10:16:53 -08:00
Klaus Post	2cea944cdb	select: Allow lower case 'is' (#14405 ) Ref: #14358	2022-02-24 09:10:48 -08:00
Shireesh Anjal	3934700a08	Make audit webhook and kafka config dynamic (#14390 )	2022-02-24 09:05:33 -08:00
hellivan	0913eb6655	fix: openid config provider not initialized correctly (#14399 ) Up until now `InitializeProvider` method of `Config` struct was implemented on a value receiver which is why changes on `provider` field where never reflected to method callers. In order to fix this issue, the method was implemented on a pointer receiver.	2022-02-23 23:42:37 -08:00
Harshavardhana	1bfbe354f5	fix: clientId must be unique for all servers (#14398 ) This is a regression from #14037, distributed setups with MQTT was not working anymore. According to MQTT spec it is expected this is unique per server. We shall proceed to use unix nano timestamp hex value instead here.	2022-02-23 20:19:59 -08:00
Shireesh Anjal	25144fedd5	Send deployment id and minio version in http header (#14378 )	2022-02-23 13:36:01 -08:00
Shireesh Anjal	94d37d05e5	Apply dynamic config at sub-system level (#14369 ) Currently, when applying any dynamic config, the system reloads and re-applies the config of all the dynamic sub-systems. This PR refactors the code in such a way that changing config of a given dynamic sub-system will work on only that sub-system.	2022-02-22 10:59:28 -08:00
Shireesh Anjal	c1437c7b46	allow `config reset api` to work by overloading default values (#14368 ) The `LookupConfig` code was not using `GetWithDefault`, because of which some of the config values were being returned as empty string, and calls like `strconv.Atoi` and `time.ParseDuration` on these were failing.	2022-02-21 15:50:45 -08:00
Aditya Manthramurthy	bc110d8055	fix: mysql notification target table creation (#14350 ) Add a generated hash column as the primary key for the key name as MySQL does not allow indexes on long VARCHAR columns.	2022-02-18 12:13:49 -08:00
Harshavardhana	65b1a4282e	fix: console logger regression with dynamic logger webhook registration (#14346 ) fixes a regression from #14289	2022-02-17 17:50:10 -08:00
Shireesh Anjal	28f188e3ef	Make logger webhook config dynamic (#14289 ) It should not be required to restart the server after setting the logger webhook config.	2022-02-17 11:11:15 -08:00
Shireesh Anjal	1a5496eced	Add `enable` key to logger webhook help (#14326 ) This key is supported by the logger webhook config - but is not returned in the help.	2022-02-16 11:59:50 -08:00
Shireesh Anjal	16939ca192	Mark SUBNET credentials as sensitive (#14320 ) So that they are redacted in the health report	2022-02-16 08:40:34 -08:00
Klaus Post	5ec57a9533	Add GetObject gzip option (#14226 ) Enabled with `mc admin config set alias/ api gzip_objects=on` Standard filtering applies (1K response minimum, not compressed content type, not range request, gzip accepted by client).	2022-02-14 09:19:01 -08:00
Shireesh Anjal	9890f579f8	Add subsystem level validation on `config set` (#14269 ) When setting a config of a particular sub-system, validate the existing config and notification targets of only that sub-system, so that existing errors related to one sub-system (e.g. notification target offline) do not result in errors for other sub-systems.	2022-02-08 10:36:41 -08:00
Shireesh Anjal	3882da6ac5	Add subnet proxy config (#14225 ) Will store the HTTP(S) proxy URL to use for connecting to SUBNET.	2022-02-01 09:52:38 -08:00
Harshavardhana	c39eb3bacd	fix: possible crash if private.key is empty (#14208 ) Before ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x9f54f7] goroutine 1 [running]: crypto/x509.IsEncryptedPEMBlock(...) crypto/x509/pem_decrypt.go:105 github.com/minio/minio/internal/config.LoadX509KeyPair({0xc00061e270, 0x0}, {0xc00061e2d0, 0x25}) github.com/minio/minio/internal/config/certs.go:88 +0xf7 github.com/minio/pkg/certs.(*Manager).AddCertificate(0xc000576150, {0xc00061e270, 0x25}, {0xc00061e2d0, 0x25}) github.com/minio/pkg@v1.1.15/certs/certs.go:132 +0x368 github.com/minio/pkg/certs.NewManager({0x51f5910, 0xc00053e140}, {0xc00061e270, 0xc000580400}, {0xc00061e2d0, 0x25}, 0x4dc5880) github.com/minio/pkg@v1.1.15/certs/certs.go:97 +0x170 github.com/minio/minio/cmd.getTLSConfig() ``` After ``` ERROR Unable to load the TLS configuration: The private key is not readable > Please check your certificate ```	2022-01-30 12:55:21 -08:00
Poorna	a4be47d7ad	Validate config before saving changes after config reset (#14203 )	2022-01-27 18:28:16 -08:00
Aditya Manthramurthy	7dfa565d00	Identity LDAP: Allow multiple search base DNs (#14191 ) This change allows the MinIO server to lookup users in different directory sub-trees by allowing specification of multiple search bases separated by semicolons.	2022-01-26 15:05:59 -08:00
Poorna	295730408b	Disallow delete replication for tag based rules (#14167 )	2022-01-24 15:22:20 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
Anis Elleuch	3e9bd931ed	tests: Remove RPC wording from the code (#14142 ) The lock was using net/rpc in the past but it got replaced with a REST API. This commit will fix function names/comments to avoid confusion.	2022-01-20 09:36:09 -08:00
Harshavardhana	1a56ebea70	cleanup dsync tests and remove net/rpc references (#14118 )	2022-01-18 12:44:38 -08:00
Harshavardhana	70e1cbda21	allow disabling O_DIRECT in certain environments for reads (#14115 ) repeated reads on single large objects in HPC like workloads, need the following option to disable O_DIRECT for a more effective usage of the kernel page-cache. However this optional should be used in very specific situations only, and shouldn't be enabled on all servers. NVMe servers benefit always from keeping O_DIRECT on.	2022-01-17 08:34:14 -08:00
Anis Elleuch	b106b1c131	lock: Fix decision when a lock needs to be removed (#14095 ) The code was not properly deciding if a lock needs to be removed when it doesn't have quorum anymore. After this commit, a lock will be forcefully unlocked if nodes reporting they are not able to find a lock internally breaks the quorum. Simplify the code as well.	2022-01-14 10:33:08 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00
Aditya Manthramurthy	1981fe2072	Add internal IDP and OIDC users support for site-replication (#14041 ) - This allows site-replication to be configured when using OpenID or the internal IDentity Provider. - Internal IDP IAM users and groups will now be replicated to all members of the set of replicated sites. - When using OpenID as the external identity provider, STS and service accounts are replicated. - Currently this change dis-allows root service accounts from being replicated (TODO: discuss security implications).	2022-01-06 15:52:43 -08:00
Minio Trusted	76877eb6fa	move gofumpt to golang-ci	2022-01-06 13:08:21 -08:00
Harshavardhana	0d3ae3810f	make sure to comply with MQTT spec (#14037 ) - keep-alive cannot be 0 by default anymore - client_id cannot be empty fixes #13993	2022-01-06 11:25:39 -08:00
Anis Elleuch	9d91d32d82	typo: Low capital in some JSON field names in log/audit output (#14020 ) Use a low capital in some fields in JSON log/audit output to follow other fields names.	2022-01-03 09:26:26 -08:00
Harshavardhana	a60ac7ca17	fix: audit log to support object names in multipleObjectNames() handler (#14017 )	2022-01-03 01:28:52 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	46fd9f4a53	fix: update storage-class properly fixes #14005	2021-12-28 22:49:06 -08:00
Harshavardhana	9ad6012782	simplify logger time and avoid possible crashes (#13986 ) time.Format() is not necessary prematurely for JSON marshalling, since JSON marshalling indeed defaults to RFC3339Nano. This also ensures the 'time' is remembered until its logged and it is the same time when the 'caller' invoked 'log' functions.	2021-12-23 15:33:54 -08:00
Harshavardhana	416977436e	rename MINIO_CACHE_.._MASTER_KEY to MINIO_CACHE_.._SECRET_KEY fixes #13975	2021-12-22 12:11:07 -08:00
Klaus Post	ebd78e983f	Limit key size to 3K (#13974 ) User is reporting `Error 1071 :Specified key was too long,max key length is 3072 bytes`. Regression caused by #13414	2021-12-22 11:41:51 -08:00
Harshavardhana	499872f31d	Add configurable channel queue_size for audit/logger webhook targets (#13819 ) Also log all the missed events and logs instead of silently swallowing the events. Bonus: Extend the logger webhook to support mTLS similar to audit webhook target.	2021-12-20 13:16:53 -08:00
Poorna K	111c6177d2	Deprecate caching for erasure/distributed mode (#13909 ) Fixes: #13907 Also removing default value of `writethrough` for cache commit which was interfering with cache_after setting	2021-12-15 16:48:34 -08:00
Klaus Post	91f72f25ab	select: Return early from bool AND, OR (#13914 ) Return as soon as an AND fails and whenever an OR succeeds. Faster and more flexible. For example makes `select * from S3object where _2 != '' AND _2 > 1` able to operate on empty fields. Followup to #13900	2021-12-15 16:47:21 -08:00
Klaus Post	a8d4042853	select: Add IS (NOT) operators (#13906 ) Add `IS` and `IS NOT` as comparison operators. This may be a bit wider than the S3 spec, but we can rather easily remove the forwarding.	2021-12-14 09:54:50 -08:00
Krishnan Parthasarathi	44a9339c0a	Newer noncurrent versions (#13815 ) - Rename MaxNoncurrentVersions tag to NewerNoncurrentVersions Note: We apply overlapping NewerNoncurrentVersions rules such that we honor the highest among applicable limits. e.g if 2 overlapping rules are configured with 2 and 3 noncurrent versions to be retained, we will retain 3. - Expire newer noncurrent versions after noncurrent days - MinIO extension: allow noncurrent days to be zero, allowing expiry of noncurrent version as soon as more than configured NewerNoncurrentVersions are present. - Allow NewerNoncurrentVersions rules on object-locked buckets - No x-amz-expiration when NewerNoncurrentVersions configured - ComputeAction should skip rules with NewerNoncurrentVersions > 0 - Add unit tests for lifecycle.ComputeAction - Support lifecycle rules with MaxNoncurrentVersions - Extend ExpectedExpiryTime to work with zero days - Fix all-time comparisons to be relative to UTC	2021-12-14 09:41:44 -08:00
Harshavardhana	113c7ff49a	add code to parse secrets natively instead of shell scripts (#13883 )	2021-12-13 18:23:31 -08:00

1 2 3 4

184 Commits