minio

mirror of https://github.com/minio/minio.git synced 2025-11-22 18:47:43 -05:00

Author	SHA1	Message	Date
Klaus Post	472c2d828c	Fix waitgroup add after wait on config reload (#14584 ) Fix `panic: "POST /minio/peer/v21/signalservice?signal=2": sync: WaitGroup is reused before previous Wait has returned` Log entries already on the channel would cause `logEntry` to increment the waitgroup when sending messages, after Cancel has been called. Instead of tracking every single message, just check the send goroutine. Faster and safe, since it will not decrement until the channel is closed. Regression from #14289	2022-03-19 09:15:45 -07:00
Anis Elleuch	b20ecc7b54	Add support of TLS session tickets with KES server (#14577 ) Reduce overhead for communication between MinIO server and KES server.	2022-03-18 15:14:10 -07:00
Harshavardhana	43eb5a001c	re-use transport for AdminInfo() call (#14571 ) avoids creating new transport for each `isServerResolvable` request, instead re-use the available global transport and do not try to forcibly close connections to avoid TIME_WAIT build upon large clusters. Never use httpClient.CloseIdleConnections() since that can have a drastic effect on existing connections on the transport pool. Remove it everywhere.	2022-03-17 16:20:10 -07:00
Aditya Manthramurthy	ce97313fda	Add extra LDAP configuration validation (#14535 ) - The result now contains suggestions on fixing common configuration issues. - These suggestions will subsequently be exposed in console/mc	2022-03-16 19:57:36 -07:00
Harshavardhana	ae3b369fe1	logger webhook failure can overrun the queue_size (#14556 ) PR introduced in #13819 was incorrect and was not handling the situation where a buffer is full can cause incessant amount of logs that would keep the logger webhook overrun by the requests. To avoid this only log failures to console logger instead of all targets as it can cause self reference, leading to an infinite loop.	2022-03-15 17:45:51 -07:00
Klaus Post	c07af89e48	select: Add ScanRange to CSV&JSON (#14546 ) Implements https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html#AmazonS3-SelectObjectContent-request-ScanRange Fixes #14539	2022-03-14 09:48:36 -07:00
Aditya Manthramurthy	b7ed3b77bd	Indicate required fields in LDAP configuration correctly (#14526 )	2022-03-10 19:03:38 -08:00
Poorna	75b925c326	Deprecate root disk for disk caching (#14527 ) This PR modifies #14513 to issue a deprecation warning rather than reject settings on startup.	2022-03-10 18:42:44 -08:00
Harshavardhana	91d419ee6c	warn issues about large block I/O performance for Linux older than 4.0.0 (#14524 ) This PR simply adds a warning message when it detects older kernel versions and warn's them about potential performance issues on this kernel. The issue can be seen only with parallel I/O across all drives on denser setups such as 90 drives or 45 drives per server configurations.	2022-03-10 17:36:13 -08:00
Poorna	7ce91ea1a1	Disallow root disk to be used for cache drives (#14513 )	2022-03-10 02:45:31 -08:00
Klaus Post	b890bbfa63	Add local disk health checks (#14447 ) The main goal of this PR is to solve the situation where disks stop responding to operations. This generally causes an FD build-up and eventually will crash the server. This adds detection of hung disks, where calls on disk get stuck. We add functionality to `xlStorageDiskIDCheck` where it keeps track of the number of concurrent requests on a given disk. A total number of 100 operations are allowed. If this limit is reached we will block (but not reject) new requests, but we will monitor the state of the disk. If no requests have been completed or updated within a 15-second window, we mark the disk as offline. Requests that are blocked will be unblocked and return an error as "faulty disk". New requests will be rejected until the disk is marked OK again. Once a disk has been marked faulty, a check will run every 5 seconds that will attempt to write and read back a file. As long as this fails the disk will remain faulty. To prevent lots of long-running requests to mark the disk faulty we implement a callback feature that allows updating the status as parts of these operations are running. We add a reader and writer wrapper that will update the status of each successful read/write operation. This should allow fine enough granularity that a slow, but still operational disk will not reach 15 seconds where 50 operations have not progressed. Note that errors themselves are not enough to mark a disk faulty. A nil (or io.EOF) error will mark a disk as "good". * Make concurrent disk setting configurable via `_MINIO_DISK_MAX_CONCURRENT`. * de-couple IsOnline() from disk health tracker The purpose of IsOnline() is to ensure that we reconnect the drive only when the "drive" was - disconnected from network we need to validate if the drive is "correct" and is the same drive which belongs to this server. - drive was replaced we have to format it - we support hot swapping of the drives. IsOnline() is not meant for taking the drive offline when it is hung, it is not useful we can let the drive be online instead "return" errors for relevant calls. * return errFaultyDisk for DiskInfo() call Co-authored-by: Harshavardhana <harsha@minio.io> Possible future Improvements: * Unify the REST server and local xlStorageDiskIDCheck. This would also improve stats significantly. * Allow reads/writes to be aborted by the context. * Add usage stats, concurrent count, blocked operations, etc.	2022-03-09 11:38:54 -08:00
Klaus Post	7060c809c0	Add authorization header to HEAD requests (#14510 ) Add Authorization to network check requests. Fixes #14507	2022-03-09 10:48:56 -08:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Andreas Auernhammer	b48f719b8e	kes: remove unnecessary error conversion (#14459 ) This commit removes some duplicate code that converts KES API errors. This code was added since KES `0.18.0` changed some exported API errors. However, the KES SDK handles this error conversion itself. Therefore, it is not necessary to duplicate this behavior in MinIO. See: `21555fa624/error.go (L94)` Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-03 09:42:37 -08:00
Lenin Alevski	289fcbd08c	KES dependency upgrade (#14454 ) - Updating KES dependency to v.0.18.0 - Fixing incompatibility issue when checking for errors during KES key creation Signed-off-by: Lenin Alevski <alevsk.8772@gmail.com>	2022-03-02 23:03:40 -08:00
Harshavardhana	f6875bb893	fix: regression from refactor in AMQP notification (#14455 ) fixes a regression introduced in #14269 that refactored the notification registration logic, all the amqp targets however online will not be available for use anymore. fixes #14451	2022-03-02 21:35:48 -08:00
Klaus Post	b030ef1aca	tests: Clean up dsync package (#14415 ) Add non-constant timeouts to dsync package. Reduce test runtime by minutes. Hopefully not too aggressive.	2022-03-01 11:14:28 -08:00
Klaus Post	88fd1cba71	select: add MISSING operator support (#14406 ) Probably not full support, but for regular checks it should work. Fixes #14358	2022-02-25 12:31:19 -08:00
hellivan	5307e18085	use keycloak_realm properly for keycloak user lookups (#14401 ) In case a user-defined a value for the MINIO_IDENTITY_OPENID_KEYCLOAK_REALM environment variable, construct the path properly.	2022-02-24 10:16:53 -08:00
Klaus Post	2cea944cdb	select: Allow lower case 'is' (#14405 ) Ref: #14358	2022-02-24 09:10:48 -08:00
Shireesh Anjal	3934700a08	Make audit webhook and kafka config dynamic (#14390 )	2022-02-24 09:05:33 -08:00
hellivan	0913eb6655	fix: openid config provider not initialized correctly (#14399 ) Up until now `InitializeProvider` method of `Config` struct was implemented on a value receiver which is why changes on `provider` field where never reflected to method callers. In order to fix this issue, the method was implemented on a pointer receiver.	2022-02-23 23:42:37 -08:00
Harshavardhana	1bfbe354f5	fix: clientId must be unique for all servers (#14398 ) This is a regression from #14037, distributed setups with MQTT was not working anymore. According to MQTT spec it is expected this is unique per server. We shall proceed to use unix nano timestamp hex value instead here.	2022-02-23 20:19:59 -08:00
Shireesh Anjal	25144fedd5	Send deployment id and minio version in http header (#14378 )	2022-02-23 13:36:01 -08:00
Shireesh Anjal	94d37d05e5	Apply dynamic config at sub-system level (#14369 ) Currently, when applying any dynamic config, the system reloads and re-applies the config of all the dynamic sub-systems. This PR refactors the code in such a way that changing config of a given dynamic sub-system will work on only that sub-system.	2022-02-22 10:59:28 -08:00
Shireesh Anjal	c1437c7b46	allow `config reset api` to work by overloading default values (#14368 ) The `LookupConfig` code was not using `GetWithDefault`, because of which some of the config values were being returned as empty string, and calls like `strconv.Atoi` and `time.ParseDuration` on these were failing.	2022-02-21 15:50:45 -08:00
Aditya Manthramurthy	bc110d8055	fix: mysql notification target table creation (#14350 ) Add a generated hash column as the primary key for the key name as MySQL does not allow indexes on long VARCHAR columns.	2022-02-18 12:13:49 -08:00
Harshavardhana	65b1a4282e	fix: console logger regression with dynamic logger webhook registration (#14346 ) fixes a regression from #14289	2022-02-17 17:50:10 -08:00
Shireesh Anjal	28f188e3ef	Make logger webhook config dynamic (#14289 ) It should not be required to restart the server after setting the logger webhook config.	2022-02-17 11:11:15 -08:00
Shireesh Anjal	1a5496eced	Add `enable` key to logger webhook help (#14326 ) This key is supported by the logger webhook config - but is not returned in the help.	2022-02-16 11:59:50 -08:00
Shireesh Anjal	16939ca192	Mark SUBNET credentials as sensitive (#14320 ) So that they are redacted in the health report	2022-02-16 08:40:34 -08:00
Klaus Post	5ec57a9533	Add GetObject gzip option (#14226 ) Enabled with `mc admin config set alias/ api gzip_objects=on` Standard filtering applies (1K response minimum, not compressed content type, not range request, gzip accepted by client).	2022-02-14 09:19:01 -08:00
Shireesh Anjal	9890f579f8	Add subsystem level validation on `config set` (#14269 ) When setting a config of a particular sub-system, validate the existing config and notification targets of only that sub-system, so that existing errors related to one sub-system (e.g. notification target offline) do not result in errors for other sub-systems.	2022-02-08 10:36:41 -08:00
Shireesh Anjal	3882da6ac5	Add subnet proxy config (#14225 ) Will store the HTTP(S) proxy URL to use for connecting to SUBNET.	2022-02-01 09:52:38 -08:00
Harshavardhana	c39eb3bacd	fix: possible crash if private.key is empty (#14208 ) Before ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x9f54f7] goroutine 1 [running]: crypto/x509.IsEncryptedPEMBlock(...) crypto/x509/pem_decrypt.go:105 github.com/minio/minio/internal/config.LoadX509KeyPair({0xc00061e270, 0x0}, {0xc00061e2d0, 0x25}) github.com/minio/minio/internal/config/certs.go:88 +0xf7 github.com/minio/pkg/certs.(*Manager).AddCertificate(0xc000576150, {0xc00061e270, 0x25}, {0xc00061e2d0, 0x25}) github.com/minio/pkg@v1.1.15/certs/certs.go:132 +0x368 github.com/minio/pkg/certs.NewManager({0x51f5910, 0xc00053e140}, {0xc00061e270, 0xc000580400}, {0xc00061e2d0, 0x25}, 0x4dc5880) github.com/minio/pkg@v1.1.15/certs/certs.go:97 +0x170 github.com/minio/minio/cmd.getTLSConfig() ``` After ``` ERROR Unable to load the TLS configuration: The private key is not readable > Please check your certificate ```	2022-01-30 12:55:21 -08:00
Poorna	a4be47d7ad	Validate config before saving changes after config reset (#14203 )	2022-01-27 18:28:16 -08:00
Aditya Manthramurthy	7dfa565d00	Identity LDAP: Allow multiple search base DNs (#14191 ) This change allows the MinIO server to lookup users in different directory sub-trees by allowing specification of multiple search bases separated by semicolons.	2022-01-26 15:05:59 -08:00
Poorna	295730408b	Disallow delete replication for tag based rules (#14167 )	2022-01-24 15:22:20 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
Anis Elleuch	3e9bd931ed	tests: Remove RPC wording from the code (#14142 ) The lock was using net/rpc in the past but it got replaced with a REST API. This commit will fix function names/comments to avoid confusion.	2022-01-20 09:36:09 -08:00
Harshavardhana	1a56ebea70	cleanup dsync tests and remove net/rpc references (#14118 )	2022-01-18 12:44:38 -08:00
Harshavardhana	70e1cbda21	allow disabling O_DIRECT in certain environments for reads (#14115 ) repeated reads on single large objects in HPC like workloads, need the following option to disable O_DIRECT for a more effective usage of the kernel page-cache. However this optional should be used in very specific situations only, and shouldn't be enabled on all servers. NVMe servers benefit always from keeping O_DIRECT on.	2022-01-17 08:34:14 -08:00
Anis Elleuch	b106b1c131	lock: Fix decision when a lock needs to be removed (#14095 ) The code was not properly deciding if a lock needs to be removed when it doesn't have quorum anymore. After this commit, a lock will be forcefully unlocked if nodes reporting they are not able to find a lock internally breaks the quorum. Simplify the code as well.	2022-01-14 10:33:08 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00
Aditya Manthramurthy	1981fe2072	Add internal IDP and OIDC users support for site-replication (#14041 ) - This allows site-replication to be configured when using OpenID or the internal IDentity Provider. - Internal IDP IAM users and groups will now be replicated to all members of the set of replicated sites. - When using OpenID as the external identity provider, STS and service accounts are replicated. - Currently this change dis-allows root service accounts from being replicated (TODO: discuss security implications).	2022-01-06 15:52:43 -08:00
Minio Trusted	76877eb6fa	move gofumpt to golang-ci	2022-01-06 13:08:21 -08:00
Harshavardhana	0d3ae3810f	make sure to comply with MQTT spec (#14037 ) - keep-alive cannot be 0 by default anymore - client_id cannot be empty fixes #13993	2022-01-06 11:25:39 -08:00
Anis Elleuch	9d91d32d82	typo: Low capital in some JSON field names in log/audit output (#14020 ) Use a low capital in some fields in JSON log/audit output to follow other fields names.	2022-01-03 09:26:26 -08:00
Harshavardhana	a60ac7ca17	fix: audit log to support object names in multipleObjectNames() handler (#14017 )	2022-01-03 01:28:52 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00

... 2 3 4 5 6 ...

344 Commits