minio

mirror of https://github.com/minio/minio.git synced 2025-11-27 04:46:53 -05:00

Author	SHA1	Message	Date
Andreas Auernhammer	e955aa7f2a	kes: add support for encrypted private keys (#14650 ) This commit adds support for encrypted KES client private keys. Now, it is possible to encrypt the KES client private key (`MINIO_KMS_KES_KEY_FILE`) with a password. For example, KES CLI already supports the creation of encrypted private keys: ``` kes identity new --encrypt --key client.key --cert client.crt MinIO ``` To decrypt an encrypted private key, the password needs to be provided: ``` MINIO_KMS_KES_KEY_PASSWORD=<password> ``` Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-29 09:53:33 -07:00
Harshavardhana	7956ff0313	fix: multiple pool setup return incorrect DeleteMarker metadata (#14642 )	2022-03-27 23:39:50 -07:00
Aditya Manthramurthy	9ff25fb64b	Load IAM in-memory cache using only a single list call (#14640 ) - Increase global IAM refresh interval to 30 minutes - Also print a log after loading IAM subsystem	2022-03-27 18:48:01 -07:00
Andreas Auernhammer	04df69f633	listing: decrypt only SSE-S3 single-part ETags (#14638 ) This commit optimises the ETag decryption when listing objects. When MinIO lists objects, it has to decrypt the ETags of single-part SSE-S3 objects. It does not need to decrypt ETags of - plaintext objects => Their ETag is not encrypted - SSE-C objects => Their ETag is not the content MD5 - SSE-KMS objects => Their ETag is not the content MD5 - multipart objects => Their ETag is not encrypted Hence, MinIO only needs to make a call to the KMS when it needs to decrypt a single-part SSE-S3 object. It can resolve the ETags off all other object types locally. This commit implements the above semantics by processing an object listing in batches. If the batch contains no single-part SSE-S3 object, then no KMS calls will be made. If the batch contains at least one single-part SSE-S3 object we have to make at least one KMS call. No we first filter all single-part SSE-S3 objects such that we only request the decryption keys for these objects. Once we know which objects resp. ETags require a decryption key, MinIO either uses the KES bulk decryption API (if supported) or decrypts each ETag serially. This commit is a significant improvement compared to the previous listing code. Before, a single non-SSE-S3 object caused MinIO to fall-back to a serial ETag decryption. For example, if a batch consisted of 249 SSE-S3 objects and one single SSE-KMS object, MinIO would send 249 requests to the KMS. Now, MinIO will send a single request for exactly those 249 objects and skip the one SSE-KMS object since it can handle its ETag locally. Further, MinIO would request decryption keys for SSE-S3 multipart objects in the past - even though multipart ETags are not encrypted. So, if a bucket contained only multipart SSE-S3 objects, MinIO would make totally unnecessary requests to the KMS. Now, MinIO simply skips these multipart objects since it can handle the ETags locally. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-27 18:34:11 -07:00
Anis Elleuch	908eb57795	Always get the actual object size (#14637 ) In bulk ETag decryption, do not rely on the etag to check if it is encrypted or not to decide if we should set the actual object size in ObjectInfo. The reason is that multipart objects ETags are not encrypted. Always get the actual object size in that case.	2022-03-27 08:54:25 -07:00
Harshavardhana	5cfedcfe33	askDisks for strict quorum to be equal to read quorum (#14623 )	2022-03-25 16:29:45 -07:00
Andreas Auernhammer	4d2fc530d0	add support for SSE-S3 bulk ETag decryption (#14627 ) This commit adds support for bulk ETag decryption for SSE-S3 encrypted objects. If KES supports a bulk decryption API, then MinIO will check whether its policy grants access to this API. If so, MinIO will use a bulk API call instead of sending encrypted ETags serially to KES. Note that MinIO will not use the KES bulk API if its client certificate is an admin identity. MinIO will process object listings in batches. A batch has a configurable size that can be set via `MINIO_KMS_KES_BULK_API_BATCH_SIZE=N`. It defaults to `500`. This env. variable is experimental and may be renamed / removed in the future. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-25 15:01:41 -07:00
Harshavardhana	f046f557fa	request only 1 best version for latest version resolution (#14625 ) ListObjects, ListObjectsV2 calls are being heavily taxed when there are many versions on objects left over from a previous release or ILM was never setup to clean them up. Instead of being absolutely correct at resolving the exact latest version of an object, we simply rely on the top most 1 version and resolve the rest. Once we have obtained the top most "1" version for ListObject, ListObjectsV2 call we break out.	2022-03-25 08:50:07 -07:00
Harshavardhana	401958938d	add load balance properly restClientFromHash() bucket/prefix (#14621 ) spread out resuming further to other nodes	2022-03-25 03:41:31 -07:00
Poorna	566cffe53d	save format.json by default for inspect API (#14620 )	2022-03-25 02:02:17 -07:00
Minio Trusted	a42b576382	keep maximum concurrent operations to 512 (to sustain upto 1024 open fds)	2022-03-23 17:02:04 -07:00
Klaus Post	2ac54e5a7b	ListObjects: Filter lifecycle expired objects (#14606 ) For ListObjects and ListObjectsV2 perform lifecycle checks on all objects before returning. This will filter out objects that are pending lifecycle expiration. Bonus: Cheaper server pool conflict resolution by not converting to FileInfo.	2022-03-22 12:39:45 -07:00
Harshavardhana	8eecdc6d1f	odd stripe sizes should choose (odd+1)/2 to get correct quorum (#14610 )	2022-03-22 12:21:14 -07:00
Klaus Post	50577e2bd2	Allow adjusting request pool both ways (#14609 ) When reloading a dynamic config allow the request pool to scale both ways. Existing requests hold on to the previous pool, so they will pop the elements from that.	2022-03-22 11:28:54 -07:00
Klaus Post	7bc1f986e8	Do not wait for results when canceled (#14607 ) When canceled nobody may be listening for the results. Prevents memory buildup from cancelled requests.	2022-03-22 09:37:01 -07:00
Harshavardhana	d796621ccc	choose smaller default deadline for diagnostics without --full (#14599 )	2022-03-21 23:25:24 -07:00
Harshavardhana	f6113264f4	add detection for GOMAXPROCS < NumCPU	2022-03-21 19:05:10 -07:00
Harshavardhana	a3534a730b	fallback quorum should be "strict" globally if config is not loaded (#14589 )	2022-03-20 17:39:06 -07:00
Harshavardhana	bd6f7b6d83	fix: make decommission restart non-blocking (#14591 ) currently an on-going decommission, during a server restart might block the startup sequence for relatively longer periods, instead start the decommission in background lazily.	2022-03-20 14:46:43 -07:00
Andreas Auernhammer	b0a4beb66a	PutObjectPart: set SSE-KMS headers and truncate ETags. (#14578 ) This commit fixes two bugs in the `PutObjectPartHandler`. First, `PutObjectPart` should return SSE-KMS headers when the object is encrypted using SSE-KMS. Before, this was not the case. Second, the ETag should always be a 16 byte hex string, perhaps followed by a `-X` (where `X` is the number of parts). However, `PutObjectPart` used to return the encrypted ETag in case of SSE-KMS. This leaks MinIO internal etag details through the S3 API. The combination of both bugs causes clients that use SSE-KMS to fail when trying to validate the ETag. Since `PutObjectPart` did not send the SSE-KMS response headers, the response looked like a plaintext `PutObjectPart` response. Hence, the client tries to verify that the ETag is the content-md5 of the part. This could never be the case, since MinIO used to return the encrypted ETag. Therefore, clients behaving as specified by the S3 protocol tried to verify the ETag in a situation they should not. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-19 10:15:12 -07:00
Harshavardhana	01ee49045e	fix: handle race in server setup global CI/CD variable (#14579 )	2022-03-18 18:21:09 -07:00
Harshavardhana	7bd9f821dd	return correct context errors for locking operations (#14569 ) if a context is canceled do not need to return a timeout error instead, return the appropriate error for context canceled.	2022-03-18 15:32:45 -07:00
Klaus Post	61eb9d4e29	Fix listing fallback re-using disks (#14576 ) When more than 2 disks are unavailable for listing, the same disk will be used for fallback. This makes quorum calculations incorrect since the same disk will have multiple entries. This PR keeps track of which fallback disks have been handed out and only every returns a disk once.	2022-03-18 11:35:27 -07:00
Harshavardhana	43eb5a001c	re-use transport for AdminInfo() call (#14571 ) avoids creating new transport for each `isServerResolvable` request, instead re-use the available global transport and do not try to forcibly close connections to avoid TIME_WAIT build upon large clusters. Never use httpClient.CloseIdleConnections() since that can have a drastic effect on existing connections on the transport pool. Remove it everywhere.	2022-03-17 16:20:10 -07:00
Klaus Post	c1760fb764	Move apiCalls to front for field alignment (#14568 ) Fixes #14565	2022-03-17 10:57:52 -07:00
Minio Trusted	ffcadcd99e	Revert "Use S3 client for uplooads/downloads during perf test (#14553 )" This reverts commit `ff811f594b`. Speedtest is broken need to fix this more cleanly.	2022-03-16 23:34:49 -07:00
Krishnan Parthasarathi	7b81967a3c	Fix handling of object versions pending purge (#14555 ) - GetObject() with vid should return 405 - GetObject() without vid should return 404 - ListObjects() should ignore this object if this is the "latest" version of the object - ListObjectVersions() should list this object as "DELETE marker" - Remove data parts before sync'ing the version pending purge	2022-03-16 16:59:43 -07:00
Krishna Srinivas	ff811f594b	Use S3 client for uplooads/downloads during perf test (#14553 )	2022-03-16 16:58:46 -07:00
Harshavardhana	e3071157f0	allow MakeBucketLocation to work for metaBucket (#14548 ) decommission would fail to start due to failure in MakeBucketLocation() error on .minio.sys/ bucket creation. Allow these special buckets.	2022-03-14 11:25:24 -07:00
Klaus Post	c07af89e48	select: Add ScanRange to CSV&JSON (#14546 ) Implements https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html#AmazonS3-SelectObjectContent-request-ScanRange Fixes #14539	2022-03-14 09:48:36 -07:00
Harshavardhana	9c846106fa	decouple service accounts from root credentials (#14534 ) changing root credentials makes service accounts in-operable, this PR changes the way sessionToken is generated for service accounts. It changes service account behavior to generate sessionToken claims from its own secret instead of using global root credential. Existing credentials will be supported by falling back to verify using root credential. fixes #14530	2022-03-14 09:09:22 -07:00
Harshavardhana	cf94d1f1f1	do not crash readXLMetaNoData - if the `xl.meta` has incorrect content (#14538 ) ``` tmp = buf[want:] ``` Would potentially crash when `buf` is truncated for some reason and does not have the expected bytes, this is of course considered not normal and is an odd situation. But we do not need to crash here instead allow for errors to be returned and let callers handle the errors.	2022-03-14 09:07:46 -07:00
Poorna	f8d6eaaa96	fix: regression from range GET proxy on replicated buckets #14345 (#14532 ) Fixes: #14531	2022-03-11 15:56:49 -08:00
Poorna	75b925c326	Deprecate root disk for disk caching (#14527 ) This PR modifies #14513 to issue a deprecation warning rather than reject settings on startup.	2022-03-10 18:42:44 -08:00
Harshavardhana	91d419ee6c	warn issues about large block I/O performance for Linux older than 4.0.0 (#14524 ) This PR simply adds a warning message when it detects older kernel versions and warn's them about potential performance issues on this kernel. The issue can be seen only with parallel I/O across all drives on denser setups such as 90 drives or 45 drives per server configurations.	2022-03-10 17:36:13 -08:00
Harshavardhana	41079f1015	heal: remove blocking healDiskMeta upon startup (#14514 ) This type of code is not necessary, read's of all metadata content at `.minio.sys/config` automatically triggers healing when necessary in the GetObjectNInfo() call-path. Having this code is not useful and this also adds to the overall startup time of MinIO when there are lots of users and policies.	2022-03-10 02:45:14 -08:00
Poorna	712dfa40cd	Add missing site replication hook for clearing sse config (#14512 )	2022-03-10 00:04:34 -08:00
Klaus Post	b890bbfa63	Add local disk health checks (#14447 ) The main goal of this PR is to solve the situation where disks stop responding to operations. This generally causes an FD build-up and eventually will crash the server. This adds detection of hung disks, where calls on disk get stuck. We add functionality to `xlStorageDiskIDCheck` where it keeps track of the number of concurrent requests on a given disk. A total number of 100 operations are allowed. If this limit is reached we will block (but not reject) new requests, but we will monitor the state of the disk. If no requests have been completed or updated within a 15-second window, we mark the disk as offline. Requests that are blocked will be unblocked and return an error as "faulty disk". New requests will be rejected until the disk is marked OK again. Once a disk has been marked faulty, a check will run every 5 seconds that will attempt to write and read back a file. As long as this fails the disk will remain faulty. To prevent lots of long-running requests to mark the disk faulty we implement a callback feature that allows updating the status as parts of these operations are running. We add a reader and writer wrapper that will update the status of each successful read/write operation. This should allow fine enough granularity that a slow, but still operational disk will not reach 15 seconds where 50 operations have not progressed. Note that errors themselves are not enough to mark a disk faulty. A nil (or io.EOF) error will mark a disk as "good". * Make concurrent disk setting configurable via `_MINIO_DISK_MAX_CONCURRENT`. * de-couple IsOnline() from disk health tracker The purpose of IsOnline() is to ensure that we reconnect the drive only when the "drive" was - disconnected from network we need to validate if the drive is "correct" and is the same drive which belongs to this server. - drive was replaced we have to format it - we support hot swapping of the drives. IsOnline() is not meant for taking the drive offline when it is hung, it is not useful we can let the drive be online instead "return" errors for relevant calls. * return errFaultyDisk for DiskInfo() call Co-authored-by: Harshavardhana <harsha@minio.io> Possible future Improvements: * Unify the REST server and local xlStorageDiskIDCheck. This would also improve stats significantly. * Allow reads/writes to be aborted by the context. * Add usage stats, concurrent count, blocked operations, etc.	2022-03-09 11:38:54 -08:00
Poorna	46ba15ab03	Return MethodNotAllowed if force del on replicated bucket (#14505 )	2022-03-08 14:28:51 -08:00
Poorna	1e39ca39c3	fix: consistent replies for incorrect range requests on replicated buckets (#14345 ) Propagate error from replication proxy target correctly to the client if range GET is unsatisfiable.	2022-03-08 13:58:55 -08:00
Krishnan Parthasarathi	80ef1ae51c	Simplify assembling of tierStats from data-usage (#14504 )	2022-03-08 12:08:29 -08:00
Krishna Srinivas	4d0715d226	Implement netperf for "mc support perf net" (#14397 ) Co-authored-by: Klaus Post <klauspost@gmail.com>	2022-03-08 09:54:38 -08:00
Klaus Post	8a274169da	heal: Fix first entry on dangling (#14495 ) Instead of the first, the last entry was returned pointerizing the range value.	2022-03-08 09:04:20 -08:00
Harshavardhana	5d6f6d8d5b	create missing .minio.sys/config, .minio.sys/buckets during decommission (#14497 )	2022-03-07 16:18:57 -08:00
Anis Elleuch	bacf6156c1	metrics: Avoid crash when fetching tier metrics (#14493 ) Data usage does not always contain tiering info even if the data usage information is valid. Avoid a crash in that case. (e.g. the scanner scanned the namespace, the user enables tiering, prometheus scrapes the server before the scanner gets a chance to update the data usage with new tiering information)	2022-03-07 10:59:32 -08:00
Klaus Post	1d1b213f1f	scanner: Consider preselection bias when selecting for Healing (#14492 ) Healing decisions would align with skipped folder counters. This can lead to files never being selected for heal checks on "clean" paths. Use different hashing methods and take objectHealProbDiv into account when calculating the cycle. Found by @vadmeste	2022-03-07 09:25:53 -08:00
Harshavardhana	92a77cc78e	update pkg v1.1.20 to reload certs in k8s always (#14470 )	2022-03-04 20:34:39 -08:00
Harshavardhana	b0c84e3de7	fix: deleteVersions causing xl.meta to have empty Versions[] slice (#14483 ) This is a side-affect of the optimization done in PR #13544 which causes a certain type of delete operations on given object versions can cause lastVersion indication to be skipped, which leads to an `xl.meta` where Versions[] slice is empty while the entire file is intact by itself. This PR tries to ensure that such files are visible and deletable by regular means of listing as null 'delete-marker' and also avoid the situation where this potential issue might arise.	2022-03-04 20:01:26 -08:00
Anis Elleuch	bbc914e174	heal: Do not override heal scan mode mode if it is set (#14476 ) mc admin heal has --scan=deep flag which enforces bitrot checking when doing the healing. Do not force override an existing heal scan option.	2022-03-04 18:25:06 -08:00
Anis Elleuch	3fca4055d2	heal: Re-heal an object when a corruption is found during normal scan (#14482 ) When scanning using normal mode, HealObject() can report an error saying that it found a corrupted part. This doesn't have when HealObject() is called with bitrot scan flag. However, when this happens, we can still restart HealObject() with the bitrot scan. This is also important because this means the scanner and the new disks healer will not be able to heal an object that doesn't exist in a specific disk and has corruption in another disk. Also without this PR, mc admin heal command without bitrot will report an error.	2022-03-04 18:24:34 -08:00

... 2 3 4 5 6 ...

4491 Commits