minio

mirror of https://github.com/minio/minio.git synced 2025-04-16 17:00:07 -04:00

Author	SHA1	Message	Date
Anis Elleuch	d4e565e595	Add defensive check for one stream message size (#15029 ) In a streaming response, the client knows the size of a streamed message but never checks the message size. Add the check to error out if the response message is truncated.	2022-06-02 09:16:26 -07:00
Klaus Post	f7cecf0945	Make isIndexedMetaV2 return errors (#15012 ) Indexed streams would be decoded by the legacy loader if there was an error loading it. Return an error when the stream is indexed and it cannot be loaded. Fixes "unknown minor metadata version" on corrupted xl.meta files and returns an actual error.	2022-05-31 19:06:57 -07:00
Harshavardhana	52221db7ef	fix: for unexpected errors in reading versioning config panic (#14994 ) We need to make sure if we cannot read bucket metadata for some reason, and bucket metadata is not missing and returning corrupted information we should panic such handlers to disallow I/O to protect the overall state on the system. In-case of such corruption we have a mechanism now to force recreate the metadata on the bucket, using `x-minio-force-create` header with `PUT /bucket` API call. Additionally fix the versioning config updated state to be set properly for the site replication healing to trigger correctly.	2022-05-31 02:57:57 -07:00
Anis Elleuch	56a61bab56	test: Add GetObjectNInfo test with some outdated disks (#15004 ) Add a test reading an object which has some old data in some outdated disks, in a versioned and non-versioned bucket.	2022-05-30 17:52:59 -07:00
Harshavardhana	d480022711	fix: invalidate outdated disks appropriately during readAllXL (#15002 ) readAllXL would return inlined data for outdated disks causing "read" to return incorrect content to the client, this PR fixes this behavior by making sure we skip such outdated disks appropriately based on the latest ModTime on the disk.	2022-05-30 12:43:54 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	5792be71fa	fix: add timeouts to avoid goroutine leaks in net/http (#14995 ) Following code can reproduce an unending go-routine buildup, while keeping connections established due to lack of client not closing the connections. https://gist.github.com/harshavardhana/2d00e6f909054d2d2524c71485ad02e1 Without this PR all MinIO deployments can be put into denial of service attacks, causing entire service to be unavailable. We bring in two timeouts at this stage to control such go-routine build ups, new change - IdleTimeout (to kill off idle connections) - ReadHeaderTimeout (to kill off connections that are too slow) This new change also brings two hidden options to make any additional relevant changes if desired in some setups.	2022-05-30 06:24:51 -07:00
Poorna	5e3010d455	Tighten enforcement of object retention (#14993 ) Ref issue#14991 - in the rare case that object in bucket under retention has null version, make sure to enforce retention rules.	2022-05-28 02:21:19 -07:00
Anis Elleuch	ccbf65c8e8	site-repl: Fix deadlock after an IAM loading error (#14990 ) Fix forgotten IAM cache lock releases when reading some data from disk/etcd Co-authored-by: Anis Elleuch <anis@min.io>	2022-05-27 10:26:38 -07:00
Harshavardhana	9d07cde385	use crypto/sha256 only for FIPS 140-2 compliance (#14983 ) It would seem like the PR #11623 had chewed more than it wanted to, non-fips build shouldn't really be forced to use slower crypto/sha256 even for presumed "non-performance" codepaths. In MinIO there are really no "non-performance" codepaths. This assumption seems to have had an adverse effect in certain areas of CPU usage. This PR ensures that we stick to sha256-simd on all non-FIPS builds, our most common build to ensure we get the best out of the CPU at any given point in time.	2022-05-27 06:00:19 -07:00
Aditya Manthramurthy	464b9d7c80	Add support for Identity Management Plugin (#14913 ) - Adds an STS API `AssumeRoleWithCustomToken` that can be used to authenticate via the Id. Mgmt. Plugin. - Adds a sample identity manager plugin implementation - Add doc for plugin and STS API - Add an example program using go SDK for AssumeRoleWithCustomToken	2022-05-26 17:58:09 -07:00
Poorna	5c81d0d89a	site replication: heal missing/invalid replication config (#14979 ) Validate remote target ARNs and heal any stale rules in the replication config	2022-05-26 17:57:23 -07:00
Klaus Post	c0bf02b8b2	Ignore disks with 0 total space (#14981 ) Ignore disks with 0 total Mainly defensive to ensure no `/0` in percent calculation.	2022-05-26 06:01:50 -07:00
Harshavardhana	fd46a1c3b3	fix: some races when accessing ldap/openid config globally (#14978 )	2022-05-25 18:32:53 -07:00
Aditya Manthramurthy	5aae7178ad	Fix listing of service and sts accounts (#14977 ) Now returns user does not exist error if the user is not known to the system	2022-05-25 15:28:54 -07:00
Harshavardhana	dea8220eee	do not heal outdated disks > parityBlocks (#14976 ) this PR also fixes a situation where incorrect partsMetadata slice was used where fi.Data was re-used from a single drive causing duplication of the shards across all drives. This happens for situations where shouldHeal() returns true for all drives > parityBlocks. To avoid this we should never attempt to heal on all drives > parityBlocks, unless we are doing metadata migration from xl.json -> xl.meta	2022-05-25 15:17:10 -07:00
Klaus Post	a4be0b88f6	Add server pool reserved space (#14974 ) If one or more pools reach 85% usage in a set, we will only use pools that have more free space. In case all pools are above 85% we allow all of them to be used with the regular distribution.	2022-05-25 13:20:20 -07:00
Poorna	d8101573be	Disallow deletion of ARN when under active replication (#14972 ) fixes a regression from #12880	2022-05-24 19:40:45 -07:00
Klaus Post	41cdb357bb	Compensate for different server pool sizes (#14968 ) When a server pool with a different number of sets is added they are not compensated when choosing a destination pool for new objects. This leads to the unbalanced placement of objects with smaller pools getting a bigger number of objects since we only compare the destination sets directly. This change will compensate for differences in set sizes when choosing the destination pool. Different set sizes are already compensated by fewer disks.	2022-05-24 18:57:14 -07:00
Harshavardhana	38caddffe7	fix: copyObject on versioned bucket when updating metadata (#14971 ) updating metadata with CopyObject on a versioned bucket causes the latest version to be not readable, this PR fixes this properly by handling the inline data bug fix introduced in PR #14780. This bug affects only inlined data.	2022-05-24 17:27:45 -07:00
Poorna	0e26f983d6	site replication: Allow replication rule edit (#14969 ) Revert commit b42cfcea60aedddb300d46a371662b8d16574d6b as too restrictive	2022-05-24 13:27:33 -07:00
Anis Elleuch	77dc99e71d	Do not use inline data size in xl.meta quorum calculation (#14831 ) * Do not use inline data size in xl.meta quorum calculation Data shards of one object can different inline/not-inline decision in multiple disks. This happens with outdated disks when inline decision changes. For example, enabling bucket versioning configuration will change the small file threshold. When the parity of an object becomes low, GET object can return 503 because it is not unable to calculate the xl.meta quorum, just because some xl.meta has inline data and other are not. So this commit will be disable taking the size of the inline data into consideration when calculating the xl.meta quorum. * Add tests for simulatenous inline/notinline object Co-authored-by: Anis Elleuch <anis@min.io>	2022-05-24 06:26:38 -07:00
Anis Elleuch	5041bfcb5c	replication healing: Fix typo when healing bucket quota info (#14966 ) A typo is found in the replication healing code where an empty quota configuration is sent to peer sites instead of the correct one. .io>	2022-05-24 06:26:13 -07:00
Harshavardhana	f8650a3493	fetch bucket replication stats across peers in single call (#14956 ) current implementation relied on recursively calling one bucket at a time across all peers, this would be very slow and chatty when there are 100's of buckets which would mean 100*peerCount amount of network operations. This PR attempts to reduce this entire call into `peerCount` amount of network calls only. This functionality addresses also a concern where the Prometheus metrics would significantly slow down when one of the peers is offline.	2022-05-23 09:15:30 -07:00
Klaus Post	90a52a29c5	Fix WalkDir fallback hot loop (#14961 ) Fix fallback hot loop fd was never refreshed, leading to an infinite hot loop if a disk failed and the fallback disk fails as well. Fix & simplify retry loop. Fixes #14960	2022-05-23 06:28:46 -07:00
Poorna	8859c92f80	Relax site replication syncing of service accounts (#14955 ) Synchronous replication of service/sts accounts can be relaxed as site replication healing should catch up when peer clusters are back online.	2022-05-20 19:09:11 -07:00
Anis Elleuch	01e5632949	mrf: Fix stale MRF data showed in heal info (#14953 ) One usee reported having mc admin heal status output ETA increasing by time. It turned out it is MRF that is not clearing its data due to a bug in the code. pendingItems is increased when an object is queued to be healed but never decreasd when there is a healing error. This commit will decrease pendingItems and pendingBytes even when there is an error to give accurate reporting.	2022-05-20 07:33:18 -07:00
Anis Elleuch	95a6b2c991	Merge LDAP STS policy evaluation with the generic STS code (#14944 ) If LDAP is enabled, STS security token policy is evaluated using a different code path and expects ldapUser claim to exist in the security token. This means other STS temporary accounts generated by any Assume Role function, such as AssumeRoleWithCertificate, won't be allowed to do any operation as these accounts do not have LDAP user claim. Since IsAllowedLDAPSTS() is similar to IsAllowedSTS(), this commit will merge both. Non harmful changes: - IsAllowed for LDAP will start supporting RoleARN claim - IsAllowed for LDAP will not check for parent claim anymore. This check doesn't seem to be useful since all STS login compare access/secret/security-token with the one saved in the disk. - LDAP will support $username condition in policy documents. Co-authored-by: Anis Elleuch <anis@min.io> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2022-05-19 11:06:55 -07:00
Harshavardhana	30c9e50701	make sure to ignore expected errors and dirname deletes (#14945 )	2022-05-18 17:58:19 -07:00
Aditya Manthramurthy	9aadd725d2	Avoid calling .Reset() on active timer (#14941 ) .Reset() documentation states: For a Timer created with NewTimer, Reset should be invoked only on stopped or expired timers with drained channels. This change is just to comply with this requirement as there might be some runtime dependent situation that might lead to unexpected behavior.	2022-05-18 15:37:58 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Harshavardhana	2dc8ac1e62	allow IAM cache load to be granular and capture missed state (#14930 ) anything that is stuck on the disk today can cause latency spikes for all incoming S3 I/O, we need to have this de-coupled so that we can make sure that latency in loading credentials are not reflected back to the S3 API calls. The approach this PR takes is by checking if the calls were updated just in case when the IAM load was in progress, so that we can use merge instead of "replacement" to avoid missing state.	2022-05-17 19:58:47 -07:00
Harshavardhana	040ac5cad8	fix: when logger queue is full exit quickly upon doneCh (#14928 ) Additionally only reload requested sub-system not everything	2022-05-16 16:10:51 -07:00
Harshavardhana	03f8b25b50	disable connectDisks loop under testing (#14920 ) avoids races during tests, keeps tests predictable	2022-05-16 05:36:00 -07:00
Aditya Manthramurthy	f28a8eca91	Add Access Management Plugin tests with OpenID (#14919 )	2022-05-13 12:48:02 -07:00
Anis Elleuch	ca69e54cb6	tests: Fix sporadic failure of TestXLStorageDeleteFile (#14911 ) The test expects from DeleteFile to return errDiskNotFound when the disk is not available. It calls os.RemoveAll() to remove one disk after XL storage initialization. However, this latter contains some goroutines which can race with os.RemoveAll() and then the test fails sporadically with returning random errors. The commit will tweak the initialization routine of the XL storage to only run deletion of temporary and metacache data in the background, so TestXLStorageDeleteFile won't fail anymore.	2022-05-12 15:24:58 -07:00
Aditya Manthramurthy	4629abd5a2	Add tests for Access Management Plugin (#14909 )	2022-05-12 15:24:19 -07:00
Harshavardhana	dc99f4a7a3	allow bucket to be listed when GetBucketLocation is enabled (#14903 ) currently, we allowed buckets to be listed from the API call if and when the user has ListObject() permission at the global level, this is okay to be extended to GetBucketLocation() as well since GetBucketLocation() is a "read" call and allowing "reads" on a bucket has an implicit assumption that ListBuckets() should be allowed. This makes discoverability of access for read-only users becomes easier or users with specific restrictions on their policies.	2022-05-12 10:46:20 -07:00
Harshavardhana	9341201132	logger lock should be more granular (#14901 ) This PR simplifies few things by splitting the locks between audit, logger targets to avoid potential contention between them. any failures inside audit/logger HTTP targets must only log to console instead of other targets to avoid cyclical dependency. avoids unneeded atomic variables instead uses RWLock to differentiate a more common read phase v/s lock phase.	2022-05-12 07:20:58 -07:00
Krishnan Parthasarathi	88dd83a365	lifecycle: Set opts.VersionSuspended when expiring objects (#14902 )	2022-05-12 06:09:24 -07:00
Harshavardhana	60d0611ac2	use BadRequest HTTP status instead of Conflict for certain errors (#14900 ) PutBucketVersioning API should return BadRequest for errors instead of Conflict, Conflict is used for "AlreadyExists" resource situations.	2022-05-11 13:44:16 -07:00
Harshavardhana	f939222942	add support for extra prometheus labels (#14899 ) fixes #14353	2022-05-11 13:04:53 -07:00
Krishna Srinivas	e34ca9acd1	retry each object decom upto 3 times, in-case of failure (#14861 )	2022-05-11 11:37:32 -07:00
Aditya Manthramurthy	83071a3459	Add support for Access Management Plugin (#14875 ) - This change renames the OPA integration as Access Management Plugin - there is nothing specific to OPA in the integration, it is just a webhook. - OPA configuration is automatically migrated to Access Management Plugin and OPA specific configuration is marked as deprecated. - OPA doc is updated and moved.	2022-05-10 17:14:55 -07:00
Anis Elleuch	edf364bf21	tracing: Add disk path to storage tracing (#14883 ) Example: 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/2 / 227.834µs 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/4 / 236.042µs 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/3 / 130.958µs 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/1 / 102.875µs	2022-05-10 07:48:07 -07:00
Anis Elleuch	1e037883b0	pools: GetObjectNInfo should cover locking during object read (#14887 ) In case of multi-pools setup, GetObjectNInfo returns a GetObjectReader but it unlocks the read lock when quitting GetObjectNInfo. This should not happen, unlock should only happen when GetObjectReader is closed.	2022-05-10 07:47:40 -07:00
Klaus Post	d909f167ff	tests: Add localLocker RUnlock test (#14882 )	2022-05-09 09:55:52 -07:00
Harshavardhana	62aa42cccf	avoid replication proxy on version excluded paths (#14878 ) no need to attempt proxying objects that were never replicated, but do have local `null` versions on them.	2022-05-08 16:50:31 -07:00
Harshavardhana	5cffd3780a	fix: multiple fixes in prefix exclude implementation (#14877 ) - do not need to restrict prefix exclusions that do not have `/` as suffix, relax this requirement as spark may have staging folders with other autogenerated characters , so we are better off doing full prefix March and skip. - multiple delete objects was incorrectly creating a null delete marker on a versioned bucket instead of creating a proper versioned delete marker. - do not suspend paths on the excluded prefixes during delete operations to avoid creating `null` delete markers, honor suspension of versioning only at bucket level for delete markers.	2022-05-07 22:06:44 -07:00
Harshavardhana	def75ffcfe	allow versioning config changes under site replication (#14876 ) PR #14828 introduced prefix-level exclusion of versioning and replication - however our site replication implementation since it defaults versioning on all buckets did not allow changing versioning configuration once the bucket was created. This PR changes this and ensures that such changes are honored and also propagated/healed across sites appropriately.	2022-05-07 18:39:40 -07:00

1 2 3 4 5 ...

4564 Commits