minio

mirror of https://github.com/minio/minio.git synced 2025-11-28 13:09:09 -05:00

Author	SHA1	Message	Date
Harshavardhana	53997ecc79	avoid excessive logging for objects that do not exist (#19030 ) in replicated setups, that have proxying enabled for replicated buckets.	2024-02-11 14:21:08 -08:00
Harshavardhana	997ba3a574	introduce reader deadlines for net.Conn (#19023 ) Bonus: set "retry-after" header for AWS SDKs if possible to honor them.	2024-02-09 13:25:16 -08:00
Harshavardhana	62761a23e6	remove unnecessary metrics in 'mc admin info' output (#19020 ) Reduce the amount of data transfer on large deployments	2024-02-08 19:28:46 -08:00
Harshavardhana	404d8b3084	fix: dangling objects honor parityBlocks instead of dataBlocks (#19019 ) Bonus: do not recreate buckets if NoRecreate is asked.	2024-02-08 15:22:16 -08:00
Klaus Post	6005ad3d48	Fix shared top locks client (#19018 ) `client` is shared across goroutines. Seen with `mc support top locks` on minio built with `-race`.	2024-02-08 12:28:05 -08:00
Harshavardhana	035a3ea4ae	optimize startup sequence performance (#19009 ) - bucket metadata does not need to look for legacy things anymore if b.Created is non-zero - stagger bucket metadata loads across lots of nodes to avoid the current thundering herd problem. - Remove deadlines for RenameData, RenameFile - these calls should not ever be timed out and should wait until completion or wait for client timeout. Do not choose timeouts for applications during the WRITE phase. - increase R/W buffer size, increase maxMergeMessages to 30	2024-02-08 11:21:21 -08:00
Aditya Manthramurthy	e104b183d8	fix: skip policy usage validation for cache update (#19008 ) When updating the policy cache, we do not need to validate policy usage as the policy has already been deleted by the node sending the notification.	2024-02-07 20:39:53 -08:00
Klaus Post	7e082f232e	Add GetBucketInfo toStorageErr conversion (#19005 ) Convert error to storageError since it is used for quorum calculations here: `ff80cfd83d/cmd/peer-s3-client.go (L339)`	2024-02-07 14:24:24 -08:00
Harshavardhana	d28bf71f25	listing must return WalkDir() errors first (#19006 )	2024-02-07 13:20:07 -08:00
Harshavardhana	5b1a74b6b2	do not block iam.store registration (#18999 ) current implementation would quite simply block the sys.store registration, making sys.Initialized() call to be blocked.	2024-02-07 12:41:58 -08:00
Klaus Post	ebc6c9b498	Fix tracing send on closed channel (#18982 ) Depending on when the context cancelation is picked up the handler may return and close the channel before `SubscribeJSON` returns, causing: ``` Feb 05 17:12:00 s3-us-node11 minio[3973657]: panic: send on closed channel Feb 05 17:12:00 s3-us-node11 minio[3973657]: goroutine 378007076 [running]: Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub.(PubSub[...]).SubscribeJSON.func1() Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub/pubsub.go:139 +0x12d Feb 05 17:12:00 s3-us-node11 minio[3973657]: created by github.com/minio/minio/internal/pubsub.(PubSub[...]).SubscribeJSON in goroutine 378010884 Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub/pubsub.go:124 +0x352 ``` Wait explicitly for the goroutine to exit. Bonus: Listen for doneCh when sending to not risk getting blocked there is channel isn't being emptied.	2024-02-06 08:57:30 -08:00
Harshavardhana	630963fa6b	protect tracker copy properly to avoid race (#18984 ) ``` WARNING: DATA RACE Write at 0x00c000aac1e0 by goroutine 1133: github.com/minio/minio/cmd.(healingTracker).updateProgress() github.com/minio/minio/cmd/background-newdisks-heal-ops.go:183 +0x117 github.com/minio/minio/cmd.(erasureObjects).healErasureSet.func5() github.com/minio/minio/cmd/global-heal.go:292 +0x1d3 Previous read at 0x00c000aac1e0 by goroutine 1003: github.com/minio/minio/cmd.(allHealState).updateHealStatus() github.com/minio/minio/cmd/admin-heal-ops.go:136 +0xcb github.com/minio/minio/cmd.(healingTracker).save() github.com/minio/minio/cmd/background-newdisks-heal-ops.go:223 +0x424 ```	2024-02-06 08:56:59 -08:00
Harshavardhana	f674168b8b	Add missing gob register for map[string]string{} (#18974 ) ``` minio[1303918]: API: SYSTEM() minio[1303918]: Time: 02:04:28 UTC 02/05/2024 minio[1303918]: DeploymentID: 0972de33-2d17-4499-8967-aff6437dd9da minio[1303918]: Error: gob: type not registered for interface: map[string]string (errors.errorString) minio[1303918]: 4: internal/logger/logonce.go:118:logger.(logOnceType).logOnceIf() minio[1303918]: 3: internal/logger/logonce.go:149:logger.LogOnceIf() minio[1303918]: 2: cmd/peer-rest-server.go:533:cmd.(*peerRESTServer).GetSysConfigHandler() minio[1303918]: 1: net/http/server.go:2136:http.HandlerFunc.ServeHTTP() ```	2024-02-06 08:23:23 -08:00
Poorna	27d02ea6f7	metrics: add replication metrics on proxied requests (#18957 )	2024-02-05 22:00:45 -08:00
Harshavardhana	794a7993cb	calculate correct quorum check for metadata updates on object (#18979 ) this fixes rare bugs we have seen but never really found a reproducer for - PutObjectRetention() returning 503s - PutObjectTags() returning 503s - PutObjectMetadata() updates during replication returning 503s These calls return errors, and this perpetuates with no apparent fix. This PR fixes with correct quorum requirement.	2024-02-05 21:44:40 -08:00
Harshavardhana	6f16d1cb2c	do not count context canceled as timeout errors (#18975 )	2024-02-05 18:16:13 -08:00
Anis Eleuch	7aa00bff89	sts: Add support of AssumeRoleWithWebIdentity and DurationSeconds (#18835 ) To force limit the duration of STS accounts, the user can create a new policy, like the following: { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": ["sts:AssumeRoleWithWebIdentity"], "Condition": {"NumericLessThanEquals": {"sts:DurationSeconds": "300"}} }] } And force binding the policy to all OpenID users, whether using a claim name or role ARN.	2024-02-05 11:44:23 -08:00
Klaus Post	e046eb1d17	Disable Rename2 metrics on non-linux (#18970 ) Logging a call that always fails is pointless.	2024-02-05 10:48:14 -08:00
Anis Eleuch	ba975ca320	Add defensive code to ignore checking parts with transitioned objects (#18973 ) Though dataErrs are nil with transitioned objects, add a more defensive code to ignore counting missing parts in that case	2024-02-05 10:48:03 -08:00
Harshavardhana	fec13b0ec1	remove unused DiskMTime (#18965 )	2024-02-05 01:04:26 -08:00
Harshavardhana	100c35c281	avoid excessive logs when peer is down (#18969 )	2024-02-04 23:25:42 -08:00
Harshavardhana	f225ca3312	Add more advanced cases for dangling (#18968 )	2024-02-04 14:36:13 -08:00
Frank Wessels	8b68e0bfdc	Fix typo in api-router.go (#18955 )	2024-02-03 14:03:51 -08:00
Anis Eleuch	6ae97aedc9	xl: Disable rename2 in decommissioning/rebalance (#18964 ) Always disable rename2 optimization in decom/rebalance	2024-02-03 14:03:30 -08:00
Harshavardhana	960d604013	disconnected returns, an unexpected error to List() returning 500s (#18959 ) provide the error string appropriately so that the matching of error types works. Also add a string based fallback for the said error.	2024-02-03 01:04:33 -08:00
Harshavardhana	ff80cfd83d	move Make,Delete,Head,Heal bucket calls to websockets (#18951 )	2024-02-02 14:54:54 -08:00
Harshavardhana	99fde2ba85	deprecate disk tokens, instead rely on deadlines and active monitoring (#18947 ) disk tokens usage is not necessary anymore with the implementation of deadlines for storage calls and active monitoring of the drive for I/O timeouts. Functionality kicking off a bad drive is still supported, it's just that we do not have to serialize I/O in the manner tokens would do.	2024-02-02 10:10:54 -08:00
Frank Wessels	31743789dc	Fix some leftover issues from PR 18936 (#18946 )	2024-02-01 19:42:56 -08:00
Anis Eleuch	6fd63e920a	log: Use error log type instead of Application/MinIO type (#18930 ) * log: Use error log type instead of Application/MinIO type Also bump github.com/shirou/gopsutil version to address cross compilation issues. * Apply suggestions from code review Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com> --------- Co-authored-by: Anis Eleuch <anis@min.io> Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2024-02-01 16:13:57 -08:00
Aditya Manthramurthy	59cc3e93d6	fix: `null` inline policy handling for access keys (#18945 ) Interpret `null` inline policy for access keys as inheriting parent policy. Since MinIO Console currently sends this value, we need to honor it for now. A larger fix in Console and in the server are required. Fixes #18939.	2024-02-01 14:45:03 -08:00
Anis Eleuch	61a4bb38cd	batch: Fix a typo while validating smallerThan field (#18942 )	2024-02-01 13:53:26 -08:00
Klaus Post	b192bc348c	Improve object reuse for grid messages (#18940 ) Allow internal types to support a `Recycler` interface, which will allow for sharing of common types across handlers. This means that all `grid.MSS` (and similar) objects are shared across in a common pool instead of a per-handler pool. Add internal request reuse of internal types. Add for safe (pointerless) types explicitly. Only log params for internal types. Doing Sprint(obj) is just a bit too messy.	2024-02-01 12:41:20 -08:00
Harshavardhana	6440d0fbf3	move a collection of peer APIs to websockets (#18936 )	2024-02-01 10:47:20 -08:00
Anis Eleuch	24ecc44bac	Keep ServiceV1 admin stop/restart API and mark as deprecated (#18932 )	2024-01-31 12:20:33 -08:00
Aditya Manthramurthy	0ae4915a93	fix: permission checks for editing access keys (#18928 ) With this change, only a user with `UpdateServiceAccountAdminAction` permission is able to edit access keys. We would like to let a user edit their own access keys, however the feature needs to be re-designed for better security and integration with external systems like AD/LDAP and OpenID. This change prevents privilege escalation via service accounts.	2024-01-31 10:56:45 -08:00
Harshavardhana	caac9d216e	remove all the frivolous logs, that may or may not be actionable (#18922 ) for actionable, inspections we have `mc support inspect` we do not need double logging, healing will report relevant errors if any, in terms of quorum lost etc.	2024-01-30 18:11:45 -08:00
Harshavardhana	057192913c	add total usable capacity, free and used to DataUsageInfo() (#18921 )	2024-01-30 17:49:37 -08:00
Harshavardhana	f25cbdf43c	use all the available nr_requests for NVMe (#18920 )	2024-01-30 14:10:06 -08:00
Klaus Post	6da4a9c7bb	Improve tracing & notification scalability (#18903 ) * Perform JSON encoding on remote machines and only forward byte slices. * Migrate tracing & notification to WebSockets.	2024-01-30 12:49:02 -08:00
Harshavardhana	80ca120088	remove checkBucketExist check entirely to avoid fan-out calls (#18917 ) Each Put, List, Multipart operations heavily rely on making GetBucketInfo() call to verify if bucket exists or not on a regular basis. This has a large performance cost when there are tons of servers involved. We did optimize this part by vectorizing the bucket calls, however its not enough, beyond 100 nodes and this becomes fairly visible in terms of performance.	2024-01-30 12:43:25 -08:00
Anis Eleuch	a669946357	Add cgroup v2 support for memory limit (#18905 )	2024-01-30 11:13:27 -08:00
Poorna	7ffc162ea8	exclude veeam virtual objects from replication (#18918 ) Fixes: #18916	2024-01-30 10:43:58 -08:00
Poorna	bcfd7fbbcf	reuse transports for callhome and remote tgt validation (#18912 )	2024-01-29 23:05:39 -08:00
Harshavardhana	486e2e48ea	enable xattr capture by default (#18911 ) - healing must not set the write xattr because that is the job of active healing to update. what we need to preserve is permanent deletes. - remove older env for drive monitoring and enable it accordingly, as a global value.	2024-01-29 23:03:58 -08:00
Harshavardhana	2ddf2ca934	allow configuring maximum idle connections per host (#18908 )	2024-01-29 16:50:37 -08:00
Poorna	29b1a29044	fix metrics panic in node metrics endpoint (#18894 )	2024-01-29 12:32:44 -08:00
jiuker	b4ab8e095a	fix: preserve bucket metric of data usage for replication info (#18895 )	2024-01-29 08:54:20 -08:00
Harshavardhana	cff8235068	remove getReplicationNodeMetrics() from peer metrics groups	2024-01-28 18:45:20 -08:00
Harshavardhana	944f3c1477	remove local disk metrics from cluster metrics (#18886 ) local disk metrics were polluting cluster metrics Please remove them instead of adding relevant ones. - batch job metrics were incorrectly kept at bucket metrics endpoint, move it to cluster metrics. - add tier metrics to cluster peer metrics from the node. - fix missing set level cluster health metrics	2024-01-28 12:53:59 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00

1 2 3 4 5 ...

5806 Commits