minio

mirror of https://github.com/minio/minio.git synced 2025-12-02 06:07:51 -05:00

Author	SHA1	Message	Date
Harshavardhana	fd37418da2	fix: allow server not initialized error to be retried (#18300 ) Since relaxing quorum the error across pools for ListBuckets(), GetBucketInfo() we hit a situation where loading IAM could potentially return an error for second pool that server is not initialized. We need to handle this, let the pool come online and retry transparently - this PR fixes that.	2023-10-23 12:30:20 -07:00
Harshavardhana	bbfea29c2b	use object modTime for the event sequencer ID (#18285 ) always set modTime after lock is acquired in completemultipart stage to make sure that the modTime is not racy.	2023-10-20 19:28:05 -07:00
Harshavardhana	aa703dc903	relax write quorum requirement for ListBuckets()/HeadBucket() (#18288 ) Also fix error handling for HeadBucket() to be pool specific	2023-10-20 17:50:21 -07:00
Harshavardhana	780882efcf	do not check for query params to be signed headers (#18283 ) x-amz-signed-headers is meant for HTTP headers only not for query params, using that to verify things further can lead to failure. The generated presigned URL with custom metadata is already kosher (tamper proof). fixes #18281	2023-10-19 21:32:49 -07:00
Klaus Post	ba6218b354	fix: resource metrics "concurrent map iteration and map write" (#18273 ) `resourceMetricsMap` has no protection against concurrent reads and writes. Add a mutex and don't use maps from the last iteration. Bug introduced in #18057 Fixes #18271	2023-10-18 13:28:50 -07:00
Harshavardhana	8e32de3ba9	cache DiskInfo() metrics call separately (#18270 )	2023-10-18 11:17:32 -07:00
Klaus Post	e37508fb8f	fix: linter errors in Windows specific code (#18276 )	2023-10-18 11:08:15 -07:00
Klaus Post	b46a717425	Remove unused config migration (#18277 ) None of the migration is called. Remove dead code.	2023-10-18 11:05:24 -07:00
Klaus Post	7926df0b80	Fix globalDeploymentID race (#18275 ) globalDeploymentID was being read while it was being set. Fixes race: ``` WARNING: DATA RACE Write at 0x0000079605a0 by main goroutine: github.com/minio/minio/cmd.connectLoadInitFormats() github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0 github.com/minio/minio/cmd.waitForFormatErasure() github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d ... Previous read at 0x0000079605a0 by goroutine 105: github.com/minio/minio/cmd.newContext() github.com/minio/minio/cmd/utils.go:817 +0x31e github.com/minio/minio/cmd.adminMiddleware.func1() github.com/minio/minio/cmd/admin-router.go:110 +0x96 net/http.HandlerFunc.ServeHTTP() net/http/server.go:2136 +0x47 github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1() github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a net/http.HandlerFunc.ServeHTTP() net/http/server.go:2136 +0x47 ... ```	2023-10-18 08:06:57 -07:00
Harshavardhana	f91b257f50	choose different max_concurrent requests per drive based on HDD/NVMe (#18254 ) currently the default for all drives is 512, which is a lot for HDDs the recent testing has revealed moving this to 32 for HDDs seems like a fair value.	2023-10-16 17:18:13 -07:00
Harshavardhana	edfb310a59	fix: always load ENVs from files first as soon as server starts (#18247 ) This is a regression from #18231, however reading from ENV files must happen well before any parsing logic is invoked.	2023-10-15 21:13:43 -07:00
Poorna	78f1f69d57	fix site replication resync status (#18245 ) To persist status changes on disk upon completion. Adds new tests to handle this functionality.	2023-10-13 22:17:22 -07:00
Harshavardhana	e1e33077e8	fix: tests and resync replication status (#18244 )	2023-10-13 17:03:34 -07:00
Aditya Manthramurthy	b3e7de010d	Remove usage of errors.Join for go1.19 compat (#18243 )	2023-10-13 15:14:16 -07:00
Shireesh Anjal	bf1c6edb76	Revert "Capture network device info in health report" (#18241 ) Introducing a new version of healthinfo struct for adding this info is not correct. It needs to be implemented differently without adding a new version. This reverts commit 8737025d940f80360ed4b3686b332db5156f6659.	2023-10-13 07:46:36 -07:00
jiuker	2ac7fee017	fix: missing fileName will upload failed when PostPolicyBucketHandler (#18240 )	2023-10-13 07:31:23 -07:00
Klaus Post	128256e3ab	Add event counters (#18232 ) Export metric for global events sent and skipped for the lifetime of the server.	2023-10-12 15:39:22 -07:00
Shireesh Anjal	a66a7f3e97	Capture network device info in health report (#18213 )	2023-10-12 15:33:31 -07:00
jiuker	20b79f8945	fix: env depend on the flag (#18231 )	2023-10-12 15:32:38 -07:00
Klaus Post	9a877734b2	Fix various poolmeta races (#18230 ) There is a fundamental race condition in `newErasureServerPools`, where setObjectLayer is called before the poolMeta has been loaded/populated. We add a placeholder value to this field but disable all saving of the value, so we don't risk overwriting the value on disk. Once the value has been loaded or created, it is replaced with the proper value, which will also be saved. Also fixes various accesses of `poolMeta` that were done without locks. We make the `poolMeta.IsSuspended` return false, even if we shouldn't risk out-of-bounds reads anymore.	2023-10-12 15:30:42 -07:00
Harshavardhana	409c391850	implement helpers to get relevant info instead of FileInfo() (#18228 )	2023-10-12 15:29:59 -07:00
jiuker	000928d34e	fix: should call func globalOSMetrics.time(s)() when updateOSMetrics (#18209 )	2023-10-12 00:08:13 -07:00
Harshavardhana	6829ae5b13	completely remove drive caching layer from gateway days (#18217 ) This has already been deprecated for close to a year now.	2023-10-11 21:18:17 -07:00
jiuker	f09756443d	fix: a dynamic config will make a panic for addOrUpdateIDP (#18208 )	2023-10-11 09:06:40 -07:00
jiuker	5512016885	fix: siteResyncMetrics init will make a deadlock when len(siteReplication) >= 3 (#18206 )	2023-10-10 23:27:27 -07:00
Harshavardhana	21ecb941fe	fix: avoid counting out of band deletes during disk heal (#18205 )	2023-10-10 14:39:48 -07:00
Harshavardhana	77e94087cf	fix: calling statfs() call moves the disk head (#18203 ) if erasure upgrade is needed rely on the in-memory values, instead of performing a "DiskInfo()" call. https://brendangregg.com/blog/2016-09-03/sudden-disk-busy.html for HDDs these are problematic, lets avoid this because there is no value in "being" absolutely strict here in terms of parity. We are okay to increase parity as we see based on the in-memory online/offline ratio.	2023-10-10 13:47:35 -07:00
Klaus Post	9ab1f25a47	fix : PutObjectExtract data races (#18199 ) Several callers to putObjectTar may be fighting to set sc. Move the write out of the loop. Use static resp, and request elements. Fixes tests with -race: ``` WARNING: DATA RACE Read at 0x00c01cd680e0 by goroutine 691354: github.com/minio/minio/cmd.objectAPIHandlers.PutObjectExtractHandler.func1() e:/gopath/src/github.com/minio/minio/cmd/object-handlers.go:2130 +0x149 github.com/minio/minio/cmd.untar.func1() e:/gopath/src/github.com/minio/minio/cmd/untar.go:250 +0x2b6 github.com/minio/minio/cmd.untar.func8() e:/gopath/src/github.com/minio/minio/cmd/untar.go:261 +0xa4 Previous write at 0x00c01cd680e0 by goroutine 691352: github.com/minio/minio/cmd.objectAPIHandlers.PutObjectExtractHandler.func1() e:/gopath/src/github.com/minio/minio/cmd/object-handlers.go:2131 +0x15d github.com/minio/minio/cmd.untar.func1() e:/gopath/src/github.com/minio/minio/cmd/untar.go:250 +0x2b6 github.com/minio/minio/cmd.untar.func8() e:/gopath/src/github.com/minio/minio/cmd/untar.go:261 +0xa4 ```	2023-10-10 08:36:44 -07:00
jiuker	aaab7aefbe	fix: avoid nil panic upon error in GetObjectNInfo via InnerGetObjectNInfoFn (#18198 )	2023-10-10 08:35:33 -07:00
Klaus Post	5b8599e52d	Do not log invalid tag errors (#18200 ) Eliminate logging on invalid tags: ``` API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325) Time: 07:40:33 UTC 10/10/2023 DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71 RequestID: 178CAF0DE0A67480 RemoteHost: 127.0.0.1 Host: 127.0.0.1:9001 UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64) Error: Tags cannot be more than 10 (tags.errTag) 5: internal\logger\logger.go:259:logger.LogIf() 4: cmd\api-errors.go:2350:cmd.toAPIErrorCode() 3: cmd\api-errors.go:2375:cmd.toAPIError() 2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler() 1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP() API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325) Time: 07:40:33 UTC 10/10/2023 DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71 RequestID: 178CAF0DE0BEA514 RemoteHost: 127.0.0.1 Host: 127.0.0.1:9001 UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64) Error: Cannot provide multiple Tags with the same key (tags.errTag) 5: internal\logger\logger.go:259:logger.LogIf() 4: cmd\api-errors.go:2350:cmd.toAPIErrorCode() 3: cmd\api-errors.go:2375:cmd.toAPIError() 2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler() 1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP() API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325) Time: 07:40:33 UTC 10/10/2023 DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71 RequestID: 178CAF0DE0E78970 RemoteHost: 127.0.0.1 Host: 127.0.0.1:9001 UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64) Error: The TagKey you have provided is invalid (tags.errTag) 5: internal\logger\logger.go:259:logger.LogIf() 4: cmd\api-errors.go:2350:cmd.toAPIErrorCode() 3: cmd\api-errors.go:2375:cmd.toAPIError() 2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler() 1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP() API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325) Time: 07:40:33 UTC 10/10/2023 DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71 RequestID: 178CAF0DE1002AE8 RemoteHost: 127.0.0.1 Host: 127.0.0.1:9001 UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64) Error: The TagValue you have provided is invalid (tags.errTag) 5: internal\logger\logger.go:259:logger.LogIf() 4: cmd\api-errors.go:2350:cmd.toAPIErrorCode() 3: cmd\api-errors.go:2375:cmd.toAPIError() 2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler() 1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP() ```	2023-10-10 08:35:03 -07:00
Harshavardhana	74e0c9ab9b	reduce unnecessary logging, simplify certain error handling (#18196 ) remove a bunch of unnecessary logs	2023-10-10 00:33:42 -07:00
Harshavardhana	dcce83b288	avoid rebalance state for getObjectTags if any (#18197 ) fixes #18190	2023-10-09 23:56:26 -07:00
Matthew Toohey	f731e7ea36	Fix current_send_in_progress metric always being zero (#18160 )	2023-10-09 17:28:17 -07:00
Maxim Tkachenko	ec30bb89a4	simplify channel send() in WalkDir() (#18186 )	2023-10-09 17:27:55 -07:00
Klaus Post	7cd08594f6	Use better host names for metric errors (#18188 ) Typically hosts would end up like this: ``` "hosts": [ ":9000", ":9000", ":9000", ... ``` Also add host name to errors.	2023-10-09 17:27:11 -07:00
Aditya Manthramurthy	2b4531f069	fix: O_DIRECT is on only for multi-disk setups (#18194 ) Disable it for single disk/unsupported platforms	2023-10-09 17:08:40 -07:00
Harshavardhana	11544a62aa	fix: upon write failure on disk journal close the file properly (#18183 ) close the file properly before dereferencing *os.File, this can silently leak fd's in rare cases. This PR fixes this properly.	2023-10-08 12:17:08 -07:00
Taran Pelkey	18550387d5	fix: DeleteServiceAccount API behavior (#18163 )	2023-10-08 12:13:18 -07:00
Klaus Post	0de2b9a1b2	Fix panic on double unfreezeServices (#18177 ) Calling unfreezeServices twice results in panic: ``` panic: "POST /minio/peer/v32/signalservice?signal=4&sub-sys=": close of nil channel goroutine 14703 [running]: runtime/debug.Stack() runtime/debug/stack.go:24 +0x65 github.com/minio/minio/cmd.setCriticalErrorHandler.func1.1() github.com/minio/minio/cmd/generic-handlers.go:549 +0x8e panic({0x27c3020, 0x4c9b370}) runtime/panic.go:884 +0x212 github.com/minio/minio/cmd.unfreezeServices() github.com/minio/minio/cmd/service.go:112 +0xc7 github.com/minio/minio/cmd.(*peerRESTServer).SignalServiceHandler(0x0?, {0x4cb6af0, 0xc010b96420}, 0xc01affab00) github.com/minio/minio/cmd/peer-rest-server.go:837 +0x13a net/http.HandlerFunc.ServeHTTP(...) ``` If the function was called a second time `val` would not be nil, but the returned channel `ch` would be, causing the panic. Check the channel isn't nil and also use Swap for an atomic swap instead of 2 separate operations (though we are in a mutex).	2023-10-06 07:51:50 -06:00
Poorna	9dc29d7687	Avoid ILM expiry on deleted versions that are yet to replicate (#18175 ) Fixes #18167	2023-10-06 06:55:15 -06:00
Poorna	72871dbb9a	delete replication: avoid overwriting replication decision (#18174 ) from ObjectInfo unless version purge status is present. Otherwise there is potential to make incorrect replication decision if Stat returned an error	2023-10-05 21:09:45 -06:00
Aditya Manthramurthy	4bda4e4e2b	fix: check for disk-level O_DIRECT support (#18173 ) Disk level O_DIRECT support checking at xl storage initialization was conditional on a config setting being enabled. (This never took effect because config initialization happens after ObjectLayer is ready.) This is not necessary as the config setting is dynamic - O_DIRECT should be enabled via runtime config. So we need to do the disk level support check regardless of the config setting.	2023-10-05 20:54:49 -06:00
Harshavardhana	1971c54a50	update buffer channels for both trace and listen events (#18171 ) - Trace needs higher buffered channels than 4000 to ensure when we run `mc admin trace -a` it captures all information sufficiently. - Listen event notification needs the event channel to be `apiRequestsMaxPerNode` * number of nodes	2023-10-05 18:16:04 -06:00
Anis Eleuch	b336e9a79f	fix: loading usage cache to not fail early when reading the backup fails (#18158 ) Currently, the retry is not fully used when there is no backup copy of the data usage; use 5 retry attempts when we don't have any valid data, new or backup, unless we have seen an un-recognized error.	2023-10-02 19:22:35 -07:00
Harshavardhana	a2ab21e91c	add max-keys=2 optimization for spark workloads (#18154 ) comment in the code provides more detailed explanation on what this PR entails and its assumptions. this PR reduces the amount of listing() by an order of magnitude, however there are other such calls that still needs further optimization that shall be done in subsequent PRs.	2023-10-02 07:52:59 -06:00
Sveinn	603437e70f	Fix startup formatting (#18156 ) Percentages in root user names are used for formatting. Before: ``` S3-API: http://192.168.50.21:9000 http://172.31.96.1:9000 http://127.0.0.1:9000 RootUser: "U4B6Zi!b75DXSPm%!!(MISSING)a(MISSING)vZb" RootPass: "Q4#Q6y8G%!P(MISSING)x#npP4dudUobU#NBcGB7RMKV4ajYb" Console: http://192.168.50.21:51915 http://172.31.96.1:51915 http://127.0.0.1:51915 RootUser: "U4B6Zi!b75DXSPm%!!(MISSING)a(MISSING)vZb" RootPass: "Q4#Q6y8G%!P(MISSING)x#npP4dudUobU#NBcGB7RMKV4ajYb" Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart FORMAT: %117s MESSAGE: $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%avZb" "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb" $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%!a(MISSING)vZb" "Q4#Q6y8G%Px#npP4dudUobU#NBcGB7RMKV4ajYb" ``` After: ``` Status: 1 Online, 0 Offline. S3-API: http://192.168.50.21:9000 http://172.31.96.1:9000 http://127.0.0.1:9000 RootUser: "U4B6Zi!b75DXSPm%avZb" RootPass: "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb" Console: http://192.168.50.21:52421 http://172.31.96.1:52421 http://127.0.0.1:52421 RootUser: "U4B6Zi!b75DXSPm%avZb" RootPass: "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb" Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%avZb" "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb" ``` No need for special Windows case. `mc` works just fine.	2023-10-02 07:39:47 -06:00
Shireesh Anjal	6d20ec3bea	Add support for resource metrics (#18057 ) Add a new endpoint for "resource" metrics `/v2/metrics/resource` This should return system metrics related to drives, network, CPU and memory. Except for drives, other metrics should have corresponding "avg" and "max" values also. Reuse the real-time feature to capture the required data, introducing CPU and memory metrics in it. Collect the data every minute and keep updating the average and max values accordingly, returning the latest values when the API is called.	2023-09-30 13:40:20 -07:00
Anis Eleuch	22d2dbc4e6	decom: Fix infinite retry when the decom is canceled (#18143 ) Also, use rand.Float64() since it is thread-safe; otherwise go race will complain.	2023-09-30 00:02:29 -07:00
Harshavardhana	d6446cb096	do not return an error in AbortMultipartUpload() (#18135 ) returning an error is a bit undefined in AWS S3 as it may return an error or not depending on the time from AbortMultipartUpload().	2023-09-29 10:28:19 -07:00
Harshavardhana	c34bdc33fb	make sure to set Versioned field to ensure rename2 is not called (#18141 ) without this the rename2() can rename the previous dataDir causing issues for different versions of the object, only latest version is preserved due to this bug. Added healing code to ensure recovery of such content.	2023-09-29 09:08:24 -07:00

1 2 3 4 5 ...

5616 Commits