minio

mirror of https://github.com/minio/minio.git synced 2025-12-01 13:52:34 -05:00

Author	SHA1	Message	Date
Harshavardhana	72cff79c8a	add missing STS accounts loading (#20279 ) PR #20268 missed loading STS accounts map properly	2024-08-16 18:24:54 -07:00
Harshavardhana	a5702f978e	remove requests deadline, instead just reject the requests (#20272 ) Additionally set - x-ratelimit-limit - x-ratelimit-remaining To indicate the request rates.	2024-08-16 01:43:49 -07:00
Poorna	4687c4616f	try loading temp account if not in cache (#20266 )	2024-08-15 23:12:42 -07:00
Harshavardhana	cc0c41d216	remove region locks and make them simpler (#20268 ) - single flight approach is now optional, instead of default. - parallelize the loaders upto 32 items per assets (more room for improvement possible)	2024-08-15 08:41:03 -07:00
Klaus Post	f1302c40fe	Fix uninitialized replication stats (#20260 ) Services are unfrozen before `initBackgroundReplication` is finished. This means that the globalReplicationStats write is racy. Switch to an atomic pointer. Provide the `ReplicationPool` with the stats, so it doesn't have to be grabbed from the atomic pointer on every use. All other loads and checks are nil, and calls return empty values when stats still haven't been initialized.	2024-08-15 05:04:40 -07:00
Klaus Post	d96798ae7b	Add support profile deadlines and concurrent operations (#20244 ) * Allow a maximum of 10 seconds to start profiling operations. * Download up to 16 profiles concurrently, but only allow 10 seconds for each (does not include write time). * Add cluster info as the first operation. * Ignore remote download errors. * Stop remote profiles if the request is terminated.	2024-08-15 03:36:00 -07:00
Anis Eleuch	b508264ac4	sr: Avoid recursion when loading site replicator credentials (#20262 ) If the site replication is enabled and the code tries to extract jwt claims while the site replication service account credentials are still not loaded yet, the code will enter an infinite loop, causing in a high CPU usage. Another possibility of the infinite loop is having some service accounts created by an old deployment version where the service account JWT was signed by the root credentials, but not anymore. This commit will remove the possibility of the infinite loop in the code and add root credential fallback to extract claims from old service accounts.	2024-08-14 18:29:20 -07:00
Harshavardhana	db78431b1d	avoid crash when initializing bucket quota cache (#20258 )	2024-08-14 17:34:56 -07:00
Klaus Post	3ffeabdfcb	Fix govet+staticcheck issues (#20263 ) This is better: https://github.com/golang/go/issues/60529	2024-08-14 10:11:51 -07:00
Anis Eleuch	51b1f41518	heal: Persist MRF queue in the disk during shutdown (#19410 )	2024-08-13 15:26:05 -07:00
Harshavardhana	e7a56f35b9	flatten out audit tags, do not send as free-form (#20256 ) move away from map[string]interface{} to map[string]string to simplify the audit, and also provide concise information. avoids large allocations under load(), reduces the amount of audit information generated, as the current implementation was a bit free-form. instead all datastructures must be flattened.	2024-08-13 15:22:04 -07:00
rubyisrust	516af01a12	chore: fix some function names (#20243 ) Signed-off-by: rubyisrust <rustrover@icloud.com>	2024-08-13 11:23:33 -07:00
Harshavardhana	acdb355070	update deps and update azure WARM tier implementation (#20247 )	2024-08-13 11:21:34 -07:00
Mark Theunissen	37c02a5f7b	Add dummy DeleteBucketCors for safety (#20253 )	2024-08-13 08:25:16 -07:00
Krishnan Parthasarathi	04be352ae9	Relax quorum agreement on DataDir values (#20232 ) Previously, we checked if we had a quorum on the DataDir value. We are removing this check, which allows reading objects with different DataDir values in a few drives (due to a rebalance-stop race bug) provided their eTags or ModTimes match.	2024-08-12 12:02:21 -07:00
Klaus Post	53eb7656de	Add admin info timeouts (#20249 ) Since a lot of operations load from storage, do remote calls, add a 10 second timeout to each operation. This should make `mc admin info` return values even under extreme conditions.	2024-08-12 10:24:29 -07:00
Harshavardhana	2e0fd2cba9	implement a safer completeMultipart implementation (#20227 ) - optimize writing part.N.meta by writing both part.N and its meta in sequence without network component. - remove part.N.meta, part.N which were partially success ful, in quorum loss situations during renamePart() - allow for strict read quorum check arbitrated via ETag for the given part number, this makes it double safer upon final commit. - return an appropriate error when read quorum is missing, instead of returning InvalidPart{}, which is non-retryable error. This kind of situation can happen when many nodes are going offline in rotation, an example of such a restart() behavior is statefulset updates in k8s. fixes #20091	2024-08-12 01:38:15 -07:00
Harshavardhana	909b169593	avoid source index to be same as destination index (#20238 ) during rebalance stop, it can possibly happen that Put() would race by overwriting the same object again. This may very well if done "successfully" it can potentially proceed to delete the object from the pool, causing data loss. This PR enhances #20233 to handle more scenarios such as these.	2024-08-09 19:30:44 -07:00
Krishnan Parthasarathi	4e67a4027e	Prevent overwrites due to rebalance-stop race (#20233 ) Rebalance-stop can race with ongoing rebalance operations. This change prevents these operations from overwriting objects by checking the source and destination pool indices are different.	2024-08-08 19:05:14 -07:00
Klaus Post	49055658a9	Fix missing hash in GetObjectAttributes (#20231 ) SHA256/SHA1 were mixed up. Simplify code as well.	2024-08-08 13:19:41 -07:00
Harshavardhana	89c58ce87d	enhance getActualSize() to return valid values for most situations (#20228 )	2024-08-08 08:29:58 -07:00
Mark Theunissen	2681219039	Add dummy PutBucketCors for functional test compatibility (#20220 )	2024-08-06 08:41:38 -07:00
Harshavardhana	dea9abed29	use singleflight when bucket metadata is reloaded() (#20216 ) this allows for de-duplicating the callers when called concurrently, allowing for bucketmetadata reads to be single call. All concurrent callers will get the same data as the first one.	2024-08-05 09:50:11 -07:00
Harshavardhana	e3eb5c1328	batch-exp: Remove 1000 maximum objects per call (#20212 ) It seems ObjectAPI.DeleteObjects() is clogging up when it is removing 10k versions of a single object. Authored-by: Anis Eleuch <anis@min.io>	2024-08-04 21:55:25 -07:00
Poorna	74c047cb03	fix replication last hour metric (#20199 ) also adding missing recent_backlog_count metric to v3 metrics	2024-08-01 17:55:27 -07:00
jiuker	50a5ad48fc	feat: support batch replication prefix slice (#20033 )	2024-08-01 05:53:30 -07:00
Harshavardhana	a9dc061d84	count metrics properly for any failures during drive heal (#20193 ) or via `mc admin heal --set 1 --pool 1`	2024-07-30 22:46:26 -07:00
Krishnan Parthasarathi	01a8c09920	Add fmt-gen subcommand (#20192 ) fmt-gen subcommand is only available when built with build tag `fmtgen`.	2024-07-30 15:59:48 -07:00
Aditya Manthramurthy	4c8562bcec	Fix v2 metrics: Send all ttfb api labels (#20191 ) Fix a regression in #19733 where TTFB metrics for all APIs except GetObject were removed in v2 and v3 metrics. This causes breakage for existing v2 metrics users. Instead we continue to send TTFB for all APIs in V2 but only send for GetObject in V3.	2024-07-30 15:28:46 -07:00
Harshavardhana	f13c04629b	allow multipart uploads expiration to be dynamic (#20190 ) allow multipart uploads expiration to be dyamic It would seem like the new values will take effect only after a restart for changes in multipart_expiration. This PR fixes this by making it dynamic as it should have been.	2024-07-30 12:01:06 -07:00
Harshavardhana	80ff907d08	add DeleteBulk support, add sufficient deadlines per rename() (#20185 ) deadlines per moveToTrash() allows for a more granular timeout approach for syscalls, instead of an aggregate timeout. This PR also enhances multipart state cleanup to be optimal by removing 100's of multipart network rename() calls into single network call.	2024-07-29 18:56:40 -07:00
Poorna	2d40433bc1	remove replication throttle deadline for objects > 128MiB (#20184 ) context deadline was introduced to avoid a slow transfer from blocking replication queue(s) shared by other buckets that may not be under throttling. This PR removes this context deadline for larger objects since they are anyway restricted to a limited set of workers. Otherwise, objects would get dequeued when the throttle limit is exceeded and cannot proceed within the deadline.	2024-07-29 15:14:52 -07:00
Harshavardhana	a17f14f73a	separate lock from common grid to avoid epoll contention (#20180 ) epoll contention on TCP causes latency build-up when we have high volume ingress. This PR is an attempt to relieve this pressure. upstream issue https://github.com/golang/go/issues/65064 It seems to be a deeper problem; haven't yet tried the fix provide in this issue, but however this change without changing the compiler helps. Of course, this is a workaround for now, hoping for a more comprehensive fix from Go runtime.	2024-07-29 11:10:04 -07:00
Poorna	6651c655cb	fix replication of checksum when encryption is enabled (#20161 ) - Adding functional tests - Return checksum header on GET/HEAD, previously this was returning InvalidPartNumber error	2024-07-29 01:02:16 -07:00
Harshavardhana	3ae104edae	change Read* calls over net/http to move to http.MethodGet (#20173 ) - ReadVersion - ReadFile - ReadXL Further changes include to - Compact internode resource RPC paths - Compact internode query params To optimize on parsing by gorilla/mux as the length of this string increases latency in gorilla/mux - reduce to a meaningful string.	2024-07-29 01:00:12 -07:00
jiuker	c87a489514	fix: support prefix when batchJob replicate enable the snowball (#20178 )	2024-07-29 00:59:50 -07:00
Poorna	641a56da0d	fix panic in replication queuing (#20169 ) Regression from #20077 ``` Jul 26 19:08:29 minio-dr-0101a minio[275423]: Error: grid handler (NSScanner) panic: runtime error: index out of range [4] with length 1 (errors.errorString) Jul 26 19:08:29 minio-dr-0101a minio[275423]: 33: internal/logger/logger.go:268:logger.LogIf() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 32: internal/grid/connection.go:50:grid.gridLogIf() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 31: internal/grid/muxserver.go:234:grid.(muxServer).handleRequests.func1() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 30: cmd/bucket-replication.go:2165:cmd.(ReplicationPool).queueReplicaTask() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 29: cmd/bucket-replication.go:3440:cmd.queueReplicationHeal() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 28: cmd/data-scanner.go:1396:cmd.(scannerItem).healReplication() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 27: cmd/data-scanner.go:1220:cmd.(scannerItem).applyActions() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 26: cmd/xl-storage.go:627:cmd.(xlStorage).NSScanner.func2() ```	2024-07-26 13:48:21 -07:00
Harshavardhana	a16193bb50	remove fdatasync() discard, we write with O_SYNC (#20168 ) fdatasync() discard for page-cached READs is not needed, it would seem like this can cause latencies in situations when things are loaded.	2024-07-26 10:27:56 -07:00
jiuker	132e7413ba	fix: check once ready for site-replication (#20149 )	2024-07-26 10:27:42 -07:00
Klaus Post	1966668066	Avoid Batch Replication Job log spam (#20158 ) Only print once per job and error location. Set default retry to default 1 second wait, and use as minimum.	2024-07-26 05:55:50 -07:00
Harshavardhana	064f36ca5a	move to GET for internal stream READs instead of POST (#20160 ) the main reason is to let Go net/http perform necessary book keeping properly, and in essential from consistency point of view its GETs all the way. Deprecate sendFile() as its buggy inside Go runtime.	2024-07-26 05:55:01 -07:00
Krishnan Parthasarathi	4a1edfd9aa	Different read quorum for tiered objects (#20115 ) For a non-tiered object, MinIO requires that EcM (# of data blocks) of xl.meta agree, corresponding to the number of data blocks needed to read this object. OTOH, tiered objects have metadata in the hot tier and data in the warm tier. The data and its integrity are offloaded to the warm tier. This allows us to reduce the read quorum from EcM (typically > N/2, where N - erasure stripe width) to N/2 + 1. The simple majority of metadata ensures consensus on what the object is and where it is located.	2024-07-25 14:02:50 -07:00
Anis Eleuch	b7f319b62a	properly reload a fresh drive when found in a failed state during startup (#20145 ) When a drive is in a failed state when a single node multiple drives deployment is started, a replacement of a fresh disk will not be properly healed unless the user restarts the node. Fix this by always adding the new fresh disk to globalLocalDrivesMap. Also remove globalLocalDrives for simplification, a map to store local node drives can still be used since the order of local drives of a node is not defined.	2024-07-24 16:30:33 -07:00
Anis Eleuch	33c101544d	kms: Expose API when bucket federation is enabled (#20143 ) kms: Expose API available when bucket federation is enabled When bucket federation feature is enabled, KMS API will not work, such as `mc admin kms key list` The commit will fix the issue by disabling bucket forwarding when this is a KMS request.	2024-07-24 15:44:29 -07:00
Harshavardhana	3b21bb5be8	use unixNanoTime instead of time.Time in lockRequestorInfo (#20140 ) Bonus: Skip Source, Quorum fields in lockArgs that are never sent during Unlock() phase.	2024-07-24 03:24:01 -07:00
Harshavardhana	6fe2b3f901	avoid sendFile() for ranges or object lengths < 4MiB (#20141 )	2024-07-24 03:22:50 -07:00
Taran Pelkey	b368d4cc13	Fix `updateGroupMembershipsForLDAP` behavior with unicode (#20137 )	2024-07-23 19:10:03 -07:00
Klaus Post	0680af7414	Set O_NONBLOCK for reads and writes on unix (#20133 ) Tracing syscalls, opening and reading an `xl.meta` looks like this: ``` openat(AT_FDCWD, "/mnt/drive1/ss8-old/testbucket/ObjSize4MiBThreads72/(554O51H/peTb(0iztdbTKw59.csv/xl.meta", O_RDONLY\|O_NOATIME\|O_CLOEXEC) = 34 <0.000> fcntl(34, F_GETFL) = 0x48000 (flags O_RDONLY\|O_LARGEFILE\|O_NOATIME) <0.000> fcntl(34, F_SETFL, O_RDONLY\|O_NONBLOCK\|O_LARGEFILE\|O_NOATIME) = 0 <0.000> epoll_ctl(4, EPOLL_CTL_ADD, 34, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP\|EPOLLET, data={u32=3172471557, u64=8145488475984499461}}) = -1 EPERM (Operation not permitted) <0.000> fcntl(34, F_GETFL) = 0x48800 (flags O_RDONLY\|O_NONBLOCK\|O_LARGEFILE\|O_NOATIME) <0.000> fcntl(34, F_SETFL, O_RDONLY\|O_LARGEFILE\|O_NOATIME) = 0 <0.000> fstat(34, {st_mode=S_IFREG\|0644, st_size=354, ...}) = 0 <0.000> read(34, "XL2 \1\0\3\0\306\0\0\1P\2\2\1\304$\225\304\20\0\0\0\0\0\0\0\0\0\0\0"..., 354) = 354 <0.000> close(34) = 0 <0.000> ``` Everything until `fstat` is the `os.Open` call. Looking at the code: https://github.com/golang/go/blob/master/src/os/file_unix.go#L212-L243 It seems for every file it "tries" to see if it is pollable. This causes `syscall.SetNonblock(fd, true)` to be called. This is the first `F_SETFL`. It then calls `f.pfd.Init("file", true)`. This will attempt to set it as pollable using `epoll_ctl`. This will always fail for files. It therefore calls `syscall.SetNonblock(fd, false)` resulting in the second `F_SETFL`. If we set the `O_NONBLOCK` call on the initial open, we should avoid the 4 `fcntl` syscalls per file. I don't see any way to avoid the `epoll_ctl` call, since kind is either `kindOpenFile` or `kindNonBlock`, so "pollable" will always be true. However avoiding 4 of 6 syscalls still seems worth it. This should not have any effect, since files will end up with "nonblock" anyway.	2024-07-23 09:36:24 -07:00
Harshavardhana	91805bcab6	add optimizations to bring performance on unversioned READS (#20128 ) allow non-inlined on disk to be inlined via an unversioned ReadVersion() call, we only need ReadXL() to resolve objects with multiple versions only. The choice of this block makes it to be dynamic and chosen by the user via `mc admin config set` Other bonus things - Start measuring internode TTFB performance. - Set TCP_NODELAY, TCP_CORK for low latency	2024-07-23 03:53:03 -07:00
jiuker	b3a94c4e85	fix: Use xtime duration to parse batch job (#20117 )	2024-07-23 00:05:53 -07:00

1 2 3 4 5 ...

6275 Commits