minio

mirror of https://github.com/minio/minio.git synced 2024-12-27 15:45:55 -05:00

Author	SHA1	Message	Date
shandongzhejiang	a8ff12bc72	chore: fix some comments (#20294 ) Signed-off-by: shandongzhejiang <shandongzhejiang@icloud.com>	2024-08-21 13:14:24 -07:00
jiuker	1e1bd3afd9	use io.NopCloser replace closeWrapper (#20287 )	2024-08-21 05:20:54 -07:00
Anis Eleuch	7b239ae154	sftp: Fix operations with a internal service account (#20293 ) sftp sends local requests to the S3 port while passing the session token header when the account corresponds to a service account. However, this is not permitted and will throw an error: "The security token included in the request is invalid" This commit will avoid passing the session token to the upper layer that initializes MinIO client to avoid this error.	2024-08-20 13:00:29 -07:00
Anis Eleuch	85c3db3a93	heal: Add finished flag to .healing.bin to avoid removing this latter (#20250 ) Sometimes, we need historical information in .healing.bin, such as the number of expired objects that the healing avoids to heal and that can create drive usage disparency in the same erasure set. For that reason, this commit will not remove .healing.bin anymore and it will have a new field called Finished so we know healing is finished in that drive.	2024-08-20 08:42:49 -07:00
Mark Theunissen	6378ca10a4	kms.ListKeys returns CreatedBy/CreatedAt when information is available (#20223 )	2024-08-17 23:43:03 -07:00
Harshavardhana	72cff79c8a	add missing STS accounts loading (#20279 ) PR #20268 missed loading STS accounts map properly	2024-08-16 18:24:54 -07:00
Harshavardhana	a5702f978e	remove requests deadline, instead just reject the requests (#20272 ) Additionally set - x-ratelimit-limit - x-ratelimit-remaining To indicate the request rates.	2024-08-16 01:43:49 -07:00
Poorna	4687c4616f	try loading temp account if not in cache (#20266 )	2024-08-15 23:12:42 -07:00
Harshavardhana	cc0c41d216	remove region locks and make them simpler (#20268 ) - single flight approach is now optional, instead of default. - parallelize the loaders upto 32 items per assets (more room for improvement possible)	2024-08-15 08:41:03 -07:00
Klaus Post	f1302c40fe	Fix uninitialized replication stats (#20260 ) Services are unfrozen before `initBackgroundReplication` is finished. This means that the globalReplicationStats write is racy. Switch to an atomic pointer. Provide the `ReplicationPool` with the stats, so it doesn't have to be grabbed from the atomic pointer on every use. All other loads and checks are nil, and calls return empty values when stats still haven't been initialized.	2024-08-15 05:04:40 -07:00
Klaus Post	d96798ae7b	Add support profile deadlines and concurrent operations (#20244 ) * Allow a maximum of 10 seconds to start profiling operations. * Download up to 16 profiles concurrently, but only allow 10 seconds for each (does not include write time). * Add cluster info as the first operation. * Ignore remote download errors. * Stop remote profiles if the request is terminated.	2024-08-15 03:36:00 -07:00
Anis Eleuch	b508264ac4	sr: Avoid recursion when loading site replicator credentials (#20262 ) If the site replication is enabled and the code tries to extract jwt claims while the site replication service account credentials are still not loaded yet, the code will enter an infinite loop, causing in a high CPU usage. Another possibility of the infinite loop is having some service accounts created by an old deployment version where the service account JWT was signed by the root credentials, but not anymore. This commit will remove the possibility of the infinite loop in the code and add root credential fallback to extract claims from old service accounts.	2024-08-14 18:29:20 -07:00
Harshavardhana	db78431b1d	avoid crash when initializing bucket quota cache (#20258 )	2024-08-14 17:34:56 -07:00
Klaus Post	3ffeabdfcb	Fix govet+staticcheck issues (#20263 ) This is better: https://github.com/golang/go/issues/60529	2024-08-14 10:11:51 -07:00
Anis Eleuch	51b1f41518	heal: Persist MRF queue in the disk during shutdown (#19410 )	2024-08-13 15:26:05 -07:00
Harshavardhana	e7a56f35b9	flatten out audit tags, do not send as free-form (#20256 ) move away from map[string]interface{} to map[string]string to simplify the audit, and also provide concise information. avoids large allocations under load(), reduces the amount of audit information generated, as the current implementation was a bit free-form. instead all datastructures must be flattened.	2024-08-13 15:22:04 -07:00
rubyisrust	516af01a12	chore: fix some function names (#20243 ) Signed-off-by: rubyisrust <rustrover@icloud.com>	2024-08-13 11:23:33 -07:00
Harshavardhana	acdb355070	update deps and update azure WARM tier implementation (#20247 )	2024-08-13 11:21:34 -07:00
Mark Theunissen	37c02a5f7b	Add dummy DeleteBucketCors for safety (#20253 )	2024-08-13 08:25:16 -07:00
Krishnan Parthasarathi	04be352ae9	Relax quorum agreement on DataDir values (#20232 ) Previously, we checked if we had a quorum on the DataDir value. We are removing this check, which allows reading objects with different DataDir values in a few drives (due to a rebalance-stop race bug) provided their eTags or ModTimes match.	2024-08-12 12:02:21 -07:00
Klaus Post	53eb7656de	Add admin info timeouts (#20249 ) Since a lot of operations load from storage, do remote calls, add a 10 second timeout to each operation. This should make `mc admin info` return values even under extreme conditions.	2024-08-12 10:24:29 -07:00
Harshavardhana	2e0fd2cba9	implement a safer completeMultipart implementation (#20227 ) - optimize writing part.N.meta by writing both part.N and its meta in sequence without network component. - remove part.N.meta, part.N which were partially success ful, in quorum loss situations during renamePart() - allow for strict read quorum check arbitrated via ETag for the given part number, this makes it double safer upon final commit. - return an appropriate error when read quorum is missing, instead of returning InvalidPart{}, which is non-retryable error. This kind of situation can happen when many nodes are going offline in rotation, an example of such a restart() behavior is statefulset updates in k8s. fixes #20091	2024-08-12 01:38:15 -07:00
Harshavardhana	909b169593	avoid source index to be same as destination index (#20238 ) during rebalance stop, it can possibly happen that Put() would race by overwriting the same object again. This may very well if done "successfully" it can potentially proceed to delete the object from the pool, causing data loss. This PR enhances #20233 to handle more scenarios such as these.	2024-08-09 19:30:44 -07:00
Krishnan Parthasarathi	4e67a4027e	Prevent overwrites due to rebalance-stop race (#20233 ) Rebalance-stop can race with ongoing rebalance operations. This change prevents these operations from overwriting objects by checking the source and destination pool indices are different.	2024-08-08 19:05:14 -07:00
Klaus Post	49055658a9	Fix missing hash in GetObjectAttributes (#20231 ) SHA256/SHA1 were mixed up. Simplify code as well.	2024-08-08 13:19:41 -07:00
Harshavardhana	89c58ce87d	enhance getActualSize() to return valid values for most situations (#20228 )	2024-08-08 08:29:58 -07:00
Mark Theunissen	2681219039	Add dummy PutBucketCors for functional test compatibility (#20220 )	2024-08-06 08:41:38 -07:00
Harshavardhana	dea9abed29	use singleflight when bucket metadata is reloaded() (#20216 ) this allows for de-duplicating the callers when called concurrently, allowing for bucketmetadata reads to be single call. All concurrent callers will get the same data as the first one.	2024-08-05 09:50:11 -07:00
Harshavardhana	e3eb5c1328	batch-exp: Remove 1000 maximum objects per call (#20212 ) It seems ObjectAPI.DeleteObjects() is clogging up when it is removing 10k versions of a single object. Authored-by: Anis Eleuch <anis@min.io>	2024-08-04 21:55:25 -07:00
Poorna	74c047cb03	fix replication last hour metric (#20199 ) also adding missing recent_backlog_count metric to v3 metrics	2024-08-01 17:55:27 -07:00
jiuker	50a5ad48fc	feat: support batch replication prefix slice (#20033 )	2024-08-01 05:53:30 -07:00
Harshavardhana	a9dc061d84	count metrics properly for any failures during drive heal (#20193 ) or via `mc admin heal --set 1 --pool 1`	2024-07-30 22:46:26 -07:00
Krishnan Parthasarathi	01a8c09920	Add fmt-gen subcommand (#20192 ) fmt-gen subcommand is only available when built with build tag `fmtgen`.	2024-07-30 15:59:48 -07:00
Aditya Manthramurthy	4c8562bcec	Fix v2 metrics: Send all ttfb api labels (#20191 ) Fix a regression in #19733 where TTFB metrics for all APIs except GetObject were removed in v2 and v3 metrics. This causes breakage for existing v2 metrics users. Instead we continue to send TTFB for all APIs in V2 but only send for GetObject in V3.	2024-07-30 15:28:46 -07:00
Harshavardhana	f13c04629b	allow multipart uploads expiration to be dynamic (#20190 ) allow multipart uploads expiration to be dyamic It would seem like the new values will take effect only after a restart for changes in multipart_expiration. This PR fixes this by making it dynamic as it should have been.	2024-07-30 12:01:06 -07:00
Harshavardhana	80ff907d08	add DeleteBulk support, add sufficient deadlines per rename() (#20185 ) deadlines per moveToTrash() allows for a more granular timeout approach for syscalls, instead of an aggregate timeout. This PR also enhances multipart state cleanup to be optimal by removing 100's of multipart network rename() calls into single network call.	2024-07-29 18:56:40 -07:00
Poorna	2d40433bc1	remove replication throttle deadline for objects > 128MiB (#20184 ) context deadline was introduced to avoid a slow transfer from blocking replication queue(s) shared by other buckets that may not be under throttling. This PR removes this context deadline for larger objects since they are anyway restricted to a limited set of workers. Otherwise, objects would get dequeued when the throttle limit is exceeded and cannot proceed within the deadline.	2024-07-29 15:14:52 -07:00
Harshavardhana	a17f14f73a	separate lock from common grid to avoid epoll contention (#20180 ) epoll contention on TCP causes latency build-up when we have high volume ingress. This PR is an attempt to relieve this pressure. upstream issue https://github.com/golang/go/issues/65064 It seems to be a deeper problem; haven't yet tried the fix provide in this issue, but however this change without changing the compiler helps. Of course, this is a workaround for now, hoping for a more comprehensive fix from Go runtime.	2024-07-29 11:10:04 -07:00
Poorna	6651c655cb	fix replication of checksum when encryption is enabled (#20161 ) - Adding functional tests - Return checksum header on GET/HEAD, previously this was returning InvalidPartNumber error	2024-07-29 01:02:16 -07:00
Harshavardhana	3ae104edae	change Read* calls over net/http to move to http.MethodGet (#20173 ) - ReadVersion - ReadFile - ReadXL Further changes include to - Compact internode resource RPC paths - Compact internode query params To optimize on parsing by gorilla/mux as the length of this string increases latency in gorilla/mux - reduce to a meaningful string.	2024-07-29 01:00:12 -07:00
jiuker	c87a489514	fix: support prefix when batchJob replicate enable the snowball (#20178 )	2024-07-29 00:59:50 -07:00
Poorna	641a56da0d	fix panic in replication queuing (#20169 ) Regression from #20077 ``` Jul 26 19:08:29 minio-dr-0101a minio[275423]: Error: grid handler (NSScanner) panic: runtime error: index out of range [4] with length 1 (errors.errorString) Jul 26 19:08:29 minio-dr-0101a minio[275423]: 33: internal/logger/logger.go:268:logger.LogIf() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 32: internal/grid/connection.go:50:grid.gridLogIf() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 31: internal/grid/muxserver.go:234:grid.(muxServer).handleRequests.func1() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 30: cmd/bucket-replication.go:2165:cmd.(ReplicationPool).queueReplicaTask() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 29: cmd/bucket-replication.go:3440:cmd.queueReplicationHeal() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 28: cmd/data-scanner.go:1396:cmd.(scannerItem).healReplication() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 27: cmd/data-scanner.go:1220:cmd.(scannerItem).applyActions() Jul 26 19:08:29 minio-dr-0101a minio[275423]: 26: cmd/xl-storage.go:627:cmd.(xlStorage).NSScanner.func2() ```	2024-07-26 13:48:21 -07:00
Harshavardhana	a16193bb50	remove fdatasync() discard, we write with O_SYNC (#20168 ) fdatasync() discard for page-cached READs is not needed, it would seem like this can cause latencies in situations when things are loaded.	2024-07-26 10:27:56 -07:00
jiuker	132e7413ba	fix: check once ready for site-replication (#20149 )	2024-07-26 10:27:42 -07:00
Klaus Post	1966668066	Avoid Batch Replication Job log spam (#20158 ) Only print once per job and error location. Set default retry to default 1 second wait, and use as minimum.	2024-07-26 05:55:50 -07:00
Harshavardhana	064f36ca5a	move to GET for internal stream READs instead of POST (#20160 ) the main reason is to let Go net/http perform necessary book keeping properly, and in essential from consistency point of view its GETs all the way. Deprecate sendFile() as its buggy inside Go runtime.	2024-07-26 05:55:01 -07:00
Krishnan Parthasarathi	4a1edfd9aa	Different read quorum for tiered objects (#20115 ) For a non-tiered object, MinIO requires that EcM (# of data blocks) of xl.meta agree, corresponding to the number of data blocks needed to read this object. OTOH, tiered objects have metadata in the hot tier and data in the warm tier. The data and its integrity are offloaded to the warm tier. This allows us to reduce the read quorum from EcM (typically > N/2, where N - erasure stripe width) to N/2 + 1. The simple majority of metadata ensures consensus on what the object is and where it is located.	2024-07-25 14:02:50 -07:00
Anis Eleuch	b7f319b62a	properly reload a fresh drive when found in a failed state during startup (#20145 ) When a drive is in a failed state when a single node multiple drives deployment is started, a replacement of a fresh disk will not be properly healed unless the user restarts the node. Fix this by always adding the new fresh disk to globalLocalDrivesMap. Also remove globalLocalDrives for simplification, a map to store local node drives can still be used since the order of local drives of a node is not defined.	2024-07-24 16:30:33 -07:00
Anis Eleuch	33c101544d	kms: Expose API when bucket federation is enabled (#20143 ) kms: Expose API available when bucket federation is enabled When bucket federation feature is enabled, KMS API will not work, such as `mc admin kms key list` The commit will fix the issue by disabling bucket forwarding when this is a KMS request.	2024-07-24 15:44:29 -07:00
Harshavardhana	3b21bb5be8	use unixNanoTime instead of time.Time in lockRequestorInfo (#20140 ) Bonus: Skip Source, Quorum fields in lockArgs that are never sent during Unlock() phase.	2024-07-24 03:24:01 -07:00
Harshavardhana	6fe2b3f901	avoid sendFile() for ranges or object lengths < 4MiB (#20141 )	2024-07-24 03:22:50 -07:00
Taran Pelkey	b368d4cc13	Fix `updateGroupMembershipsForLDAP` behavior with unicode (#20137 )	2024-07-23 19:10:03 -07:00
Klaus Post	0680af7414	Set O_NONBLOCK for reads and writes on unix (#20133 ) Tracing syscalls, opening and reading an `xl.meta` looks like this: ``` openat(AT_FDCWD, "/mnt/drive1/ss8-old/testbucket/ObjSize4MiBThreads72/(554O51H/peTb(0iztdbTKw59.csv/xl.meta", O_RDONLY\|O_NOATIME\|O_CLOEXEC) = 34 <0.000> fcntl(34, F_GETFL) = 0x48000 (flags O_RDONLY\|O_LARGEFILE\|O_NOATIME) <0.000> fcntl(34, F_SETFL, O_RDONLY\|O_NONBLOCK\|O_LARGEFILE\|O_NOATIME) = 0 <0.000> epoll_ctl(4, EPOLL_CTL_ADD, 34, {events=EPOLLIN\|EPOLLOUT\|EPOLLRDHUP\|EPOLLET, data={u32=3172471557, u64=8145488475984499461}}) = -1 EPERM (Operation not permitted) <0.000> fcntl(34, F_GETFL) = 0x48800 (flags O_RDONLY\|O_NONBLOCK\|O_LARGEFILE\|O_NOATIME) <0.000> fcntl(34, F_SETFL, O_RDONLY\|O_LARGEFILE\|O_NOATIME) = 0 <0.000> fstat(34, {st_mode=S_IFREG\|0644, st_size=354, ...}) = 0 <0.000> read(34, "XL2 \1\0\3\0\306\0\0\1P\2\2\1\304$\225\304\20\0\0\0\0\0\0\0\0\0\0\0"..., 354) = 354 <0.000> close(34) = 0 <0.000> ``` Everything until `fstat` is the `os.Open` call. Looking at the code: https://github.com/golang/go/blob/master/src/os/file_unix.go#L212-L243 It seems for every file it "tries" to see if it is pollable. This causes `syscall.SetNonblock(fd, true)` to be called. This is the first `F_SETFL`. It then calls `f.pfd.Init("file", true)`. This will attempt to set it as pollable using `epoll_ctl`. This will always fail for files. It therefore calls `syscall.SetNonblock(fd, false)` resulting in the second `F_SETFL`. If we set the `O_NONBLOCK` call on the initial open, we should avoid the 4 `fcntl` syscalls per file. I don't see any way to avoid the `epoll_ctl` call, since kind is either `kindOpenFile` or `kindNonBlock`, so "pollable" will always be true. However avoiding 4 of 6 syscalls still seems worth it. This should not have any effect, since files will end up with "nonblock" anyway.	2024-07-23 09:36:24 -07:00
Harshavardhana	91805bcab6	add optimizations to bring performance on unversioned READS (#20128 ) allow non-inlined on disk to be inlined via an unversioned ReadVersion() call, we only need ReadXL() to resolve objects with multiple versions only. The choice of this block makes it to be dynamic and chosen by the user via `mc admin config set` Other bonus things - Start measuring internode TTFB performance. - Set TCP_NODELAY, TCP_CORK for low latency	2024-07-23 03:53:03 -07:00
jiuker	b3a94c4e85	fix: Use xtime duration to parse batch job (#20117 )	2024-07-23 00:05:53 -07:00
Harshavardhana	8e618d45fc	remove unnecessary LRU for internode auth token (#20119 ) removes contentious usage of mutexes in LRU, which were never really reused in any manner; we do not need it. To trust hosts, the correct way is TLS certs; this PR completely removes this dependency, which has never been useful. ``` 0 0% 100% 25.83s 26.76% github.com/hashicorp/golang-lru/v2/expirable.(LRU[...]) 0 0% 100% 28.03s 29.04% github.com/hashicorp/golang-lru/v2/expirable.(LRU[...]) ``` Bonus: use `x-minio-time` as a nanosecond to avoid unnecessary parsing logic of time strings instead of using a more straightforward mechanism.	2024-07-22 00:04:48 -07:00
Harshavardhana	3ef59d2821	do not set KMSSecretKey env from KMSSecretKeyFile (#20122 ) fixes #20121	2024-07-21 14:39:15 -07:00
Anis Eleuch	d9ee668b6d	s3: Fix wrong continuation token during listing with ILM enabled bucket (#20113 )	2024-07-18 13:37:34 -07:00
Anis Eleuch	2e5d792f0c	batch-expiry: Save progress regularly in the drives and at the end (#20098 ) - Also, fix failure reporting at the end. - Also, avoid parsing report objects when listing or resuming jobs, this does not cause any bugs, it is only printing, not useful errors.	2024-07-17 09:42:32 -07:00
Poorna	3535197f99	replication: proxy only on missing object or read quorum err (#20101 )	2024-07-16 16:46:41 -07:00
Mark Theunissen	698bb93a46	Allow a KMS Action to specify keys in the Resources of a policy (#20079 )	2024-07-16 07:03:03 -07:00
Harshavardhana	e8c54c3d6c	add validation test for v3 metrics for all its endpoints (#20094 ) add unit test for v3 metrics for all its exposed endpoints Bonus: - support OpenMetrics encoding - adds boot time for prometheus - continueOnError is better to serve as much metrics as possible.	2024-07-15 09:28:02 -07:00
Shubhendu	f944a42886	Removed user and group details from logs (#20072 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2024-07-14 11:12:07 -07:00
Harshavardhana	eff0ea43aa	fix: typo in BucketUsageMetrics group registration in v3 metrics (#20090 ) ``` curl http://localhost:9000/minio/metrics/v3/cluster/usage/buckets ``` Did not work as documented, due to the fact that there was a typo in the bucket usage metrics registration group. This endpoint is a cluster endpoint and does not require any `buckets` argument.	2024-07-14 11:11:42 -07:00
Harshavardhana	7fcb428622	do not print unexpected logs (#20083 )	2024-07-12 13:51:54 -07:00
Klaus Post	83adc2eebf	Fix ListObjects aborting after 3 minute on async request (#20074 ) When creating the async listing, if the first request does not return within 3 minutes, it is stopped, since it isn't being kept alive. Keep updating `lastHandout` while we are waiting for the initial request to be fulfilled.	2024-07-12 09:23:16 -07:00
Poorna	989c318a28	replication: make large workers configurable (#20077 ) This PR also improves throttling by reducing tokens requested from rate limiter based on available tokens to avoid exceeding throttle wait deadlines	2024-07-12 07:57:31 -07:00
Taran Pelkey	f5d2fbc84c	Add DecodeDN and QuickNormalizeDN functions to LDAP config (#20076 )	2024-07-11 18:04:53 -07:00
Allan Roger Reid	e139673969	Audit failure in batch job key rotate (#20073 )	2024-07-11 16:13:15 -07:00
Harshavardhana	a8c6465f22	hide some deprecated fields from 'get' output (#20069 ) also update wording on `subnet license="" api_key=""`	2024-07-10 13:16:44 -07:00
Taran Pelkey	6c6f0987dc	Add groups to policy entities (#20052 ) * Add groups to policy entities * update comment --------- Co-authored-by: Harshavardhana <harsha@minio.io>	2024-07-10 11:41:49 -07:00
Austin Chang	5f64658faa	clarify error message for root user credential (#20043 ) Signed-off-by: Austin Chang <austin880625@gmail.com>	2024-07-10 09:57:01 -07:00
Anis Eleuch	ce183cb2b4	heal: List and heal again for any listing error (#19999 ) When a fresh drive healing is finished, add more checks for the drive listing errors. If any, re-list and heal again. Although this is an infrequent use case to have listPathRaw() returning nil when minDisks is set to 1, we still need to handle all possible use cases to avoid missing healing any object. Also, check for HealObject result to decide of an object is healed in the fresh disk since HealObject returns nil if an object is healed in any disk, and not in the new fresh drive.	2024-07-10 09:55:36 -07:00
Klaus Post	b3bac73c0f	Clarify post policy error message (#20067 ) It is not really clear that the listed keys are missing. Clarify the error	2024-07-10 07:18:44 -07:00
Anis Eleuch	e726d8ff0f	list: Hide objects/versions with pending/failed replicated deletion (#20047 ) In regular listing, this commit will avoid showing an object when its latest version has a pending or failed deletion. In replicated setup. It will also prevent showing older versions in the same case.	2024-07-09 15:26:42 -07:00
Shubhendu	f4230777b3	Log replication errors once (#20063 ) Also, sort the error map for multiple sites in ascending order of deployment IDs, so that the error message generated is always definitive order and same. Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2024-07-09 10:10:31 -07:00
Krishnan Parthasarathi	380233d646	batch: Update job info object on success (#20053 )	2024-07-08 18:45:54 -07:00
Klaus Post	0d0b0aa599	Abstract grid connections (#20038 ) Add `ConnDialer` to abstract connection creation. - `IncomingConn(ctx context.Context, conn net.Conn)` is provided as an entry point for incoming custom connections. - `ConnectWS` is provided to create web socket connections.	2024-07-08 14:44:00 -07:00
Anis Eleuch	b433bf14ba	Add typos check to Makefile (#20051 )	2024-07-08 14:39:49 -07:00
Klaus Post	107d951893	Log ILM failed object name (#20040 ) Log so we know which object we are dealing with. Log each object once.	2024-07-04 07:25:45 -07:00
Shireesh Anjal	22c53b1c70	Remove license update job (#20037 )	2024-07-03 11:49:48 -07:00
Mark Theunissen	88926ad8e9	return appropriate error upon tier update for incorrect credentials (#20034 )	2024-07-03 00:17:20 -07:00
Harshavardhana	32d04091a2	resume any batch jobs in a goroutine (#20035 ) Bonus: move batch job initialization to the last item after all other initialization, allowing for faster startup time for different subsystems.	2024-07-03 00:16:05 -07:00
Harshavardhana	be84a4fd68	do not proxy invalid object names (#20031 )	2024-07-02 14:28:55 -07:00
Anis Eleuch	2ec1f404ac	info: Always refresh the root disk status (#20023 ) Add root drive status in the disk info cache function, so unmounting a drive without restarting a local node reflects the correct value.	2024-07-02 13:41:29 -07:00
Klaus Post	2040559f71	Fix SkipReader performance with small initial read (#20030 ) If `SkipReader` is called with a small initial buffer it may be doing a huge number if Reads to skip the requested number of bytes. If a small buffer is provided grab a 32K buffer and use that. Fixes slow execution of `testAPIGetObjectWithMPHandler`. Bonuses: * Use `-short` with `-race` test. * Do all suite test types with `-short`. * Enable compressed+encrypted in `testAPIGetObjectWithMPHandler`. * Disable big file tests in `testAPIGetObjectWithMPHandler` when using `-short`.	2024-07-02 08:13:05 -07:00
Anis Eleuch	ca0ce4c6ef	tests: Fix setting max openfds as memory limit (#20029 ) The code was advertenly passing max openfds to debug.SetMemoryLimit(), fixing this accelerate go test in my machine. This is only a testing bug, since the server context has always a valid MaxMem, so the buggy code was never called in users environments.	2024-07-02 08:09:36 -07:00
Anis Eleuch	757cf413cb	Add batch status API (#19679 ) Currently the status of a completed or failed batch is held in the memory, a simple restart will lose the status and the user will not have any visibility of the job that was long running. In addition to the metrics, add a new API that reads the batch status from the drives. A batch job will be cleaned up three days after completion. Also add the batch type in the batch id, the reason is that the batch job request is removed immediately when the job is finished, then we do not know the type of batch job anymore, hence a difficulty to locate the job report	2024-07-02 01:17:52 -07:00
Anis Eleuch	b35acb3dbc	heal: Add support of healing particular pool/set (#20024 )	2024-07-01 15:02:25 -07:00
Sveinn	e404abf103	Letting password enable auth bypass caPublicKey (only if passauth is … (#20022 )	2024-07-01 15:02:01 -07:00
jiuker	f7ff19cb18	fix: warning for decommissioned pool while start (#20019 )	2024-07-01 07:38:46 -07:00
Poorna	91faaa1387	fix panic in batch replicate (#20014 ) Fixes: ``` panic: send on closed channel panic: close of closed channel goroutine 878 [running]: github.com/minio/minio/internal/ioutil.SafeClose[...](...) /Users/kp/code/src/github.com/minio/minio/internal/ioutil/ioutil.go:407 github.com/minio/minio/cmd.(erasureServerPools).Walk.func2.2() /Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2229 +0xc0 panic({0x108c25e60?, 0x1090b28d0?}) /usr/local/go/src/runtime/panic.go:770 +0x124 github.com/minio/minio/cmd.(erasureServerPools).Walk.func2.3({{0x1400e397316, 0x5}, {0x1400d88b8a8, 0x8}, {0x1f99d80, 0xede101c42, 0x0}, 0x3bc, 0x0, 0x0, ...}) /Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2235 +0xb4 github.com/minio/minio/cmd.(erasureServerPools).Walk.func2() /Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2277 +0xabc created by github.com/minio/minio/cmd.(erasureServerPools).Walk in goroutine 575 /Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2210 +0x33c ```	2024-06-28 18:20:47 -07:00
Harshavardhana	f365a98029	fix: hot-reloading STS credential policy documents (#20012 ) * fix: hot-reloading STS credential policy documents * Support Role ARNs hot load policies (#28) --------- Co-authored-by: Anis Eleuch <vadmeste@users.noreply.github.com>	2024-06-28 16:17:22 -07:00
Taran Pelkey	7ca4ba77c4	Update tests to use AttachPolicy(LDAP) instead of deprecated SetPolicy (#19972 )	2024-06-28 02:06:25 -07:00
Poorna	13512170b5	list: Do not decrypt SSE-S3 Etags in a non encrypted format (#20008 )	2024-06-27 19:44:56 -07:00
Krishnan Parthasarathi	154fcaeb56	Allow rebalance start when it's stopped/completed (#20009 )	2024-06-27 17:22:30 -07:00
Anis Eleuch	722118386d	iam: Hot load of the policy during request authorization (#20007 ) Hot load a policy document when during account authorization evaluation to avoid returning 403 during server startup, when not all policies are already loaded. Add this support for group policies as well.	2024-06-27 17:03:07 -07:00
Harshavardhana	709612cb37	fix: rebalance upon pool expansion would crash when in progress (#20004 ) you can attempt a rebalance first i.e, start with 2 pools. ``` mc admin rebalance start alias/ ``` and after that you can add a new pool, this would potentially crash. ``` Jun 27 09:22:19 xxx minio[7828]: panic: runtime error: invalid memory address or nil pointer dereference Jun 27 09:22:19 xxx minio[7828]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x22cc225] Jun 27 09:22:19 xxx minio[7828]: goroutine 1 [running]: Jun 27 09:22:19 xxx minio[7828]: github.com/minio/minio/cmd.(*erasureServerPools).findIndex(...) ```	2024-06-27 11:35:34 -07:00
Harshavardhana	b35d083872	fix; change retry-after 60sec for 503s and 10s for 429s (#19996 )	2024-06-26 01:32:06 -07:00
Harshavardhana	5e7b243bde	extend cluster health to return errors for IAM, and Bucket metadata (#19995 ) Bonus: make API freeze to be opt-in instead of default	2024-06-26 00:44:34 -07:00

1 2 3 4 5 ...

6330 Commits