Commit Graph

6385 Commits

Author SHA1 Message Date
Krishnan Parthasarathi
154fcaeb56 Allow rebalance start when it's stopped/completed (#20009) 2024-06-27 17:22:30 -07:00
Anis Eleuch
722118386d iam: Hot load of the policy during request authorization (#20007)
Hot load a policy document when during account authorization evaluation
to avoid returning 403 during server startup, when not all policies are
already loaded.

Add this support for group policies as well.
2024-06-27 17:03:07 -07:00
Harshavardhana
709612cb37 fix: rebalance upon pool expansion would crash when in progress (#20004)
you can attempt a rebalance first i.e, start with 2 pools.

```
mc admin rebalance start alias/
```

and after that you can add a new pool, this would
potentially crash.

```
Jun 27 09:22:19 xxx minio[7828]: panic: runtime error: invalid memory address or nil pointer dereference
Jun 27 09:22:19 xxx minio[7828]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x22cc225]
Jun 27 09:22:19 xxx minio[7828]: goroutine 1 [running]:
Jun 27 09:22:19 xxx minio[7828]: github.com/minio/minio/cmd.(*erasureServerPools).findIndex(...)
```
2024-06-27 11:35:34 -07:00
Harshavardhana
b35d083872 fix; change retry-after 60sec for 503s and 10s for 429s (#19996) 2024-06-26 01:32:06 -07:00
Harshavardhana
5e7b243bde extend cluster health to return errors for IAM, and Bucket metadata (#19995)
Bonus: make API freeze to be opt-in instead of default
2024-06-26 00:44:34 -07:00
Taran Pelkey
3c2141513f add ListAccessKeysLDAPBulk API to list accessKeys for multiple/all LDAP users (#19835) 2024-06-25 14:21:28 -07:00
Aditya Manthramurthy
602f6a9ad0 Add IAM (re)load timing logs (#19984)
This is useful to debug large IAM load times - the usual cause is when
there are a large amount of temporary accounts.
2024-06-25 10:33:10 -07:00
Harshavardhana
22c5a5b91b add healing retries when there are failed heal attempts (#19986)
transient errors for long running tasks are normal, allow for
drive to retry again upto 3 times before giving up on healing
the drive.
2024-06-25 10:32:56 -07:00
jiuker
41f508765d fix: format the scanner object error (#19991) 2024-06-25 08:54:24 -07:00
Aditya Manthramurthy
7dccd1f589 fix: bootstrap msgs should only be sent at startup (#19985) 2024-06-24 19:30:28 -07:00
Harshavardhana
be97ae4c5d fix: gcs tier going offline due to customer HTTPclient (#19973)
specifying customer HTTP client makes the gcs SDK
ignore the passed credentials, instead let the GCS
SDK manage the transport.

this PR fixes #19922 a regression from #19565
2024-06-21 22:26:45 -07:00
Anis Eleuch
4d7d008741 bootstrap: Speed up bucket metadata loading (#19969)
Currently, bucket metadata is being loaded serially inside ListBuckets
Objet API. Fix that by loading the bucket metadata as the number of
erasure sets * 10, which is a good approximation.
2024-06-21 15:22:24 -07:00
Klaus Post
2d7a3d1516 Return error from mergeEntryChannels (#19970)
- Add error from mergeEntryChannels to `results.`
- Make sure we check the context error before we close the channel.
2024-06-21 12:06:51 -07:00
Harshavardhana
dfab400d43 reject bootup, if binaries are different in a cluster (#19968) 2024-06-21 07:49:49 -07:00
Shireesh Anjal
e200808ab7 fix errors in metrics code on macos (#19965)
- do not load proc fs metrics in case of macos
- null-check TimeStat before accessing
2024-06-20 10:55:03 -07:00
Klaus Post
fae563b85d Add fixed timed restarts to updates (#19960) 2024-06-20 07:49:22 -07:00
Anis Eleuch
95e4cbbfde Do not ping event targets during cluster initialization (#19959)
S3 operations are frozen during startup, therefore we should avoid pinging
event targets during the initialization since it can stall.
2024-06-20 07:46:02 -07:00
Harshavardhana
2825294b7b allow server startup to come online with READ success (#19957) 2024-06-19 22:21:31 -07:00
Sveinn
bce93b5cfa Removing timeout on shutdown (#19956) 2024-06-19 11:42:47 -07:00
Harshavardhana
7a4b250c8b avoid waiting for quorum health while debugging (#19955) 2024-06-19 10:12:20 -07:00
Harshavardhana
69e41f87ef compute localIPs only once per server startup() (#19951)
repeatedly calling this function is not necessary,
on systems with lots of interfaces, including virtual
ones can make this reasonably delayed.
2024-06-19 07:34:00 -07:00
Harshavardhana
ee48f9f206 perform healthchecks before initializing everything fully (#19953)
adds more informative logs that provide details on which
erasure set is losing quorum etc.
2024-06-19 07:33:40 -07:00
Sveinn
9ba39d7fad Removing a channel that was not being used (#19948) 2024-06-19 01:59:39 -07:00
Harshavardhana
d2fb371f80 do not need response record body (#19949)
since the connection is active, the
response recorder body can grow endlessly
causing leak, as this bytes buffer is
never given back to GC due to an goroutine.
2024-06-19 01:59:21 -07:00
Klaus Post
2f9018f03b Do regular checks for healing status while scanning (#19946) 2024-06-18 09:11:04 -07:00
Harshavardhana
bbb64eaade skip healing properly in the scanner when a drive is hotplugged (#19939)
skip healing properly in scanner when drive is hotplugged

due to how the state is passed around the SkipHealing
might not be the true state() of the system always, causing
a situation where we might healing from the scanner on the
same drive which is being. Due to this competing heals get
triggered that slow each other down.
2024-06-17 16:39:11 -07:00
Harshavardhana
7bd1d899bc remove overzealous check during HEAD() (#19940)
due to a historic bug in CopyObject() where
an inlined object loses its metadata, the
check causes an incorrect fallback verifying
data-dir.

CopyObject() bug was fixed in ffa91f9794 however
the occurrence of this problem is historic, so
the aforementioned check is stretching too much.

Bonus: simplify fileInfoRaw() to read xl.json as well,
also recreate buckets properly.
2024-06-17 07:29:18 -07:00
Harshavardhana
c91d1ec2e3 fix: avoid metadata cache without data for all callers (#19935) 2024-06-14 06:28:35 -07:00
Shubhendu
3bd3470d0b Corrected names of node replication metrics (#19932)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-13 15:26:54 -07:00
Harshavardhana
ba39ed9af7 loadUser() if not able to load() credential return error (#19931) 2024-06-13 15:26:38 -07:00
jiuker
62e6dc950d fix: do not update metadata cache upon headObject() (#19929) 2024-06-13 08:42:02 -07:00
Klaus Post
ad04afe381 Fix SSEC multipart checksum replication (#19915)
* Multipart SSEC checksums were not transferred.
* Remove key mismatch logging. This key is user-controlled with SSEC.
* If the source is SSEC and the destination reports ErrSSEEncryptedObject, 
  assume replication is good.
2024-06-12 23:56:12 -07:00
Harshavardhana
d06b63d056 load credential for in-flights requests as singleflight (#19920)
avoid concurrent callers for LoadUser() to even initiate
object read() requests, if an on-going operation is in progress.

this avoids many callers hitting the drives causing I/O
spikes, also allows for loading credentials faster.
2024-06-12 13:47:56 -07:00
Harshavardhana
e3ac4035b9 decrement requests inqueue correctly after the request is processed (#19918) 2024-06-12 01:13:12 -07:00
Harshavardhana
d21b6daa49 fix: avoid crash when delete() returns an error in batch expiration (#19909) 2024-06-11 06:50:53 -07:00
Harshavardhana
55aa431578 fix: on windows avoid ':' as part of the object name (#19907)
fixes #18865
avoid-colon
2024-06-10 20:13:30 -07:00
Harshavardhana
614981e566 allow purge expired STS while loading credentials (#19905)
the reason for this is to avoid STS mappings to be
purged without a successful load of other policies,
and all the credentials only loaded successfully
are properly handled.

This also avoids unnecessary cache store which was
implemented earlier for optimization.
2024-06-10 11:45:50 -07:00
Klaus Post
d2eed44c78 Fix replication checksum transfer (#19906)
Compression will be disabled by default if SSE-C is specified. So we can still honor SSE-C.
2024-06-10 10:40:33 -07:00
Anis Eleuch
789cbc6fb2 heal: Dangling check to evaluate object parts separately (#19797) 2024-06-10 08:51:27 -07:00
jiuker
0662c90b5c fix: copyObject restore with a specific version, update test cases (#19895) 2024-06-10 08:50:49 -07:00
Klaus Post
a2cab02554 Fix SSE-C checksums (#19896)
Compression will be disabled by default if SSE-C is specified. So we can still honor SSE-C.
2024-06-10 08:31:51 -07:00
Harshavardhana
6c7a21df6b turn-off unexpected debug logging in List() calls (#19903) 2024-06-09 21:34:26 -07:00
Harshavardhana
29a25a538f fix: make sure we list freeVersions like DEL marker with --versions (#19878)
freeVersions() was being incorrectly skipped; list it as
valid objects properly.

Co-authored-by: Krishnan Parthasarathi <Krishnan Parthasarathi>
2024-06-07 15:18:44 -07:00
Harshavardhana
2dd8faaedc remove unnecessary log in Listing() 2024-06-07 14:52:55 -07:00
Krishnan Parthasarathi
069c4015cd Don't tier directory objects (#19891)
Directory objects are used by applications that simulate the folder
structure of an on-disk filesystem. These are zero-byte objects with names
ending with '/'. They are only used to check whether a 'folder' exists in
the namespace.
2024-06-07 08:43:17 -07:00
Shubhendu
2f6e03fb60 Calculate correct object size while replication (#19888)
It was missing in case of `replicateObject` but was present for
`replicateAll` already

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-06 12:31:01 -07:00
Klaus Post
0fbb945e13 Disable caching of encrypted objects (#19890)
Don't write encrypted objects to cache, if configured.
2024-06-06 11:39:18 -07:00
Anis Eleuch
b94dd835c9 decom: Fix CurrentSize output when generating the status (#19883)
StartSize starts with the raw free space of all disks in the given pool,
however during the status, CurrentSize is not showing the current free
raw space, as expected at least by `mc admin decom status` since it was
written.
2024-06-06 07:30:43 -07:00
Poorna
5aaef9790f replication: pass checksum headers to replica (#19834) 2024-06-06 02:36:42 -07:00
Bala FA
7edc352d23 Add ILM metrics in metrics-v3 (#19539)
Signed-off-by: Bala.FA <bala@minio.io>
2024-06-06 02:36:25 -07:00