Commit Graph

12066 Commits

Author SHA1 Message Date
Poorna 91faaa1387
fix panic in batch replicate (#20014)
Fixes:

```
panic: send on closed channel
	panic: close of closed channel

goroutine 878 [running]:
github.com/minio/minio/internal/ioutil.SafeClose[...](...)
	/Users/kp/code/src/github.com/minio/minio/internal/ioutil/ioutil.go:407
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func2.2()
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2229 +0xc0
panic({0x108c25e60?, 0x1090b28d0?})
	/usr/local/go/src/runtime/panic.go:770 +0x124
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func2.3({{0x1400e397316, 0x5}, {0x1400d88b8a8, 0x8}, {0x1f99d80, 0xede101c42, 0x0}, 0x3bc, 0x0, 0x0, ...})
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2235 +0xb4
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func2()
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2277 +0xabc
created by github.com/minio/minio/cmd.(*erasureServerPools).Walk in goroutine 575
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2210 +0x33c
```
2024-06-28 18:20:47 -07:00
Poorna 68a9f521d5
fix object lock metadata filter (#20011) 2024-06-28 18:20:27 -07:00
Harshavardhana f365a98029
fix: hot-reloading STS credential policy documents (#20012)
* fix: hot-reloading STS credential policy documents
* Support Role ARNs hot load policies (#28)

---------

Co-authored-by: Anis Eleuch <vadmeste@users.noreply.github.com>
2024-06-28 16:17:22 -07:00
Minio Trusted 47bbc272df Update yaml files to latest version RELEASE.2024-06-28T09-06-49Z 2024-06-28 14:11:36 +00:00
Anis Eleuch aebac90013
tests: Fix minor issue in the config yaml file testing (#20005)
Convert x86_64 to amd64 in the test script to correctly download mc binary.
2024-06-28 02:06:49 -07:00
Taran Pelkey 7ca4ba77c4
Update tests to use AttachPolicy(LDAP) instead of deprecated SetPolicy (#19972) 2024-06-28 02:06:25 -07:00
Poorna 13512170b5
list: Do not decrypt SSE-S3 Etags in a non encrypted format (#20008) 2024-06-27 19:44:56 -07:00
Krishnan Parthasarathi 154fcaeb56
Allow rebalance start when it's stopped/completed (#20009) 2024-06-27 17:22:30 -07:00
Anis Eleuch 722118386d
iam: Hot load of the policy during request authorization (#20007)
Hot load a policy document when during account authorization evaluation
to avoid returning 403 during server startup, when not all policies are
already loaded.

Add this support for group policies as well.
2024-06-27 17:03:07 -07:00
Harshavardhana 709612cb37
fix: rebalance upon pool expansion would crash when in progress (#20004)
you can attempt a rebalance first i.e, start with 2 pools.

```
mc admin rebalance start alias/
```

and after that you can add a new pool, this would
potentially crash.

```
Jun 27 09:22:19 xxx minio[7828]: panic: runtime error: invalid memory address or nil pointer dereference
Jun 27 09:22:19 xxx minio[7828]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x22cc225]
Jun 27 09:22:19 xxx minio[7828]: goroutine 1 [running]:
Jun 27 09:22:19 xxx minio[7828]: github.com/minio/minio/cmd.(*erasureServerPools).findIndex(...)
```
2024-06-27 11:35:34 -07:00
Harshavardhana b35d083872
fix; change retry-after 60sec for 503s and 10s for 429s (#19996) 2024-06-26 01:32:06 -07:00
Harshavardhana 5e7b243bde
extend cluster health to return errors for IAM, and Bucket metadata (#19995)
Bonus: make API freeze to be opt-in instead of default
2024-06-26 00:44:34 -07:00
Minio Trusted f8f9fc77ac Update yaml files to latest version RELEASE.2024-06-26T01-06-18Z 2024-06-26 02:12:12 +00:00
Harshavardhana 499531f0b5 update minio/console v1.6.1
Signed-off-by: Harshavardhana <harsha@minio.io>
2024-06-25 18:06:18 -07:00
Taran Pelkey 3c2141513f
add `ListAccessKeysLDAPBulk` API to list accessKeys for multiple/all LDAP users (#19835) 2024-06-25 14:21:28 -07:00
Aditya Manthramurthy 602f6a9ad0
Add IAM (re)load timing logs (#19984)
This is useful to debug large IAM load times - the usual cause is when
there are a large amount of temporary accounts.
2024-06-25 10:33:10 -07:00
Harshavardhana 22c5a5b91b
add healing retries when there are failed heal attempts (#19986)
transient errors for long running tasks are normal, allow for
drive to retry again upto 3 times before giving up on healing
the drive.
2024-06-25 10:32:56 -07:00
jiuker 41f508765d
fix: format the scanner object error (#19991) 2024-06-25 08:54:24 -07:00
Aditya Manthramurthy 7dccd1f589
fix: bootstrap msgs should only be sent at startup (#19985) 2024-06-24 19:30:28 -07:00
Allan Roger Reid 55ff598b23
Refactor the documentation on minio server config notation (#19987)
Refactor minio server config notation to add bracket notation to the TODO list
2024-06-24 19:30:18 -07:00
Harshavardhana a22ce4550c protect workers and simplify use of atomics (#19982)
without atomic load() it is possible that for
a slow receiver we would get into a hot-loop, when
logCh is full and there are many incoming callers.

to avoid this as a workaround enable BATCH_SIZE
greater than 100 to ensure that your slow receiver
receives data in bulk to avoid being throttled in
some manner.

this PR however fixes the unprotected access to
the current workers value.
2024-06-24 18:15:27 -07:00
Taran Pelkey 168ae81b1f
Fix error when validating DN that is not under base DN (#19971) 2024-06-21 23:35:35 -07:00
Minio Trusted 5f6a25cdd0 Update yaml files to latest version RELEASE.2024-06-22T05-26-45Z 2024-06-22 06:20:13 +00:00
Harshavardhana be97ae4c5d
fix: gcs tier going offline due to customer HTTPclient (#19973)
specifying customer HTTP client makes the gcs SDK
ignore the passed credentials, instead let the GCS
SDK manage the transport.

this PR fixes #19922 a regression from #19565
2024-06-21 22:26:45 -07:00
Anis Eleuch 4d7d008741
bootstrap: Speed up bucket metadata loading (#19969)
Currently, bucket metadata is being loaded serially inside ListBuckets
Objet API. Fix that by loading the bucket metadata as the number of
erasure sets * 10, which is a good approximation.
2024-06-21 15:22:24 -07:00
Klaus Post 2d7a3d1516
Return error from mergeEntryChannels (#19970)
- Add error from mergeEntryChannels to `results.`
- Make sure we check the context error before we close the channel.
2024-06-21 12:06:51 -07:00
Harshavardhana dfab400d43
reject bootup, if binaries are different in a cluster (#19968) 2024-06-21 07:49:49 -07:00
Pedro Juarez 70078eab10
Fix browser UI animation (#19966)
Browse UI is not showing the animation because the default 
content-security-policy do not trust the file https://unpkg.com/detect-gpu@5.0.38/dist/benchmarks/d-apple.json 
the GPU library needs to identify if the web browser can play it.
2024-06-20 17:58:58 -07:00
Klaus Post 3415c4dd1e
Fix reconnected deadlock with full queue (#19964)
When a reconnection happens, `handleMessages` must be able to complete and exit. 

This can be prevented in a full queue.

Deadlock chain (May 10th release)

```
1 @ 0x44110e 0x453125 0x109f88c 0x109f7d5 0x10a472c 0x10a3f72 0x10a34ed 0x4795e1
#	0x109f88b	github.com/minio/minio/internal/grid.(*Connection).send+0x3eb			github.com/minio/minio/internal/grid/connection.go:548
#	0x109f7d4	github.com/minio/minio/internal/grid.(*Connection).queueMsg+0x334		github.com/minio/minio/internal/grid/connection.go:586
#	0x10a472b	github.com/minio/minio/internal/grid.(*Connection).handleAckMux+0xab		github.com/minio/minio/internal/grid/connection.go:1284
#	0x10a3f71	github.com/minio/minio/internal/grid.(*Connection).handleMsg+0x231		github.com/minio/minio/internal/grid/connection.go:1211
#	0x10a34ec	github.com/minio/minio/internal/grid.(*Connection).handleMessages.func1+0x6cc	github.com/minio/minio/internal/grid/connection.go:1019

---> blocks ---> via (Connection).handleMsgWg

1 @ 0x44110e 0x454165 0x454134 0x475325 0x486b08 0x10a161a 0x10a1465 0x2470e67 0x7395a9 0x20e61af 0x20e5f1f 0x7395a9 0x22f781c 0x7395a9 0x22f89a5 0x7395a9 0x22f6e82 0x7395a9 0x22f49a2 0x7395a9 0x2206e45 0x7395a9 0x22f4d9c 0x7395a9 0x210ba06 0x7395a9 0x23089c2 0x7395a9 0x22f86e9 0x7395a9 0xd42582 0x2106c04
#	0x475324	sync.runtime_Semacquire+0x24								runtime/sema.go:62
#	0x486b07	sync.(*WaitGroup).Wait+0x47								sync/waitgroup.go:116
#	0x10a1619	github.com/minio/minio/internal/grid.(*Connection).reconnected+0xb9			github.com/minio/minio/internal/grid/connection.go:857
#	0x10a1464	github.com/minio/minio/internal/grid.(*Connection).handleIncoming+0x384			github.com/minio/minio/internal/grid/connection.go:825
```

Add a queue cleaner in reconnected that will pop old messages so `handleMessages` can 
send messages without blocking and exit appropriately for the connection to be re-established.

Messages are likely dropped by the remote, but we may have some that can succeed, 
so we only drop when running out of space.
2024-06-20 16:11:40 -07:00
Shireesh Anjal e200808ab7
fix errors in metrics code on macos (#19965)
- do not load proc fs metrics in case of macos
- null-check TimeStat before accessing
2024-06-20 10:55:03 -07:00
Klaus Post fae563b85d
Add fixed timed restarts to updates (#19960) 2024-06-20 07:49:22 -07:00
Klaus Post 3e6dc02f8f
Add actual inline data to JSON output in xl-meta (#19958)
Add the inlined data as base64 encoded field and try to add a string version if feasible.

Example:

```
λ xl-meta -data xl.meta
{
  "8e03504e-1123-4957-b272-7bc53eda0d55": {
    "bitrot_valid": true,
    "bytes": 58,
    "data_base64": "Z29sYW5nLm9yZy94L3N5cyB2MC4xNS4wIC8=",
    "data_string": "golang.org/x/sys v0.15.0 /"
}
```

The string will have quotes, newlines escaped to produce valid JSON.

If content isn't valid utf8 or the encoding otherwise fails, only the base64 data will be added.

`-export` can still be used separately to extract the data as files (including bitrot).
2024-06-20 07:46:44 -07:00
Anis Eleuch 95e4cbbfde
Do not ping event targets during cluster initialization (#19959)
S3 operations are frozen during startup, therefore we should avoid pinging
event targets during the initialization since it can stall.
2024-06-20 07:46:02 -07:00
Harshavardhana 2825294b7b
allow server startup to come online with READ success (#19957) 2024-06-19 22:21:31 -07:00
Sveinn bce93b5cfa
Removing timeout on shutdown (#19956) 2024-06-19 11:42:47 -07:00
Harshavardhana 7a4b250c8b
avoid waiting for quorum health while debugging (#19955) 2024-06-19 10:12:20 -07:00
Anis Eleuch e5335450a4
test: Healing test to avoid infinite waiting for servers to be up (#19954)
tests: Healing test to avoid infinite waiting for servers to be up

Quit after 15 minutes and print server logs instead
2024-06-19 09:00:38 -07:00
Klaus Post a6ffdf1dd4
Do not block on distributed unlocks (#19952)
* Prevents blocking when losing quorum (standard on cluster restarts).
* Time out to prevent endless buildup. Timed-out remote locks will be canceled because they miss the refresh anyway.
* Reduces latency for all calls since the wall time for the roundtrip to remotes no longer adds to the requests.
2024-06-19 07:35:19 -07:00
Harshavardhana 69e41f87ef
compute localIPs only once per server startup() (#19951)
repeatedly calling this function is not necessary,
on systems with lots of interfaces, including virtual
ones can make this reasonably delayed.
2024-06-19 07:34:00 -07:00
Harshavardhana ee48f9f206
perform healthchecks before initializing everything fully (#19953)
adds more informative logs that provide details on which
erasure set is losing quorum etc.
2024-06-19 07:33:40 -07:00
Sveinn 9ba39d7fad
Removing a channel that was not being used (#19948) 2024-06-19 01:59:39 -07:00
Harshavardhana d2fb371f80
do not need response record body (#19949)
since the connection is active, the
response recorder body can grow endlessly
causing leak, as this bytes buffer is
never given back to GC due to an goroutine.
2024-06-19 01:59:21 -07:00
Klaus Post 2f9018f03b
Do regular checks for healing status while scanning (#19946) 2024-06-18 09:11:04 -07:00
Harshavardhana eb990f64a9 update pkger to use v2.3.1 2024-06-17 22:48:41 -07:00
Harshavardhana bbb64eaade
skip healing properly in the scanner when a drive is hotplugged (#19939)
skip healing properly in scanner when drive is hotplugged

due to how the state is passed around the SkipHealing
might not be the true state() of the system always, causing
a situation where we might healing from the scanner on the
same drive which is being. Due to this competing heals get
triggered that slow each other down.
2024-06-17 16:39:11 -07:00
Harshavardhana 7bd1d899bc
remove overzealous check during HEAD() (#19940)
due to a historic bug in CopyObject() where
an inlined object loses its metadata, the
check causes an incorrect fallback verifying
data-dir.

CopyObject() bug was fixed in ffa91f9794 however
the occurrence of this problem is historic, so
the aforementioned check is stretching too much.

Bonus: simplify fileInfoRaw() to read xl.json as well,
also recreate buckets properly.
2024-06-17 07:29:18 -07:00
Harshavardhana c91d1ec2e3
fix: avoid metadata cache without data for all callers (#19935) 2024-06-14 06:28:35 -07:00
Minio Trusted c50b64027d Update yaml files to latest version RELEASE.2024-06-13T22-53-53Z 2024-06-14 05:40:03 +00:00
Cesar N 20960b6a2d
Update console to v1.6.0 (#19933) 2024-06-13 15:53:53 -07:00
Shubhendu 3bd3470d0b
Corrected names of node replication metrics (#19932)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-13 15:26:54 -07:00