Commit Graph

11614 Commits

Author SHA1 Message Date
Klaus Post e06168596f
Convert more peer <--> peer REST calls (#19004)
* Convert more peer <--> peer REST calls
* Clean up in general.
* Add JSON wrapper.
* Add slice wrapper.
* Add option to make handler return nil error if no connection is given, `IgnoreNilConn`.

Converts the following:

```
+	HandlerGetMetrics
+	HandlerGetResourceMetrics
+	HandlerGetMemInfo
+	HandlerGetProcInfo
+	HandlerGetOSInfo
+	HandlerGetPartitions
+	HandlerGetNetInfo
+	HandlerGetCPUs
+	HandlerServerInfo
+	HandlerGetSysConfig
+	HandlerGetSysServices
+	HandlerGetSysErrors
+	HandlerGetAllBucketStats
+	HandlerGetBucketStats
+	HandlerGetSRMetrics
+	HandlerGetPeerMetrics
+	HandlerGetMetacacheListing
+	HandlerUpdateMetacacheListing
+	HandlerGetPeerBucketMetrics
+	HandlerStorageInfo
+	HandlerGetLocks
+	HandlerBackgroundHealStatus
+	HandlerGetLastDayTierStats
+	HandlerSignalService
+	HandlerGetBandwidth
```
2024-02-19 14:54:46 -08:00
Harshavardhana 4c8197a119
reject expired STS credentials early without decoding sessionToken (#19072) 2024-02-19 07:34:10 -08:00
Minio Trusted 23c10350f3 Update yaml files to latest version RELEASE.2024-02-17T01-15-57Z 2024-02-17 01:37:10 +00:00
Harshavardhana b6e98aed01
fix: found races in accessing globalLocalDrives (#19069)
make a copy before accessing globalLocalDrives

Bonus: update console v0.46.0

Signed-off-by: Harshavardhana <harsha@minio.io>
2024-02-16 17:15:57 -08:00
Anis Eleuch 00dcba9ddd
Fix typo in jwt skewed date/time error (#19066) 2024-02-16 10:48:30 -08:00
Harshavardhana 607cafadbc
converge clusterRead health into cluster health (#19063) 2024-02-15 16:48:36 -08:00
Anis Eleuch 68dde2359f
log: Add logger.Event to send to console and other logger targets (#19060)
Add a new function logger.Event() to send the log to Console and
http/kafka log webhooks. This will include some internal events such as
disk healing and rebalance/decommissioning
2024-02-15 15:13:30 -08:00
Poorna f9dbf41e27
sr: add validation to disallow updating bandwidth limit on self (#19062) 2024-02-15 13:03:40 -08:00
Krishnan Parthasarathi 7405760f44
Refresh tier config periodically (#19049)
- Increase the parity for tier-config.bin object
- Refresh globalTierConfigMgr cached value once every 15 mins
2024-02-15 11:52:44 -08:00
Harshavardhana 7e4a6b4bcd
remove rename2 entirely, avoids the risk of moving data (#19058) 2024-02-14 17:09:38 -08:00
Minio Trusted b5791e6f28 Update yaml files to latest version RELEASE.2024-02-14T21-36-02Z 2024-02-14 21:51:12 +00:00
Harshavardhana 00cb58eaf3
add customer specific hotfixes to 'registry.min.dev' (#19057)
```
REPO="registry.min.dev/<customer>" CRED_DIR=/media/builder/minio make docker-hotfix-push
```
2024-02-14 13:36:02 -08:00
Harshavardhana f961ec4aaf
fix: revert allow offline disks on fresh start (#19052)
the PR in #16541 was incorrect and hand wrong assumptions
about the overall setup, revert this since this expectation
to have offline servers is wrong and we can end up with a
bigger chicken and egg problem.

This reverts commit 5996c8c4d5.

Bonus:

- preserve disk in globalLocalDrives properly upon connectDisks()
- do not return 'nil' from newXLStorage(), getting it ready for
  the next set of changes for 'format.json' loading.
2024-02-14 10:37:34 -08:00
Harshavardhana 134db72bb7
fix: reject service account access key same as root credentials (#19055) 2024-02-14 10:37:12 -08:00
Harshavardhana 6fd0b434e2
upgrade all deps (#19041) 2024-02-14 09:51:34 -08:00
Harshavardhana effe21f3eb
send correct objectname in audit events for DeleteAll ILM (#19053) 2024-02-14 08:07:58 -08:00
Praveen raj Mani 1118b285d3
fix: race in deleting objects during batch expiry (#19054) 2024-02-14 08:07:44 -08:00
Poorna 912a0031b7
fix sr tests to capture all server logs (#19051) 2024-02-13 20:51:23 -08:00
Aditya Manthramurthy a14e192376
fix: remove unnecessary panic in iam-store (#19050) 2024-02-13 19:29:36 -08:00
Minio Trusted f8e15e7d09 Update yaml files to latest version RELEASE.2024-02-13T15-35-11Z 2024-02-13 16:01:38 +00:00
Shireesh Anjal 7b9f9e0628
fix incorrect disk io stats in k8s environment (#19016)
The previous logic of calculating per second values for disk io stats
divides the stats by the host uptime. This doesn't work in k8s
environment as the uptime is of the pod, but the stats (from
/proc/diskstats) are from the host.

Fix this by storing the initial values of uptime and the stats at the
timme of server startup, and using the difference between current and
initial values when calculating the per second values.
2024-02-13 07:35:11 -08:00
Praveen raj Mani ac8e9ce04f
Send a bucket notification event on DeleteObject() for non-existing object (#19037)
Send a bucket notification event on DeleteObject for non-existing objects
2024-02-13 07:34:17 -08:00
Praveen raj Mani cfd8645843
fix: update batch replication stats for snowball uploads (#19045) 2024-02-13 07:33:27 -08:00
Harshavardhana 0c068b15c7
add missing handler for reloading site replication config on peers (#19042) 2024-02-13 06:55:54 -08:00
Anis Eleuch 30a466aa71
sts: Add test for DurationSeconds condition (#19044) 2024-02-13 06:55:37 -08:00
Taran Pelkey 4d94609c44
FIx unexpected behavior when creating service account (#19036) 2024-02-13 02:31:43 -08:00
Minio Trusted 6b63123ca9 Update yaml files to latest version RELEASE.2024-02-12T21-02-27Z 2024-02-12 21:40:49 +00:00
Poorna 0cc9fb73e1
metrics: fix typo in namespace for proxy tagging metric (#19039)
Relevant PR introducing this metric: #18957
2024-02-12 13:02:27 -08:00
Harshavardhana eac4e4b279
honor replaced disk properly by updating globalLocalDrives (#19038)
globalLocalDrives seem to be not updated during the
HealFormat() leads to a requirement where the server
needs to be restarted for the healing to continue.
2024-02-12 13:00:20 -08:00
Harshavardhana 6d381f7c0a
relax pre-emptive GetBucketInfo() for multi-object delete (#19035) 2024-02-12 08:46:46 -08:00
Anis Eleuch 4fa06aefc6
Convert service account add/update expiration to cond values (#19024)
In order to force some users allowed to create or update a service
account to provide an expiration satifying the user policy conditions.
2024-02-12 08:36:16 -08:00
Harshavardhana 0e177a44e0
preserve conflicting objects when parent object is being deleted (#19034)
a/prefix
a/prefix/1.txt

where `a/prefix` is an object which does not have `/` at the end,
we do not have to aggressively recursively delete all the sub-folders
as well. Instead convert the call into self contained to deleting
'xl.meta' and then subsequently attempting to Remove the parent.
2024-02-12 08:30:40 -08:00
Harshavardhana afd19de5a9
fix: allow configuring excess versions alerting (#19028)
Bonus: enable audit alerts for object versions
beyond the configured value, default is '100'
versions per object beyond which scanner will
alert for each such objects.
2024-02-11 23:41:53 -08:00
Harshavardhana e3fbac9e24
do not have to use the same distributionAlgo as first pool (#19031)
when we expand via pools, there is no reason to stick
with the same distributionAlgo as the rest. Since the
algo only makes sense with-in a pool not across pools.

This allows for newer pools to use newer codepaths to
avoid legacy file lookups when they have a pre-existing
deployment from 2019, they can expand their new pool
to be of a newer distribution format, allowing the
pool to be more performant.
2024-02-11 23:21:56 -08:00
Poorna a9cf32811c
Fix panic in tagging request proxying (#19032) 2024-02-11 18:18:43 -08:00
Harshavardhana 53997ecc79
avoid excessive logging for objects that do not exist (#19030)
in replicated setups, that have proxying enabled for
replicated buckets.
2024-02-11 14:21:08 -08:00
Minio Trusted 8e69f3cb89 Update yaml files to latest version RELEASE.2024-02-09T21-25-16Z 2024-02-09 22:48:07 +00:00
Harshavardhana 997ba3a574
introduce reader deadlines for net.Conn (#19023)
Bonus: set "retry-after" header for AWS SDKs if possible to honor them.
2024-02-09 13:25:16 -08:00
Klaus Post 8e68ff9321
Add extra disconnect safety (#19022)
Fix reported races that are actually synchronized by network calls.

But this should add some extra safety for untimely disconnects.

Race reported:

```
WARNING: DATA RACE
Read at 0x00c00171c9c0 by goroutine 214:
  github.com/minio/minio/internal/grid.(*muxClient).addResponse()
      e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:519 +0x111
  github.com/minio/minio/internal/grid.(*muxClient).error()
      e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:470 +0x21d
  github.com/minio/minio/internal/grid.(*Connection).handleDisconnectClientMux()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:1391 +0x15b
  github.com/minio/minio/internal/grid.(*Connection).handleMsg()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:1190 +0x1ab
  github.com/minio/minio/internal/grid.(*Connection).handleMessages.func1()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:981 +0x610

Previous write at 0x00c00171c9c0 by goroutine 1081:
  github.com/minio/minio/internal/grid.(*muxClient).roundtrip()
      e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:94 +0x324
  github.com/minio/minio/internal/grid.(*muxClient).traceRoundtrip()
      e:/gopath/src/github.com/minio/minio/internal/grid/trace.go:74 +0x10e4
  github.com/minio/minio/internal/grid.(*Subroute).Request()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:366 +0x230
  github.com/minio/minio/internal/grid.(*SingleHandler[go.shape.*github.com/minio/minio/cmd.DiskInfoOptions,go.shape.*github.com/minio/minio/cmd.DiskInfo]).Call()
      e:/gopath/src/github.com/minio/minio/internal/grid/handlers.go:554 +0x3fd
  github.com/minio/minio/cmd.(*storageRESTClient).DiskInfo()
      e:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:314 +0x270
  github.com/minio/minio/cmd.erasureObjects.getOnlineDisksWithHealingAndInfo.func1()
      e:/gopath/src/github.com/minio/minio/cmd/erasure.go:293 +0x171
```

This read will always happen after the write, since there is a network call in between.

However a disconnect could come in while we are setting up the call, so we protect against that with extra checks.
2024-02-09 08:43:38 -08:00
Harshavardhana 62761a23e6
remove unnecessary metrics in 'mc admin info' output (#19020)
Reduce the amount of data transfer on large deployments
2024-02-08 19:28:46 -08:00
Harshavardhana 404d8b3084
fix: dangling objects honor parityBlocks instead of dataBlocks (#19019)
Bonus: do not recreate buckets if NoRecreate is asked.
2024-02-08 15:22:16 -08:00
Klaus Post 6005ad3d48
Fix shared top locks client (#19018)
`client` is shared across goroutines.

Seen with `mc support top locks` on minio built with `-race`.
2024-02-08 12:28:05 -08:00
Harshavardhana 035a3ea4ae
optimize startup sequence performance (#19009)
- bucket metadata does not need to look for legacy things
  anymore if b.Created is non-zero

- stagger bucket metadata loads across lots of nodes to
  avoid the current thundering herd problem.

- Remove deadlines for RenameData, RenameFile - these
  calls should not ever be timed out and should wait
  until completion or wait for client timeout. Do not
  choose timeouts for applications during the WRITE phase.

- increase R/W buffer size, increase maxMergeMessages to 30
2024-02-08 11:21:21 -08:00
Klaus Post 7ec43bd177
Fix blocked streams blocking reconnects (#19017)
We have observed cases where a blocked stream will block for cancellations.

This happens when response channel is blocked and we want to push an error.
This will have the response mutex locked, which will prevent all other operations until upstream is unblocked.

Make this behavior non-blocking and if blocked spawn a goroutine that will send the response and close the output.

Still a lot of "dancing". Added a test for this and reviewed.
2024-02-08 10:15:27 -08:00
Aditya Manthramurthy a29c66ed74
Update IAM access manager plugin demo (#19007)
Now prints the JSON payload for easier debugging.
2024-02-08 09:15:20 -08:00
Aditya Manthramurthy e104b183d8
fix: skip policy usage validation for cache update (#19008)
When updating the policy cache, we do not need to validate policy usage
as the policy has already been deleted by the node sending the
notification.
2024-02-07 20:39:53 -08:00
Klaus Post 7e082f232e
Add GetBucketInfo toStorageErr conversion (#19005)
Convert error to storageError since it is used for quorum calculations here: ff80cfd83d/cmd/peer-s3-client.go (L339)
2024-02-07 14:24:24 -08:00
Harshavardhana d28bf71f25
listing must return WalkDir() errors first (#19006) 2024-02-07 13:20:07 -08:00
Harshavardhana 5b1a74b6b2
do not block iam.store registration (#18999)
current implementation would quite simply
block the sys.store registration, making
sys.Initialized() call to be blocked.
2024-02-07 12:41:58 -08:00
Minio Trusted eead4db1d2 Update yaml files to latest version RELEASE.2024-02-06T21-36-22Z 2024-02-07 12:11:36 +00:00