Commit Graph

5932 Commits

Author SHA1 Message Date
Harshavardhana 74ccee6619
avoid too much auditing during decom/rebalance make it more robust (#19174)
there can be a sudden spike in tiny allocations,
due to too much auditing being done, also don't hang
on the

```
h.logCh <- entry
```

after initializing workers if you do not have a way to
dequeue for some reason.
2024-03-06 03:43:16 -08:00
Poorna 89f759566c
bucket import: avoid overwriting bucket creation date (#19207) 2024-03-05 16:05:28 -08:00
Harshavardhana cd7551031b
fix: a regression in loading replication creds (#19204)
fixes #19200

generating STS credentials fail with site-replicated
setup, with this error on a fresh environment.
2024-03-05 11:06:17 -08:00
Praveen raj Mani df57bfcd6c
fix: cluster read health check to return proper values (#19203)
Fixes #19202
2024-03-05 10:25:49 -08:00
Justin Griffin dfb1f39b57
Support custom endpoint for Azure remote storage tier (#19188)
This commits adds support for using the `--endpoint` arg when creating a
tier of type `azure`. This is needed to connect to Azure's Gov Cloud
instance.  For example,

```
mc ilm tier add azure TARGET TIER_NAME \
   --account-name ACCOUNT \
   --account-key KEY \
   --bucket CONTAINER \
   --endpoint https://ACCOUNT.blob.core.usgovcloudapi.net
   --prefix PREFIX \
   --storage-class STORAGE_CLASS
```

Prior to this, the endpoint was hardcoded to `https://ACCOUNT.blob.core.windows.net`.
The docs were even explicit about this, stating that `--endpoint` is:

  "Required for `s3` or `minio` tier types. This option has no effect for any
  other value of `TIER_TYPE`."

Now, if the endpoint arg is present it will be used.  If not, it will
fall back to the same default behavior of `ACCOUNT.blob.core.windows.net`.
2024-03-05 08:44:08 -08:00
Harshavardhana 1b5f28e99b
fix: skip local disks properly in cluster health maintenance check (#19184) 2024-03-04 20:48:44 -08:00
Krishnan Parthasarathi b69bcdcdc4
Fix ilm config at startup (#19189)
Remove api.expiration_workers config setting which was inadvertently left behind. Per review comment 

https://github.com/minio/minio/pull/18926, expiration_workers can be configured via ilm.expiration_workers.
2024-03-04 18:50:24 -08:00
Harshavardhana e385f54185
fix: nLink is unreliable on all filesystems (#19187)
ext4, xfs support this behavior however
btrfs, nfs may not support it properly.

in-case when we see Nlink < 2 then we know
that we need to fallback on readdir()

fixes a regression from #19100

fixes #19181
2024-03-04 15:58:35 -08:00
Aditya Manthramurthy 9a4d003ac7
Add common middleware to S3 API handlers (#19171)
The middleware sets up tracing, throttling, gzipped responses and
collecting API stats.

Additionally, this change updates the names of handler functions in
metric labels to be the same as the name derived from Go lang reflection
on the handler name.

The metric api labels are now stored in memory the same as the handler
name - they will be camelcased, e.g. `GetObject` instead of `getobject`.

For compatibility, we lowercase the metric api label values when emitting the metrics.
2024-03-04 10:05:56 -08:00
Praveen raj Mani d5656eeb65
fix: healthcheck to fail even if one erasure set doesn't have quorum (#19180)
fix: healthcheck to return false even if one erasure set doesn't have quorum
2024-03-04 08:34:14 -08:00
Harshavardhana 6d08af61a0
for root disks add additional information in the error log (#19177) 2024-03-02 23:45:39 -08:00
Krishnan Parthasarathi a7577da768
Improve expiration of tiered objects (#18926)
- Use a shared worker pool for all ILM expiry tasks
- Free version cleanup executes in a separate goroutine
- Add a free version only if removing the remote object fails
- Add ILM expiry metrics to the node namespace
- Move tier journal tasks to expiryState
- Remove unused on-disk journal for tiered objects pending deletion
- Distribute expiry tasks across workers such that the expiry of versions of
  the same object serialized
- Ability to resize worker pool without server restart
- Make scaling down of expiryState workers' concurrency safe; Thanks
  @klauspost
- Add error logs when expiryState and transition state are not
  initialized (yet)
* metrics: Add missed tier journal entry tasks
* Initialize the ILM worker pool after the object layer
2024-03-01 21:11:03 -08:00
Harshavardhana 325fd80687
add retry logic upto 3 times for policy map and policy (#19173) 2024-03-01 16:21:34 -08:00
Andreas Auernhammer 09626d78ff
automatically generate root credentials with KMS (#19025)
With this commit, MinIO generates root credentials automatically
and deterministically if:

 - No root credentials have been set.
 - A KMS (KES) is configured.
 - API access for the root credentials is disabled (lockdown mode).

Before, MinIO defaults to `minioadmin` for both the access and
secret keys. Now, MinIO generates unique root credentials
automatically on startup using the KMS.

Therefore, it uses the KMS HMAC function to generate pseudo-random
values. These values never change as long as the KMS key remains
the same, and the KMS key must continue to exist since all IAM data
is encrypted with it.

Backward compatibility:

This commit should not cause existing deployments to break. It only
changes the root credentials of deployments that have a KMS configured
(KES, not a static key) but have not set any admin credentials. Such
implementations should be rare or not exist at all.

Even if the worst case would be updating root credentials in mc
or other clients used to administer the cluster. Root credentials
are anyway not intended for regular S3 operations.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-03-01 13:09:42 -08:00
Anis Eleuch 8f03c6e0db
xl: Avoid called getdents for folders in listing (#19100) 2024-03-01 08:01:28 -08:00
Harshavardhana 2c2f5d871c
debug: introduce support for configuring client connect WRITE deadline (#19170)
just like client-conn-read-deadline, added a new flag that does
client-conn-write-deadline as well.

Both are not configured by default, since we do not yet know
what is the right value. Allow this to be configurable if needed.
2024-03-01 08:00:42 -08:00
Harshavardhana c599c11e70
fix: relax metadata checks for healing (#19165)
we should do this to ensure that we focus on
data healing as primary focus, fixing metadata
as part of healing must be done but making
data available is the main focus.

the main reason is metadata inconsistencies can
cause data availability issues, which must be
avoided at all cost.

will be bringing in an additional healing mechanism
that involves "metadata-only" heal, for now we do
not expect to have these checks.

continuation of #19154

Bonus: add a pro-active healthcheck to perform a connection
2024-02-29 22:49:01 -08:00
Aditya Manthramurthy 6769d4dd54
Update API label names for metrics (#19162)
This change makes the label names consistent with the handler names.
This is in preparation to use reflection based API handler function
names for the api labels so they will be the same as tracing, auditing
and logging names for these API calls.
2024-02-29 16:14:27 -08:00
Harshavardhana d7520f0ae6
fix: make sure maintenance=true is honored properly (#19156)
fixes a regression from #18700
2024-02-29 08:37:57 -08:00
Harshavardhana 44b70eb646
allow creating missing parent folders during moveToTrash() (#19155) 2024-02-29 08:28:33 -08:00
Harshavardhana 467714f33b
ignore x-amz-storage-class when its set to STANDARD (#19154)
fixes #19135
2024-02-28 17:44:30 -08:00
Harshavardhana f8696cc8f6 fallback to globalLocalDrives for non-distributed setups 2024-02-28 14:56:08 -08:00
Anis Eleuch 9a7c7ab2d0
fix: parsing v2 and v1 cgroup memory limit (#19153)
Trim the newline at the end of the sysfs memory limit.
2024-02-28 14:52:20 -08:00
Harshavardhana 51874a5776
fix: allow DNS disconnection events to happen in k8s (#19145)
in k8s things really do come online very asynchronously,
we need to use implementation that allows this randomness.

To facilitate this move WriteAll() as part of the
websocket layer instead.

Bonus: avoid instances of dnscache usage on k8s
2024-02-28 09:54:52 -08:00
Aditya Manthramurthy 62ce52c8fd
cachevalue: simplify exported interface (#19137)
- Also add cache options type
2024-02-28 09:09:09 -08:00
Anis Eleuch 2bdb9511bd
heal: Add skipped objects to the heal summary (#19142)
New disk healing code skips/expires objects that ILM supposed to expire.
Add more visibility to the user about this activity by calculating those
objects and print it at the end of healing activity.
2024-02-28 09:05:40 -08:00
Harshavardhana 9a012a53ef
initialize the disk healer early on (#19143)
This PR fixes a bug that perhaps has been long introduced,
with no visible workarounds. In any deployment, if an entire
erasure set is deleted, there is no way the cluster recovers.
2024-02-27 23:02:14 -08:00
Harshavardhana 1dd8ef09a6
remove unnecessary 'recreate' code (#19136) 2024-02-27 01:47:58 -08:00
Poorna b1351e2dee sr: use site replicator svcacct to sign STS session tokens (#19111)
This change is to decouple need for root credentials to match between
 site replication deployments.

 Also ensuring site replication config initialization is re-tried until
 it succeeds, this deoendency is critical to STS flow in site replication
 scenario.
2024-02-26 13:30:28 -08:00
Praveen raj Mani 30c2596512
Read drive IO stats from sysfs instead of procfs (#19131)
Currently, we read from `/proc/diskstats` which is found to be
un-reliable in k8s environments. We can read from `sysfs` instead.

Also, cache the latest drive io stats to find the diff and update
the metrics.
2024-02-26 11:34:50 -08:00
Klaus Post 2b5e4b853c
Improve caching (#19130)
* Remove lock for cached operations.
* Rename "Relax" to `ReturnLastGood`.
* Add `CacheError` to allow caching values even on errors.
* Add NoWait that will return current value with async fetching if within 2xTTL.
* Make benchmark somewhat representative.

```
Before: BenchmarkCache-12       16408370                63.12 ns/op            0 B/op
After:  BenchmarkCache-12       428282187                2.789 ns/op           0 B/op
```

* Remove `storageRESTClient.scanning`. Nonsensical - RPC clients will not have any idea about scanning.
* Always fetch remote diskinfo metrics and cache them. Seems most calls are requesting metrics.
* Do async fetching of usage caches.
2024-02-26 10:49:19 -08:00
Harshavardhana 92788e4cf4
fix: re-arrange console-sys to log properly in k8s/docker (#19129)
fixes #19125
2024-02-26 01:33:48 -08:00
Harshavardhana 8a698fef71
fix: crash in ResourceMetrics RPC handling concurrent writers (#19123)
Continuation of #19103 that had fixed the crash in peer metrics for cluster endpoint.
2024-02-25 00:51:38 -08:00
Harshavardhana c2b54d92f6
allow all disk full errors to be handled (#19117) 2024-02-24 09:11:14 -08:00
Harshavardhana f965434022
fix: re-use endpoint strings to avoid allocation during audit (#19116) 2024-02-23 16:19:13 -08:00
Harshavardhana a3ac62596c
move timedValue -> cachevalue package (#19114) 2024-02-23 13:28:14 -08:00
Harshavardhana 2faba02d6b
fix: allow diskInfo at storageRPC to be cached (#19112)
Bonus: convert timedValue into a typed implementation
2024-02-23 09:21:38 -08:00
Krishnan Parthasarathi ee158e1610
ilm: Update action count only on success (#19093)
It also fixes a long-standing bug in expiring transitioned objects.
The expiration action was deleting the current version in the case'
of tiered objects instead of adding a delete marker.
2024-02-22 15:00:32 -08:00
Anis Eleuch fa68efb1e7
s3: CopyObject to disallow invalid dest object names (#19110)
By not doing so, objects can risk being in a wrong erasure set if the
destination object name contains e.g. '//'
2024-02-22 10:05:17 -08:00
Anis Eleuch 8c53a4405a
Add audit for folder excess (#19109)
Also replace ilm:expiry with scanner to avoid user confusion
2024-02-22 08:18:13 -08:00
Harshavardhana c32f699105
turn-off md5sum for SSE-KMS/SSE-C as optimization for multipart (#19106)
only enable md5sum if explicitly asked by the client, otherwise
its not necessary to compute md5sum when SSE-KMS/SSE-C is enabled.

this is continuation of #17958
2024-02-22 04:24:11 -08:00
Harshavardhana 53aa8f5650
use typos instead of codespell (#19088) 2024-02-21 22:26:06 -08:00
Klaus Post 92180bc793
Add array recycling safety (#19103)
Nil entries when recycling arrays.
2024-02-21 12:27:35 -08:00
Poorna 526b829a09
site replication: Disallow removal of site-replicator account (#19092) 2024-02-21 02:09:33 -08:00
Anis Eleuch 9ea5d08ecd
site-repl: Fix endpoint in the error with unexpected deployment-id (#19086) 2024-02-20 15:02:35 -08:00
Harshavardhana 35deb1a8e2
do not block on send channels under high load (#19090)
all send channels must compete with `ctx` if not
they will perpetually stay alive.
2024-02-20 15:00:35 -08:00
Harshavardhana c7f7c47388
allow renames() for inlined writes without data-dir (#18801)
data-dir not being present is okay, however we can still
rely on the `rename()` atomic call instead of relying on
write xl.meta write which may truncate the io.EOF.
2024-02-20 07:05:57 -08:00
Klaus Post e06168596f
Convert more peer <--> peer REST calls (#19004)
* Convert more peer <--> peer REST calls
* Clean up in general.
* Add JSON wrapper.
* Add slice wrapper.
* Add option to make handler return nil error if no connection is given, `IgnoreNilConn`.

Converts the following:

```
+	HandlerGetMetrics
+	HandlerGetResourceMetrics
+	HandlerGetMemInfo
+	HandlerGetProcInfo
+	HandlerGetOSInfo
+	HandlerGetPartitions
+	HandlerGetNetInfo
+	HandlerGetCPUs
+	HandlerServerInfo
+	HandlerGetSysConfig
+	HandlerGetSysServices
+	HandlerGetSysErrors
+	HandlerGetAllBucketStats
+	HandlerGetBucketStats
+	HandlerGetSRMetrics
+	HandlerGetPeerMetrics
+	HandlerGetMetacacheListing
+	HandlerUpdateMetacacheListing
+	HandlerGetPeerBucketMetrics
+	HandlerStorageInfo
+	HandlerGetLocks
+	HandlerBackgroundHealStatus
+	HandlerGetLastDayTierStats
+	HandlerSignalService
+	HandlerGetBandwidth
```
2024-02-19 14:54:46 -08:00
Harshavardhana 4c8197a119
reject expired STS credentials early without decoding sessionToken (#19072) 2024-02-19 07:34:10 -08:00
Harshavardhana b6e98aed01
fix: found races in accessing globalLocalDrives (#19069)
make a copy before accessing globalLocalDrives

Bonus: update console v0.46.0

Signed-off-by: Harshavardhana <harsha@minio.io>
2024-02-16 17:15:57 -08:00
Anis Eleuch 00dcba9ddd
Fix typo in jwt skewed date/time error (#19066) 2024-02-16 10:48:30 -08:00
Harshavardhana 607cafadbc
converge clusterRead health into cluster health (#19063) 2024-02-15 16:48:36 -08:00
Anis Eleuch 68dde2359f
log: Add logger.Event to send to console and other logger targets (#19060)
Add a new function logger.Event() to send the log to Console and
http/kafka log webhooks. This will include some internal events such as
disk healing and rebalance/decommissioning
2024-02-15 15:13:30 -08:00
Poorna f9dbf41e27
sr: add validation to disallow updating bandwidth limit on self (#19062) 2024-02-15 13:03:40 -08:00
Krishnan Parthasarathi 7405760f44
Refresh tier config periodically (#19049)
- Increase the parity for tier-config.bin object
- Refresh globalTierConfigMgr cached value once every 15 mins
2024-02-15 11:52:44 -08:00
Harshavardhana 7e4a6b4bcd
remove rename2 entirely, avoids the risk of moving data (#19058) 2024-02-14 17:09:38 -08:00
Harshavardhana f961ec4aaf
fix: revert allow offline disks on fresh start (#19052)
the PR in #16541 was incorrect and hand wrong assumptions
about the overall setup, revert this since this expectation
to have offline servers is wrong and we can end up with a
bigger chicken and egg problem.

This reverts commit 5996c8c4d5.

Bonus:

- preserve disk in globalLocalDrives properly upon connectDisks()
- do not return 'nil' from newXLStorage(), getting it ready for
  the next set of changes for 'format.json' loading.
2024-02-14 10:37:34 -08:00
Harshavardhana 134db72bb7
fix: reject service account access key same as root credentials (#19055) 2024-02-14 10:37:12 -08:00
Harshavardhana effe21f3eb
send correct objectname in audit events for DeleteAll ILM (#19053) 2024-02-14 08:07:58 -08:00
Praveen raj Mani 1118b285d3
fix: race in deleting objects during batch expiry (#19054) 2024-02-14 08:07:44 -08:00
Aditya Manthramurthy a14e192376
fix: remove unnecessary panic in iam-store (#19050) 2024-02-13 19:29:36 -08:00
Minio Trusted f8e15e7d09 Update yaml files to latest version RELEASE.2024-02-13T15-35-11Z 2024-02-13 16:01:38 +00:00
Shireesh Anjal 7b9f9e0628
fix incorrect disk io stats in k8s environment (#19016)
The previous logic of calculating per second values for disk io stats
divides the stats by the host uptime. This doesn't work in k8s
environment as the uptime is of the pod, but the stats (from
/proc/diskstats) are from the host.

Fix this by storing the initial values of uptime and the stats at the
timme of server startup, and using the difference between current and
initial values when calculating the per second values.
2024-02-13 07:35:11 -08:00
Praveen raj Mani ac8e9ce04f
Send a bucket notification event on DeleteObject() for non-existing object (#19037)
Send a bucket notification event on DeleteObject for non-existing objects
2024-02-13 07:34:17 -08:00
Praveen raj Mani cfd8645843
fix: update batch replication stats for snowball uploads (#19045) 2024-02-13 07:33:27 -08:00
Harshavardhana 0c068b15c7
add missing handler for reloading site replication config on peers (#19042) 2024-02-13 06:55:54 -08:00
Anis Eleuch 30a466aa71
sts: Add test for DurationSeconds condition (#19044) 2024-02-13 06:55:37 -08:00
Taran Pelkey 4d94609c44
FIx unexpected behavior when creating service account (#19036) 2024-02-13 02:31:43 -08:00
Poorna 0cc9fb73e1
metrics: fix typo in namespace for proxy tagging metric (#19039)
Relevant PR introducing this metric: #18957
2024-02-12 13:02:27 -08:00
Harshavardhana eac4e4b279
honor replaced disk properly by updating globalLocalDrives (#19038)
globalLocalDrives seem to be not updated during the
HealFormat() leads to a requirement where the server
needs to be restarted for the healing to continue.
2024-02-12 13:00:20 -08:00
Harshavardhana 6d381f7c0a
relax pre-emptive GetBucketInfo() for multi-object delete (#19035) 2024-02-12 08:46:46 -08:00
Anis Eleuch 4fa06aefc6
Convert service account add/update expiration to cond values (#19024)
In order to force some users allowed to create or update a service
account to provide an expiration satifying the user policy conditions.
2024-02-12 08:36:16 -08:00
Harshavardhana 0e177a44e0
preserve conflicting objects when parent object is being deleted (#19034)
a/prefix
a/prefix/1.txt

where `a/prefix` is an object which does not have `/` at the end,
we do not have to aggressively recursively delete all the sub-folders
as well. Instead convert the call into self contained to deleting
'xl.meta' and then subsequently attempting to Remove the parent.
2024-02-12 08:30:40 -08:00
Harshavardhana afd19de5a9
fix: allow configuring excess versions alerting (#19028)
Bonus: enable audit alerts for object versions
beyond the configured value, default is '100'
versions per object beyond which scanner will
alert for each such objects.
2024-02-11 23:41:53 -08:00
Harshavardhana e3fbac9e24
do not have to use the same distributionAlgo as first pool (#19031)
when we expand via pools, there is no reason to stick
with the same distributionAlgo as the rest. Since the
algo only makes sense with-in a pool not across pools.

This allows for newer pools to use newer codepaths to
avoid legacy file lookups when they have a pre-existing
deployment from 2019, they can expand their new pool
to be of a newer distribution format, allowing the
pool to be more performant.
2024-02-11 23:21:56 -08:00
Poorna a9cf32811c
Fix panic in tagging request proxying (#19032) 2024-02-11 18:18:43 -08:00
Harshavardhana 53997ecc79
avoid excessive logging for objects that do not exist (#19030)
in replicated setups, that have proxying enabled for
replicated buckets.
2024-02-11 14:21:08 -08:00
Harshavardhana 997ba3a574
introduce reader deadlines for net.Conn (#19023)
Bonus: set "retry-after" header for AWS SDKs if possible to honor them.
2024-02-09 13:25:16 -08:00
Harshavardhana 62761a23e6
remove unnecessary metrics in 'mc admin info' output (#19020)
Reduce the amount of data transfer on large deployments
2024-02-08 19:28:46 -08:00
Harshavardhana 404d8b3084
fix: dangling objects honor parityBlocks instead of dataBlocks (#19019)
Bonus: do not recreate buckets if NoRecreate is asked.
2024-02-08 15:22:16 -08:00
Klaus Post 6005ad3d48
Fix shared top locks client (#19018)
`client` is shared across goroutines.

Seen with `mc support top locks` on minio built with `-race`.
2024-02-08 12:28:05 -08:00
Harshavardhana 035a3ea4ae
optimize startup sequence performance (#19009)
- bucket metadata does not need to look for legacy things
  anymore if b.Created is non-zero

- stagger bucket metadata loads across lots of nodes to
  avoid the current thundering herd problem.

- Remove deadlines for RenameData, RenameFile - these
  calls should not ever be timed out and should wait
  until completion or wait for client timeout. Do not
  choose timeouts for applications during the WRITE phase.

- increase R/W buffer size, increase maxMergeMessages to 30
2024-02-08 11:21:21 -08:00
Aditya Manthramurthy e104b183d8
fix: skip policy usage validation for cache update (#19008)
When updating the policy cache, we do not need to validate policy usage
as the policy has already been deleted by the node sending the
notification.
2024-02-07 20:39:53 -08:00
Klaus Post 7e082f232e
Add GetBucketInfo toStorageErr conversion (#19005)
Convert error to storageError since it is used for quorum calculations here: ff80cfd83d/cmd/peer-s3-client.go (L339)
2024-02-07 14:24:24 -08:00
Harshavardhana d28bf71f25
listing must return WalkDir() errors first (#19006) 2024-02-07 13:20:07 -08:00
Harshavardhana 5b1a74b6b2
do not block iam.store registration (#18999)
current implementation would quite simply
block the sys.store registration, making
sys.Initialized() call to be blocked.
2024-02-07 12:41:58 -08:00
Klaus Post ebc6c9b498
Fix tracing send on closed channel (#18982)
Depending on when the context cancelation is picked up the handler may return and close the channel before `SubscribeJSON` returns, causing:

```
Feb 05 17:12:00 s3-us-node11 minio[3973657]: panic: send on closed channel
Feb 05 17:12:00 s3-us-node11 minio[3973657]: goroutine 378007076 [running]:
Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub.(*PubSub[...]).SubscribeJSON.func1()
Feb 05 17:12:00 s3-us-node11 minio[3973657]:         github.com/minio/minio/internal/pubsub/pubsub.go:139 +0x12d
Feb 05 17:12:00 s3-us-node11 minio[3973657]: created by github.com/minio/minio/internal/pubsub.(*PubSub[...]).SubscribeJSON in goroutine 378010884
Feb 05 17:12:00 s3-us-node11 minio[3973657]:         github.com/minio/minio/internal/pubsub/pubsub.go:124 +0x352
```

Wait explicitly for the goroutine to exit.

Bonus: Listen for doneCh when sending to not risk getting blocked there is channel isn't being emptied.
2024-02-06 08:57:30 -08:00
Harshavardhana 630963fa6b
protect tracker copy properly to avoid race (#18984)
```
WARNING: DATA RACE
Write at 0x00c000aac1e0 by goroutine 1133:
  github.com/minio/minio/cmd.(*healingTracker).updateProgress()
      github.com/minio/minio/cmd/background-newdisks-heal-ops.go:183 +0x117
  github.com/minio/minio/cmd.(*erasureObjects).healErasureSet.func5()
      github.com/minio/minio/cmd/global-heal.go:292 +0x1d3

Previous read at 0x00c000aac1e0 by goroutine 1003:
  github.com/minio/minio/cmd.(*allHealState).updateHealStatus()
      github.com/minio/minio/cmd/admin-heal-ops.go:136 +0xcb
  github.com/minio/minio/cmd.(*healingTracker).save()
      github.com/minio/minio/cmd/background-newdisks-heal-ops.go:223 +0x424
```
2024-02-06 08:56:59 -08:00
Harshavardhana f674168b8b
Add missing gob register for map[string]string{} (#18974)
```
minio[1303918]: API: SYSTEM()
minio[1303918]: Time: 02:04:28 UTC 02/05/2024
minio[1303918]: DeploymentID: 0972de33-2d17-4499-8967-aff6437dd9da
minio[1303918]: Error: gob: type not registered for interface: map[string]string (*errors.errorString)
minio[1303918]:        4: internal/logger/logonce.go:118:logger.(*logOnceType).logOnceIf()
minio[1303918]:        3: internal/logger/logonce.go:149:logger.LogOnceIf()
minio[1303918]:        2: cmd/peer-rest-server.go:533:cmd.(*peerRESTServer).GetSysConfigHandler()
minio[1303918]:        1: net/http/server.go:2136:http.HandlerFunc.ServeHTTP()
```
2024-02-06 08:23:23 -08:00
Poorna 27d02ea6f7
metrics: add replication metrics on proxied requests (#18957) 2024-02-05 22:00:45 -08:00
Harshavardhana 794a7993cb
calculate correct quorum check for metadata updates on object (#18979)
this fixes rare bugs we have seen but never really found a
reproducer for

- PutObjectRetention() returning 503s
- PutObjectTags() returning 503s
- PutObjectMetadata() updates during replication returning 503s

These calls return errors, and this perpetuates with
no apparent fix.

This PR fixes with correct quorum requirement.
2024-02-05 21:44:40 -08:00
Harshavardhana 6f16d1cb2c
do not count context canceled as timeout errors (#18975) 2024-02-05 18:16:13 -08:00
Anis Eleuch 7aa00bff89
sts: Add support of AssumeRoleWithWebIdentity and DurationSeconds (#18835)
To force limit the duration of STS accounts, the user can create a new
policy, like the following:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["sts:AssumeRoleWithWebIdentity"],
    "Condition": {"NumericLessThanEquals": {"sts:DurationSeconds": "300"}}
  }]
}

And force binding the policy to all OpenID users, whether using a claim name or role
ARN.
2024-02-05 11:44:23 -08:00
Klaus Post e046eb1d17
Disable Rename2 metrics on non-linux (#18970)
Logging a call that always fails is pointless.
2024-02-05 10:48:14 -08:00
Anis Eleuch ba975ca320
Add defensive code to ignore checking parts with transitioned objects (#18973)
Though dataErrs are nil with transitioned objects, add a more defensive
code to ignore counting missing parts in that case
2024-02-05 10:48:03 -08:00
Harshavardhana fec13b0ec1
remove unused DiskMTime (#18965) 2024-02-05 01:04:26 -08:00
Harshavardhana 100c35c281
avoid excessive logs when peer is down (#18969) 2024-02-04 23:25:42 -08:00
Harshavardhana f225ca3312
Add more advanced cases for dangling (#18968) 2024-02-04 14:36:13 -08:00
Frank Wessels 8b68e0bfdc
Fix typo in api-router.go (#18955) 2024-02-03 14:03:51 -08:00
Anis Eleuch 6ae97aedc9
xl: Disable rename2 in decommissioning/rebalance (#18964)
Always disable rename2 optimization in decom/rebalance
2024-02-03 14:03:30 -08:00