Commit Graph

11840 Commits

Author SHA1 Message Date
Klaus Post
7ff4164d65
Fix races in IAM cache lazy loading (#19346)
Fix races in IAM cache

Fixes #19344

On the top level we only grab a read lock, but we write to the cache if we manage to fetch it.

a03dac41eb/cmd/iam-store.go (L446) is also flipped to what it should be AFAICT.

Change the internal cache structure to a concurrency safe implementation.

Bonus: Also switch grid implementation.
2024-03-26 11:12:57 -07:00
Shubhendu
53a14c7301
Adding dashboard for MinIO node metrics (#19329)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-03-26 08:01:28 -07:00
Harshavardhana
dc45a5010d
bring back minor DNS cache for k8s setups (#19341)
k8s as it stands is flaky in DNS lookups,
bring this change back such that we can
cache DNS atleast for 30secs TTL.
2024-03-26 08:00:38 -07:00
jiuker
4b9192034c
fix: should return when error happend (#19342) 2024-03-26 07:51:56 -07:00
Harshavardhana
deeadd1a37
fix: convert multiple callers to use toStorageErr(err) correctly (#19339)
we must attempt to convert all errors at storage-rest-client
into StorageErr() regardless of what functionality is being
called in, this PR fixes this for multiple callers including
some internally used functions.
2024-03-25 23:24:59 -07:00
Sveinn
1fc4203c19
Webhook targets refactor and bug fixes (#19275)
- old version was unable to retain messages during config reload
- old version could not go from memory to disk during reload
- new version can batch disk queue entries to single for to reduce I/O load
- error logging has been improved, previous version would miss certain errors.
- logic for spawning/despawning additional workers has been adjusted to trigger when half capacity is reached, instead of when the log queue becomes full.
- old version would json marshall x2 and unmarshal 1x for every log item. Now we only do marshal x1 and then we GetRaw from the store and send it without having to re-marshal.
2024-03-25 09:44:20 -07:00
Minio Trusted
15b930be1f Update yaml files to latest version RELEASE.2024-03-21T23-13-43Z 2024-03-22 20:08:28 +00:00
Poorna
7fd76dbbb7
fix batch snowball to close channel after listing finishes (#19316)
panic seen due to premature closing of slow channel while listing is still sending or
list has already closed on the sender's side:
```
panic: close of closed channel

goroutine 13666 [running]:
github.com/minio/minio/internal/ioutil.SafeClose[...](0x101ff51e4?)
	/Users/kp/code/src/github.com/minio/minio/internal/ioutil/ioutil.go:425 +0x24
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func1()
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2142 +0x170
created by github.com/minio/minio/cmd.(*erasureServerPools).Walk in goroutine 1189
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:1985 +0x228
```
2024-03-21 16:13:43 -07:00
Krishnan Parthasarathi
da81c6cc27
Encode dir obj names before expiration (#19305)
Object names of directory objects qualified for ExpiredObjectAllVersions
must be encoded appropriately before calling on deletePrefix on their
erasure set.

e.g., a directory object and regular objects with overlapping prefixes
could lead to the expiration of regular objects, which is not the 
intention of ILM. 

```
bucket/dir/ ---> directory object
bucket/dir/obj-1
```

When `bucket/dir/` qualifies for expiration, the current implementation would
remove regular objects under the prefix `bucket/dir/`, in this case,
`bucket/dir/obj-1`.
2024-03-21 10:21:35 -07:00
Harshavardhana
a03dac41eb
use retry during policy reload from drives (#19307) 2024-03-21 10:19:50 -07:00
Anis Eleuch
b657ffa496
fix: Fix crash when logging events and anonymous is enabled (#19313)
Events log does not have a stacktrace. So Trace is nil. Fix a crash in
this case when an event is printed while anonymous logging is enabled.
2024-03-21 10:19:36 -07:00
Shireesh Anjal
55778ae278
fix: peer addr returned as empty string (#19308)
In handlers related to health diagnostics e.g. CPU, Network, Partitions,
etc, globalMinioHost was being passed as the addr, resulting in empty
value for the same in the health report.

Using globalLocalNodeName instead fixes the issue.
2024-03-21 10:19:14 -07:00
Poorna
d990661d1f
replication: enforce precondition for multipart (#19306) 2024-03-20 18:12:37 -07:00
Harshavardhana
280526caf7
add IAM policyDB lookup fallbacks to drives (#19302)
IAM loading is a lazy operation, allow these
fallbacks to be in place when we cannot find
in-memory state().

this allows us to honor the request even if pay
a small price for lookup and populating the data.
2024-03-20 09:24:04 -07:00
Harshavardhana
1173b26fc8
avoid triggering heals on metacache files if any (#19299) 2024-03-19 20:21:15 -07:00
Krishnan Parthasarathi
383489d5d9
Handle zero versions qualified for expiration (#19301)
When objects have more versions than their ILM policy expects to retain
via NewerNoncurrentVersions, but they don't qualify for expiry due to
NoncurrentDays are configured in that rule. 

In this case, applyNewerNoncurrentVersionsLimit method was enqueuing empty 
tasks, which lead to a panic (panic: runtime error: index out of range [0] with
length 0) in newerNoncurrentTask.OpHash method, which assumes the task
to contain at least one version to expire.
2024-03-19 20:10:58 -07:00
Anis Eleuch
9370b11684
decom: Fix failed status after a failed decommission (#19300)
When returning the status of a decommissioned pool, a pool with zero
time StartedTime will be considered an active pool, which is unexpected. 
This commit will always ensure that a pool's canceled/failed/completed
status is returned.
2024-03-19 20:09:59 -07:00
Andreas Auernhammer
999bbd3a14
crypto: generate OEK using HMAC-SHA256 instead of SHA256 (#19297)
This commit changes how MinIO generates the object encryption key (OEK)
when encrypting an object using server-side encryption.

This change is fully backwards compatible. Now, MinIO generates
the OEK as following:
```
Nonce = RANDOM(32)        // generate 256 bit random value
OEK = HMAC-SHA256(EK, Context || Nonce)
```

Before, the OEK was computed as following:
```
Nonce = RANDOM(32)        // generate 256 bit random value
OEK = SHA256(EK || Nonce)
```

The new scheme does not technically fix a security issue but
uses a more familiar scheme. The only requirement for the
OEK generation function is that it produces a (pseudo)random value
for every pair (`EK`,`Nonce`) as long as no `EK`-`Nonce` combination
is repeated. This prevents a faulty PRNG from repeating or generating
a "bad" key.

The previous scheme guarantees that the `OEK` is a (pseudo)random
value given that no pair (`EK`,`Nonce`) repeats under the assumption
that SHA256 is indistinguable from a random oracle.

The new scheme guarantees that the `OEK` is a (pseudo)random value
given that no pair (`EK`, `Nonce`) repeats under the assumption that
SHA256's underlying compression function is a PRF/PRP.

While the later is a weaker assumption, and therefore, less likely
to be false, both are considered true. SHA256 is believed to be
indistinguable from a random oracle AND its compression function
is assumed to be a PRF/PRP.

As far as the OEK generating is concerned, the OS random number
generator is not required to be pseudo-random but just non-repeating.

Apart from being more compatible to standard definitions and
descriptions for how to generate crypto. keys, this change does not
have any impact of the actual security of the OEK key generation.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-03-19 13:28:10 -07:00
Anis Eleuch
235edd88aa
xl: Purge instead of moving to trash with near filled disks (#19294)
Immediately remove objects from the trash when the disk is 95% full
2024-03-19 13:26:24 -07:00
Anis Eleuch
b5e074e54c
list: Fix IsTruncated and NextMarker when encountering expired objects (#19290) 2024-03-19 13:23:12 -07:00
Harshavardhana
4d7068931a
change the notification queue full message (#19293) 2024-03-19 00:30:10 -07:00
jiuker
d7fb6fddf6
feat: add user specific redis auth (#19285) 2024-03-18 21:37:54 -07:00
Harshavardhana
7213bd7131
add additional logs for the decom during metadata save (#19288) 2024-03-18 15:25:45 -07:00
Harshavardhana
d4aac7cd72
add deprecated expiry_workers to be ignored (#19289)
avoids error during upgrades such as
```
API: SYSTEM()
Time: 19:19:22 UTC 03/18/2024
DeploymentID: 24e4b574-b28d-4e94-9bfa-03c363a600c2
Error: Invalid api configuration: found invalid keys (expiry_workers=100 ) for 'api' sub-system, use 'mc admin config reset myminio api' to fix invalid keys (*fmt.wrapError)
      11: internal/logger/logger.go:260:logger.LogIf()
...
```
2024-03-18 15:25:32 -07:00
Harshavardhana
741de4cf94
fix: add a default requests deadline when deadline is 0 (#19287) 2024-03-18 12:30:41 -07:00
Harshavardhana
f168ef9989
implement a flag to specify custom crossdomain.xml (#19262)
fixes #16909
2024-03-17 23:42:40 -07:00
alingse
a0de56abb6
fix: wrong time.Parse params order for replication timestamp (#19279) 2024-03-17 21:19:43 -07:00
Harshavardhana
c201d8bda9
write anything beyond 4k to be written in 4k pages (#19269)
we were prematurely not writing 4k pages while we
could have due to the fact that most buffers would
be multiples of 4k upto some number and there shall
be some remainder.

We only need to write the remainder without O_DIRECT.
2024-03-15 12:27:59 -07:00
Minio Trusted
d2373d5d6c Update yaml files to latest version RELEASE.2024-03-15T01-07-19Z 2024-03-15 02:47:20 +00:00
Harshavardhana
93fb7d62d8
allow dynamically changing max_object_versions per object (#19265) 2024-03-14 18:07:19 -07:00
Harshavardhana
485298b680
update all dependencies (#19235) 2024-03-14 17:41:26 -07:00
Harshavardhana
062f0cffad
fix: do not look for non-existent bucket in decom tests (#19261) 2024-03-14 08:54:11 -07:00
Harshavardhana
ce1c640ce0
feat: allow retaining parity SLA to be configurable (#19260)
at scale customers might start with failed drives,
causing skew in the overall usage ratio per EC set.

make this configurable such that customers can turn
this off as needed depending on how comfortable they
are.
2024-03-14 03:38:33 -07:00
Klaus Post
5c32058ff3
cosmetic: Move request goroutines to methods (#19241)
Cosmetic change, but breaks up a big code block and will make a goroutine 
dumps of streams are more readable, so it is clearer what each goroutine is doing.
2024-03-13 11:43:58 -07:00
Anis Eleuch
24b4f9d748
Fix quorum calculation with zero parity objects (#19250)
Currently, the code relies on object parity to decide whether it is a
delete marker or a regular object. In the case of a delete marker, the
return quorum is half of the disks in the erasure set. However, this
calculation must be corrected with objects with EC = 0, mainly 
because EC is not a one-time fixed configuration.

Though all data are correct, the manifested symptom is a 503 with an 
EC=0 object. This bug was manifested after we introduced the 
fast Get Object feature that does not read all data from all disks in 
case of inlined objects
2024-03-12 12:59:11 -07:00
Harshavardhana
81d7531f1f
only look for valid buckets (#19244)
fixes #19239
2024-03-12 04:33:30 -07:00
Poorna
b4a23f720e
update build constants (#19243) 2024-03-11 17:54:37 -07:00
Klaus Post
a2f6252b2f
xl-meta: Add inline data bitrot check (#19240)
When using `-data` also perform a bitrot check on the data.

Example:

```
λ xl-meta -data net.zip
{
        "minio-1.com:9000/data/minio1/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-1.com:9000/data/minio2/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-1.com:9000/data/minio3/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-1.com:9000/data/minio4/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-2.com:9000/data/minio1/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-2.com:9000/data/minio2/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-2.com:9000/data/minio3/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-2.com:9000/data/minio4/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-3.com:9000/data/minio1/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-3.com:9000/data/minio2/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-3.com:9000/data/minio3/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-3.com:9000/data/minio4/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-4.com:9000/data/minio1/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-4.com:9000/data/minio2/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-4.com:9000/data/minio3/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}},
        "minio-4.com:9000/data/minio4/p40/44b6/44b612e9a7294856bd2b5fe6f6cdeb0d.pdf/xl.meta": {"null":{"bitrot_valid":true,"bytes":11710}}
}
```
2024-03-11 10:57:11 -07:00
Dennis Marttinen
6c964fede5
Improve handling of compression inclusion for objects (#19234) 2024-03-11 04:55:34 -07:00
huajin tong
a25a8312d8
fix: some flyby typos in the code (#19212)
Signed-off-by: thirdkeyword <fliterdashen@gmail.com>
2024-03-10 14:09:36 -07:00
Aditya Manthramurthy
b2c5b75efa
feat: Add Metrics V3 API (#19068)
Metrics v3 is mainly a reorganization of metrics into smaller groups of
metrics and the removal of internal aggregation of metrics received from
peer nodes in a MinIO cluster.

This change adds the endpoint `/minio/metrics/v3` as the top-level metrics
endpoint and under this, various sub-endpoints are implemented. These
are currently documented in `docs/metrics/v3.md`

The handler will serve metrics at any path
`/minio/metrics/v3/PATH`, as follows:

when PATH is a sub-endpoint listed above => serves the group of
metrics under that path; or when PATH is a (non-empty) parent 
directory of the sub-endpoints listed above => serves metrics
from each child sub-endpoint of PATH. otherwise, returns a no 
resource found error

All available metrics are listed in the `docs/metrics/v3.md`. More will
be added subsequently.
2024-03-10 01:15:15 -08:00
Minio Trusted
2dfa9adc5d Update yaml files to latest version RELEASE.2024-03-10T02-53-48Z 2024-03-10 08:42:35 +00:00
Harshavardhana
88a89213ff
make immediate purge non-blocking up to 100,000 entries per drive (#19231)
make immediate purge non-blocking upto 100000 entries per drive

Bonus: turn-off O_DIRECT verification when FSType is 'XFS'
2024-03-09 18:53:48 -08:00
Poorna
8e2238ea09
some more cleanup for startup message (#19229) 2024-03-08 22:42:32 -08:00
Krishnan Parthasarathi
2007dd26ae
ilm: Expire if object past expected expiry date (#19230)
When an object qualifies for both tiering and expiration rules and is
past its expiration date, it should be expired without requiring to tier
it, even when tiering event occurs before expiration.
2024-03-08 22:41:22 -08:00
Poorna
31e8f7c525
Small reformatting of startup message (#19228)
Also changing User-Agent format
2024-03-08 19:07:08 -08:00
Klaus Post
51f62a8da3
Port ListBuckets to websockets layer & some cleanup (#19199) 2024-03-08 11:08:18 -08:00
Klaus Post
650efc2e96
Fix listing in objects split across pools (#19227)
Merging same-object - multiple versions from different pools would not always result in correct ordering.

When merging keep inputs separate.

```
λ mc ls --versions local/testbucket
------ before ------

[2024-03-05 20:17:19 CET]   228B STANDARD 1f163718-9bc5-4b01-bff7-5d8cf09caf10 v3 PUT hosts
[2024-03-05 20:19:56 CET]  19KiB STANDARD null v2 PUT hosts
[2024-03-05 20:17:15 CET]   228B STANDARD 73c9f651-f023-4566-b012-cc537fdb7ce2 v1 PUT hosts

------ after ------
λ mc ls --versions local/testbucket
[2024-03-05 20:19:56 CET]  19KiB STANDARD null v3 PUT hosts
[2024-03-05 20:17:19 CET]   228B STANDARD 1f163718-9bc5-4b01-bff7-5d8cf09caf10 v2 PUT hosts
[2024-03-05 20:17:15 CET]   228B STANDARD 73c9f651-f023-4566-b012-cc537fdb7ce2 v1 PUT hosts
```
2024-03-08 09:50:48 -08:00
dependabot[bot]
1787bcfc91
build(deps): bump github.com/lestrrat-go/jwx from 1.2.28 to 1.2.29 (#19226)
Bumps [github.com/lestrrat-go/jwx](https://github.com/lestrrat-go/jwx) from 1.2.28 to 1.2.29.
- [Release notes](https://github.com/lestrrat-go/jwx/releases)
- [Changelog](https://github.com/lestrrat-go/jwx/blob/v1.2.29/Changes)
- [Commits](https://github.com/lestrrat-go/jwx/compare/v1.2.28...v1.2.29)

---
updated-dependencies:
- dependency-name: github.com/lestrrat-go/jwx
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-08 08:42:47 -08:00
Harshavardhana
2cc4997d24
fix: crash on 32bit systems during pre-allocation (#19225) 2024-03-08 05:55:28 -08:00