Commit Graph

6272 Commits

Author SHA1 Message Date
Shireesh Anjal
074d70112d Consolidate drive health related metrics into single metric (#19706)
Instead of having "online" and "healing" as two metrics, replace with a
single metric "health" which can have following values:

0 = offline
1 = healthy
2 = healing
2024-05-12 10:23:50 -07:00
Harshavardhana
e8d14c0d90 verify preconditions during CompleteMultipart (#19713)
Bonus: hold the write lock properly to apply
optimistic concurrency during NewMultipartUpload()
2024-05-10 17:31:22 -07:00
Shireesh Anjal
60d7e8143a Move /cluster/audit to /audit (#19708)
As the audit metrics are server level and not 
overall cluster level.
2024-05-10 07:50:39 -07:00
Klaus Post
9667a170de Add usage cache cleanup and lower forced top compaction (#19719)
Lower forced compaction to 250K entries.

If there is more than 250K entries on the top level force compact it and log an error.
2024-05-10 07:49:50 -07:00
Harshavardhana
b598402738 fix: unexpected credentials missing while passing 2024-05-09 18:41:38 -07:00
Harshavardhana
72ff69d9bb add log-prefix name for specifying custom log-name (#19712) 2024-05-09 14:29:37 -07:00
Harshavardhana
f30417d9a8 Revert "Fix incorrect merging of slash-suffixed objects (#19699)"
This reverts commit 2f7a10ab31.
2024-05-09 12:32:05 -07:00
jiuker
47a4ad3cd7 fix: truncate Expiration to second when Add ServiceAccount (#19674)
Truncate Expiration at the second when Add ServiceAccount
2024-05-09 11:08:04 -07:00
Klaus Post
2f7a10ab31 Fix incorrect merging of slash-suffixed objects (#19699)
If two objects share everything but one object has a slash prefix, those would be merged in listings, 
with secondary properties used for a tiebreak.

Example: An object with the key `prefix/obj` would be merged with an object named `prefix/obj/`. 
While this violates the [no object can be a prefix of another](https://min.io/docs/minio/linux/operations/concepts/thresholds.html#conflicting-objects), let's resolve these.

If we have an object with 'name' and a directory named 'name/' discard the directory only - but allow objects 
of 'name' and 'name/' (xldir) to be uniquely returned.

Regression from #15772
2024-05-09 11:05:45 -07:00
Harshavardhana
b534dc69ab deprecate unexpected healing failed counters (#19705)
simplify this to avoid verbose metrics, and make
room for valid metrics to be reported for alerting
etc.
2024-05-09 11:04:41 -07:00
Harshavardhana
7b7d2ea7d4 pass around correct endpoint while registering remote storage (#19710) 2024-05-09 11:03:54 -07:00
Aditya Manthramurthy
e00de1c302 ldap-import: Add additional logs (#19691)
These logs are being added to provide better debugging of LDAP
normalization on IAM import.
2024-05-09 10:52:53 -07:00
Harshavardhana
3549e583a6 results must be a single channel to avoid overwriting healing.bin (#19702) 2024-05-09 10:15:03 -07:00
Andi
f5e3eedf34 chore: use errors.New to replace fmt.Errorf with no parameters (#19568)
Signed-off-by: ChengenH <hce19970702@gmail.com>
2024-05-09 01:44:07 -07:00
Harshavardhana
9a267f9270 allow caller context during reloads() to cancel (#19687)
canceled callers might linger around longer,
can potentially overwhelm the system. Instead
provider a caller context and canceled callers
don't hold on to them.

Bonus: we have no reason to cache errors, we should
never cache errors otherwise we can potentially have
quorum errors creeping in unexpectedly. We should
let the cache when invalidating hit the actual resources
instead.
2024-05-08 17:51:34 -07:00
Klaus Post
ec49fff583 Accept multipart checksums with part count (#19680)
Accept multipart uploads where the combined checksum provides the expected part count.

It seems this was added by AWS to make the API more consistent, even if the 
data is entirely superfluous on multiple levels.

Improves AWS S3 compatibility.
2024-05-08 09:18:34 -07:00
Andreas Auernhammer
8b660e18f2 kms: add support for MinKMS and remove some unused/broken code (#19368)
This commit adds support for MinKMS. Now, there are three KMS
implementations in `internal/kms`: Builtin, MinIO KES and MinIO KMS.

Adding another KMS integration required some cleanup. In particular:
 - Various KMS APIs that haven't been and are not used have been
   removed. A lot of the code was broken anyway.
 - Metrics are now monitored by the `kms.KMS` itself. For basic
   metrics this is simpler than collecting metrics for external
   servers. In particular, each KES server returns its own metrics
   and no cluster-level view.
 - The builtin KMS now uses the same en/decryption implemented by
   MinKMS and KES. It still supports decryption of the previous
   ciphertext format. It's backwards compatible.
 - Data encryption keys now include a master key version since MinKMS
   supports multiple versions (~4 billion in total and 10000 concurrent)
   per key name.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-05-07 16:55:37 -07:00
Harshavardhana
981497799a return appropriate error upon reaching maxClients() (#19669) 2024-05-07 13:41:56 -07:00
Olli Janatuinen
b413ff9fdb Support user certificate based authentication on SFTP (#19650) 2024-05-06 23:41:25 -07:00
Harshavardhana
6a15580817 fix: collect quorum errors for deletePrefix() (#19685)
do not return error for single drive being offline.
2024-05-06 22:44:46 -07:00
Cesar N
39633a5581 Set Console Redirect URL env variable (#19683) 2024-05-06 19:47:59 -07:00
Harshavardhana
888d2bb1d8 support ETag value to be '*' (#19682)
This supports '*' as per behavior to
comply with AWS S3 behavior for

- 'If-Match: *'
- 'If-None-Match: *'
2024-05-06 17:08:42 -07:00
Klaus Post
847ee5ac45 Make WalkDir return errors (#19677)
If used, 'opts.Marker` will cause many missed entries since results are returned 
unsorted, and pools are serialized.

Switch to fully concurrent listing and merging across pools to return sorted entries.
2024-05-06 13:27:52 -07:00
jiuker
9a9a49aa84 fix: Ignore AWSAccessKeyId check for SignV2 policy condition (#19673) 2024-05-06 03:52:41 -07:00
Harshavardhana
a03ca80269 support 'mc support perf object' with root login disabled (#19672)
It is expected that whoever is using the credentials which has
the proper set of permissions must be able to run.

`mc support perf object`

While the root login is disabled.
2024-05-06 02:45:10 -07:00
Harshavardhana
523bd769f1 add support for specific error response for InvalidRange (#19668)
fixes #19648

AWS S3 returns the actual object size as part of XML
response for InvalidRange error, this is used apparently
by SDKs to retry the request without the range.
2024-05-05 09:56:21 -07:00
Harshavardhana
8ff70ea5a9 turn-off coloring if we have std{err,out} dumb terminals (#19667) 2024-05-03 17:17:57 -07:00
Harshavardhana
da3e7747ca avoid using 10MiB EC buffers in maxAPI calculations (#19665)
max requests per node is more conservative in its value
causing premature serialization of the calls, avoid it
for newer deployments.
2024-05-03 13:08:20 -07:00
Klaus Post
4afb59e63f fix: walk missing entries with opts.Marker set (#19661)
'opts.Marker` is causing many missed entries if used since results are returned unsorted. Also since pools are serialized.

Switch to do fully concurrent listing and merging across pools to return sorted entries.

Returning errors on listings is impossible with the current API, so document that.

Return an error at once if no drives are found instead of just returning an empty listing and no error.
2024-05-03 10:26:51 -07:00
Harshavardhana
1526e7ece3 extend server config.yaml to support per pool set drive count (#19663)
This is to support deployments migrating from a multi-pooled
wider stripe to lower stripe. MINIO_STORAGE_CLASS_STANDARD
is still expected to be same for all pools. So you can satisfy
adding custom drive count based pools by adjusting the storage
class value.

```
version: v2
address: ':9000'
rootUser: 'minioadmin'
rootPassword: 'minioadmin'
console-address: ':9001'
pools: # Specify the nodes and drives with pools
  -
    args:
        - 'node{11...14}.example.net/data{1...4}'
  -
    args:
        - 'node{15...18}.example.net/data{1...4}'
  -
    args:
        - 'node{19...22}.example.net/data{1...4}'
  -
    args:
        - 'node{23...34}.example.net/data{1...10}'
    set-drive-count: 6
```
2024-05-03 08:54:03 -07:00
Krishnan Parthasarathi
6c07bfee8a With retention, skip actions expiring all versions (#19657)
ILM actions due to ExpiredObjectDeleteAllVersions and
DelMarkerExpiration are ignored when object locking is enabled on a
bucket.
Note: This applies to object versions which may not have retention
configured on them. This applies to all object versions in this bucket,
including those created before the retention config was applied.
2024-05-03 04:18:58 -07:00
Poorna
446c760820 replication: Avoid proxying if requested object is a deletemarker (#19656)
Fixes: #19654
2024-05-02 13:15:54 -07:00
Shireesh Anjal
04f92f1291 Change endpoint format for per-bucket metrics (#19655)
Per-bucket metrics endpoints always start with /bucket and the bucket
name is appended to the path. e.g. if the collector path is /bucket/api,
the endpoint for the bucket "mybucket" would be
/minio/metrics/v3/bucket/api/mybucket

Change the existing bucket api endpoint accordingly from /api/bucket to
/bucket/api
2024-05-02 10:37:57 -07:00
Bala FA
e5b16adb1c Add cluster IAM metrics in metrics-v3 (#19595)
Signed-off-by: Bala.FA <bala@minio.io>
2024-05-02 01:20:42 -07:00
Harshavardhana
402a3ac719 support compression after rotation of logs (#19647) 2024-05-01 15:38:07 -07:00
Aditya Manthramurthy
f3d61c51fc fix: Filter out cust. AssumeRole Token for audit (#19646)
The `Token` parameter is a sensitive value that should not be output in the Audit log for STS AssumeRoleWithCustomToken API.

Bonus: Add a simple tool that echoes audit logs to the console.
2024-05-01 14:31:13 -07:00
Klaus Post
0cde17ae5d Return listing when exceeding min disk errors (#19644)
When listing, with drives returning `errFileNotFound,` `errVolumeNotFound`, or `errUnformattedDisk,`, 
we could get below `minDisks` drives being left.

This would result in a quorum never being reachable for any object. Therefore, the listing 
would continue, but no results would ever be produced.

Include `fnf` in the mindisk check since it is incremented on these errors. This will stop 
listing when minDisks are left.

Allow `opts.minDisks` to not return errVolumeNotFound or errFileNotFound and return that. 
That will allow for good results even if disks return something else.

We switch `errUnformattedDisk` to a regular error. If we have enough of those, we should just fail.
2024-05-01 10:59:08 -07:00
Harshavardhana
8c1bba681b add logrotate support for MinIO logs (#19641) 2024-05-01 10:57:52 -07:00
Klaus Post
dbfb5e797b Wait one minute after startup to restart decommissioning (#19645)
Typically not all drives are connected, so we delay 3 minutes before resuming.
This greatly reduces risk of starting to list unconnected drives, or drives we risk being disconnected soon.

This delay is not applied when starting with an admin call.
2024-05-01 08:18:21 -07:00
Harshavardhana
08ff702434 enhance ListSVCs() API to return more info to avoid InfoSvc() (#19642)
ConsoleUI like applications rely on combination of

ListServiceAccounts() and InfoServiceAccount() to populate
UI elements, however individually these calls can be slow
causing the entire UI to load sluggishly.
2024-05-01 05:41:13 -07:00
Klaus Post
0e2148264a Fix --stfp "mac-algos=..." overwrites cipher algorithms (#19643)
Setting MAC algorithms overwrites cipher algorithms.

Followup to #19636
2024-05-01 04:07:40 -07:00
Krishnan Parthasarathi
7926401cbd ilm: Handle DeleteAllVersions action differently for DEL markers (#19481)
i.e., this rule element doesn't apply to DEL markers.

This is a breaking change to how ExpiredObejctDeleteAllVersions
functions today. This is necessary to avoid the following highly probable
footgun scenario in the future.

Scenario:
The user uses tags-based filtering to select an object's time to live(TTL). 
The application sometimes deletes objects, too, making its latest
version a DEL marker. The previous implementation skipped tag-based filters
if the newest version was DEL marker, voiding the tag-based TTL. The user is
surprised to find objects that have expired sooner than expected.

* Add DelMarkerExpiration action

This ILM action removes all versions of an object if its
the latest version is a DEL marker.

```xml
<DelMarkerObjectExpiration>
    <Days> 10 </Days>
</DelMarkerObjectExpiration>
```

1. Applies only to objects whose,
  • The latest version is a DEL marker.
  • satisfies the number of days criteria
2. Deletes all versions of this object
3. Associated rule can't have tag-based filtering

Includes,
- New bucket event type for deletion due to DelMarkerExpiration
2024-04-30 18:11:10 -07:00
Harshavardhana
8161411c5d fix: a crash in RemoveReplication target (#19640)
calling a remote target remove with a perfectly
well constructed ARN can lead to a crash for a bucket
with no replication configured.

This PR fixes, and adds a crash check for ImportMetadata
as well.
2024-04-30 18:09:56 -07:00
Klaus Post
f64dea2aac Allow custom SFTP algorithm selection (#19636)
Algorithms are comma separated.
Note that valid values does not in all cases represent default values.

`--sftp=pub-key-algos=...` specifies the supported client public key
authentication algorithms. Note that this doesn't include certificate types
since those use the underlying algorithm. This list is sent to the client if
it supports the server-sig-algs extension. Order is irrelevant.

Valid values
```
ssh-ed25519
sk-ssh-ed25519@openssh.com
sk-ecdsa-sha2-nistp256@openssh.com
ecdsa-sha2-nistp256
ecdsa-sha2-nistp384
ecdsa-sha2-nistp521
rsa-sha2-256
rsa-sha2-512
ssh-rsa
ssh-dss
```

`--sftp=kex-algos=...` specifies the supported key-exchange algorithms in preference order.

Valid values:

```
curve25519-sha256
curve25519-sha256@libssh.org
ecdh-sha2-nistp256
ecdh-sha2-nistp384
ecdh-sha2-nistp521
diffie-hellman-group14-sha256
diffie-hellman-group16-sha512
diffie-hellman-group14-sha1
diffie-hellman-group1-sha1
```

`--sftp=cipher-algos=...` specifies the allowed cipher algorithms.
If unspecified then a sensible default is used.

Valid values:
```
aes128-ctr
aes192-ctr
aes256-ctr
aes128-gcm@openssh.com
aes256-gcm@openssh.com
chacha20-poly1305@openssh.com
arcfour256
arcfour128
arcfour
aes128-cbc
3des-cbc
```

`--sftp=mac-algos=...` specifies a default set of MAC algorithms in preference order.
This is based on RFC 4253, section 6.4, but with hmac-md5 variants removed because they have
reached the end of their useful life.

Valid values:

```
hmac-sha2-256-etm@openssh.com
hmac-sha2-512-etm@openssh.com
hmac-sha2-256
hmac-sha2-512
hmac-sha1
hmac-sha1-96
```
2024-04-30 08:15:45 -07:00
Shubhendu
6579304d8c Suppress metrics with zero values (#19638)
This would reduce the size of data in response of metrics
listing. While graphing we can default these metrics with
a zero value if not found.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-04-30 08:05:22 -07:00
Klaus Post
3cf8a7c888 Always unfreeze when connection dies (#19634)
Unfreeze as soon as the incoming connection is terminated and don't wait for everything to complete.

We don't want to keep the services frozen if something becomes stuck.
2024-04-29 10:39:04 -07:00
Harshavardhana
a372c6a377 a bunch of fixes for error handling (#19627)
- handle errFileCorrupt properly
- micro-optimization of sending done() response quicker
  to close the goroutine.
- fix logger.Event() usage in a couple of places
- handle the rest of the client to return a different error other than
  lastErr() when the client is closed.
2024-04-28 10:53:50 -07:00
Poorna
9e95703efc iam reload policy mapping of STS users properly (#19626) 2024-04-27 03:04:10 -07:00
Anis Eleuch
d8e05aca81 heal/list: Fix rare incomplete listing with flaky internode connections (#19625)
listPathRaw() counts errDiskNotFound as a valid error to indicate a
listing stream end. However, storage.WalkDir() is allowed to return
errDiskNotFound anytime since grid.ErrDisconnected is converted to
errDiskNotFound.

This affects fresh disk healing and should affect S3 listing as well.
2024-04-26 12:52:52 -07:00
Praveen raj Mani
410a1ac040 Handle failures in pool rebalancing (#19623) 2024-04-26 12:29:28 -07:00