Commit Graph

6124 Commits

Author SHA1 Message Date
Anis Eleuch
d8e05aca81
heal/list: Fix rare incomplete listing with flaky internode connections (#19625)
listPathRaw() counts errDiskNotFound as a valid error to indicate a
listing stream end. However, storage.WalkDir() is allowed to return
errDiskNotFound anytime since grid.ErrDisconnected is converted to
errDiskNotFound.

This affects fresh disk healing and should affect S3 listing as well.
2024-04-26 12:52:52 -07:00
Praveen raj Mani
410a1ac040
Handle failures in pool rebalancing (#19623) 2024-04-26 12:29:28 -07:00
Shireesh Anjal
4caa3422bd
Add process metrics in metrics-v3 (#19612)
endpoint: /minio/metrics/v3/system/process
metrics:
- locks_read_total
- locks_write_total
- cpu_total_seconds
- go_routine_total
- io_rchar_bytes
- io_read_bytes
- io_wchar_bytes
- io_write_bytes
- start_time_seconds
- uptime_seconds
- file_descriptor_limit_total
- file_descriptor_open_total
- syscall_read_total
- syscall_write_total
- resident_memory_bytes
- virtual_memory_bytes
- virtual_memory_max_bytes

Since the standard process collector implements only a subset of these
metrics, remove it and implement our own custom process collector that
captures all the process metrics we need.
2024-04-26 09:07:23 -07:00
Anis Eleuch
135874ebdc
heal: Avoid marking a bucket as done when remote drives are offline (#19587) 2024-04-25 23:32:14 -07:00
Poorna
e7aa26dc29
fix: allow DeleteObject unversioned objects with insufficient read quorum (#19581)
Since the object is being permanently deleted, the lack of read quorum should not
matter as long as sufficient disks are online to complete the deletion with parity
requirements.

If several pools have the same object with insufficient read quorum, attempt to
delete object from all the pools where it exists
2024-04-25 17:31:12 -07:00
Harshavardhana
c54ffde568
add metrics ioerror counter for alerts on I/O errors (#19618) 2024-04-25 15:01:31 -07:00
Anis Eleuch
9a3c992d7a
heal: Fix regression in healing a new fresh drive (#19615) 2024-04-25 14:55:41 -07:00
Aditya Manthramurthy
0c855638de
fix: LDAP init. issue when LDAP server is down (#19619)
At server startup, LDAP configuration is validated against the LDAP
server. If the LDAP server is down at that point, we need to cleanly
disable LDAP configuration. Previously, LDAP would remain configured but
error out in strange ways because initialization did not complete
without errors.
2024-04-25 14:28:16 -07:00
Ramon de Klein
4c0acba62d
Fixes an internal error while force-deleting a bucket (#19614) 2024-04-25 09:27:27 -07:00
Aditya Manthramurthy
62c3cdee75
fix: IAM LDAP access key import bug (#19608)
When importing access keys (i.e. service accounts) for LDAP accounts,
we are requiring groups to exist under one of the configured group base
DNs. This is not correct. This change fixes this by only checking for
existence and storing the normalized form of the group DN - we do not
return an error if the group is not under a base DN.

Test is updated to illustrate an import failure that would happen
without this change.
2024-04-25 08:50:16 -07:00
Aditya Manthramurthy
3212d0c8cd
fix: IAM import for LDAP should replace mappings (#19607)
Existing IAM import logic for LDAP creates new mappings when the
normalized form of the mapping key differs from the existing mapping key
in storage. This change effectively replaces the existing mapping key by
first deleting it and then recreating with the normalized form of the
mapping key.

For e.g. if an older deployment had a policy mapped to a user DN -

`UID=alice1,OU=people,OU=hwengg,DC=min,DC=io`

instead of adding a mapping for the normalized form -

`uid=alice1,ou=people,ou=hwengg,dc=min,dc=io`

we should replace the existing mapping.

This ensures that duplicates mappings won't remain after the import.

Some additional cleanup cases are also covered. If there are multiple
mappings for the name normalized key such as:

`UID=alice1,OU=people,OU=hwengg,DC=min,DC=io`
`uid=alice1,ou=people,ou=hwengg,DC=min,DC=io`
`uid=alice1,ou=people,ou=hwengg,dc=min,dc=io`

we check if the list of policies mapped to all these keys are exactly
the same, and if so remove all of them and create a single mapping with
the normalized key. However, if the policies mapped to such keys differ,
the import operation returns an error as the server cannot automatically
pick the "right" list of policies to map.
2024-04-25 08:49:53 -07:00
Harshavardhana
1d03bea965
support preserving renameData() on inlined content during overwrites (#19609)
extending #19548 to inlined-data as well.
2024-04-24 18:14:08 -07:00
jiuker
df93ff92ba
fix: site-replication will reset group status when add user (#19594) 2024-04-24 08:54:24 -07:00
Shireesh Anjal
77d5331e85
Fix few wrongly defined metric types (#19586)
`minio_cluster_webhook_queue_length` was wrongly defined as `counter`
where-as it should be `gauge`

Following were wrongly defined as `gauge` when they should actually be
`counter`:

- minio_bucket_replication_sent_bytes
- minio_bucket_replication_received_bytes
- minio_bucket_replication_total_failed_bytes
- minio_bucket_replication_total_failed_count
2024-04-23 23:19:40 -07:00
Bala FA
14cdadfb56
Add cluster notification metrics in metrics-v3 (#19533)
Signed-off-by: Bala.FA <bala@minio.io>
2024-04-23 21:10:35 -07:00
Harshavardhana
f3a52cc195
simplify listener implementation setup customizations in right place (#19589) 2024-04-23 21:08:47 -07:00
Aditya Manthramurthy
7640cd24c9
fix: avoid some IAM import errors if LDAP enabled (#19591)
When LDAP is enabled, previously we were:

- rejecting creation of users and groups via the IAM import functionality

- throwing a `not a valid DN` error when non-LDAP group mappings are present

This change allows for these cases as we need to support situations
where the MinIO server contains users, groups and policy mappings
created before LDAP was enabled.
2024-04-23 18:23:08 -07:00
Shireesh Anjal
f7b665347e
Add system CPU metrics to metrics-v3 (#19560)
endpoint: /minio/metrics/v3/system/cpu

metrics:
- minio_system_cpu_avg_idle
- minio_system_cpu_avg_iowait
- minio_system_cpu_load
- minio_system_cpu_load_perc
- minio_system_cpu_nice
- minio_system_cpu_steal
- minio_system_cpu_system
- minio_system_cpu_user
2024-04-23 16:56:12 -07:00
Harshavardhana
9693c382a8
make renameData() more defensive during overwrites (#19548)
instead upon any error in renameData(), we still
preserve the existing dataDir in some form for
recoverability in strange situations such as out
of disk space type errors.

Bonus: avoid running list and heal() instead allow
versions disparity to return the actual versions,
uuid to heal. Currently limit this to 100 versions
and lesser disparate objects.

an undo now reverts back the xl.meta from xl.meta.bkp
during overwrites on such flaky setups.

Bonus: Save N depth syscalls via skipping the parents
upon overwrites and versioned updates.

Flaky setup examples are stretch clusters with regular
packet drops etc, we need to add some defensive code
around to avoid dangling objects.
2024-04-23 10:15:52 -07:00
jiuker
ee1047bd52
fix: can't get total disksize for decom status (#19585) 2024-04-23 04:33:28 -07:00
Seiya
5ea5ab162b
Remove leading zero strings in return value of (*xlMetaV2)getDataDirs() (#19567)
remove leading zero strings in return value of getDataDirs()
2024-04-22 22:07:37 -07:00
Klaus Post
b5a09ff96b
Fix RenameData data race (#19579)
RenameData could start operating on inline data after timing out 
and the call returned due to WithDeadline.

This could cause a buffer to write to the inline data being written.

Since no writes are in `RenameData` and the call is canceled, 
this doesn't present a corruption issue. But a race is a race and 
should be fixed.

Copy inline data to a fresh buffer.
2024-04-22 22:07:19 -07:00
Harshavardhana
95c65f4e8f
do not panic on rebalance during server restarts (#19563)
This PR makes a feasible approach to handle all the scenarios
that we must face to avoid returning "panic."

Instead, we must return "errServerNotInitialized" when a
bucketMetadataSys.Get() is called, allowing the caller to
retry their operation and wait.

Bonus fix the way data-usage-cache stores the object.
Instead of storing usage-cache.bin with the bucket as
`.minio.sys/buckets`, the `buckets` must be relative
to the bucket `.minio.sys` as part of the object name.

Otherwise, there is no way to decommission entries at
`.minio.sys/buckets` and their final erasure set positions.

A bucket must never have a `/` in it. Adds code to read()
from existing data-usage.bin upon upgrade.
2024-04-22 10:49:30 -07:00
Harshavardhana
6bfff7532e
re-use transport and set stronger backwards compatible Ciphers (#19565)
This PR fixes a few things

- FIPS support for missing for remote transports, causing
  MinIO could end up using non-FIPS Ciphers in FIPS mode

- Avoids too many transports, they all do the same thing
  to make connection pooling work properly re-use them.

- globalTCPOptions must be set before setting transport
  to make sure the client conn deadlines are honored properly.

- GCS warm tier must re-use our transport

- Re-enable trailing headers support.
2024-04-21 04:43:18 -07:00
Harshavardhana
1aa8896ad6 Revert "cleanup: Simplify usage of MinIOSourceProxyRequest (#19553)"
This reverts commit 928c0181bf.

This change was not correct, reverting.

We track 3 states with the ProxyRequest header - if replication process wants
to know if object is already replicated with a HEAD, it shouldn't proxy back
   - Poorna
2024-04-20 02:05:54 -07:00
Krishnan Parthasarathi
3e32ceb39f
Disable trailing header support for MinIO tiers (#19561)
AWS S3 trailing header support was recently enabled on the warm tier
client connection to MinIO type remote tiers. With this enabled, we are
seeing the following error message at http transport layer.

> Unsolicited response received on idle HTTP channel starting with "HTTP/1.1 400 Bad Request\r\nContent-Type: text/plain; charset=utf-8\r\nConnection: close\r\n\r\n400 Bad Request"; err=<nil>

This is an interim fix until we identify the root cause for this behaviour in the
minio-go client package.
2024-04-19 19:32:25 -07:00
jiuker
9205434ed3
fix: ignore signaturev2 for policy header check (#19551) 2024-04-19 09:45:54 -07:00
Harshavardhana
cd50e9b4bc
make LRU cache global for internode tokens (#19555) 2024-04-19 09:45:14 -07:00
Klaus Post
ec816f3840
Reduce parallelReader allocs (#19558) 2024-04-19 09:44:59 -07:00
Klaus Post
5f774951b1
Store object EC in metadata header (#19534)
Keep the EC in header, so it can be retrieved easily for dynamic quorum calculations.

To not force a full metadata decode on every read the value will be 0/0 for data written in previous versions.

Size is expected to increase by 2 bytes per version, since all valid values can be represented with 1 byte each.

Example:
```
λ xl-meta xl.meta
{
  "Versions": [
    {
      "Header": {
        "EcM": 4,
        "EcN": 8,
        "Flags": 6,
        "ModTime": "2024-04-17T11:46:25.325613+02:00",
        "Signature": "0a409875",
        "Type": 1,
        "VersionID": "8e03504e11234957b2727bc53eda0d55"
      },
...
```

Not used for operations yet.
2024-04-19 09:43:43 -07:00
Harshavardhana
72f5cb577e
optimize ftp/sftp upload() implementations to avoid CPU load (#19552) 2024-04-19 05:23:42 -07:00
Robert Lützner
928c0181bf
cleanup: Simplify usage of MinIOSourceProxyRequest (#19553)
This replaces a convoluted condition that ultimately evaluated to

"is this HTTP header present in the request or not?"
2024-04-19 05:23:31 -07:00
Harshavardhana
03767d26da
fix: get rid of large buffers (#19549)
these lead to run-away usage of memory
beyond which the Go's GC can handle, we
have to re-visit this differently, remove
this for now.
2024-04-19 04:26:59 -07:00
Sveinn
108e6f92d4
updating tests to use new mc --enc flags (#19508) 2024-04-19 01:43:09 -07:00
Harshavardhana
d653a59fc0 fix: flaky getHostIP test 2024-04-18 19:09:56 -07:00
Aditya Manthramurthy
98f7821eb3
fix: ldap: avoid unnecessary import errors (#19547)
Follow up for #19528

If there are multiple existing DN mappings for the same normalized DN,
if they all have the same policy mapping value, we pick one of them of
them instead of returning an import error.
2024-04-18 12:09:19 -07:00
Aditya Manthramurthy
ae46ce9937
ldap: Normalize DNs when importing (#19528)
This is a change to IAM export/import functionality. For LDAP enabled
setups, it performs additional validations:

- for policy mappings on LDAP users and groups, it ensures that the
corresponding user or group DN exists and if so uses a normalized form
of these DNs for storage

- for access keys (service accounts), it updates (i.e. validates
existence and normalizes) the internally stored parent user DN and group
DNs.

This allows for a migration path for setups in which LDAP mappings have
been stored in previous versions of the server, where the name of the
mapping file stored on drives is not in a normalized form.

An administrator needs to execute:

`mc admin iam export ALIAS`

followed by

`mc admin iam import ALIAS /path/to/export/file`

The validations are more strict and returns errors when multiple
mappings are found for the same user/group DN. This is to ensure the
mappings stored by the server are unambiguous and to reduce the
potential for confusion.

Bonus **bug fix**: IAM export of access keys (service accounts) did not
export key name, description and expiration. This is fixed in this
change too.
2024-04-18 08:15:02 -07:00
Anis Eleuch
dfc112c06b
list: Fix rare listing continuation freeze (#19524)
Reading the list metacache is not protected by a lock; the code retries when it fails
to read the metacache object, however, it forgot to re-read the metacache object
from the drives, which is necessary, especially if the metacache object is inlined.

This commit will ensure that we always re-read the metacache object from the drives
when it is retrying.
2024-04-17 21:42:11 -07:00
Shireesh Anjal
ca5fab8656
Add cluster audit metrics in metrics-v3 (#19514)
endpoint: /minio/metrics/v3/cluster/audit
metrics:
- failed_messages (counter)
- total_messages (counter)
- target_queue_length (gauge)
2024-04-17 02:18:02 -07:00
Shireesh Anjal
6df76ca73c
Add system memory metrics in v3 (#19486)
Following memory metrics will be added under /system/memory

- available
- buffers
- cache
- free
- shared
- total
- used
- used_perc
2024-04-16 22:10:25 -07:00
Harshavardhana
f65dd3e5a2
reload from drive tier-config when in-memory cache is not found (#19527)
avoid probing tier target while reloading() tier config
2024-04-16 22:09:58 -07:00
Harshavardhana
a8d601b64a
allow detaching any non-normalized DN (#19525) 2024-04-16 17:36:43 -07:00
Klaus Post
e2709ea129
ftp: Return current time for prefixes/directories (#19519) 2024-04-16 17:35:55 -07:00
Allan Roger Reid
740ec80819
At server init, use the correct context when creating the KMS Master Key (#19526) 2024-04-16 17:34:45 -07:00
Allan Roger Reid
7c1f9667d1
Use GetDuration() helper for MINIO_KMS_KEY_CACHE_INTERVAL as time.Duration (#19512)
Bonus: Use default duration of 10 seconds if invalid input < time.Second is specified
2024-04-16 08:43:39 -07:00
Klaus Post
9246990496
fix: ListObjectVersions returning duplicates when resuming with null version id (#19518)
When resuming a versioned listing where `version-id-marker=null`, the `null` object would 
always be returned, causing duplicate entries to be returned.

Add check against empty version
2024-04-16 08:41:27 -07:00
Harshavardhana
cb06aee5ac
convert multipart-cleanup from a blocking unlink() to a rename to trash (#19495)
unlinking() at two different locations on a disk when there
are lots to purge, this can lead to huge IOwaits, instead
rely on rename() to .trash to avoid running multiple unlinks()
in parallel.
2024-04-15 03:02:39 -07:00
Shubhendu
1c70e9ed1b
ILM expiry replication status only if enabled (#19503)
Report ILM expiry replication status only if atleast one site has the
feature enabled.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-04-15 02:40:39 -07:00
jiuker
f3d6a2dd37
code clean for dynamicSleeper (#19499) 2024-04-15 02:40:19 -07:00
Harshavardhana
d1c58fc2eb
remove older deploymentID fix behavior to speed up startup (#19497)
since mid 2018 we do not have any deployments
without deployment-id, it is time to put this
code to rest, this PR removes this old code as
its no longer valuable.

on setups with 1000's of drives these are all
quite expensive operations.
2024-04-15 01:25:46 -07:00
Allan Roger Reid
b8f05b1471
Keep an up-to-date copy of the KMS master key (#19492) 2024-04-15 00:42:50 -07:00
Klaus Post
e7baf78ee8
fix: list operations resuming when hitting different node (#19494)
The rest of the peer clients were not consistent across nodes. So, meta cache requests 
would not go to the same server if a continuation happens on a different node.
2024-04-12 11:13:36 -07:00
Harshavardhana
7e3166475d
simplify common functions in replication (#19480) 2024-04-11 17:27:32 -07:00
Klaus Post
5206c0e883
Inspect: Add error if no results (#19476)
When no results match or another error occurs, add an error to the stream. Keep the "inspect-input.txt" as the only thing in the zip for reference.

Example:

```
λ mc support inspect --airgap myminio/testbucket/fjghfjh/**
mc: Using public key from C:\Users\klaus\mc\support_public.pem
File data successfully downloaded as inspect-data.enc

λ inspect inspect-data.enc
Using private key from support_private.pem
output written to inspect-data.zip
2024/04/11 14:10:51 next stream: GetRawData: No files matched the given pattern

λ unzip -l inspect-data.zip
Archive:  inspect-data.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
      222  2024-04-11 14:10   inspect-input.txt
---------                     -------
      222                     1 file

λ
```

Modifies inspect to read until end of stream to report the error.

Bonus: Add legacy commandline params
2024-04-11 14:22:47 -07:00
Harshavardhana
41ec038523
remove permission denied error for being drive error (#19478) 2024-04-11 14:22:15 -07:00
Shireesh Anjal
08d3d06a06
Add drive metrics in metrics-v3 (#19452)
Add following metrics:

- used_inodes
- total_inodes
- healing
- online
- reads_per_sec
- reads_kb_per_sec
- reads_await
- writes_per_sec
- writes_kb_per_sec
- writes_await
- perc_util

To be able to calculate the `per_sec` values, we capture the IOStats-related 
data in the beginning (along with the time at which they were captured), 
and compare them against the current values subsequently. This is because 
dividing by "time since server uptime." doesn't work in k8s environments.
2024-04-11 10:46:34 -07:00
Harshavardhana
074febd9e1
remove SetDiskLoc() rely on the endpoint values instead (#19475)
the disk location never changes in the lifetime of a
MinIO cluster, even if it did validate this close to the
disk instead at the higher layer.

Return appropriate errors indicating an invalid drive, so
that the drive is not recognized as part of a valid
drive.
2024-04-11 10:45:28 -07:00
Poorna
ffa91f9794
fix CopyObject with replace overwriting inline status (#19468)
Fixes #19450 - internal inline-data header can get overwritten
during copy with replace before this fix.
2024-04-10 23:42:51 -07:00
Harshavardhana
0c31e61343
allow protection from invalid config values (#19460)
we have had numerous reports on some config
values not having default values, causing
features misbehaving and not having default
values set properly.

This PR tries to address all these concerns
once and for all.

Each new sub-system that gets added

- must check for invalid keys
- must have default values set
- must not "return err" when being saved into
  a global state() instead collate as part of
  other subsystem errors allow other sub-systems
  to independently initialize.
2024-04-10 18:10:30 -07:00
Harshavardhana
9b926f7dbe
avoid busy loops in bad path component (#19466)
use it in places where we are looking
for such bad path components.
2024-04-10 18:08:52 -07:00
Harshavardhana
35d8728990
handle missing LDAP normalization in SetPolicy() API (#19465) 2024-04-10 15:37:42 -07:00
Allan Roger Reid
f7ed9a75ba
Allow specifying the local server with env variable _MINIO_SERVER_LOCAL (#19453)
* Allow specifying the local server, with env variable _MINIO_SERVER_LOCAL, in systems where the hostname cannot be resolved to local IP

* Limit scope of the _MINIO_SERVER_LOCAL solution to only containerized implementations
2024-04-10 09:34:59 -07:00
jiuker
ed64e91f06
fix: noHost for collectLocalMetric (#19457) 2024-04-10 09:28:08 -07:00
jiuker
a481825ae1
fix: unknow contentType for ArchiveFileHandler (#19451) 2024-04-09 03:41:25 -07:00
Harshavardhana
7bb0f32332
make if-none-match PUT/POST RFC compliant (#19448)
fixes #19442
2024-04-09 01:17:49 -07:00
Anis Eleuch
c6f8dc431e
Add a warning when the total size of an object versions exceeds 1 TiB (#19435) 2024-04-08 10:45:03 -07:00
Anis Eleuch
787c44c39d
batch-repl: Do not allow both source/target to be remote (#19434)
Return an error when the user specifies endpoints for both source
and target. This can generate many type of errors as the code considers
a deployment remote if its endpoint is specified.
2024-04-08 07:11:38 -07:00
Anis Eleuch
f06fee0364
heal: Add more per disk healing result in the audit (#19427)
HealObject() does not return an error in some cases, for example, when
an object is successfully reconstructed in one disk but fails with other
disks, another case is when a disk does not have the object is temporarily
disconnected

Add the After heal drives result in the audit output for better
analysis.
2024-04-08 02:26:14 -07:00
Harshavardhana
c957e0d426
fix: increase the tiering part size to 128MiB (#19424)
also introduce 8MiB buffer to read from for
bigger parts
2024-04-08 02:22:27 -07:00
Harshavardhana
04101d472f
fix: add fallbackDisks for disk healing (#19425) 2024-04-08 02:22:13 -07:00
Minio Trusted
51fc145161 Update yaml files to latest version RELEASE.2024-04-06T05-26-02Z 2024-04-06 06:44:30 +00:00
Taran Pelkey
9d63bb1b41
Added new API errors for LDAP (#19415)
* change internal errors to named errors

* Change names
2024-04-05 22:26:02 -07:00
Aditya Manthramurthy
8ff2a7a2b9
fix: IAM import/export: remove sts group handling (#19422)
There are no separate STS group mappings to be handled.

Also add tests for basic import/export sanity.
2024-04-05 20:13:35 -07:00
Harshavardhana
91f91d8f47
fix: a regression in IAM policy reload routine() (#19421)
all policy reloading is broken since last release since

48deccdc40

fixes #19417
2024-04-05 14:26:41 -07:00
Harshavardhana
a207bd6790
turn-off Nlink readdir() optimization for NFS/CIFS (#19420)
fixes #19418
fixes #19416
2024-04-05 08:17:08 -07:00
Harshavardhana
96d226c0b1
remove frivolous log about abort-multipart failure in replication (#19413) 2024-04-05 04:39:55 -07:00
Krishnan Parthasarathi
a86d98826d
Set object's original modTime when being restored (#19414)
Set object's modTime when being restored

restored here refers to making a temporary local copy in the hot tier
for a tiered object using the RestoreObject API
2024-04-05 04:39:31 -07:00
Harshavardhana
1bb670ecba
use new generics based LRU from hashicorp (#19409)
we have been using an LRU caching for internode
auth tokens, migrate to using a typed implementation
and also do not cache auth tokens when its an error.
2024-04-04 11:58:48 -07:00
Aditya Manthramurthy
c9e9a8e2b9
fix: ldap: use validated base DNs (#19406)
This fixes a regression from #19358 which prevents policy mappings
created in the latest release from being displayed in policy entity
listing APIs.

This is due to the possibility that the base DNs in the LDAP config are
not in a normalized form and #19358 introduced normalized of mapping
keys (user DNs and group DNs). When listing, we check if the policy
mappings are on entities that parse as valid DNs that are descendants of
the base DNs in the config.

Test added that demonstrates a failure without this fix.
2024-04-04 11:36:18 -07:00
jiuker
272367ccd2
feat: add memlimit flags for setMaxResources (#19400) 2024-04-04 05:06:57 -07:00
Anis Eleuch
95bf4a57b6
logging: Add subsystem to log API (#19002)
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.

Also introduce bugLogIf() for errors that should not happen in the first
place.
2024-04-04 05:04:40 -07:00
Andreas Auernhammer
faeb2b7e79
use GenerateKey as more reliable KMS health-check (#19404)
This commit replaces the `KMS.Stat` API call with a
`KMS.GenerateKey` call. This approach is more reliable
since data key generation also works when the KMS backend
is unavailable (temp. offline), but KES has cached the
key. Ref: KES offline caching.

With this change, it is less likely that MinIO readiness
checks fail in cases where the KMS backend is offline.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-04-03 14:13:20 -07:00
Anis Eleuch
97ce11cb6b
Avoid using a nil transport when the config is not initialized (#19405)
Make sure to pass a nil pointer as a Transport to minio-go  when the API config
is not initialized, this will make sure that we do not pass an interface
with a known type but a nil value.

This will also fix the update of the API remote_transport_deadline
configuration without requiring the cluster restart.
2024-04-03 11:27:05 -07:00
Harshavardhana
4f660a8eb7
fix: missing metrics for healed objects (#19392)
all healed successful objects via queueHealTask
in a non-blocking heal weren't being reported
correctly, this PR fixes this comprehensively.
2024-04-01 23:48:36 -07:00
Praveen raj Mani
ae4fb1b72e
Prioritize the bucket configs first during the decommissioning (#19393) 2024-04-01 23:48:26 -07:00
Klaus Post
b435806d91
Reduce big message RPC allocations (#19390)
Use `ODirectPoolSmall` buffers for inline data in PutObject.

Add a separate call for inline data that will fetch a buffer for the inline data before unmarshal.
2024-04-01 16:42:09 -07:00
Klaus Post
3d6194e93c
Remove empty replication stats (#19385)
When sending final stats upstream also trim empty ReplicationStats.
2024-03-29 11:57:52 -07:00
Harshavardhana
feb9d8480b
add auditing for healing objects (#19379) 2024-03-28 16:46:19 -07:00
Aditya Manthramurthy
48deccdc40
fix: sts accounts map refresh and fewer list calls (#19376)
This fixes a bug where STS Accounts map accumulates accounts in memory
and never removes expired accounts and the STS Policy mappings were not
being refreshed.

The STS purge routine now runs with every IAM credentials load instead
of every 4th time.

The listing of IAM files is now cached on every IAM load operation to
prevent re-listing for STS accounts purging/reload.

Additionally this change makes each server pick a time for IAM loading
that is randomly distributed from a 10 minute interval - this is to
prevent server from thundering while performing the IAM load.

On average, IAM loading will happen between every 5-15min after the
previous IAM load operation completes.
2024-03-28 16:43:50 -07:00
Kaan Kabalak
3f72439b8a
Suppress error log for force-deleting object in locked bucket (#19378) 2024-03-28 14:37:42 -07:00
Shubhendu
468a9fae83
Enable replication of SSE-C objects (#19107)
If site replication enabled across sites, replicate the SSE-C
objects as well. These objects could be read from target sites
using the same client encryption keys.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-03-28 10:44:56 -07:00
Klaus Post
aa0eec16ab
Remove empty replication stats when sending update (#19375)
When sending update and there is no replication stats - remove the struct.

Will remove an unneeded alloc on the receiver.
2024-03-28 10:13:07 -07:00
jiuker
8222a640ac
fix: slice append lose the data for NSScanner (#19373) 2024-03-28 08:13:36 -07:00
Aditya Manthramurthy
7e45d84ace
ldap: improve normalization of DN values (#19358)
Instead of relying on user input values, we use the DN value returned by
the LDAP server.

This handles cases like when a mapping is set on a DN value
`uid=svc.algorithm,OU=swengg,DC=min,DC=io` with a user input value (with
unicode variation) of `uid=svc﹒algorithm,OU=swengg,DC=min,DC=io`. The
LDAP server on lookup of this DN returns the normalized value where the
unicode dot character `SMALL FULL STOP` (in the user input), gets
replaced with regular full stop.
2024-03-27 23:45:26 -07:00
Harshavardhana
139a606f0a
use bigger partSize per part for tiering to MinIO (#19361)
Bonus: remove persistent md5sum calculation, turn-off
sha256 as well. Instead we always enable crc32c which
is enough for payload verification also support for
trailing headers checksum.
2024-03-27 23:45:08 -07:00
Harshavardhana
289223b6de
expire ILM all versions verify quorum on action (#19359) 2024-03-27 23:44:52 -07:00
Harshavardhana
c61dd16a1e
fix: avoid fan-out DeletePrefix calls for batch-expire and ILM (#19365) 2024-03-27 20:18:15 -07:00
Harshavardhana
3e38fa54a5
set max versions to be IntMax to avoid premature failures (#19360)
let users/customers set relevant values make default value
to be non-applicable.
2024-03-27 18:08:07 -07:00
jiuker
4a02189ba0
feat: add env to choose which node to start decom (#19310)
add a temporary env  _MINIO_DECOM_ENDPOINT to choose 
the node to start decom from, in situations when first node
first pool is not available.
2024-03-27 16:18:40 -07:00
jiuker
ec3a3bb10d
fix: Remove unnecessary loops for searchParent (#19353) 2024-03-27 08:12:14 -07:00
Harshavardhana
364d3a0ac9
fix: new staticheck and linter issues reported (#19340) 2024-03-27 08:10:40 -07:00
Poorna
8bce123bba
fix: precondition check for multipart with existing object replication (#19349) 2024-03-26 15:10:45 -07:00
Harshavardhana
0a56dbde2f
allow configuring inline shard size value (#19336) 2024-03-26 15:06:19 -07:00
Klaus Post
7ff4164d65
Fix races in IAM cache lazy loading (#19346)
Fix races in IAM cache

Fixes #19344

On the top level we only grab a read lock, but we write to the cache if we manage to fetch it.

a03dac41eb/cmd/iam-store.go (L446) is also flipped to what it should be AFAICT.

Change the internal cache structure to a concurrency safe implementation.

Bonus: Also switch grid implementation.
2024-03-26 11:12:57 -07:00
Harshavardhana
dc45a5010d
bring back minor DNS cache for k8s setups (#19341)
k8s as it stands is flaky in DNS lookups,
bring this change back such that we can
cache DNS atleast for 30secs TTL.
2024-03-26 08:00:38 -07:00
jiuker
4b9192034c
fix: should return when error happend (#19342) 2024-03-26 07:51:56 -07:00
Harshavardhana
deeadd1a37
fix: convert multiple callers to use toStorageErr(err) correctly (#19339)
we must attempt to convert all errors at storage-rest-client
into StorageErr() regardless of what functionality is being
called in, this PR fixes this for multiple callers including
some internally used functions.
2024-03-25 23:24:59 -07:00
Sveinn
1fc4203c19
Webhook targets refactor and bug fixes (#19275)
- old version was unable to retain messages during config reload
- old version could not go from memory to disk during reload
- new version can batch disk queue entries to single for to reduce I/O load
- error logging has been improved, previous version would miss certain errors.
- logic for spawning/despawning additional workers has been adjusted to trigger when half capacity is reached, instead of when the log queue becomes full.
- old version would json marshall x2 and unmarshal 1x for every log item. Now we only do marshal x1 and then we GetRaw from the store and send it without having to re-marshal.
2024-03-25 09:44:20 -07:00
Poorna
7fd76dbbb7
fix batch snowball to close channel after listing finishes (#19316)
panic seen due to premature closing of slow channel while listing is still sending or
list has already closed on the sender's side:
```
panic: close of closed channel

goroutine 13666 [running]:
github.com/minio/minio/internal/ioutil.SafeClose[...](0x101ff51e4?)
	/Users/kp/code/src/github.com/minio/minio/internal/ioutil/ioutil.go:425 +0x24
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func1()
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2142 +0x170
created by github.com/minio/minio/cmd.(*erasureServerPools).Walk in goroutine 1189
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:1985 +0x228
```
2024-03-21 16:13:43 -07:00
Krishnan Parthasarathi
da81c6cc27
Encode dir obj names before expiration (#19305)
Object names of directory objects qualified for ExpiredObjectAllVersions
must be encoded appropriately before calling on deletePrefix on their
erasure set.

e.g., a directory object and regular objects with overlapping prefixes
could lead to the expiration of regular objects, which is not the 
intention of ILM. 

```
bucket/dir/ ---> directory object
bucket/dir/obj-1
```

When `bucket/dir/` qualifies for expiration, the current implementation would
remove regular objects under the prefix `bucket/dir/`, in this case,
`bucket/dir/obj-1`.
2024-03-21 10:21:35 -07:00
Harshavardhana
a03dac41eb
use retry during policy reload from drives (#19307) 2024-03-21 10:19:50 -07:00
Shireesh Anjal
55778ae278
fix: peer addr returned as empty string (#19308)
In handlers related to health diagnostics e.g. CPU, Network, Partitions,
etc, globalMinioHost was being passed as the addr, resulting in empty
value for the same in the health report.

Using globalLocalNodeName instead fixes the issue.
2024-03-21 10:19:14 -07:00
Poorna
d990661d1f
replication: enforce precondition for multipart (#19306) 2024-03-20 18:12:37 -07:00
Harshavardhana
280526caf7
add IAM policyDB lookup fallbacks to drives (#19302)
IAM loading is a lazy operation, allow these
fallbacks to be in place when we cannot find
in-memory state().

this allows us to honor the request even if pay
a small price for lookup and populating the data.
2024-03-20 09:24:04 -07:00
Harshavardhana
1173b26fc8
avoid triggering heals on metacache files if any (#19299) 2024-03-19 20:21:15 -07:00
Krishnan Parthasarathi
383489d5d9
Handle zero versions qualified for expiration (#19301)
When objects have more versions than their ILM policy expects to retain
via NewerNoncurrentVersions, but they don't qualify for expiry due to
NoncurrentDays are configured in that rule. 

In this case, applyNewerNoncurrentVersionsLimit method was enqueuing empty 
tasks, which lead to a panic (panic: runtime error: index out of range [0] with
length 0) in newerNoncurrentTask.OpHash method, which assumes the task
to contain at least one version to expire.
2024-03-19 20:10:58 -07:00
Anis Eleuch
9370b11684
decom: Fix failed status after a failed decommission (#19300)
When returning the status of a decommissioned pool, a pool with zero
time StartedTime will be considered an active pool, which is unexpected. 
This commit will always ensure that a pool's canceled/failed/completed
status is returned.
2024-03-19 20:09:59 -07:00
Anis Eleuch
235edd88aa
xl: Purge instead of moving to trash with near filled disks (#19294)
Immediately remove objects from the trash when the disk is 95% full
2024-03-19 13:26:24 -07:00
Anis Eleuch
b5e074e54c
list: Fix IsTruncated and NextMarker when encountering expired objects (#19290) 2024-03-19 13:23:12 -07:00
Harshavardhana
7213bd7131
add additional logs for the decom during metadata save (#19288) 2024-03-18 15:25:45 -07:00
Harshavardhana
741de4cf94
fix: add a default requests deadline when deadline is 0 (#19287) 2024-03-18 12:30:41 -07:00
Harshavardhana
f168ef9989
implement a flag to specify custom crossdomain.xml (#19262)
fixes #16909
2024-03-17 23:42:40 -07:00
alingse
a0de56abb6
fix: wrong time.Parse params order for replication timestamp (#19279) 2024-03-17 21:19:43 -07:00
Harshavardhana
c201d8bda9
write anything beyond 4k to be written in 4k pages (#19269)
we were prematurely not writing 4k pages while we
could have due to the fact that most buffers would
be multiples of 4k upto some number and there shall
be some remainder.

We only need to write the remainder without O_DIRECT.
2024-03-15 12:27:59 -07:00
Harshavardhana
93fb7d62d8
allow dynamically changing max_object_versions per object (#19265) 2024-03-14 18:07:19 -07:00
Harshavardhana
ce1c640ce0
feat: allow retaining parity SLA to be configurable (#19260)
at scale customers might start with failed drives,
causing skew in the overall usage ratio per EC set.

make this configurable such that customers can turn
this off as needed depending on how comfortable they
are.
2024-03-14 03:38:33 -07:00
Anis Eleuch
24b4f9d748
Fix quorum calculation with zero parity objects (#19250)
Currently, the code relies on object parity to decide whether it is a
delete marker or a regular object. In the case of a delete marker, the
return quorum is half of the disks in the erasure set. However, this
calculation must be corrected with objects with EC = 0, mainly 
because EC is not a one-time fixed configuration.

Though all data are correct, the manifested symptom is a 503 with an 
EC=0 object. This bug was manifested after we introduced the 
fast Get Object feature that does not read all data from all disks in 
case of inlined objects
2024-03-12 12:59:11 -07:00
Harshavardhana
81d7531f1f
only look for valid buckets (#19244)
fixes #19239
2024-03-12 04:33:30 -07:00
Poorna
b4a23f720e
update build constants (#19243) 2024-03-11 17:54:37 -07:00
Dennis Marttinen
6c964fede5
Improve handling of compression inclusion for objects (#19234) 2024-03-11 04:55:34 -07:00
huajin tong
a25a8312d8
fix: some flyby typos in the code (#19212)
Signed-off-by: thirdkeyword <fliterdashen@gmail.com>
2024-03-10 14:09:36 -07:00
Aditya Manthramurthy
b2c5b75efa
feat: Add Metrics V3 API (#19068)
Metrics v3 is mainly a reorganization of metrics into smaller groups of
metrics and the removal of internal aggregation of metrics received from
peer nodes in a MinIO cluster.

This change adds the endpoint `/minio/metrics/v3` as the top-level metrics
endpoint and under this, various sub-endpoints are implemented. These
are currently documented in `docs/metrics/v3.md`

The handler will serve metrics at any path
`/minio/metrics/v3/PATH`, as follows:

when PATH is a sub-endpoint listed above => serves the group of
metrics under that path; or when PATH is a (non-empty) parent 
directory of the sub-endpoints listed above => serves metrics
from each child sub-endpoint of PATH. otherwise, returns a no 
resource found error

All available metrics are listed in the `docs/metrics/v3.md`. More will
be added subsequently.
2024-03-10 01:15:15 -08:00
Harshavardhana
88a89213ff
make immediate purge non-blocking up to 100,000 entries per drive (#19231)
make immediate purge non-blocking upto 100000 entries per drive

Bonus: turn-off O_DIRECT verification when FSType is 'XFS'
2024-03-09 18:53:48 -08:00
Poorna
8e2238ea09
some more cleanup for startup message (#19229) 2024-03-08 22:42:32 -08:00
Poorna
31e8f7c525
Small reformatting of startup message (#19228)
Also changing User-Agent format
2024-03-08 19:07:08 -08:00
Klaus Post
51f62a8da3
Port ListBuckets to websockets layer & some cleanup (#19199) 2024-03-08 11:08:18 -08:00
Klaus Post
650efc2e96
Fix listing in objects split across pools (#19227)
Merging same-object - multiple versions from different pools would not always result in correct ordering.

When merging keep inputs separate.

```
λ mc ls --versions local/testbucket
------ before ------

[2024-03-05 20:17:19 CET]   228B STANDARD 1f163718-9bc5-4b01-bff7-5d8cf09caf10 v3 PUT hosts
[2024-03-05 20:19:56 CET]  19KiB STANDARD null v2 PUT hosts
[2024-03-05 20:17:15 CET]   228B STANDARD 73c9f651-f023-4566-b012-cc537fdb7ce2 v1 PUT hosts

------ after ------
λ mc ls --versions local/testbucket
[2024-03-05 20:19:56 CET]  19KiB STANDARD null v3 PUT hosts
[2024-03-05 20:17:19 CET]   228B STANDARD 1f163718-9bc5-4b01-bff7-5d8cf09caf10 v2 PUT hosts
[2024-03-05 20:17:15 CET]   228B STANDARD 73c9f651-f023-4566-b012-cc537fdb7ce2 v1 PUT hosts
```
2024-03-08 09:50:48 -08:00
Harshavardhana
2cc4997d24
fix: crash on 32bit systems during pre-allocation (#19225) 2024-03-08 05:55:28 -08:00
Poorna
934f6cabf6
sr: use site replicator creds to verify temp user claims (#19224)
This PR continues #19209 which did not handle claims verification of
temporary users created by root in site replication scenario.

Fixes: #19217
2024-03-07 14:30:00 -08:00
Anis Eleuch
68dd74c5ab
batch: Separate batch job request and batch job stats (#19205)
Currently, the progress of the batch job is saved in inside the job
request object, which is normally not supported by MinIO. Though there
is no apparent bug, it is better to fix this now.

Batch progress is saved in .minio.sys/batch-jobs/reports/

Co-authored-by: Anis Eleuch <anis@min.io>
2024-03-07 10:58:22 -08:00
Harshavardhana
48b590e14b
fix: same server to be part of multiple pools (#19216)
our PoolNumber calculation was costly,
while we already had this information per
endpoint, we needed to deduce it appropriately.

This PR addresses this by assigning PoolNumbers
field that carries all the pool numbers that
belong to a server.

properties.PoolNumber still carries a valid value
only when len(properties.PoolNumbers) == 1, otherwise
properties.PoolNumber is set to math.MaxInt (indicating
that this value is undefined) and then one must rely
on properties.PoolNumbers for server participation
in multiple pools.

addresses the issue originating from #11327
2024-03-07 10:24:07 -08:00
Poorna
837a2a3d4b
sr: use service account cred for claims check (#19209)
PR #19111 overlaid service account secret with site replicator secret
during token claims check.

Fixes : #19206
2024-03-06 16:19:24 -08:00
Harshavardhana
74ccee6619
avoid too much auditing during decom/rebalance make it more robust (#19174)
there can be a sudden spike in tiny allocations,
due to too much auditing being done, also don't hang
on the

```
h.logCh <- entry
```

after initializing workers if you do not have a way to
dequeue for some reason.
2024-03-06 03:43:16 -08:00
Poorna
89f759566c
bucket import: avoid overwriting bucket creation date (#19207) 2024-03-05 16:05:28 -08:00
Harshavardhana
cd7551031b
fix: a regression in loading replication creds (#19204)
fixes #19200

generating STS credentials fail with site-replicated
setup, with this error on a fresh environment.
2024-03-05 11:06:17 -08:00
Praveen raj Mani
df57bfcd6c
fix: cluster read health check to return proper values (#19203)
Fixes #19202
2024-03-05 10:25:49 -08:00
Justin Griffin
dfb1f39b57
Support custom endpoint for Azure remote storage tier (#19188)
This commits adds support for using the `--endpoint` arg when creating a
tier of type `azure`. This is needed to connect to Azure's Gov Cloud
instance.  For example,

```
mc ilm tier add azure TARGET TIER_NAME \
   --account-name ACCOUNT \
   --account-key KEY \
   --bucket CONTAINER \
   --endpoint https://ACCOUNT.blob.core.usgovcloudapi.net
   --prefix PREFIX \
   --storage-class STORAGE_CLASS
```

Prior to this, the endpoint was hardcoded to `https://ACCOUNT.blob.core.windows.net`.
The docs were even explicit about this, stating that `--endpoint` is:

  "Required for `s3` or `minio` tier types. This option has no effect for any
  other value of `TIER_TYPE`."

Now, if the endpoint arg is present it will be used.  If not, it will
fall back to the same default behavior of `ACCOUNT.blob.core.windows.net`.
2024-03-05 08:44:08 -08:00
Harshavardhana
1b5f28e99b
fix: skip local disks properly in cluster health maintenance check (#19184) 2024-03-04 20:48:44 -08:00
Krishnan Parthasarathi
b69bcdcdc4
Fix ilm config at startup (#19189)
Remove api.expiration_workers config setting which was inadvertently left behind. Per review comment 

https://github.com/minio/minio/pull/18926, expiration_workers can be configured via ilm.expiration_workers.
2024-03-04 18:50:24 -08:00
Harshavardhana
e385f54185
fix: nLink is unreliable on all filesystems (#19187)
ext4, xfs support this behavior however
btrfs, nfs may not support it properly.

in-case when we see Nlink < 2 then we know
that we need to fallback on readdir()

fixes a regression from #19100

fixes #19181
2024-03-04 15:58:35 -08:00