Commit Graph

6378 Commits

Author SHA1 Message Date
Klaus Post
0680af7414
Set O_NONBLOCK for reads and writes on unix (#20133)
Tracing syscalls, opening and reading an `xl.meta` looks like this:

```
openat(AT_FDCWD, "/mnt/drive1/ss8-old/testbucket/ObjSize4MiBThreads72/(554O51H/peTb(0iztdbTKw59.csv/xl.meta", O_RDONLY|O_NOATIME|O_CLOEXEC) = 34 <0.000>
fcntl(34, F_GETFL)                      = 0x48000 (flags O_RDONLY|O_LARGEFILE|O_NOATIME) <0.000>
fcntl(34, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOATIME) = 0 <0.000>
epoll_ctl(4, EPOLL_CTL_ADD, 34, {events=EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, data={u32=3172471557, u64=8145488475984499461}}) = -1 EPERM (Operation not permitted) <0.000>
fcntl(34, F_GETFL)                      = 0x48800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_NOATIME) <0.000>
fcntl(34, F_SETFL, O_RDONLY|O_LARGEFILE|O_NOATIME) = 0 <0.000>
fstat(34, {st_mode=S_IFREG|0644, st_size=354, ...}) = 0 <0.000>
read(34, "XL2 \1\0\3\0\306\0\0\1P\2\2\1\304$\225\304\20\0\0\0\0\0\0\0\0\0\0\0"..., 354) = 354 <0.000>
close(34)                               = 0 <0.000>
```

Everything until `fstat` is the `os.Open` call.

Looking at the code: https://github.com/golang/go/blob/master/src/os/file_unix.go#L212-L243

It seems for every file it "tries" to see if it is pollable. This causes `syscall.SetNonblock(fd, true)` to be called. This is the first `F_SETFL`.

It then calls `f.pfd.Init("file", true)`. This will attempt to set it as pollable using `epoll_ctl`. This will always fail for files. It therefore calls `syscall.SetNonblock(fd, false)` resulting in the second `F_SETFL`.

If we set the `O_NONBLOCK` call on the initial open, we should avoid the 4 `fcntl` syscalls per file.

I don't see any way to avoid the `epoll_ctl` call, since kind is either `kindOpenFile` or `kindNonBlock`, so "pollable" will always be true. However avoiding 4 of 6 syscalls still seems worth it.

This should not have any effect, since files will end up with "nonblock" anyway.
2024-07-23 09:36:24 -07:00
Harshavardhana
91805bcab6
add optimizations to bring performance on unversioned READS (#20128)
allow non-inlined on disk to be inlined via
an unversioned ReadVersion() call, we only
need ReadXL() to resolve objects with multiple
versions only.

The choice of this block makes it to be dynamic
and chosen by the user via `mc admin config set`

Other bonus things

- Start measuring internode TTFB performance.
- Set TCP_NODELAY, TCP_CORK for low latency
2024-07-23 03:53:03 -07:00
jiuker
b3a94c4e85
fix: Use xtime duration to parse batch job (#20117) 2024-07-23 00:05:53 -07:00
Harshavardhana
8e618d45fc
remove unnecessary LRU for internode auth token (#20119)
removes contentious usage of mutexes in LRU, which
were never really reused in any manner; we do not
need it.

To trust hosts, the correct way is TLS certs; this PR completely
removes this dependency, which has never been useful.

```
0  0%  100%  25.83s 26.76%  github.com/hashicorp/golang-lru/v2/expirable.(*LRU[...])
0  0%  100%  28.03s 29.04%  github.com/hashicorp/golang-lru/v2/expirable.(*LRU[...])
```

Bonus: use `x-minio-time` as a nanosecond to avoid unnecessary
parsing logic of time strings instead of using a more
straightforward mechanism.
2024-07-22 00:04:48 -07:00
Harshavardhana
3ef59d2821
do not set KMSSecretKey env from KMSSecretKeyFile (#20122)
fixes #20121
2024-07-21 14:39:15 -07:00
Anis Eleuch
d9ee668b6d
s3: Fix wrong continuation token during listing with ILM enabled bucket (#20113) 2024-07-18 13:37:34 -07:00
Anis Eleuch
2e5d792f0c
batch-expiry: Save progress regularly in the drives and at the end (#20098)
- Also, fix failure reporting at the end.
- Also, avoid parsing report objects when listing or resuming jobs, this
does not cause any bugs, it is only printing, not useful errors.
2024-07-17 09:42:32 -07:00
Poorna
3535197f99
replication: proxy only on missing object or read quorum err (#20101) 2024-07-16 16:46:41 -07:00
Mark Theunissen
698bb93a46
Allow a KMS Action to specify keys in the Resources of a policy (#20079) 2024-07-16 07:03:03 -07:00
Harshavardhana
e8c54c3d6c
add validation test for v3 metrics for all its endpoints (#20094)
add unit test for v3 metrics for all its exposed endpoints

Bonus:
  - support OpenMetrics encoding
  - adds boot time for prometheus
  - continueOnError is better to serve as
    much metrics as possible.
2024-07-15 09:28:02 -07:00
Shubhendu
f944a42886
Removed user and group details from logs (#20072)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-07-14 11:12:07 -07:00
Harshavardhana
eff0ea43aa
fix: typo in BucketUsageMetrics group registration in v3 metrics (#20090)
```
curl http://localhost:9000/minio/metrics/v3/cluster/usage/buckets
```

Did not work as documented, due to the fact that there was a typo
in the bucket usage metrics registration group. This endpoint is
a cluster endpoint and does not require any `buckets` argument.
2024-07-14 11:11:42 -07:00
Harshavardhana
7fcb428622
do not print unexpected logs (#20083) 2024-07-12 13:51:54 -07:00
Klaus Post
83adc2eebf
Fix ListObjects aborting after 3 minute on async request (#20074)
When creating the async listing, if the first request does not return within 3 
minutes, it is stopped, since it isn't being kept alive.

Keep updating `lastHandout` while we are waiting for the initial request to be fulfilled.
2024-07-12 09:23:16 -07:00
Poorna
989c318a28
replication: make large workers configurable (#20077)
This PR also improves throttling by reducing tokens requested
from rate limiter based on available tokens to avoid exceeding
throttle wait deadlines
2024-07-12 07:57:31 -07:00
Taran Pelkey
f5d2fbc84c
Add DecodeDN and QuickNormalizeDN functions to LDAP config (#20076) 2024-07-11 18:04:53 -07:00
Allan Roger Reid
e139673969
Audit failure in batch job key rotate (#20073) 2024-07-11 16:13:15 -07:00
Harshavardhana
a8c6465f22
hide some deprecated fields from 'get' output (#20069)
also update wording on `subnet license="" api_key=""`
2024-07-10 13:16:44 -07:00
Taran Pelkey
6c6f0987dc
Add groups to policy entities (#20052)
* Add groups to policy entities

* update comment

---------

Co-authored-by: Harshavardhana <harsha@minio.io>
2024-07-10 11:41:49 -07:00
Austin Chang
5f64658faa
clarify error message for root user credential (#20043)
Signed-off-by: Austin Chang <austin880625@gmail.com>
2024-07-10 09:57:01 -07:00
Anis Eleuch
ce183cb2b4
heal: List and heal again for any listing error (#19999)
When a fresh drive healing is finished, add more checks for the drive listing
errors. If any, re-list and heal again. Although this is an infrequent use
case to have listPathRaw() returning nil when minDisks is set to 1, we
still need to handle all possible use cases to avoid missing healing
any object.

Also, check for HealObject result to decide of an object is healed in the
fresh disk since HealObject returns nil if an object is healed in any
disk, and not in the new fresh drive.
2024-07-10 09:55:36 -07:00
Klaus Post
b3bac73c0f
Clarify post policy error message (#20067)
It is not really clear that the listed keys are missing.

Clarify the error
2024-07-10 07:18:44 -07:00
Anis Eleuch
e726d8ff0f
list: Hide objects/versions with pending/failed replicated deletion (#20047)
In regular listing, this commit will avoid showing an object when its
latest version has a pending or failed deletion. In replicated setup.
It will also prevent showing older versions in the same case.
2024-07-09 15:26:42 -07:00
Shubhendu
f4230777b3
Log replication errors once (#20063)
Also, sort the error map for multiple sites in ascending order
of deployment IDs, so that the error message generated is always
definitive order and same.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-07-09 10:10:31 -07:00
Krishnan Parthasarathi
380233d646
batch: Update job info object on success (#20053) 2024-07-08 18:45:54 -07:00
Klaus Post
0d0b0aa599
Abstract grid connections (#20038)
Add `ConnDialer` to abstract connection creation.

- `IncomingConn(ctx context.Context, conn net.Conn)` is provided as an entry point for 
   incoming custom connections.

- `ConnectWS` is provided to create web socket connections.
2024-07-08 14:44:00 -07:00
Anis Eleuch
b433bf14ba
Add typos check to Makefile (#20051) 2024-07-08 14:39:49 -07:00
Klaus Post
107d951893
Log ILM failed object name (#20040)
Log so we know which object we are dealing with.

Log each object once.
2024-07-04 07:25:45 -07:00
Shireesh Anjal
22c53b1c70
Remove license update job (#20037) 2024-07-03 11:49:48 -07:00
Mark Theunissen
88926ad8e9
return appropriate error upon tier update for incorrect credentials (#20034) 2024-07-03 00:17:20 -07:00
Harshavardhana
32d04091a2
resume any batch jobs in a goroutine (#20035)
Bonus: move batch job initialization to the last item after all other initialization, 
            allowing for faster startup time for different subsystems.
2024-07-03 00:16:05 -07:00
Harshavardhana
be84a4fd68
do not proxy invalid object names (#20031) 2024-07-02 14:28:55 -07:00
Anis Eleuch
2ec1f404ac
info: Always refresh the root disk status (#20023)
Add root drive status in the disk info cache function, so unmounting a
drive without restarting a local node reflects the correct value.
2024-07-02 13:41:29 -07:00
Klaus Post
2040559f71
Fix SkipReader performance with small initial read (#20030)
If `SkipReader` is called with a small initial buffer it may be doing a huge number if Reads to skip the requested number of bytes. If a small buffer is provided grab a 32K buffer and use that.

Fixes slow execution of `testAPIGetObjectWithMPHandler`.

Bonuses:

* Use `-short` with `-race` test.
* Do all suite test types with `-short`.
* Enable compressed+encrypted in `testAPIGetObjectWithMPHandler`.
* Disable big file tests in `testAPIGetObjectWithMPHandler` when using `-short`.
2024-07-02 08:13:05 -07:00
Anis Eleuch
ca0ce4c6ef
tests: Fix setting max openfds as memory limit (#20029)
The code was advertenly passing max openfds to debug.SetMemoryLimit(),
fixing this accelerate go test in my machine.

This is only a testing bug, since the server context has always a valid
MaxMem, so the buggy code was never called in users environments.
2024-07-02 08:09:36 -07:00
Anis Eleuch
757cf413cb
Add batch status API (#19679)
Currently the status of a completed or failed batch is held in the
memory, a simple restart will lose the status and the user will not
have any visibility of the job that was long running.

In addition to the metrics, add a new API that reads the batch status
from the drives. A batch job will be cleaned up three days after
completion.

Also add the batch type in the batch id, the reason is that the batch
job request is removed immediately when the job is finished, then we
do not know the type of batch job anymore, hence a difficulty to locate
the job report
2024-07-02 01:17:52 -07:00
Anis Eleuch
b35acb3dbc
heal: Add support of healing particular pool/set (#20024) 2024-07-01 15:02:25 -07:00
Sveinn
e404abf103
Letting password enable auth bypass caPublicKey (only if passauth is … (#20022) 2024-07-01 15:02:01 -07:00
jiuker
f7ff19cb18
fix: warning for decommissioned pool while start (#20019) 2024-07-01 07:38:46 -07:00
Poorna
91faaa1387
fix panic in batch replicate (#20014)
Fixes:

```
panic: send on closed channel
	panic: close of closed channel

goroutine 878 [running]:
github.com/minio/minio/internal/ioutil.SafeClose[...](...)
	/Users/kp/code/src/github.com/minio/minio/internal/ioutil/ioutil.go:407
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func2.2()
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2229 +0xc0
panic({0x108c25e60?, 0x1090b28d0?})
	/usr/local/go/src/runtime/panic.go:770 +0x124
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func2.3({{0x1400e397316, 0x5}, {0x1400d88b8a8, 0x8}, {0x1f99d80, 0xede101c42, 0x0}, 0x3bc, 0x0, 0x0, ...})
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2235 +0xb4
github.com/minio/minio/cmd.(*erasureServerPools).Walk.func2()
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2277 +0xabc
created by github.com/minio/minio/cmd.(*erasureServerPools).Walk in goroutine 575
	/Users/kp/code/src/github.com/minio/minio/cmd/erasure-server-pool.go:2210 +0x33c
```
2024-06-28 18:20:47 -07:00
Harshavardhana
f365a98029
fix: hot-reloading STS credential policy documents (#20012)
* fix: hot-reloading STS credential policy documents
* Support Role ARNs hot load policies (#28)

---------

Co-authored-by: Anis Eleuch <vadmeste@users.noreply.github.com>
2024-06-28 16:17:22 -07:00
Taran Pelkey
7ca4ba77c4
Update tests to use AttachPolicy(LDAP) instead of deprecated SetPolicy (#19972) 2024-06-28 02:06:25 -07:00
Poorna
13512170b5
list: Do not decrypt SSE-S3 Etags in a non encrypted format (#20008) 2024-06-27 19:44:56 -07:00
Krishnan Parthasarathi
154fcaeb56
Allow rebalance start when it's stopped/completed (#20009) 2024-06-27 17:22:30 -07:00
Anis Eleuch
722118386d
iam: Hot load of the policy during request authorization (#20007)
Hot load a policy document when during account authorization evaluation
to avoid returning 403 during server startup, when not all policies are
already loaded.

Add this support for group policies as well.
2024-06-27 17:03:07 -07:00
Harshavardhana
709612cb37
fix: rebalance upon pool expansion would crash when in progress (#20004)
you can attempt a rebalance first i.e, start with 2 pools.

```
mc admin rebalance start alias/
```

and after that you can add a new pool, this would
potentially crash.

```
Jun 27 09:22:19 xxx minio[7828]: panic: runtime error: invalid memory address or nil pointer dereference
Jun 27 09:22:19 xxx minio[7828]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x22cc225]
Jun 27 09:22:19 xxx minio[7828]: goroutine 1 [running]:
Jun 27 09:22:19 xxx minio[7828]: github.com/minio/minio/cmd.(*erasureServerPools).findIndex(...)
```
2024-06-27 11:35:34 -07:00
Harshavardhana
b35d083872
fix; change retry-after 60sec for 503s and 10s for 429s (#19996) 2024-06-26 01:32:06 -07:00
Harshavardhana
5e7b243bde
extend cluster health to return errors for IAM, and Bucket metadata (#19995)
Bonus: make API freeze to be opt-in instead of default
2024-06-26 00:44:34 -07:00
Taran Pelkey
3c2141513f
add ListAccessKeysLDAPBulk API to list accessKeys for multiple/all LDAP users (#19835) 2024-06-25 14:21:28 -07:00
Aditya Manthramurthy
602f6a9ad0
Add IAM (re)load timing logs (#19984)
This is useful to debug large IAM load times - the usual cause is when
there are a large amount of temporary accounts.
2024-06-25 10:33:10 -07:00
Harshavardhana
22c5a5b91b
add healing retries when there are failed heal attempts (#19986)
transient errors for long running tasks are normal, allow for
drive to retry again upto 3 times before giving up on healing
the drive.
2024-06-25 10:32:56 -07:00
jiuker
41f508765d
fix: format the scanner object error (#19991) 2024-06-25 08:54:24 -07:00
Aditya Manthramurthy
7dccd1f589
fix: bootstrap msgs should only be sent at startup (#19985) 2024-06-24 19:30:28 -07:00
Harshavardhana
be97ae4c5d
fix: gcs tier going offline due to customer HTTPclient (#19973)
specifying customer HTTP client makes the gcs SDK
ignore the passed credentials, instead let the GCS
SDK manage the transport.

this PR fixes #19922 a regression from #19565
2024-06-21 22:26:45 -07:00
Anis Eleuch
4d7d008741
bootstrap: Speed up bucket metadata loading (#19969)
Currently, bucket metadata is being loaded serially inside ListBuckets
Objet API. Fix that by loading the bucket metadata as the number of
erasure sets * 10, which is a good approximation.
2024-06-21 15:22:24 -07:00
Klaus Post
2d7a3d1516
Return error from mergeEntryChannels (#19970)
- Add error from mergeEntryChannels to `results.`
- Make sure we check the context error before we close the channel.
2024-06-21 12:06:51 -07:00
Harshavardhana
dfab400d43
reject bootup, if binaries are different in a cluster (#19968) 2024-06-21 07:49:49 -07:00
Shireesh Anjal
e200808ab7
fix errors in metrics code on macos (#19965)
- do not load proc fs metrics in case of macos
- null-check TimeStat before accessing
2024-06-20 10:55:03 -07:00
Klaus Post
fae563b85d
Add fixed timed restarts to updates (#19960) 2024-06-20 07:49:22 -07:00
Anis Eleuch
95e4cbbfde
Do not ping event targets during cluster initialization (#19959)
S3 operations are frozen during startup, therefore we should avoid pinging
event targets during the initialization since it can stall.
2024-06-20 07:46:02 -07:00
Harshavardhana
2825294b7b
allow server startup to come online with READ success (#19957) 2024-06-19 22:21:31 -07:00
Sveinn
bce93b5cfa
Removing timeout on shutdown (#19956) 2024-06-19 11:42:47 -07:00
Harshavardhana
7a4b250c8b
avoid waiting for quorum health while debugging (#19955) 2024-06-19 10:12:20 -07:00
Harshavardhana
69e41f87ef
compute localIPs only once per server startup() (#19951)
repeatedly calling this function is not necessary,
on systems with lots of interfaces, including virtual
ones can make this reasonably delayed.
2024-06-19 07:34:00 -07:00
Harshavardhana
ee48f9f206
perform healthchecks before initializing everything fully (#19953)
adds more informative logs that provide details on which
erasure set is losing quorum etc.
2024-06-19 07:33:40 -07:00
Sveinn
9ba39d7fad
Removing a channel that was not being used (#19948) 2024-06-19 01:59:39 -07:00
Harshavardhana
d2fb371f80
do not need response record body (#19949)
since the connection is active, the
response recorder body can grow endlessly
causing leak, as this bytes buffer is
never given back to GC due to an goroutine.
2024-06-19 01:59:21 -07:00
Klaus Post
2f9018f03b
Do regular checks for healing status while scanning (#19946) 2024-06-18 09:11:04 -07:00
Harshavardhana
bbb64eaade
skip healing properly in the scanner when a drive is hotplugged (#19939)
skip healing properly in scanner when drive is hotplugged

due to how the state is passed around the SkipHealing
might not be the true state() of the system always, causing
a situation where we might healing from the scanner on the
same drive which is being. Due to this competing heals get
triggered that slow each other down.
2024-06-17 16:39:11 -07:00
Harshavardhana
7bd1d899bc
remove overzealous check during HEAD() (#19940)
due to a historic bug in CopyObject() where
an inlined object loses its metadata, the
check causes an incorrect fallback verifying
data-dir.

CopyObject() bug was fixed in ffa91f9794 however
the occurrence of this problem is historic, so
the aforementioned check is stretching too much.

Bonus: simplify fileInfoRaw() to read xl.json as well,
also recreate buckets properly.
2024-06-17 07:29:18 -07:00
Harshavardhana
c91d1ec2e3
fix: avoid metadata cache without data for all callers (#19935) 2024-06-14 06:28:35 -07:00
Shubhendu
3bd3470d0b
Corrected names of node replication metrics (#19932)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-13 15:26:54 -07:00
Harshavardhana
ba39ed9af7
loadUser() if not able to load() credential return error (#19931) 2024-06-13 15:26:38 -07:00
jiuker
62e6dc950d
fix: do not update metadata cache upon headObject() (#19929) 2024-06-13 08:42:02 -07:00
Klaus Post
ad04afe381
Fix SSEC multipart checksum replication (#19915)
* Multipart SSEC checksums were not transferred.
* Remove key mismatch logging. This key is user-controlled with SSEC.
* If the source is SSEC and the destination reports ErrSSEEncryptedObject, 
  assume replication is good.
2024-06-12 23:56:12 -07:00
Harshavardhana
d06b63d056
load credential for in-flights requests as singleflight (#19920)
avoid concurrent callers for LoadUser() to even initiate
object read() requests, if an on-going operation is in progress.

this avoids many callers hitting the drives causing I/O
spikes, also allows for loading credentials faster.
2024-06-12 13:47:56 -07:00
Harshavardhana
e3ac4035b9
decrement requests inqueue correctly after the request is processed (#19918) 2024-06-12 01:13:12 -07:00
Harshavardhana
d21b6daa49
fix: avoid crash when delete() returns an error in batch expiration (#19909) 2024-06-11 06:50:53 -07:00
Harshavardhana
55aa431578
fix: on windows avoid ':' as part of the object name (#19907)
fixes #18865
avoid-colon
2024-06-10 20:13:30 -07:00
Harshavardhana
614981e566
allow purge expired STS while loading credentials (#19905)
the reason for this is to avoid STS mappings to be
purged without a successful load of other policies,
and all the credentials only loaded successfully
are properly handled.

This also avoids unnecessary cache store which was
implemented earlier for optimization.
2024-06-10 11:45:50 -07:00
Klaus Post
d2eed44c78
Fix replication checksum transfer (#19906)
Compression will be disabled by default if SSE-C is specified. So we can still honor SSE-C.
2024-06-10 10:40:33 -07:00
Anis Eleuch
789cbc6fb2
heal: Dangling check to evaluate object parts separately (#19797) 2024-06-10 08:51:27 -07:00
jiuker
0662c90b5c
fix: copyObject restore with a specific version, update test cases (#19895) 2024-06-10 08:50:49 -07:00
Klaus Post
a2cab02554
Fix SSE-C checksums (#19896)
Compression will be disabled by default if SSE-C is specified. So we can still honor SSE-C.
2024-06-10 08:31:51 -07:00
Harshavardhana
6c7a21df6b
turn-off unexpected debug logging in List() calls (#19903) 2024-06-09 21:34:26 -07:00
Harshavardhana
29a25a538f
fix: make sure we list freeVersions like DEL marker with --versions (#19878)
freeVersions() was being incorrectly skipped; list it as
valid objects properly.

Co-authored-by: Krishnan Parthasarathi <Krishnan Parthasarathi>
2024-06-07 15:18:44 -07:00
Harshavardhana
2dd8faaedc remove unnecessary log in Listing() 2024-06-07 14:52:55 -07:00
Krishnan Parthasarathi
069c4015cd
Don't tier directory objects (#19891)
Directory objects are used by applications that simulate the folder
structure of an on-disk filesystem. These are zero-byte objects with names
ending with '/'. They are only used to check whether a 'folder' exists in
the namespace.
2024-06-07 08:43:17 -07:00
Shubhendu
2f6e03fb60
Calculate correct object size while replication (#19888)
It was missing in case of `replicateObject` but was present for
`replicateAll` already

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-06 12:31:01 -07:00
Klaus Post
0fbb945e13
Disable caching of encrypted objects (#19890)
Don't write encrypted objects to cache, if configured.
2024-06-06 11:39:18 -07:00
Anis Eleuch
b94dd835c9
decom: Fix CurrentSize output when generating the status (#19883)
StartSize starts with the raw free space of all disks in the given pool,
however during the status, CurrentSize is not showing the current free
raw space, as expected at least by `mc admin decom status` since it was
written.
2024-06-06 07:30:43 -07:00
Poorna
5aaef9790f
replication: pass checksum headers to replica (#19834) 2024-06-06 02:36:42 -07:00
Bala FA
7edc352d23
Add ILM metrics in metrics-v3 (#19539)
Signed-off-by: Bala.FA <bala@minio.io>
2024-06-06 02:36:25 -07:00
Poorna
850a84b08a
simplify site replication multipart proxying (#19885) 2024-06-05 18:01:15 -07:00
Taran Pelkey
4148754ce0
Check both given and normalized group DN on LDAP policy detach requests (#19876) 2024-06-05 15:42:40 -07:00
Harshavardhana
2107722829
upgrade go-oidc to fix GO-2024-2631 (#19884) 2024-06-05 15:00:34 -07:00
jiuker
d326ba52e9
feat: support batchJob for windows (#19877) 2024-06-05 08:44:53 -07:00
Sveinn
91e1487de4
Add LDAP public key authentication to SFTP (#19833) 2024-06-05 00:51:13 -07:00
jiuker
90a9f2dd70
fix: log diskerror when detect the disk space failed (#19861) 2024-06-04 09:42:03 -07:00
Harshavardhana
d5e48cfd65
fix: remove DriveOPTimeout for REST callers as they don't work properly (#19873)
Go's net/http is notoriously difficult to have a streaming
deadlines per READ/WRITE on the net.Conn if we add them they
interfere with the Go's internal requirements for a HTTP
connection.

Remove this support for now

fixes #19853
2024-06-04 08:12:57 -07:00
Anis Eleuch
d274566463
race: Fix rare race detected by testing (#19872)
Below is the race warning:

```
WARNING: DATA RACE
Write at 0x00c02d3d27c0 by goroutine 1210:
  github.com/minio/minio/cmd.(*healingTracker).bucketDone()
      github.com/minio/minio/cmd/background-newdisks-heal-ops.go:273 +0x13a
  github.com/minio/minio/cmd.(*erasureObjects).healErasureSet()
      github.com/minio/minio/cmd/global-heal.go:525 +0x2158
  github.com/minio/minio/cmd.healFreshDisk()
      github.com/minio/minio/cmd/background-newdisks-heal-ops.go:450 +0x107e
  github.com/minio/minio/cmd.monitorLocalDisksAndHeal.func1()
      github.com/minio/minio/cmd/background-newdisks-heal-ops.go:528 +0x150
  github.com/minio/minio/cmd.monitorLocalDisksAndHeal.gowrap2()
      github.com/minio/minio/cmd/background-newdisks-heal-ops.go:538 +0x82

Previous read at 0x00c02d3d27c0 by goroutine 1446:
  github.com/minio/minio/cmd.(*erasureObjects).healErasureSet.func5()
      github.com/minio/minio/cmd/global-heal.go:232 +0xfd
```
2024-06-04 08:12:32 -07:00
Shubhendu
21b6204692
Test proxying of DEL marker for bucket replication (#19870)
Make sure to avoid proxying for DEL markers

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-04 04:38:26 -07:00
Taran Pelkey
d98faeb26a
Check if LDAP User has attached policy before creating Service Account (#19843)
Check if ldap user has policy before creating
2024-06-03 12:58:48 -07:00
Klaus Post
0a63dc199c
Add trace sizes to more trace types (#19864)
Add trace sizes to

* ILM traces
* Replication traces
* Healing traces
* Decommission traces
* Rebalance traces
* (s)ftp traces
* http traces.
2024-06-03 08:45:54 -07:00
Klaus Post
e72429c79c
Add sizes to traces (#19851)
added to storage and grid traces. Can provide more context for traces that aren't HTTP. Others may apply.
2024-05-31 22:17:37 -07:00
Klaus Post
c5b3f5553f
Add per connection RPC metrics (#19852)
Provides individual and aggregate stats for each RPC connection.

Example:

```
  "rpc": {
   "collectedAt": "2024-05-31T14:33:29.1373103+02:00",
   "connected": 30,
   "disconnected": 0,
   "outgoingStreams": 69,
   "incomingStreams": 0,
   "outgoingBytes": 174822796,
   "incomingBytes": 175821566,
   "outgoingMessages": 768595,
   "incomingMessages": 768589,
   "outQueue": 0,
   "lastPongTime": "2024-05-31T12:33:28Z",
   "byDestination": {
    "http://127.0.0.1:9001": {
     "collectedAt": "2024-05-31T14:33:29.1373103+02:00",
     "connected": 5,
     "disconnected": 0,
     "outgoingStreams": 2,
     "incomingStreams": 0,
     "outgoingBytes": 38432543,
     "incomingBytes": 66604052,
     "outgoingMessages": 229496,
     "incomingMessages": 229575,
     "outQueue": 0,
     "lastPongTime": "2024-05-31T12:33:27Z"
    },
    "http://127.0.0.1:9002": {
     "collectedAt": "2024-05-31T14:33:29.1373103+02:00",
     "connected": 5,
     "disconnected": 0,
     "outgoingStreams": 6,
     "incomingStreams": 0,
     "outgoingBytes": 38215680,
     "incomingBytes": 66121283,
     "outgoingMessages": 228525,
     "incomingMessages": 228510,
     "outQueue": 0,
     "lastPongTime": "2024-05-31T12:33:27Z"
    },
...
```
2024-05-31 22:16:24 -07:00
Klaus Post
d3ae0aaad3
Add max buffering to SFTP (#19848)
Prevent OOM by adversarial use of SFTP upload by setting a 100MB max upload buffer.
2024-05-31 14:28:07 -07:00
Anis Eleuch
1277ad69a6
heal: Remove .healing.bin when all ES drives are healing (#19846)
In the very rare case when all drives in a erasure set need to be healed,
remove .healing.bin from all drives, otherwise it will be stuck in a
loop

Also, fix a unit test that fails sometimes due to wrong test.
2024-05-31 07:48:50 -07:00
Harshavardhana
8f93e81afb
change service account embedded policy size limit (#19840)
Bonus: trim-off all the unnecessary spaces to allow
for real 2048 characters in policies for STS handlers
and re-use the code in all STS handlers.
2024-05-30 11:10:41 -07:00
Harshavardhana
4af31e654b
avoid pre-populating buffers for deployments < 32GiB memory (#19839) 2024-05-30 04:58:12 -07:00
Harshavardhana
aad50579ba
fix: wire up ILM sub-system properly for help (#19836) 2024-05-30 01:14:58 -07:00
Harshavardhana
38d059b0ae
fix: single node multi-drive must register local drives properly (#19832)
since #19688 there was a regression introduced during drive
lookups for single node multi-drive setups, drive replacement
would not work correctly without this PR.
2024-05-29 13:12:44 -07:00
Klaus Post
bd4eeb4522
Fix flipped EcM, EcN in metadata header (#19831)
Since this is a tuple encoded field we can just flip the struct members.
2024-05-29 12:14:09 -07:00
jiuker
03e3493288
fix: correct parse the tagging error for PostPolicyBucketHandler (#19825) 2024-05-29 11:50:46 -07:00
Harshavardhana
64baedf5a4
fix: hide prefixes for Hadoop properly (#19821) 2024-05-28 15:53:15 -07:00
Anis Eleuch
f79a4ef4d0
policy: More defensive code validating svc:DurationSeconds (#19820)
This does not fix any current issue, but merging https://github.com/minio/madmin-go/pull/282
can lose the validation of the service account expiration time.

Add more defensive code for now. In the future, we should avoid doing
validation in another library.
2024-05-28 10:19:04 -07:00
Taran Pelkey
2d53854b19
Restrict access keys for users and groups to not allow '=' or ',' (#19749)
* initial commit

* Add UTF check

---------

Co-authored-by: Harshavardhana <harsha@minio.io>
2024-05-28 10:14:16 -07:00
jiuker
c904ef966e
feat: support tags for PostPolicy upload (#19816) 2024-05-27 21:44:00 -07:00
Harshavardhana
e0fe7cc391
fix: information disclosure bug in preconditions GET (#19810)
precondition check was being honored before, validating
if anonymous access is allowed on the metadata of an
object, leading to metadata disclosure of the following
headers.

```
Last-Modified
Etag
x-amz-version-id
Expires:
Cache-Control:
```

although the information presented is minimal in nature,
and of opaque nature. It still simply discloses that an
object by a specific name exists or not without even having
enough permissions.
2024-05-27 12:17:46 -07:00
Harshavardhana
9d20dec56a Revert "remove dataErrs from er.deleteIfDangling code"
This reverts commit 7d75b1e758.

This fails multipart tests we need this code to handle
existing challenges, so wait for the comprehensive fix.
2024-05-26 11:13:29 -07:00
Harshavardhana
597a785253
fix: authenticate LDAP via actual DN instead of normalized DN (#19805)
fix: authenticate LDAP via actual DN instead of normalized DN

Normalized DN is only for internal representation, not for
external communication, any communication to LDAP must be
based on actual user DN. LDAP servers do not understand
normalized DN.

fixes #19757
2024-05-25 06:43:06 -07:00
Harshavardhana
7d75b1e758 remove dataErrs from er.deleteIfDangling code
avoid this until a comprehensive change is
merged such as https://github.com/minio/minio/pull/19797
2024-05-24 18:20:04 -07:00
Aditya Manthramurthy
5f78691fcf
ldap: Add user DN attributes list config param (#19758)
This change uses the updated ldap library in minio/pkg (bumped
up to v3). A new config parameter is added for LDAP configuration to
specify extra user attributes to load from the LDAP server and to store
them as additional claims for the user.

A test is added in sts_handlers.go that shows how to access the LDAP
attributes as a claim.

This is in preparation for adding SSH pubkey authentication to MinIO's SFTP
integration.
2024-05-24 16:05:23 -07:00
Shireesh Anjal
a591e06ae5
Add cluster scanner metrics in metrics-v3 (#19517)
endpoint: /minio/metrics/v3/cluster/scanner
metrics:
 - bucket_scans_finished (counter)
 - bucket_scans_started (counter)
 - directories_scanned (counter)
 - last_activity_nano_seconds (gauge)
 - objects_scanned (counter)
 - versions_scanned (counter)
2024-05-24 12:29:25 -07:00
Harshavardhana
443c93c634
compute time spent in ILM properly (#19806) 2024-05-24 12:28:51 -07:00
Shireesh Anjal
5659cddc84
Add cluster config metrics in metrics-v3 (#19507)
endpoint: /minio/metrics/v3/cluster/config
metrics:
- write_quorum
- rrs_parity
- standard_parity
2024-05-24 05:50:46 -07:00
Shireesh Anjal
673a521711
Change endpoint of v3 notification metrics (#19804)
from /cluster/notification to /notification
2024-05-24 04:10:24 -07:00
Shireesh Anjal
7981509cc8
Add cluster and bucket replication metrics in metrics-v3 (#19546)
endpoint: /minio/metrics/v3/cluster/replication
metrics:
- average_active_workers
- average_queued_bytes
- average_queued_count
- average_transfer_rate
- current_active_workers
- current_transfer_rate
- last_minute_queued_bytes
- last_minute_queued_count
- max_active_workers
- max_queued_bytes
- max_queued_count
- max_transfer_rate
- recent_backlog_count

endpoint: /minio/metrics/v3/api/bucket/replication
metrics:
- last_hour_failed_bytes
- last_hour_failed_count
- last_minute_failed_bytes
- last_minute_failed_count
- latency_ms
- proxied_delete_tagging_requests_total
- proxied_get_requests_failures
- proxied_get_requests_total
- proxied_get_tagging_requests_failures
- proxied_get_tagging_requests_total
- proxied_head_requests_failures
- proxied_head_requests_total
- proxied_put_tagging_requests_failures
- proxied_put_tagging_requests_total
- sent_bytes
- sent_count
- total_failed_bytes
- total_failed_count
- proxied_delete_tagging_requests_failures
2024-05-23 00:41:18 -07:00
Harshavardhana
d38e020b29
remove errant logs for disconnected remote (#19793)
Signed-off-by: Harshavardhana <harsha@minio.io>
2024-05-22 18:12:23 -07:00
Poorna
7d29030292
fix list results returned for spark max-keys=2 listing (#19791)
This PR continues fix #19725 for some unhandled cases
2024-05-22 16:16:34 -07:00
Shubhendu
7c7650b7c3
Add sufficient deadlines and countermeasures to handle hung node scenario (#19688)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
Signed-off-by: Harshavardhana <harsha@minio.io>
2024-05-22 16:07:14 -07:00
Harshavardhana
ca80eced24
usage of deadline conn at Accept() breaks websocket (#19789)
fortunately not wired up to use, however if anyone
enables deadlines for conn then sporadically MinIO
startups fail.
2024-05-22 10:49:27 -07:00
jiuker
9906b3ade9
fix: reject ilm rule when bucket LockEnabled (#19785) 2024-05-21 23:50:03 -07:00
Anis Eleuch
bf1769d3e0
xl: Avoid marking a drive offline after one part read failure (#19779)
This commit will fix one rare case of a multipart object that
can be read in theory but GetObject API returned an error.

It turned out that a six years old code was marking a drive offline
when the bitrot streaming fails to read a part in a disk with any error.
This can affect reading a subsequent part, though having enough shards,
but unable to construct because one drive was marked offline earlier.

This commit will remove the drive marking offline code. It will also
close the bitrotstreaming reader before marking it as nil.
2024-05-21 07:36:21 -07:00
Harshavardhana
63e1ad9f29 fix: the user-agent for Veeam 2024-05-20 11:54:52 -07:00
Harshavardhana
1fd90c93ff
re-use StorageAPI while loading drive formats (#19770)
Bonus: safe settings for deployment ID to avoid races
2024-05-19 01:06:49 -07:00
Krishnan Parthasarathi
1228d6bf1a
Return NumVersions in quorum when available (#19766)
Similar to https://github.com/minio/minio/pull/17925
2024-05-17 13:57:37 -07:00
Shireesh Anjal
fc4561c64c
Start callhome immediately after enabling (#19764)
Currently, on enabling callhome (or restarting the server), the callhome
job gets scheduled. This means that one has to wait for 24hrs (the
default frequency duration) to see it in action and to figure out if it
is working as expected.

It will be a better user experience to perform the first callhome
execution immediately after enabling it (or on server start if already
enabled).

Also, generate audit event on callhome execution, setting the error
field in case the execution has failed.
2024-05-17 09:53:34 -07:00
Klaus Post
3b7747b42b
Tweak multipart uploads (#19756)
* Store ModTime in the upload ID; return it when listing instead of the current time.
* Use this ModTime to expire and skip reading the file info.
* Consistent upload sorting in listing (since it now has the ModTime).
* Exclude healing disks to avoid returning an empty list.
2024-05-17 09:40:09 -07:00
Harshavardhana
e432e79324
avoid calling 'admin info' for disk, cpu, net metrics collection (#19762)
resource metrics collection was incorrectly making fan-out
liveness peer calls where it's not needed.
2024-05-17 08:15:13 -07:00
Harshavardhana
08d74819b6
handle racy updates to globalSite config (#19750)
```
==================
WARNING: DATA RACE
Read at 0x0000082be990 by goroutine 205:
  github.com/minio/minio/cmd.setCommonHeaders()

Previous write at 0x0000082be990 by main goroutine:
  github.com/minio/minio/cmd.lookupConfigs()
```
2024-05-16 16:13:47 -07:00
Poorna
aa3fde1784
Add ListObjectsV2 unit test (#19753)
for PR: #19725
2024-05-15 20:40:51 -07:00
Harshavardhana
0b3eb7f218
add more deadlines and pass around context under most situations (#19752) 2024-05-15 15:19:00 -07:00
Klaus Post
b792b36495
Add Veeam storage class override (#19748)
Recent Veeam is very picky about storage class names. Add `_MINIO_VEEAM_FORCE_SC` env var.

It will override the storage class returned by the storage backend if it is non-standard
and we detect a Veeam client by checking the User Agent.

Applies to HeadObject/GetObject/ListObject*
2024-05-15 11:04:16 -07:00
Harshavardhana
d3db7d31a3
fix: add deadlines for all synchronous REST callers (#19741)
add deadlines that can be dynamically changed via
the drive max timeout values.

Bonus: optimize "file not found" case and hung drives/network - circuit break the check and return right
away instead of waiting.
2024-05-15 09:52:29 -07:00
Shireesh Anjal
c05ca63158
Fix crash on /minio/metrics/v3?list (#19745)
An unchecked map access was causing panic.
2024-05-15 09:06:35 -07:00
Shireesh Anjal
0e59e50b39
Capture ttfb api metrics only for GetObject (#19733)
as that is the only API where the TTFB metric is beneficial, and
capturing this for all APIs exponentially increases the response size in
large clusters.
2024-05-14 23:25:13 -07:00
Klaus Post
d4b391de1b
Add PutObject Ring Buffer (#19605)
Replace the `io.Pipe` from streamingBitrotWriter -> CreateFile with a fixed size ring buffer.

This will add an output buffer for encoded shards to be written to disk - potentially via RPC.

This will remove blocking when `(*streamingBitrotWriter).Write` is called, and it writes hashes and data.

With current settings, the write looks like this:

```
Outbound
┌───────────────────┐             ┌────────────────┐               ┌───────────────┐                      ┌────────────────┐
│                   │   Parr.     │                │  (http body)  │               │                      │                │
│ Bitrot Hash       │     Write   │      Pipe      │      Read     │  HTTP buffer  │    Write (syscall)   │  TCP Buffer    │
│ Erasure Shard     │ ──────────► │  (unbuffered)  │ ────────────► │   (64K Max)   │ ───────────────────► │    (4MB)       │
│                   │             │                │               │  (io.Copy)    │                      │                │
└───────────────────┘             └────────────────┘               └───────────────┘                      └────────────────┘
```

We write a Hash (32 bytes). Since the pipe is unbuffered, it will block until the 32 bytes have 
been delivered to the TCP buffer, and the next Read hits the Pipe.

Then we write the shard data. This will typically be bigger than 64KB, so it will block until two blocks 
have been read from the pipe.

When we insert a ring buffer:

```
Outbound
┌───────────────────┐             ┌────────────────┐               ┌───────────────┐                      ┌────────────────┐
│                   │             │                │  (http body)  │               │                      │                │
│ Bitrot Hash       │     Write   │  Ring Buffer   │      Read     │  HTTP buffer  │    Write (syscall)   │  TCP Buffer    │
│ Erasure Shard     │ ──────────► │    (2MB)       │ ────────────► │   (64K Max)   │ ───────────────────► │    (4MB)       │
│                   │             │                │               │  (io.Copy)    │                      │                │
└───────────────────┘             └────────────────┘               └───────────────┘                      └────────────────┘
```

The hash+shard will fit within the ring buffer, so writes will not block - but will complete after a 
memcopy. Reads can fill the 64KB buffer if there is data for it.

If the network is congested, the ring buffer will become filled, and all syscalls will be on full buffers.
Only when the ring buffer is filled will erasure coding start blocking.

Since there is always "space" to write output data, we remove the parallel writing since we are 
always writing to memory now, and the goroutine synchronization overhead probably not worth taking. 

If the output were blocked in the existing, we would still wait for it to unblock in parallel write, so it would 
make no difference there - except now the ring buffer smoothes out the load.

There are some micro-optimizations we could look at later. The biggest is that, in most cases, 
we could encode directly to the ring buffer - if we are not at a boundary. Also, "force filling" the 
Read requests (i.e., blocking until a full read can be completed) could be investigated and maybe 
allow concurrent memory on read and write.
2024-05-14 17:11:04 -07:00
Olli Janatuinen
534e7161df
SFTP: Correctly inform client about unsupported commands (#19735) 2024-05-14 03:29:30 -07:00
Harshavardhana
9b219cd646
fix: return quorum based error, temporary failures must be ignored (#19732) 2024-05-14 03:29:17 -07:00
Shireesh Anjal
3bab4822f3
Add logger webhook metrics in metrics-v3 (#19515)
endpoint: /minio/metrics/v3/cluster/webhook
metrics:
- failed_messages (counter)
- online (gauge)
- queue_length (gauge)
- total_messages (counter)
2024-05-14 00:27:33 -07:00
coderwander
3c5f2d8916
fix some typo in struct name comments (#19513)
Signed-off-by: coderwander <770732124@qq.com>
2024-05-14 00:26:50 -07:00
Shireesh Anjal
5808190398
Add more metrics to v3/cluster/erasure-set (#19714)
Metrics being added:

- read_tolerance: No of drive failures that can be tolerated without
  disrupting read operations
- write_tolerance: No of drive failures that can be tolerated without
  disrupting write operations
- read_health: Health of the erasure set in a pool for read operations
  (1=healthy, 0=unhealthy)
- write_health: Health of the erasure set in a pool for write operations
  (1=healthy, 0=unhealthy)
2024-05-14 00:25:56 -07:00
Shireesh Anjal
b2a82248b1
Move /system/go to /debug/go (#19707) 2024-05-14 00:25:37 -07:00
Klaus Post
c36eaedb93
Re-add "Fix incorrect merging of slash-suffixed objects (#19729)
Adds regression test for #19699

Failures are a bit luck based, since it requires objects to be placed on different sets.

However this generates a failure prior to #19699

* Revert "Revert "Fix incorrect merging of slash-suffixed objects (#19699)""

This reverts commit f30417d9a8.

* Don't override when suffix doesn't match. Instead rely on quorum for each.
2024-05-13 09:30:24 -07:00
Poorna
7752b03add
optimize max-keys=2 listing for spark workloads (#19725)
to return results appropriately for versioned buckets, especially
when underlying prefixes have been deleted
2024-05-13 07:57:42 -07:00
Shireesh Anjal
074d70112d
Consolidate drive health related metrics into single metric (#19706)
Instead of having "online" and "healing" as two metrics, replace with a
single metric "health" which can have following values:

0 = offline
1 = healthy
2 = healing
2024-05-12 10:23:50 -07:00
Harshavardhana
e8d14c0d90
verify preconditions during CompleteMultipart (#19713)
Bonus: hold the write lock properly to apply
optimistic concurrency during NewMultipartUpload()
2024-05-10 17:31:22 -07:00
Shireesh Anjal
60d7e8143a
Move /cluster/audit to /audit (#19708)
As the audit metrics are server level and not 
overall cluster level.
2024-05-10 07:50:39 -07:00
Klaus Post
9667a170de
Add usage cache cleanup and lower forced top compaction (#19719)
Lower forced compaction to 250K entries.

If there is more than 250K entries on the top level force compact it and log an error.
2024-05-10 07:49:50 -07:00
Harshavardhana
b598402738 fix: unexpected credentials missing while passing 2024-05-09 18:41:38 -07:00
Harshavardhana
72ff69d9bb
add log-prefix name for specifying custom log-name (#19712) 2024-05-09 14:29:37 -07:00
Harshavardhana
f30417d9a8 Revert "Fix incorrect merging of slash-suffixed objects (#19699)"
This reverts commit 2f7a10ab31.
2024-05-09 12:32:05 -07:00
jiuker
47a4ad3cd7
fix: truncate Expiration to second when Add ServiceAccount (#19674)
Truncate Expiration at the second when Add ServiceAccount
2024-05-09 11:08:04 -07:00
Klaus Post
2f7a10ab31
Fix incorrect merging of slash-suffixed objects (#19699)
If two objects share everything but one object has a slash prefix, those would be merged in listings, 
with secondary properties used for a tiebreak.

Example: An object with the key `prefix/obj` would be merged with an object named `prefix/obj/`. 
While this violates the [no object can be a prefix of another](https://min.io/docs/minio/linux/operations/concepts/thresholds.html#conflicting-objects), let's resolve these.

If we have an object with 'name' and a directory named 'name/' discard the directory only - but allow objects 
of 'name' and 'name/' (xldir) to be uniquely returned.

Regression from #15772
2024-05-09 11:05:45 -07:00
Harshavardhana
b534dc69ab
deprecate unexpected healing failed counters (#19705)
simplify this to avoid verbose metrics, and make
room for valid metrics to be reported for alerting
etc.
2024-05-09 11:04:41 -07:00
Harshavardhana
7b7d2ea7d4
pass around correct endpoint while registering remote storage (#19710) 2024-05-09 11:03:54 -07:00
Aditya Manthramurthy
e00de1c302
ldap-import: Add additional logs (#19691)
These logs are being added to provide better debugging of LDAP
normalization on IAM import.
2024-05-09 10:52:53 -07:00
Harshavardhana
3549e583a6
results must be a single channel to avoid overwriting healing.bin (#19702) 2024-05-09 10:15:03 -07:00
Andi
f5e3eedf34
chore: use errors.New to replace fmt.Errorf with no parameters (#19568)
Signed-off-by: ChengenH <hce19970702@gmail.com>
2024-05-09 01:44:07 -07:00
Harshavardhana
9a267f9270
allow caller context during reloads() to cancel (#19687)
canceled callers might linger around longer,
can potentially overwhelm the system. Instead
provider a caller context and canceled callers
don't hold on to them.

Bonus: we have no reason to cache errors, we should
never cache errors otherwise we can potentially have
quorum errors creeping in unexpectedly. We should
let the cache when invalidating hit the actual resources
instead.
2024-05-08 17:51:34 -07:00
Klaus Post
ec49fff583
Accept multipart checksums with part count (#19680)
Accept multipart uploads where the combined checksum provides the expected part count.

It seems this was added by AWS to make the API more consistent, even if the 
data is entirely superfluous on multiple levels.

Improves AWS S3 compatibility.
2024-05-08 09:18:34 -07:00
Andreas Auernhammer
8b660e18f2
kms: add support for MinKMS and remove some unused/broken code (#19368)
This commit adds support for MinKMS. Now, there are three KMS
implementations in `internal/kms`: Builtin, MinIO KES and MinIO KMS.

Adding another KMS integration required some cleanup. In particular:
 - Various KMS APIs that haven't been and are not used have been
   removed. A lot of the code was broken anyway.
 - Metrics are now monitored by the `kms.KMS` itself. For basic
   metrics this is simpler than collecting metrics for external
   servers. In particular, each KES server returns its own metrics
   and no cluster-level view.
 - The builtin KMS now uses the same en/decryption implemented by
   MinKMS and KES. It still supports decryption of the previous
   ciphertext format. It's backwards compatible.
 - Data encryption keys now include a master key version since MinKMS
   supports multiple versions (~4 billion in total and 10000 concurrent)
   per key name.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-05-07 16:55:37 -07:00
Harshavardhana
981497799a
return appropriate error upon reaching maxClients() (#19669) 2024-05-07 13:41:56 -07:00
Olli Janatuinen
b413ff9fdb
Support user certificate based authentication on SFTP (#19650) 2024-05-06 23:41:25 -07:00
Harshavardhana
6a15580817
fix: collect quorum errors for deletePrefix() (#19685)
do not return error for single drive being offline.
2024-05-06 22:44:46 -07:00
Cesar N
39633a5581
Set Console Redirect URL env variable (#19683) 2024-05-06 19:47:59 -07:00
Harshavardhana
888d2bb1d8
support ETag value to be '*' (#19682)
This supports '*' as per behavior to
comply with AWS S3 behavior for

- 'If-Match: *'
- 'If-None-Match: *'
2024-05-06 17:08:42 -07:00
Klaus Post
847ee5ac45
Make WalkDir return errors (#19677)
If used, 'opts.Marker` will cause many missed entries since results are returned 
unsorted, and pools are serialized.

Switch to fully concurrent listing and merging across pools to return sorted entries.
2024-05-06 13:27:52 -07:00
jiuker
9a9a49aa84
fix: Ignore AWSAccessKeyId check for SignV2 policy condition (#19673) 2024-05-06 03:52:41 -07:00
Harshavardhana
a03ca80269
support 'mc support perf object' with root login disabled (#19672)
It is expected that whoever is using the credentials which has
the proper set of permissions must be able to run.

`mc support perf object`

While the root login is disabled.
2024-05-06 02:45:10 -07:00
Harshavardhana
523bd769f1
add support for specific error response for InvalidRange (#19668)
fixes #19648

AWS S3 returns the actual object size as part of XML
response for InvalidRange error, this is used apparently
by SDKs to retry the request without the range.
2024-05-05 09:56:21 -07:00
Harshavardhana
8ff70ea5a9
turn-off coloring if we have std{err,out} dumb terminals (#19667) 2024-05-03 17:17:57 -07:00
Harshavardhana
da3e7747ca
avoid using 10MiB EC buffers in maxAPI calculations (#19665)
max requests per node is more conservative in its value
causing premature serialization of the calls, avoid it
for newer deployments.
2024-05-03 13:08:20 -07:00
Klaus Post
4afb59e63f
fix: walk missing entries with opts.Marker set (#19661)
'opts.Marker` is causing many missed entries if used since results are returned unsorted. Also since pools are serialized.

Switch to do fully concurrent listing and merging across pools to return sorted entries.

Returning errors on listings is impossible with the current API, so document that.

Return an error at once if no drives are found instead of just returning an empty listing and no error.
2024-05-03 10:26:51 -07:00
Harshavardhana
1526e7ece3
extend server config.yaml to support per pool set drive count (#19663)
This is to support deployments migrating from a multi-pooled
wider stripe to lower stripe. MINIO_STORAGE_CLASS_STANDARD
is still expected to be same for all pools. So you can satisfy
adding custom drive count based pools by adjusting the storage
class value.

```
version: v2
address: ':9000'
rootUser: 'minioadmin'
rootPassword: 'minioadmin'
console-address: ':9001'
pools: # Specify the nodes and drives with pools
  -
    args:
        - 'node{11...14}.example.net/data{1...4}'
  -
    args:
        - 'node{15...18}.example.net/data{1...4}'
  -
    args:
        - 'node{19...22}.example.net/data{1...4}'
  -
    args:
        - 'node{23...34}.example.net/data{1...10}'
    set-drive-count: 6
```
2024-05-03 08:54:03 -07:00
Krishnan Parthasarathi
6c07bfee8a
With retention, skip actions expiring all versions (#19657)
ILM actions due to ExpiredObjectDeleteAllVersions and
DelMarkerExpiration are ignored when object locking is enabled on a
bucket.
Note: This applies to object versions which may not have retention
configured on them. This applies to all object versions in this bucket,
including those created before the retention config was applied.
2024-05-03 04:18:58 -07:00
Poorna
446c760820
replication: Avoid proxying if requested object is a deletemarker (#19656)
Fixes: #19654
2024-05-02 13:15:54 -07:00
Shireesh Anjal
04f92f1291
Change endpoint format for per-bucket metrics (#19655)
Per-bucket metrics endpoints always start with /bucket and the bucket
name is appended to the path. e.g. if the collector path is /bucket/api,
the endpoint for the bucket "mybucket" would be
/minio/metrics/v3/bucket/api/mybucket

Change the existing bucket api endpoint accordingly from /api/bucket to
/bucket/api
2024-05-02 10:37:57 -07:00
Bala FA
e5b16adb1c
Add cluster IAM metrics in metrics-v3 (#19595)
Signed-off-by: Bala.FA <bala@minio.io>
2024-05-02 01:20:42 -07:00
Harshavardhana
402a3ac719
support compression after rotation of logs (#19647) 2024-05-01 15:38:07 -07:00
Aditya Manthramurthy
f3d61c51fc
fix: Filter out cust. AssumeRole Token for audit (#19646)
The `Token` parameter is a sensitive value that should not be output in the Audit log for STS AssumeRoleWithCustomToken API.

Bonus: Add a simple tool that echoes audit logs to the console.
2024-05-01 14:31:13 -07:00
Klaus Post
0cde17ae5d
Return listing when exceeding min disk errors (#19644)
When listing, with drives returning `errFileNotFound,` `errVolumeNotFound`, or `errUnformattedDisk,`, 
we could get below `minDisks` drives being left.

This would result in a quorum never being reachable for any object. Therefore, the listing 
would continue, but no results would ever be produced.

Include `fnf` in the mindisk check since it is incremented on these errors. This will stop 
listing when minDisks are left.

Allow `opts.minDisks` to not return errVolumeNotFound or errFileNotFound and return that. 
That will allow for good results even if disks return something else.

We switch `errUnformattedDisk` to a regular error. If we have enough of those, we should just fail.
2024-05-01 10:59:08 -07:00
Harshavardhana
8c1bba681b
add logrotate support for MinIO logs (#19641) 2024-05-01 10:57:52 -07:00
Klaus Post
dbfb5e797b
Wait one minute after startup to restart decommissioning (#19645)
Typically not all drives are connected, so we delay 3 minutes before resuming.
This greatly reduces risk of starting to list unconnected drives, or drives we risk being disconnected soon.

This delay is not applied when starting with an admin call.
2024-05-01 08:18:21 -07:00
Harshavardhana
08ff702434
enhance ListSVCs() API to return more info to avoid InfoSvc() (#19642)
ConsoleUI like applications rely on combination of

ListServiceAccounts() and InfoServiceAccount() to populate
UI elements, however individually these calls can be slow
causing the entire UI to load sluggishly.
2024-05-01 05:41:13 -07:00
Klaus Post
0e2148264a
Fix --stfp "mac-algos=..." overwrites cipher algorithms (#19643)
Setting MAC algorithms overwrites cipher algorithms.

Followup to #19636
2024-05-01 04:07:40 -07:00
Krishnan Parthasarathi
7926401cbd
ilm: Handle DeleteAllVersions action differently for DEL markers (#19481)
i.e., this rule element doesn't apply to DEL markers.

This is a breaking change to how ExpiredObejctDeleteAllVersions
functions today. This is necessary to avoid the following highly probable
footgun scenario in the future.

Scenario:
The user uses tags-based filtering to select an object's time to live(TTL). 
The application sometimes deletes objects, too, making its latest
version a DEL marker. The previous implementation skipped tag-based filters
if the newest version was DEL marker, voiding the tag-based TTL. The user is
surprised to find objects that have expired sooner than expected.

* Add DelMarkerExpiration action

This ILM action removes all versions of an object if its
the latest version is a DEL marker.

```xml
<DelMarkerObjectExpiration>
    <Days> 10 </Days>
</DelMarkerObjectExpiration>
```

1. Applies only to objects whose,
  • The latest version is a DEL marker.
  • satisfies the number of days criteria
2. Deletes all versions of this object
3. Associated rule can't have tag-based filtering

Includes,
- New bucket event type for deletion due to DelMarkerExpiration
2024-04-30 18:11:10 -07:00
Harshavardhana
8161411c5d
fix: a crash in RemoveReplication target (#19640)
calling a remote target remove with a perfectly
well constructed ARN can lead to a crash for a bucket
with no replication configured.

This PR fixes, and adds a crash check for ImportMetadata
as well.
2024-04-30 18:09:56 -07:00
Klaus Post
f64dea2aac
Allow custom SFTP algorithm selection (#19636)
Algorithms are comma separated.
Note that valid values does not in all cases represent default values.

`--sftp=pub-key-algos=...` specifies the supported client public key
authentication algorithms. Note that this doesn't include certificate types
since those use the underlying algorithm. This list is sent to the client if
it supports the server-sig-algs extension. Order is irrelevant.

Valid values
```
ssh-ed25519
sk-ssh-ed25519@openssh.com
sk-ecdsa-sha2-nistp256@openssh.com
ecdsa-sha2-nistp256
ecdsa-sha2-nistp384
ecdsa-sha2-nistp521
rsa-sha2-256
rsa-sha2-512
ssh-rsa
ssh-dss
```

`--sftp=kex-algos=...` specifies the supported key-exchange algorithms in preference order.

Valid values:

```
curve25519-sha256
curve25519-sha256@libssh.org
ecdh-sha2-nistp256
ecdh-sha2-nistp384
ecdh-sha2-nistp521
diffie-hellman-group14-sha256
diffie-hellman-group16-sha512
diffie-hellman-group14-sha1
diffie-hellman-group1-sha1
```

`--sftp=cipher-algos=...` specifies the allowed cipher algorithms.
If unspecified then a sensible default is used.

Valid values:
```
aes128-ctr
aes192-ctr
aes256-ctr
aes128-gcm@openssh.com
aes256-gcm@openssh.com
chacha20-poly1305@openssh.com
arcfour256
arcfour128
arcfour
aes128-cbc
3des-cbc
```

`--sftp=mac-algos=...` specifies a default set of MAC algorithms in preference order.
This is based on RFC 4253, section 6.4, but with hmac-md5 variants removed because they have
reached the end of their useful life.

Valid values:

```
hmac-sha2-256-etm@openssh.com
hmac-sha2-512-etm@openssh.com
hmac-sha2-256
hmac-sha2-512
hmac-sha1
hmac-sha1-96
```
2024-04-30 08:15:45 -07:00