Commit Graph

845 Commits

Author SHA1 Message Date
Andreas Auernhammer 999bbd3a14
crypto: generate OEK using HMAC-SHA256 instead of SHA256 (#19297)
This commit changes how MinIO generates the object encryption key (OEK)
when encrypting an object using server-side encryption.

This change is fully backwards compatible. Now, MinIO generates
the OEK as following:
```
Nonce = RANDOM(32)        // generate 256 bit random value
OEK = HMAC-SHA256(EK, Context || Nonce)
```

Before, the OEK was computed as following:
```
Nonce = RANDOM(32)        // generate 256 bit random value
OEK = SHA256(EK || Nonce)
```

The new scheme does not technically fix a security issue but
uses a more familiar scheme. The only requirement for the
OEK generation function is that it produces a (pseudo)random value
for every pair (`EK`,`Nonce`) as long as no `EK`-`Nonce` combination
is repeated. This prevents a faulty PRNG from repeating or generating
a "bad" key.

The previous scheme guarantees that the `OEK` is a (pseudo)random
value given that no pair (`EK`,`Nonce`) repeats under the assumption
that SHA256 is indistinguable from a random oracle.

The new scheme guarantees that the `OEK` is a (pseudo)random value
given that no pair (`EK`, `Nonce`) repeats under the assumption that
SHA256's underlying compression function is a PRF/PRP.

While the later is a weaker assumption, and therefore, less likely
to be false, both are considered true. SHA256 is believed to be
indistinguable from a random oracle AND its compression function
is assumed to be a PRF/PRP.

As far as the OEK generating is concerned, the OS random number
generator is not required to be pseudo-random but just non-repeating.

Apart from being more compatible to standard definitions and
descriptions for how to generate crypto. keys, this change does not
have any impact of the actual security of the OEK key generation.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-03-19 13:28:10 -07:00
Harshavardhana 4d7068931a
change the notification queue full message (#19293) 2024-03-19 00:30:10 -07:00
jiuker d7fb6fddf6
feat: add user specific redis auth (#19285) 2024-03-18 21:37:54 -07:00
Harshavardhana d4aac7cd72
add deprecated expiry_workers to be ignored (#19289)
avoids error during upgrades such as
```
API: SYSTEM()
Time: 19:19:22 UTC 03/18/2024
DeploymentID: 24e4b574-b28d-4e94-9bfa-03c363a600c2
Error: Invalid api configuration: found invalid keys (expiry_workers=100 ) for 'api' sub-system, use 'mc admin config reset myminio api' to fix invalid keys (*fmt.wrapError)
      11: internal/logger/logger.go:260:logger.LogIf()
...
```
2024-03-18 15:25:32 -07:00
Harshavardhana c201d8bda9
write anything beyond 4k to be written in 4k pages (#19269)
we were prematurely not writing 4k pages while we
could have due to the fact that most buffers would
be multiples of 4k upto some number and there shall
be some remainder.

We only need to write the remainder without O_DIRECT.
2024-03-15 12:27:59 -07:00
Harshavardhana 93fb7d62d8
allow dynamically changing max_object_versions per object (#19265) 2024-03-14 18:07:19 -07:00
Harshavardhana ce1c640ce0
feat: allow retaining parity SLA to be configurable (#19260)
at scale customers might start with failed drives,
causing skew in the overall usage ratio per EC set.

make this configurable such that customers can turn
this off as needed depending on how comfortable they
are.
2024-03-14 03:38:33 -07:00
Klaus Post 5c32058ff3
cosmetic: Move request goroutines to methods (#19241)
Cosmetic change, but breaks up a big code block and will make a goroutine 
dumps of streams are more readable, so it is clearer what each goroutine is doing.
2024-03-13 11:43:58 -07:00
huajin tong a25a8312d8
fix: some flyby typos in the code (#19212)
Signed-off-by: thirdkeyword <fliterdashen@gmail.com>
2024-03-10 14:09:36 -07:00
Krishnan Parthasarathi 2007dd26ae
ilm: Expire if object past expected expiry date (#19230)
When an object qualifies for both tiering and expiration rules and is
past its expiration date, it should be expired without requiring to tier
it, even when tiering event occurs before expiration.
2024-03-08 22:41:22 -08:00
Klaus Post 51f62a8da3
Port ListBuckets to websockets layer & some cleanup (#19199) 2024-03-08 11:08:18 -08:00
Harshavardhana 233cc3905a
add batchSize support for webhook endpoints (#19214)
configure batch size to send audit/logger events
in batches instead of sending one event per connection.

this is mainly to optimize the number of requests
we make to webhook endpoint.
2024-03-07 12:17:46 -08:00
Harshavardhana e91a4a414c
merge startHTTPLogger() many callers into a simpler pattern (#19211)
simplify audit webhook worker model

fixes couple of bugs like

- ping(ctx) was creating a logger without updating
  number of workers leading to incorrect nWorkers
  scaling, causing an additional worker that is not
  tracked properly.

- h.logCh <- entry could potentially hang for when
  the queue is full on heavily loaded systems.
2024-03-06 08:09:46 -08:00
Harshavardhana 74ccee6619
avoid too much auditing during decom/rebalance make it more robust (#19174)
there can be a sudden spike in tiny allocations,
due to too much auditing being done, also don't hang
on the

```
h.logCh <- entry
```

after initializing workers if you do not have a way to
dequeue for some reason.
2024-03-06 03:43:16 -08:00
Krishnan Parthasarathi c26b8d4eb8
Set expected expiry date for ExpiredObjectAllVersions (#19210) 2024-03-05 22:28:57 -08:00
Krishnan Parthasarathi b69bcdcdc4
Fix ilm config at startup (#19189)
Remove api.expiration_workers config setting which was inadvertently left behind. Per review comment 

https://github.com/minio/minio/pull/18926, expiration_workers can be configured via ilm.expiration_workers.
2024-03-04 18:50:24 -08:00
Krishnan Parthasarathi a7577da768
Improve expiration of tiered objects (#18926)
- Use a shared worker pool for all ILM expiry tasks
- Free version cleanup executes in a separate goroutine
- Add a free version only if removing the remote object fails
- Add ILM expiry metrics to the node namespace
- Move tier journal tasks to expiryState
- Remove unused on-disk journal for tiered objects pending deletion
- Distribute expiry tasks across workers such that the expiry of versions of
  the same object serialized
- Ability to resize worker pool without server restart
- Make scaling down of expiryState workers' concurrency safe; Thanks
  @klauspost
- Add error logs when expiryState and transition state are not
  initialized (yet)
* metrics: Add missed tier journal entry tasks
* Initialize the ILM worker pool after the object layer
2024-03-01 21:11:03 -08:00
Andreas Auernhammer 09626d78ff
automatically generate root credentials with KMS (#19025)
With this commit, MinIO generates root credentials automatically
and deterministically if:

 - No root credentials have been set.
 - A KMS (KES) is configured.
 - API access for the root credentials is disabled (lockdown mode).

Before, MinIO defaults to `minioadmin` for both the access and
secret keys. Now, MinIO generates unique root credentials
automatically on startup using the KMS.

Therefore, it uses the KMS HMAC function to generate pseudo-random
values. These values never change as long as the KMS key remains
the same, and the KMS key must continue to exist since all IAM data
is encrypted with it.

Backward compatibility:

This commit should not cause existing deployments to break. It only
changes the root credentials of deployments that have a KMS configured
(KES, not a static key) but have not set any admin credentials. Such
implementations should be rare or not exist at all.

Even if the worst case would be updating root credentials in mc
or other clients used to administer the cluster. Root credentials
are anyway not intended for regular S3 operations.

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-03-01 13:09:42 -08:00
Harshavardhana 2c2f5d871c
debug: introduce support for configuring client connect WRITE deadline (#19170)
just like client-conn-read-deadline, added a new flag that does
client-conn-write-deadline as well.

Both are not configured by default, since we do not yet know
what is the right value. Allow this to be configurable if needed.
2024-03-01 08:00:42 -08:00
Harshavardhana c599c11e70
fix: relax metadata checks for healing (#19165)
we should do this to ensure that we focus on
data healing as primary focus, fixing metadata
as part of healing must be done but making
data available is the main focus.

the main reason is metadata inconsistencies can
cause data availability issues, which must be
avoided at all cost.

will be bringing in an additional healing mechanism
that involves "metadata-only" heal, for now we do
not expect to have these checks.

continuation of #19154

Bonus: add a pro-active healthcheck to perform a connection
2024-02-29 22:49:01 -08:00
Klaus Post 40fb3371fa
Mux: Send async mux ack and fix stream error responses (#19149)
Streams can return errors if the cancelation is picked up before the response 
stream close is picked up. Under extreme load, this could lead to missing 
responses.

Send server mux ack async so a blocked send cannot block newMuxStream 
call. Stream will not progress until mux has been acked.
2024-02-28 10:05:18 -08:00
Harshavardhana 51874a5776
fix: allow DNS disconnection events to happen in k8s (#19145)
in k8s things really do come online very asynchronously,
we need to use implementation that allows this randomness.

To facilitate this move WriteAll() as part of the
websocket layer instead.

Bonus: avoid instances of dnscache usage on k8s
2024-02-28 09:54:52 -08:00
Aditya Manthramurthy 62ce52c8fd
cachevalue: simplify exported interface (#19137)
- Also add cache options type
2024-02-28 09:09:09 -08:00
jiuker 0aae0180fb
feat: add userCredentials for nats (#19139) 2024-02-27 10:11:55 -08:00
Anis Eleuch 95032e4710
ilm: Select an object when all AND tags are satisfied (#19134)
Currently, if one object tag matches with one lifecycle tag filter, ILM
will select it, however, this is wrong. All the Tag filters in the
lifecycle document should be satisfied.
2024-02-26 16:01:20 -08:00
Praveen raj Mani 30c2596512
Read drive IO stats from sysfs instead of procfs (#19131)
Currently, we read from `/proc/diskstats` which is found to be
un-reliable in k8s environments. We can read from `sysfs` instead.

Also, cache the latest drive io stats to find the diff and update
the metrics.
2024-02-26 11:34:50 -08:00
Klaus Post 2b5e4b853c
Improve caching (#19130)
* Remove lock for cached operations.
* Rename "Relax" to `ReturnLastGood`.
* Add `CacheError` to allow caching values even on errors.
* Add NoWait that will return current value with async fetching if within 2xTTL.
* Make benchmark somewhat representative.

```
Before: BenchmarkCache-12       16408370                63.12 ns/op            0 B/op
After:  BenchmarkCache-12       428282187                2.789 ns/op           0 B/op
```

* Remove `storageRESTClient.scanning`. Nonsensical - RPC clients will not have any idea about scanning.
* Always fetch remote diskinfo metrics and cache them. Seems most calls are requesting metrics.
* Do async fetching of usage caches.
2024-02-26 10:49:19 -08:00
Harshavardhana a3ac62596c
move timedValue -> cachevalue package (#19114) 2024-02-23 13:28:14 -08:00
Harshavardhana 53aa8f5650
use typos instead of codespell (#19088) 2024-02-21 22:26:06 -08:00
Shubhendu 56887f3208
Add DeleteAll with expiry days non zero value only (#19095)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-02-21 12:28:34 -08:00
Klaus Post 92180bc793
Add array recycling safety (#19103)
Nil entries when recycling arrays.
2024-02-21 12:27:35 -08:00
Klaus Post 22aa16ab12
Fix grid reconnection deadlock (#19101)
If network conditions have filled the output queue before a reconnect happens blocked sends could stop reconnects from happening. In short `respMu` would be held for a mux client while sending - if the queue is full this will never get released and closing the mux client will hang.

A) Use the mux client context instead of connection context for sends, so sends are unblocked when the mux client is canceled.

B) Use a `TryLock` on "close" and cancel the request if we cannot get the lock at once. This will unblock any attempts to send.
2024-02-21 07:49:34 -08:00
Harshavardhana cd419a35fe
simplify broker healthcheck by following kafka guidelines (#19082)
fixes #19081
2024-02-20 00:16:35 -08:00
Klaus Post e06168596f
Convert more peer <--> peer REST calls (#19004)
* Convert more peer <--> peer REST calls
* Clean up in general.
* Add JSON wrapper.
* Add slice wrapper.
* Add option to make handler return nil error if no connection is given, `IgnoreNilConn`.

Converts the following:

```
+	HandlerGetMetrics
+	HandlerGetResourceMetrics
+	HandlerGetMemInfo
+	HandlerGetProcInfo
+	HandlerGetOSInfo
+	HandlerGetPartitions
+	HandlerGetNetInfo
+	HandlerGetCPUs
+	HandlerServerInfo
+	HandlerGetSysConfig
+	HandlerGetSysServices
+	HandlerGetSysErrors
+	HandlerGetAllBucketStats
+	HandlerGetBucketStats
+	HandlerGetSRMetrics
+	HandlerGetPeerMetrics
+	HandlerGetMetacacheListing
+	HandlerUpdateMetacacheListing
+	HandlerGetPeerBucketMetrics
+	HandlerStorageInfo
+	HandlerGetLocks
+	HandlerBackgroundHealStatus
+	HandlerGetLastDayTierStats
+	HandlerSignalService
+	HandlerGetBandwidth
```
2024-02-19 14:54:46 -08:00
Harshavardhana 607cafadbc
converge clusterRead health into cluster health (#19063) 2024-02-15 16:48:36 -08:00
Anis Eleuch 68dde2359f
log: Add logger.Event to send to console and other logger targets (#19060)
Add a new function logger.Event() to send the log to Console and
http/kafka log webhooks. This will include some internal events such as
disk healing and rebalance/decommissioning
2024-02-15 15:13:30 -08:00
Praveen raj Mani ac8e9ce04f
Send a bucket notification event on DeleteObject() for non-existing object (#19037)
Send a bucket notification event on DeleteObject for non-existing objects
2024-02-13 07:34:17 -08:00
Taran Pelkey 4d94609c44
FIx unexpected behavior when creating service account (#19036) 2024-02-13 02:31:43 -08:00
Harshavardhana afd19de5a9
fix: allow configuring excess versions alerting (#19028)
Bonus: enable audit alerts for object versions
beyond the configured value, default is '100'
versions per object beyond which scanner will
alert for each such objects.
2024-02-11 23:41:53 -08:00
Harshavardhana 997ba3a574
introduce reader deadlines for net.Conn (#19023)
Bonus: set "retry-after" header for AWS SDKs if possible to honor them.
2024-02-09 13:25:16 -08:00
Klaus Post 8e68ff9321
Add extra disconnect safety (#19022)
Fix reported races that are actually synchronized by network calls.

But this should add some extra safety for untimely disconnects.

Race reported:

```
WARNING: DATA RACE
Read at 0x00c00171c9c0 by goroutine 214:
  github.com/minio/minio/internal/grid.(*muxClient).addResponse()
      e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:519 +0x111
  github.com/minio/minio/internal/grid.(*muxClient).error()
      e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:470 +0x21d
  github.com/minio/minio/internal/grid.(*Connection).handleDisconnectClientMux()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:1391 +0x15b
  github.com/minio/minio/internal/grid.(*Connection).handleMsg()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:1190 +0x1ab
  github.com/minio/minio/internal/grid.(*Connection).handleMessages.func1()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:981 +0x610

Previous write at 0x00c00171c9c0 by goroutine 1081:
  github.com/minio/minio/internal/grid.(*muxClient).roundtrip()
      e:/gopath/src/github.com/minio/minio/internal/grid/muxclient.go:94 +0x324
  github.com/minio/minio/internal/grid.(*muxClient).traceRoundtrip()
      e:/gopath/src/github.com/minio/minio/internal/grid/trace.go:74 +0x10e4
  github.com/minio/minio/internal/grid.(*Subroute).Request()
      e:/gopath/src/github.com/minio/minio/internal/grid/connection.go:366 +0x230
  github.com/minio/minio/internal/grid.(*SingleHandler[go.shape.*github.com/minio/minio/cmd.DiskInfoOptions,go.shape.*github.com/minio/minio/cmd.DiskInfo]).Call()
      e:/gopath/src/github.com/minio/minio/internal/grid/handlers.go:554 +0x3fd
  github.com/minio/minio/cmd.(*storageRESTClient).DiskInfo()
      e:/gopath/src/github.com/minio/minio/cmd/storage-rest-client.go:314 +0x270
  github.com/minio/minio/cmd.erasureObjects.getOnlineDisksWithHealingAndInfo.func1()
      e:/gopath/src/github.com/minio/minio/cmd/erasure.go:293 +0x171
```

This read will always happen after the write, since there is a network call in between.

However a disconnect could come in while we are setting up the call, so we protect against that with extra checks.
2024-02-09 08:43:38 -08:00
Harshavardhana 035a3ea4ae
optimize startup sequence performance (#19009)
- bucket metadata does not need to look for legacy things
  anymore if b.Created is non-zero

- stagger bucket metadata loads across lots of nodes to
  avoid the current thundering herd problem.

- Remove deadlines for RenameData, RenameFile - these
  calls should not ever be timed out and should wait
  until completion or wait for client timeout. Do not
  choose timeouts for applications during the WRITE phase.

- increase R/W buffer size, increase maxMergeMessages to 30
2024-02-08 11:21:21 -08:00
Klaus Post 7ec43bd177
Fix blocked streams blocking reconnects (#19017)
We have observed cases where a blocked stream will block for cancellations.

This happens when response channel is blocked and we want to push an error.
This will have the response mutex locked, which will prevent all other operations until upstream is unblocked.

Make this behavior non-blocking and if blocked spawn a goroutine that will send the response and close the output.

Still a lot of "dancing". Added a test for this and reviewed.
2024-02-08 10:15:27 -08:00
Shubhendu 980fb5e2ab
Enable expired-object-all-versions (#18954)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-02-06 13:36:22 -08:00
Klaus Post 9bcc46d93d
Fix second muxclient context leak (#18987)
Subrouted requests were also leaking contexts in mux clients.

Similar to #18956
2024-02-06 13:35:16 -08:00
Klaus Post 22687c1f50
Add websocket TCP write timeouts (#18988)
Add 3 second write timeout to writes.

This will make dead TCP connections terminate in a reasonable time.

Fixes writes blocking for reconnection.
2024-02-06 13:34:46 -08:00
Klaus Post ebc6c9b498
Fix tracing send on closed channel (#18982)
Depending on when the context cancelation is picked up the handler may return and close the channel before `SubscribeJSON` returns, causing:

```
Feb 05 17:12:00 s3-us-node11 minio[3973657]: panic: send on closed channel
Feb 05 17:12:00 s3-us-node11 minio[3973657]: goroutine 378007076 [running]:
Feb 05 17:12:00 s3-us-node11 minio[3973657]: github.com/minio/minio/internal/pubsub.(*PubSub[...]).SubscribeJSON.func1()
Feb 05 17:12:00 s3-us-node11 minio[3973657]:         github.com/minio/minio/internal/pubsub/pubsub.go:139 +0x12d
Feb 05 17:12:00 s3-us-node11 minio[3973657]: created by github.com/minio/minio/internal/pubsub.(*PubSub[...]).SubscribeJSON in goroutine 378010884
Feb 05 17:12:00 s3-us-node11 minio[3973657]:         github.com/minio/minio/internal/pubsub/pubsub.go:124 +0x352
```

Wait explicitly for the goroutine to exit.

Bonus: Listen for doneCh when sending to not risk getting blocked there is channel isn't being emptied.
2024-02-06 08:57:30 -08:00
Harshavardhana 100c35c281
avoid excessive logs when peer is down (#18969) 2024-02-04 23:25:42 -08:00
Harshavardhana 960d604013
disconnected returns, an unexpected error to List() returning 500s (#18959)
provide the error string appropriately so that the
matching of error types works.

Also add a string based fallback for the said error.
2024-02-03 01:04:33 -08:00
Klaus Post 63bf5f42a1
Fix mux client memory leak (#18956)
Add missing client cancellation, resulting in memory buildup tracing back to context.WithCancelCause/context.WithCancelDeadlineCause
2024-02-02 15:31:06 -08:00
Harshavardhana ff80cfd83d
move Make,Delete,Head,Heal bucket calls to websockets (#18951) 2024-02-02 14:54:54 -08:00
Harshavardhana 99fde2ba85
deprecate disk tokens, instead rely on deadlines and active monitoring (#18947)
disk tokens usage is not necessary anymore with the implementation
of deadlines for storage calls and active monitoring of the drive
for I/O timeouts.

Functionality kicking off a bad drive is still supported, it's just that 
we do not have to serialize I/O in the manner tokens would do.
2024-02-02 10:10:54 -08:00
Klaus Post ce0cb913bc
Fix ineffective recycling (#18952)
Recycle would always be called on the dummy value `any(newRT())` instead of the actual value given to the recycle function.

Caught by race tests, but mostly harmless, except for reduced perf.

Other minor cleanups. Introduced in #18940 (unreleased)
2024-02-02 08:48:12 -08:00
Harshavardhana d99d16e8c3
simplify deadlineWriter, re-use WithDeadline (#18948) 2024-02-02 03:02:31 -08:00
Anis Eleuch 6fd63e920a
log: Use error log type instead of Application/MinIO type (#18930)
* log: Use error log type instead of Application/MinIO type

Also bump github.com/shirou/gopsutil version to address cross
compilation issues.

* Apply suggestions from code review

Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>

---------

Co-authored-by: Anis Eleuch <anis@min.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>
2024-02-01 16:13:57 -08:00
Klaus Post b192bc348c
Improve object reuse for grid messages (#18940)
Allow internal types to support a `Recycler` interface, which will allow for sharing of common types across handlers.

This means that all `grid.MSS` (and similar) objects are shared across in a common pool instead of a per-handler pool.

Add internal request reuse of internal types. Add for safe (pointerless) types explicitly.

Only log params for internal types. Doing Sprint(obj) is just a bit too messy.
2024-02-01 12:41:20 -08:00
Harshavardhana 6440d0fbf3
move a collection of peer APIs to websockets (#18936) 2024-02-01 10:47:20 -08:00
Frank Wessels 4cd777a5e0
Correct small typo in pubsub (#18923) 2024-01-31 01:01:53 -08:00
Klaus Post 6da4a9c7bb
Improve tracing & notification scalability (#18903)
* Perform JSON encoding on remote machines and only forward byte slices.
* Migrate tracing & notification to WebSockets.
2024-01-30 12:49:02 -08:00
Anis Eleuch a669946357
Add cgroup v2 support for memory limit (#18905) 2024-01-30 11:13:27 -08:00
Harshavardhana 2ddf2ca934
allow configuring maximum idle connections per host (#18908) 2024-01-29 16:50:37 -08:00
Harshavardhana 9987ff570b avoid calling close for nil inbound/outblock channels 2024-01-28 19:56:32 -08:00
Harshavardhana 9ef132c33b remove excessive logging due to runtime.debugStack 2024-01-28 18:10:42 -08:00
Harshavardhana 7743d952dc
fix: incomingBytes() to update via handleMessages() (#18891)
previous change #18880 was incomplete
2024-01-28 14:35:53 -08:00
Harshavardhana 1d3bd02089
avoid close 'nil' panics if any (#18890)
brings a generic implementation that
prints a stack trace for 'nil' channel
closes(), if not safely closes it.
2024-01-28 10:04:17 -08:00
Klaus Post 38de8e6936
grid: Simpler reconnect logic (#18889)
Do not rely on `connChange` to do reconnects.

Instead, you can block while the connection is running and reconnect 
when handleMessages returns.

Add fully async monitoring instead of monitoring on the main goroutine 
and keep this to avoid full network lockup.
2024-01-28 08:46:15 -08:00
Harshavardhana c51f9ef940
fix: regression in internode bytes counting (#18880)
wire up missing metrics since #18461

Bonus: fix trace output inconsistency
2024-01-27 00:25:49 -08:00
Harshavardhana 74851834c0
further bootstrap/startup optimization for reading 'format.json' (#18868)
- Move RenameFile to websockets
- Move ReadAll that is primarily is used
  for reading 'format.json' to to websockets
- Optimize DiskInfo calls, and provide a way
  to make a NoOp DiskInfo call.
2024-01-25 12:45:46 -08:00
Harshavardhana e377bb949a
migrate bootstrap logic directly to websockets (#18855)
improve performance for startup sequences by 2x for 300+ nodes.
2024-01-24 13:36:44 -08:00
Praveen raj Mani c905d3fe21
fix: Re-use TCP connections for Kafka dials (#18860)
Fixes #18857
2024-01-24 13:10:52 -08:00
Klaus Post 6968f7237a
Add separate grid reconnection mutex (#18862)
Add separate reconnection mutex

Give more safety around reconnects and make sure a state change isn't missed.

Tested with several runs of `λ go test -race -v -count=500`

Adds separate mutex and doesn't mix in the testing mutex.
2024-01-24 11:49:39 -08:00
Klaus Post 4a6c97463f
Fix all racy use of NewDeadlineWorker (#18861)
AlmosAll uses of NewDeadlineWorker, which relied on secondary values, were used in a racy fashion,
which could lead to inconsistent errors/data being returned. It also propagates the deadline downstream.

Rewrite all these to use a generic WithDeadline caller that can return an error alongside a value.

Remove the stateful aspect of DeadlineWorker - it was racy if used - but it wasn't AFAICT.

Fixes races like:

```
WARNING: DATA RACE
Read at 0x00c130b29d10 by goroutine 470237:
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).ReadVersion()
      github.com/minio/minio/cmd/xl-storage-disk-id-check.go:702 +0x611
  github.com/minio/minio/cmd.readFileInfo()
      github.com/minio/minio/cmd/erasure-metadata-utils.go:160 +0x122
  github.com/minio/minio/cmd.erasureObjects.getObjectFileInfo.func1.1()
      github.com/minio/minio/cmd/erasure-object.go:809 +0x27a
  github.com/minio/minio/cmd.erasureObjects.getObjectFileInfo.func1.2()
      github.com/minio/minio/cmd/erasure-object.go:828 +0x61

Previous write at 0x00c130b29d10 by goroutine 470298:
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).ReadVersion.func1()
      github.com/minio/minio/cmd/xl-storage-disk-id-check.go:698 +0x244
  github.com/minio/minio/internal/ioutil.(*DeadlineWorker).Run.func1()
      github.com/minio/minio/internal/ioutil/ioutil.go:141 +0x33

WARNING: DATA RACE
Write at 0x00c0ba6e6c00 by goroutine 94507:
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).StatVol.func1()
      github.com/minio/minio/cmd/xl-storage-disk-id-check.go:419 +0x104
  github.com/minio/minio/internal/ioutil.(*DeadlineWorker).Run.func1()
      github.com/minio/minio/internal/ioutil/ioutil.go:141 +0x33

Previous read at 0x00c0ba6e6c00 by goroutine 94463:
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).StatVol()
      github.com/minio/minio/cmd/xl-storage-disk-id-check.go:422 +0x47e
  github.com/minio/minio/cmd.getBucketInfoLocal.func1()
      github.com/minio/minio/cmd/peer-s3-server.go:275 +0x122
  github.com/minio/pkg/v2/sync/errgroup.(*Group).Go.func1()
```

Probably back from #17701
2024-01-24 10:08:31 -08:00
Klaus Post feeeef71f1
Add extra protection for grid reconnects (#18840)
Race checks would occasionally show race on handleMsgWg WaitGroup by debug messages (used in test only).

Use the `connMu` mutex to protect this against concurrent Wait/Add.

Fixes #18827
2024-01-22 09:39:06 -08:00
Klaus Post 83bf15a703
grid: Return rejection reason (#18834)
When rejecting incoming grid requests fill out the rejection reason and log it once.

This will give more context when startup is failing. Already logged after a retry on caller.
2024-01-19 10:35:24 -08:00
Harshavardhana dd2542e96c
add codespell action (#18818)
Original work here, #18474,  refixed and updated.
2024-01-17 23:03:17 -08:00
Frank Wessels 4d2320ba8b
fix: a small typo in dsync (#18816) 2024-01-17 20:34:26 -08:00
Klaus Post 479940b7d0
Deallocate huge read buffers (#18813)
If a message buffer is excessively huge, release it back so it isn't kept around forever.
2024-01-17 11:47:42 -08:00
Poorna b2b26d9c95
support proxying of tagging requests in replication (#18649)
support proxying of tagging requests in active-active replication

Note: even if proxying is successful, PutObjectTagging/DeleteObjectTagging
will continue to report a 404 since the object is not present locally.
2024-01-12 23:51:33 -08:00
jiuker a89e0bab7d
fix: s3 sql parse error for colums as with quotes (#18765) 2024-01-09 09:19:11 -08:00
Sveinn 9b8ba97f9f
feat: add support for GetObjectAttributes API (#18732) 2024-01-05 10:43:06 -08:00
Anis Eleuch 7705605b5a
scanner: Add a config to disable short sleep between objects scan (#18734)
Add a hidden configuration under the scanner sub section to configure if
the scanner should sleep between two objects scan. The configuration has
only effect when there is no drive activity related to s3 requests or
healing.

By default, the code will keep the current behavior which is doing
sleep between objects.

To forcefully enable the full scan speed in idle mode, you can do this:

   `mc admin config set myminio scanner idle_speed=full`
2024-01-04 15:07:17 -08:00
Pedro Juarez 8f13c8c3bf
Support to store browser config settings (#18631)
* csp_policy
* hsts_seconds
* hsts_include_subdomains
* hsts_preload
* referrer_policy
2024-01-01 08:36:33 -08:00
Harshavardhana a50ea92c64
feat: introduce list_quorum="auto" to prefer quorum drives (#18084)
NOTE: This feature is not retro-active; it will not cater to previous transactions
on existing setups. 

To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment
variable as part of systemd service or k8s configmap. 

Once this has been enabled, you need to also set `list_quorum`. 

```
~ mc admin config set alias/ api list_quorum=auto` 
```

A new debugging tool is available to check for any missing counters.
2023-12-29 15:52:41 -08:00
Aditya Manthramurthy 496027b589
Fix precendence bug in S3Select SQL IN clauses (#18708)
Fixes a precendence issue in SQL Select where `a in b and c = 3` was parsed as `a
in (b and c = 3)`.

Fixes #18682
2023-12-22 23:19:11 -08:00
Anis Eleuch 8bd4f6568b
server-info: Avoid initializing audit/log http/kafka targets (#18703)
This can cause unnecessary ServerInfo() call delay.
2023-12-22 10:25:08 -08:00
Harshavardhana da55499db0
fix: reject clients that do not send proper payload (#18701) 2023-12-22 01:26:17 -08:00
Harshavardhana eba23bbac4
rename object_size -> block_size for cache subsystem (#18694) 2023-12-21 16:57:13 -08:00
Harshavardhana 4550535cbb
send proper IPv6 names avoid bracketing notation (#18699)
Following policies if present

```
       "Condition": {
         "IpAddress": {
            "aws:SourceIp": [
              "54.240.143.0/24",
               "2001:DB8:1234:5678::/64"
             ]
          }
        }
```

And client is making a request to MinIO via IPv6 can
potentially crash the server.

Workarounds are turn-off IPv6 and use only IPv4
2023-12-21 16:56:55 -08:00
Harshavardhana 7c948adf88
allow pre-allocating buffers to reduce frequent GCs during growth (#18686)
This PR also increases per node bpool memory from 1024 entries
to 2048 entries; along with that, it also moves the byte pool
centrally instead of being per pool.
2023-12-21 08:59:38 -08:00
Klaus Post 6c89a81af4
Fix CreateFile shared buffer corruption. (#18652)
`(*xlStorageDiskIDCheck).CreateFile` wraps the incoming reader in `xioutil.NewDeadlineReader`.

The wrapped reader is handed to `(*xlStorage).CreateFile`. This performs a Read call via `writeAllDirect`, 
which reads into an `ODirectPool` buffer.

`(*DeadlineReader).Read` spawns an async read into the buffer. If a timeout is hit while reading, 
the read operation returns to `writeAllDirect`. The operation returns an error and the buffer is reused.

However, if the async `Read` call unblocks, it will write to the now recycled buffer.

Fix: Remove the `DeadlineReader` - it is inherently unsafe. Instead, rely on the network timeouts. 
This is not a disk timeout, anyway.

Regression in https://github.com/minio/minio/pull/17745
2023-12-14 10:51:57 -08:00
Praveen raj Mani 10ca0a6936
Label the notification target metrics by their target IDs (#18633)
This patch adds the targetID to the existing notification target metrics
and deprecates the current target metrics which points to the overall
event notification subsystem
2023-12-14 09:09:26 -08:00
Harshavardhana 65f34cd823
fix: remove ODirectReader entirely since we do not need it anymore (#18619) 2023-12-09 10:17:51 -08:00
Anis Eleuch 6f97663174
yml-config: Add support of rootUser and rootPassword (#18615)
Users can define the root user and password in the yaml configuration
file; Root credentials defined in the environment variable still take
precedence
2023-12-08 12:04:54 -08:00
Poorna 6b06da76cb
add configuration to limit replication workers (#18601) 2023-12-07 16:22:00 -08:00
jiuker 6ca6788bb7
feat: add events_errors_total metric (#18610) 2023-12-07 16:21:17 -08:00
Anis Eleuch 2e23e61a45
Add support of conf file to pass arguments and options (#18592) 2023-12-07 01:33:56 -08:00
Harshavardhana 53ce92b9ca
fix: use the right channel to feed the data in (#18605)
this PR fixes a regression in batch replication
where we weren't sending any data from the Walk()
results due to incorrect channels being used.
2023-12-06 18:17:03 -08:00
Harshavardhana 73dde66dbe
stick to go1.19 go.mod (#18600) 2023-12-06 01:09:22 -08:00
Klaus Post 8fc200c0cc
Truncate long traces for internode communication (#18593)
Prevent excessively long request traces.
2023-12-05 12:16:48 -08:00
Harshavardhana fbb5e75e01
avoid run-away goroutine build-up in notification send, use channels (#18533)
use memory for async events when necessary and dequeue them as
needed, for all synchronous events customers must enable

```
MINIO_API_SYNC_EVENTS=on
```

Async events can be lost but is upto to the admin to
decide what they want, we will not create run-away number
of goroutines per event instead we will queue them properly.

Currently the max async workers is set to runtime.GOMAXPROCS(0)
which is more than sufficient in general, but it can be made
configurable in future but may not be needed.
2023-12-05 02:16:33 -08:00
Krishnan Parthasarathi a50f26b7f5
Implement batch-expiration for objects (#17946)
Based on an initial PR from -
https://github.com/minio/minio/pull/17792

But fully completes it with newer finalized YAML spec.
2023-12-02 02:51:33 -08:00
Klaus Post 69294cf98a
Disable DMA optimization on windows (#18575)
It appears that Windows can lock up when errors occur. Use regular copy here.
2023-12-01 16:13:19 -08:00
Klaus Post 860fc200b0
Local and Remote hosts swapped in grid traces (#18574)
Local and Remote hosts swapped in grid trace

A bit counter-intuitive, but simple fix.
2023-12-01 08:04:08 -08:00
Klaus Post 5f971fea6e
Fix Mux Connect Error (#18567)
`OpMuxConnectError` was not handled correctly.

Remove local checks for single request handlers so they can 
run before being registered locally.

Bonus: Only log IAM bootstrap on startup.
2023-12-01 00:18:04 -08:00
jiuker 34187e047d
feat: support elasticsearch notification endpoint compression codec (#18562) 2023-11-30 00:25:03 -08:00
Klaus Post 0bb81f2e9c
Always remove subroute when queuing message on the connection. (#18550) 2023-11-28 11:22:29 -08:00
jiuker be02333529
feat: drive sub-sys to max timeout reload (#18501) 2023-11-27 09:15:06 -08:00
Klaus Post ca488cce87
Add detailed parameter tracing + custom prefix (#18518)
* Allow per handler custom prefix.
* Add automatic parameter extraction
2023-11-26 01:32:59 -08:00
Shireesh Anjal 11dc723324
Pass SUBNET URL to console (#18503)
When minio runs with MINIO_CI_CD=on, it is expected to communicate
with the locally running SUBNET. This is happening in the case of MinIO
via call home functionality. However, the subnet-related functionality inside the
console continues to talk to the SUBNET production URL. Because of this,
the console cannot be tested with a locally running SUBNET.

Set the env variable CONSOLE_SUBNET_URL correctly in such cases. 
(The console already has code to use the value of this variable
as the subnet URL)
2023-11-24 09:59:35 -08:00
Praveen raj Mani 3369eeb920
Relax batch size limit for kafka events (#18513)
Fixes #18490
2023-11-24 09:07:38 -08:00
Harshavardhana fba883839d
feat: bring new HDD related performance enhancements (#18239)
Optionally allows customers to enable 

- Enable an external cache to catch GET/HEAD responses 
- Enable skipping disks that are slow to respond in GET/HEAD 
  when we have already achieved a quorum
2023-11-22 13:46:17 -08:00
Krishnan Parthasarathi a93214ea63
ilm: ObjectSizeLessThan and ObjectSizeGreaterThan (#18500) 2023-11-22 13:42:39 -08:00
Shubhendu 58306a9d34
Replicate Expiry ILM configs while site replication (#18130)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-21 09:48:06 -08:00
jiuker 41091d9472
fix: close http body for es action (#18491) 2023-11-20 22:22:10 -08:00
Klaus Post 51aa59a737
perf: websocket grid connectivity for all internode communication (#18461)
This PR adds a WebSocket grid feature that allows servers to communicate via 
a single two-way connection.

There are two request types:

* Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small
  roundtrips with small payloads.

* Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`,
  which allows for different combinations of full two-way streams with an initial payload.

Only a single stream is created between two machines - and there is, as such, no
server/client relation since both sides can initiate and handle requests. Which server
initiates the request is decided deterministically on the server names.

Requests are made through a mux client and server, which handles message
passing, congestion, cancelation, timeouts, etc.

If a connection is lost, all requests are canceled, and the calling server will try
to reconnect. Registered handlers can operate directly on byte 
slices or use a higher-level generics abstraction.

There is no versioning of handlers/clients, and incompatible changes should
be handled by adding new handlers.

The request path can be changed to a new one for any protocol changes.

First, all servers create a "Manager." The manager must know its address 
as well as all remote addresses. This will manage all connections.
To get a connection to any remote, ask the manager to provide it given
the remote address using.

```
func (m *Manager) Connection(host string) *Connection
```

All serverside handlers must also be registered on the manager. This will
make sure that all incoming requests are served. The number of in-flight 
requests and responses must also be given for streaming requests.

The "Connection" returned manages the mux-clients. Requests issued
to the connection will be sent to the remote.

* `func (c *Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)`
   performs a single request and returns the result. Any deadline provided on the request is
   forwarded to the server, and canceling the context will make the function return at once.

* `func (c *Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st *Stream, err error)`
   will initiate a remote call and send the initial payload.

```Go
// A Stream is a two-way stream.
// All responses *must* be read by the caller.
// If the call is canceled through the context,
//The appropriate error will be returned.
type Stream struct {
	// Responses from the remote server.
	// Channel will be closed after an error or when the remote closes.
	// All responses *must* be read by the caller until either an error is returned or the channel is closed.
	// Canceling the context will cause the context cancellation error to be returned.
	Responses <-chan Response

	// Requests sent to the server.
	// If the handler is defined with 0 incoming capacity this will be nil.
	// Channel *must* be closed to signal the end of the stream.
	// If the request context is canceled, the stream will no longer process requests.
	Requests chan<- []byte
}

type Response struct {
	Msg []byte
	Err error
}
```

There are generic versions of the server/client handlers that allow the use of type
safe implementations for data types that support msgpack marshal/unmarshal.
2023-11-20 17:09:35 -08:00
jiuker f56a182b71
fix: close http body when webhook send (#18487) 2023-11-20 14:40:07 -08:00
jiuker 215ca58d6a
fix: close the http.Body when WebhookTarget isActive (#18467) 2023-11-17 12:02:26 -08:00
Anis Eleuch 12f570a307
audit: Try to send audit even if the status is offline (#18458)
Currently, once the audit becomes offline, there is no code that tries
to reconnect to the audit, at the same time Send() quickly returns with
an error without really trying to send a message the audit endpoint; so
the audit endpoint will never be online again.

Fixing this behavior; the current downside is that we miss printing some
logs when the audit becomes offline; however this information is
available in prometheus

Later, we can refactor internal/logger so the http endpoint can send errors to
console target.
2023-11-17 10:40:28 -08:00
Adrian Najera 96c2304ae8
allow MINIO_STS_DURATION to increase the IDP token expiration (#18396)
Share link duration is based on the IDP token expiration,
for the share link to last longer, you may now use
MINIO_STS_DURATION environment variable.
2023-11-15 20:42:31 -08:00
Harshavardhana 91d8bddbd1
use sendfile/splice implementation to perform DMA (#18411)
sendfile implementation to perform DMA on all platforms

Go stdlib already supports sendfile/splice implementations
for

- Linux
- Windows
- *BSD
- Solaris

Along with this change however O_DIRECT for reads() must be
removed as well since we need to use sendfile() implementation

The main reason to add O_DIRECT for reads was to reduce the
chances of page-cache causing OOMs for MinIO, however it would
seem that avoiding buffer copies from user-space to kernel space
this issue is not a problem anymore.

There is no Go based memory allocation required, and neither
the page-cache is referenced back to MinIO. This page-
cache reference is fully owned by kernel at this point, this
essentially should solve the problem of page-cache build up.

With this now we also support SG - when NIC supports Scatter/Gather
https://en.wikipedia.org/wiki/Gather/scatter_(vector_addressing)
2023-11-10 10:10:14 -08:00
Anis Eleuch 6ef8e87492
Support case insensitive kafka SASL mechanism config values (#18398) 2023-11-08 20:04:01 -08:00
Harshavardhana 754f7a8a39
replace io.Discard usage to fix some NUMA copy() latencies (#18394)
replace io.Discard usage to fix NUMA copy() latencies

On NUMA systems copying from 8K buffer allocated via
io.Discard leads to large latency build-up for every

```
copy(new8kbuf, largebuf)
```

can in-cur upto 1ms worth of latencies on NUMA systems
due to memory sharding across NUMA nodes.
2023-11-06 14:26:08 -08:00
Adrian Najera 06f59ad631
fix: expiration time for share link when using OpenID (#18297) 2023-10-30 10:21:34 -07:00
jiuker dbc2368a7b
fix: parse the subsys env error (#18319) 2023-10-26 08:12:57 -07:00
Klaus Post 74253e1ddc
Fix BackendInfo() race (#18305)
`GetParityForSC` has a value receiver, so Config is copied before the lock is obtained.

Make it pointer receiver.

Fixes:

```
WARNING: DATA RACE
Read at 0x0000079cdd10 by goroutine 190:
  github.com/minio/minio/cmd.(*erasureServerPools).BackendInfo()
      github.com/minio/minio/cmd/erasure-server-pool.go:579 +0x6f
  github.com/minio/minio/cmd.(*erasureServerPools).LocalStorageInfo()
      github.com/minio/minio/cmd/erasure-server-pool.go:614 +0x3c6
  github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler()
      github.com/minio/minio/cmd/peer-rest-server.go:347 +0x4ea
  github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler-fm()
...

WARNING: DATA RACE
Read at 0x0000079cdd10 by goroutine 190:
  github.com/minio/minio/cmd.(*erasureServerPools).BackendInfo()
      github.com/minio/minio/cmd/erasure-server-pool.go:579 +0x6f
  github.com/minio/minio/cmd.(*erasureServerPools).LocalStorageInfo()
      github.com/minio/minio/cmd/erasure-server-pool.go:614 +0x3c6
  github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler()
      github.com/minio/minio/cmd/peer-rest-server.go:347 +0x4ea
  github.com/minio/minio/cmd.(*peerRESTServer).LocalStorageInfoHandler-fm()
```
2023-10-24 08:15:41 -07:00
Krishnan Parthasarathi 8cd80fec8c
Add unit test for lifecycle.FilterRules (#18284) 2023-10-19 21:33:28 -07:00
Klaus Post e37508fb8f
fix: linter errors in Windows specific code (#18276) 2023-10-18 11:08:15 -07:00
Krishnan Parthasarathi 557df666fd
Don't skip rules with ExpiredObjectDeleteMarker (#18256) 2023-10-16 22:46:46 -07:00
Aditya Manthramurthy 28a2d1eb3d
Allow OpenID ARN resource ID to start with a `-` (#18255) 2023-10-16 13:50:51 -07:00
Klaus Post 128256e3ab
Add event counters (#18232)
Export metric for global events sent and skipped for the lifetime of the server.
2023-10-12 15:39:22 -07:00
Shubhendu 5b9656374c
Error if target went offline (#18221)
If target went offline while MinIO was down, error once
while trying to send message. If target goes offline during
MinIO server running, it already comes through ping() call
and errors out if target offline.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-10-12 06:13:57 -07:00
Harshavardhana 6829ae5b13
completely remove drive caching layer from gateway days (#18217)
This has already been deprecated for close to a year now.
2023-10-11 21:18:17 -07:00
Praveen raj Mani c27d0583d4
Send kafka notification messages in batches when queue_dir is enabled (#18164)
Fixes #18124
2023-10-07 08:07:38 -07:00
Sveinn 603437e70f
Fix startup formatting (#18156)
Percentages in root user names are used for formatting.

Before:
```
S3-API: http://192.168.50.21:9000  http://172.31.96.1:9000  http://127.0.0.1:9000
RootUser: "U4B6Zi!b75DXSPm%!!(MISSING)a(MISSING)vZb"
RootPass: "Q4#Q6y8G%!P(MISSING)x#npP4dudUobU#NBcGB7RMKV4ajYb"

Console: http://192.168.50.21:51915 http://172.31.96.1:51915 http://127.0.0.1:51915
RootUser: "U4B6Zi!b75DXSPm%!!(MISSING)a(MISSING)vZb"
RootPass: "Q4#Q6y8G%!P(MISSING)x#npP4dudUobU#NBcGB7RMKV4ajYb"

Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart
FORMAT: %117s MESSAGE: $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%avZb" "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"
   $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%!a(MISSING)vZb" "Q4#Q6y8G%Px#npP4dudUobU#NBcGB7RMKV4ajYb"
```

After:

```
Status:         1 Online, 0 Offline.
S3-API: http://192.168.50.21:9000  http://172.31.96.1:9000  http://127.0.0.1:9000
RootUser: "U4B6Zi!b75DXSPm%avZb"
RootPass: "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"

Console: http://192.168.50.21:52421 http://172.31.96.1:52421 http://127.0.0.1:52421
RootUser: "U4B6Zi!b75DXSPm%avZb"
RootPass: "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"

Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart
   $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%avZb" "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"
```

No need for special Windows case. `mc` works just fine.
2023-10-02 07:39:47 -06:00
Shubhendu 10d5dd3a67
fix: a regression with audit log sending (#18112)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-09-26 12:23:02 -07:00
Harshavardhana d9f1df01eb
return an error in CopyAligned upon premature EOF (#18110)
add a unit-test to capture this corner case
2023-09-26 11:20:06 -07:00
Anis Eleuch 4eeb48f8e0
Return cached online/offline status for audit/http loggers (#18083)
To avoid having delays in prometheus scrape and in 'mc admin info' command.
2023-09-21 16:58:24 -07:00
Harshavardhana 1472875670
fix: failed messages counting in audit_http metrics (#18075)
all retries must not be counted as failed messages,
a failed message is a single counter not for all
retries, this PR fixes this.

Also we do not need to retry 10-times, instead we should
retry at max 3 times with some jitter to deliver the
messages.
2023-09-21 11:24:56 -07:00
Harshavardhana 2add57cfed
apply healing per object at 1024 cycles (#18050)
- we already have MRF for most recent failures
- we trigger healing during HEAD/GET operation

These are enough, also change the default max wait
from 5sec to 1sec for default scanner speed.
2023-09-19 09:24:22 -07:00
jiuker 9947c01c8e
feat: SSE-KMS use uuid instead of read all data to md5. (#17958) 2023-09-18 10:00:54 -07:00
Anis Eleuch 419e5baf16
fix: webhook notify endpoint with standard ports (#18016) 2023-09-14 20:10:44 -07:00
Alex dc48cd841a
Added MINIO_PROMETHEUS_AUTH_TOKEN env support (#18028)
Signed-off-by: Benjamin Perez <benjamin@bexsoft.net>
2023-09-14 17:28:21 -07:00
Aditya Manthramurthy cbc0ef459b
Fix policy package import name (#18031)
We do not need to rename the import of minio/pkg/v2/policy as iampolicy
any more.
2023-09-14 14:50:16 -07:00
Harshavardhana 32890342ce
introduce MINIO_BROWSER_REDIRECT env to enable/disable auto-redirect (#18025) 2023-09-13 18:43:57 -07:00
Anis Eleuch 41de53996b
heal: calculate the number of workers based on NRRequests (#17945) 2023-09-11 14:48:54 -07:00
Harshavardhana 9878031cfd
fix: change DISK_ to DRIVE_ for some drive related envs (#18005) 2023-09-11 12:19:22 -07:00
Harshavardhana e3fbcaeb72
allow scanner key cycle to be empty (#18001)
configs from 2020 server throws an
error due to deprecation of the keys
however an attempt is made to parse
them, we should have chosen existing
defaults - this PR fixes that.
2023-09-09 08:53:32 -07:00
Anis Eleuch b9269151a4
fix: drive rotational calculation status for partitions (#17986)
Fix drive rotational calculation status

If a MinIO drive path is mounted to a partition and not a real disk,
getting the rotational status would fail because Linux does not expose
that status to partition; In other words,
/sys/block/drive-partition-name/queue/rotational does not exist;

To fix the issue, the code will search for the rotational status of the
disk that hosts the partition, and this can be calculated from the
real path of /sys/class/block/<drive-partition-name>
2023-09-06 12:37:57 -07:00
Harshavardhana 5b114b43f7
refactor bandwidth throttling for replication target (#17980)
This refactor is to allow using the bandwidth throttling
for other purposes.
2023-09-05 20:21:59 -07:00
Aditya Manthramurthy 1c99fb106c
Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Anis Eleuch 6a8d8f34a5
kafka: Do not require key when sending a message (#17962)
Keys are helpful to ensure the strict ordering of messages, however currently the
code uses a random request id for every log, hence using the request-id
as a Kafka key is not serve any purpose;

This commit removes the usage of the key, to also fix the audit issue from
internal subsystem that does not have a request ID.
2023-09-01 08:37:22 -07:00
Harshavardhana 0d1fbef751
fix: a possible crash in event target Close() (#17948)
these are possible crashes when the configured
target is still in init() state and never finished
- however a delete config was initiated.
2023-08-30 07:27:45 -07:00
Harshavardhana 07b1281046 add queue_dir to help message for logger/audit targets 2023-08-29 16:07:35 -07:00
Harshavardhana adb8be069e
tune-kafka targets to ensure timeout triggers on hung brokers (#17898)
hung brokers can cause slowness to the entire system
when many callers are hung, leading to large goroutine
build-up.
2023-08-22 20:26:35 -07:00
Harshavardhana 3a0125fa1f
remove unexpected logging from peer calls (#17888)
also make sure RequestID is set for system logs
2023-08-21 14:25:24 -07:00
Daniel Valdivia 328cb0a076
Pass environment variable to control session length to console (#17885)
Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>
2023-08-21 11:55:43 -07:00
jiuker fa2a8d7209
fix: drain the req.body into io.Discard correctly (#17881) 2023-08-21 01:09:07 -07:00
Andreas Auernhammer 8f8f8854f0
update `minio/kes-go` dep to v0.2.0 (#17850)
This commit updates the minio/kes-go dependency
to v0.2.0 and updates the existing code to work
with the new KES APIs.

The `SetPolicy` handler got removed since it
may not get implemented by KES at all and could
not have been used in the past since stateless KES
is read-only w.r.t. policies and identities.

Signed-off-by: Andreas Auernhammer <hi@aead.dev>
2023-08-19 07:37:53 -07:00
Harshavardhana bc7c0d8624
if object is a delete marker it must skip tags filter in ILM (#17861) 2023-08-18 09:36:23 -07:00
Harshavardhana 11dfc817f3
do not log client canceled events (#17838) 2023-08-17 14:53:43 -07:00
Harshavardhana 8a9b886011
update grafana dashboard with disk -> drive rename (#17857) 2023-08-15 16:04:20 -07:00
Klaus Post 406ea4f281
Fix distributed listing not able to resume (#17855)
Two fields in lifecycles made GOB encoding consistently fail with `gob: type lifecycle.Prefix has no exported fields`.

This meant that in distributed systems listings would never be able to continue and would restart on every call.

Fix issues and be sure to log these errors at least once per bucket. We may see some connectivity errors here, but we shouldn't hide them.
2023-08-15 07:45:25 -07:00
Harshavardhana c45bc32d98
skip disks under scanning when healing disks (#17822)
Bonus:

- avoid calling DiskInfo() calls when missing blocks
  instead heal the object using MRF operation.

- change the max_sleep to 250ms beyond that we will
  not stop healing.
2023-08-09 12:51:47 -07:00
Praveen raj Mani 0285df5a02
fix: prioritize audit_webhook and logger_webhook ENVs over the config KVS (#17783) 2023-08-03 02:47:07 -07:00
Harshavardhana 81be718674
fix: optimize DiskInfo() call avoid metrics when not needed (#17763) 2023-07-31 15:20:48 -07:00
Harshavardhana 73edd5b8fd
introduce 'mc admin config set alias/ api odirect=on' (#17753)
change disable_odirect=off -> odirect=on to make it
easier to understand, instead of making it double
negative.
2023-07-31 00:12:53 -07:00
Harshavardhana f13cfcb83e
allow disabling O_DIRECT for write ops (#17751)
on really slow systems, O_DIRECT simply kills the drives
allow for a way to disable them.
2023-07-29 15:17:56 -07:00
Anis Eleuch 9c0e8cd15b
logger: Avoid slow calls in http logger Send() function (#17747)
Send() is synchronous and can affect the latency of S3 requests when the
logger buffer is full.

Avoid checking if the HTTP target is online or not and increase the
workers anyway since the buffer is already full.

Also, avoid logs flooding when the audit target is down.
2023-07-29 12:49:18 -07:00
Harshavardhana 731e03fe5a
add ReadFileStream deadline for disk call (#17745)
timeout the reader side if hung via disk max timeout
2023-07-28 15:37:53 -07:00
jiuker f9d029c8fa
fix: prevCertificate save not correct in loop (#17744)
fix certs save not correct

Co-authored-by: guozhi.li <guozhi.li@daocloud.io>
2023-07-28 12:57:49 -07:00
Harshavardhana 114fab4c70
export cluster health as prometheus metrics (#17741) 2023-07-28 01:16:53 -07:00
Harshavardhana e7b60c4d65
Add slow drive timeouts to match with active disk monitoring (#17701)
allow active disk-monitoring to be configurable, and use
these add deadlines in various call layers for various
syscalls.
2023-07-25 16:58:31 -07:00
Harshavardhana 14e1ace552
remove serializing WalkDir() across all buckets/prefixes on SSDs (#17707)
slower drives get knocked off because they are too slow via 
active monitoring, we do not need to block calls arbitrarily.

Serializing adds latencies for already slow calls, remove
it for SSDs/NVMEs

Also, add a selection with context when writing to `out <-`
channel, to avoid any potential blocks.
2023-07-24 09:30:19 -07:00
Harshavardhana e12ab486a2
avoid using os.Getenv for internal code, use env.Get() instead (#17688) 2023-07-20 07:52:49 -07:00
drivebyer 9b5c2c386a
fix: return error when requested interface has no stats available (#17666) 2023-07-17 01:14:01 -07:00
jiuker d118031ed6
fix: when Origin: null is set return back '*' for allow origins (#17651) 2023-07-15 12:15:06 -07:00
Shireesh Anjal bb63375f1b
Do not consider subnet api key as secret (#17643)
As it is required by mc and console to communicate with subnet
2023-07-13 12:24:47 -07:00
jiuker 183428db03
fear: Implement 'mc support top net' (#17598) 2023-07-13 11:41:19 -07:00
Shubhendu 9b9871cfbb
Added `endpoint` and `versions` attributes to KMS details (#17350)
Now it would list details of all KMS instances with additional
attributes `endpoint` and `version`. In the case of k8s-based
deployment the list would consist of a single entry.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-07-12 23:50:38 -07:00
Harshavardhana 2d1cda2061
fix: do not os.Exit(1) while writing goroutines during shutdown (#17640)
Also shutdown poll add jitter, to verify if the shutdown
sequence can finish before 500ms, this reduces the overall
time taken during "restart" of the service.

Provides speedup for `mc admin service restart` during
active I/O, also ensures that systemd doesn't treat the
returned 'error' as a failure, certain configurations in
systemd can cause it to 'auto-restart' the process by-itself
which can interfere with `mc admin service restart`.

It can be observed how now restarting the service is
much snappier.
2023-07-12 07:18:30 -07:00
Poorna fb49aead9b
replication: add validation API (#17520)
To check if replication is set up properly on a bucket.
2023-07-10 20:09:20 -07:00
Harshavardhana f6186965c3
honor DeleteAllVersions in list(), head() calls (#17604) 2023-07-08 15:42:10 -07:00
Harshavardhana 28a01f0320
update missing license header in files (#17603) 2023-07-08 10:42:05 -07:00
Klaus Post 45a717a142
Avoid per request URL parsing (#17593)
Every request does a `url.Parse(c.url.String())` to clone a URL.

The host will also be static, so we rewrite that on creation.
2023-07-07 22:07:30 -07:00
Harshavardhana 0bc34952eb
fix: under FanOut API avoid repeated md5sum calculation (#17572)
md5sum calculation has a high CPU overhead, avoid calculating
it repeatedly for similar fanOut calls.

To fix following CPU profiler result
```
(pprof) top10
Showing nodes accounting for 678.68s, 84.67% of 801.54s total
Dropped 1072 nodes (cum <= 4.01s)
Showing top 10 nodes out of 156
      flat  flat%   sum%        cum   cum%
   332.54s 41.49% 41.49%    332.54s 41.49%  runtime/internal/syscall.Syscall6
   228.39s 28.49% 69.98%    228.39s 28.49%  crypto/md5.block
    48.07s  6.00% 75.98%     48.07s  6.00%  runtime.memmove
    28.91s  3.61% 79.59%     28.91s  3.61%  github.com/minio/highwayhash.updateAVX2
     8.25s  1.03% 80.61%      8.25s  1.03%  runtime.futex
     8.25s  1.03% 81.64%     10.81s  1.35%  runtime.step
     6.99s  0.87% 82.52%     22.35s  2.79%  runtime.pcvalue
     6.67s  0.83% 83.35%     38.90s  4.85%  runtime.mallocgc
     5.77s  0.72% 84.07%     32.61s  4.07%  runtime.gentraceback
     4.84s   0.6% 84.67%     10.49s  1.31%  runtime.lock2
```
2023-07-05 03:16:05 -07:00
Harshavardhana e37c4efc6e
fix: upon DNS refresh() failure use previous values (#17561)
DNS refresh() in-case of MinIO can safely re-use
the previous values on bare-metal setups, since
bare-metal arrangements do not change DNS in any 
manner commonly.

This PR simplifies that, we only ever need DNS caching
on bare-metal setups.

- On containerized setups do not enable DNS
  caching at all, as it may have adverse effects on
  the overall effectiveness of k8s DNS systems.

  k8s DNS systems are dynamic and expect applications
  to avoid managing DNS caching themselves, instead
  provide a cleaner container native caching
  implementations that must be used.

- update IsDocker() detection, including podman runtime

- move to minio/dnscache fork for a simpler package
2023-07-03 12:30:51 -07:00
Harshavardhana 22f5bc643c
fix: honor older scanner settings only if newer has not changed (#17564) 2023-07-03 12:28:36 -07:00
Harshavardhana 7f782983ca
fix: for FTP server driver allow implicit trust of TLS (#17541)
fixes #17535
2023-06-30 08:04:13 -07:00
Aditya Manthramurthy bde533a9c7
fix: OpenID config initialization (#17544)
This is due to a regression in the handling of the enable key in OpenID
configuration.
2023-06-29 23:38:26 -07:00
Harshavardhana aae6846413
feat: allow expiration of all versions via ILM Expiration action (#17521)
Following extension allows users to specify immediate purge of
all versions as soon as the latest version of this object has
expired.

```
<LifecycleConfiguration>
    <Rule>
        <ID>ClassADocRule</ID>
        <Filter>
           <Prefix>classA/</Prefix>
        </Filter>
        <Status>Enabled</Status>
        <Expiration>
             <Days>3650</Days>
	     <ExpiredObjectAllVersions>true</ExpiredObjectAllVersions>
        </Expiration>
    </Rule>
    ...
```
2023-06-28 22:12:28 -07:00
guangwu 87b6fb37d6
chore: pkg imported more than once (#17444) 2023-06-26 09:21:29 -07:00
Aditya Manthramurthy f3248a4b37
Redact all secrets from config viewing APIs (#17380)
This change adds a `Secret` property to `HelpKV` to identify secrets
like passwords and auth tokens that should not be revealed by the server
in its configuration fetching APIs. Configuration reporting APIs now do
not return secrets.
2023-06-23 07:45:27 -07:00
Harshavardhana 74759b05a5
make sure to set relevant config entries correctly (#17485)
Bonus: also allow skipping keys properly.
2023-06-22 10:04:02 -07:00
Praveen raj Mani 7c72b25ef0
Add an option to make bucket notifications synchronous (#17406)
With the current asynchronous behaviour in sending notification events
to the targets, we can't provide guaranteed delivery as the systems
might go for restarts.

For such event-driven use-cases, we can provide an option to enable
synchronous events where the APIs wait until the event is successfully
sent or persisted.

This commit adds 'MINIO_API_SYNC_EVENTS' env which when set to 'on'
will enable sending/persisting events to targets synchronously.
2023-06-20 17:38:59 -07:00
Aditya Manthramurthy 5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Harshavardhana 22b7c8cd8a
upgrade pkg and dperf to latest packages (#17448)
- dperf improvements in benchmarking read and write tests
- upgrade mimedb to use latest content-types
2023-06-17 07:31:36 -07:00
jiuker 0474791cf8
fix: set time format right (#17402) 2023-06-14 07:49:13 -07:00
Harshavardhana f32efd5429
more compliance related fixes (#17408)
- lifecycle must return InvalidArgument for rule errors
- do not return `null` versionId in HTTP header
- reject mixed SSE uploads with correct error message
2023-06-13 13:52:33 -07:00
Anis Eleuch 0f0dcf0c5e
tar: Avoid storing snowball extraction header in extract objects (#17389) 2023-06-12 09:42:06 -07:00
Anis Eleuch bb24346e04
listen: Only error out if not able to bind any interface (#17353) 2023-06-12 09:09:28 -07:00