Commit Graph

845 Commits

Author SHA1 Message Date
Klaus Post b5177993b3
Make DeadlineConn http.Listener compatible (#20635)
HTTP likes to slap an infinite read deadline on a connection and 
do a blocking read while the response is being written.

This effectively means that a reading deadline becomes the 
request-response deadline.

Instead of enforcing our timeout, we pass it through and keep 
"infinite deadline" is sticky on connections.

However, we still "record" when reads are aborted, so we never overwrite that.

The HTTP server should have `ReadTimeout` and `IdleTimeout` set for the deadline to be effective.

Use --idle-timeout for incoming connections.
2024-11-12 12:41:41 -08:00
Klaus Post 55f5c18fd9
Harden internode DeadlineConn (#20631)
Since DeadlineConn would send deadline updates directly upstream,
it would race with Read/Write operations. The stdlib will perform a read, 
but do an async SetReadDeadLine(unix(1)) to cancel the Read in 
`abortPendingRead`. In this case, the Read may override the 
deadline intended to cancel the read.

Stop updating deadlines if a deadline in the past is seen and when Close is called. 
A mutex now protects all upstream deadline calls to avoid races. 

This should fix the short-term buildup of...

```
365 @ 0x44112e 0x4756b9 0x475699 0x483525 0x732286 0x737407 0x73816b 0x479601
#	0x475698	sync.runtime_notifyListWait+0x138		runtime/sema.go:569
#	0x483524	sync.(*Cond).Wait+0x84				sync/cond.go:70
#	0x732285	net/http.(*connReader).abortPendingRead+0xa5	net/http/server.go:729
#	0x737406	net/http.(*response).finishRequest+0x86		net/http/server.go:1676
#	0x73816a	net/http.(*conn).serve+0x62a			net/http/server.go:2050
```

AFAICT Only affects internode calls that create a connection (non-grid).
2024-11-11 09:15:17 -08:00
Klaus Post 4972735507
Fix lint issues from v1.62.0 upgrade (#20633)
* Fix lint issues from v1.62.0 upgrade

* Fix xlMetaV2TrimData version checks.
2024-11-11 06:51:43 -08:00
Harshavardhana cefc43e4da simplify the Get()/GetMultiple() re-use GetRaw() for both (#179)
Remember GetMultiple() must be used if your target is calling
PutMultiple(), without that the multiple events will not be
replayed.
2024-11-06 16:52:20 -08:00
Ramon de Klein 25e34fda5f
decompress audit log properly before sending to remote target (#20619) 2024-11-06 13:25:24 -08:00
Klaus Post 8d42f37e4b
Fix msgUnPath crash (#20614)
These are needed checks for the functions to be un-crashable with any input 
given to `msgUnPath` (tested with fuzzing).

Both conditions would result in a crash, which prevents that. Some 
additional upstream checks are needed.

Fixes #20610
2024-11-05 04:37:59 -08:00
Harshavardhana 1615920f48 fix typos reported in CI/CD 2024-11-04 11:06:02 -08:00
Harshavardhana 7f1e1713ab
use absolute path for binary checksum verification (#20487) 2024-09-26 08:03:08 -07:00
Anis Eleuch 2b0156b1fc Add TTFB to all APIs and enable for responses without body (#20479)
Add TTFB for all requests in metrics-v3 in addition to the existing
GetObject. Also for the requests that do not return a body in the
response, calculate TTFB as the HTTP status code and the headers are
sent.
2024-09-24 10:13:00 -07:00
Klaus Post 974cbb3bb7
Limit jstream parse depth (#20474)
Add https://github.com/bcicen/jstream/pull/15 by vendoring the package.

Sets JSON depth limit to 100 entries in S3 Select.
2024-09-23 12:35:41 -07:00
Harshavardhana 03e996320e
upgrade deps pkg/v3, madmin-go/v3 and lz4/v4 (#20467) 2024-09-21 17:33:43 -07:00
Ramon de Klein 3d152015eb
Use MinIO console v1.7.1 (#20465) 2024-09-20 18:18:54 -07:00
jiuker ade8925155
fix: add default http timeout for audit webhook (#20460) 2024-09-20 09:02:50 -07:00
Klaus Post 05a6c170bf
Fix PutObject Trailing checksum (#20456)
PutObject would verify trailing checksums, but not store them.

Fixes #20455
2024-09-19 05:59:07 -07:00
Shubhendu 5bd27346ac
Added iam import tests for openid (#20432)
Tests if imported service accounts have 
required access to buckets and objects.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>

Co-authored-by: Harshavardhana <harsha@minio.io>
2024-09-17 09:45:46 -07:00
Klaus Post 8a30967542
Limit S3 Select JSON documents to 10MB (#20439)
Closes #20430

Limit allocations from badly formed documents.
2024-09-16 09:59:03 -07:00
Harshavardhana c28a4beeb7
multipart support etag and pre-read small objects (#20423) 2024-09-12 05:24:04 -07:00
Sveinn 3bae73fb42
Add http_timeout to audit webhook configurations (#20421) 2024-09-11 15:20:42 -07:00
Harshavardhana 8c9ab85cfa
Add multipart uploads cache for ListMultipartUploads() (#20407)
this cache will be honored only when `prefix=""` while
performing ListMultipartUploads() operation.

This is mainly to satisfy applications like alluxio
for their underfs implementation and tests.

replaces https://github.com/minio/minio/pull/20181
2024-09-09 09:58:30 -07:00
Klaus Post b1c849bedc
Don't send a canceled context to Unlock (#20409)
AFAICT we send a canceled context to unlock (and thereby releaseAll). This will cause network calls to fail.

Instead use background and add 30s timeout.
2024-09-09 08:49:49 -07:00
Harshavardhana fb24bcfee0
fix: set audit/logger webhook retry interval to maximum 1m (#20404) 2024-09-09 02:36:47 -07:00
Harshavardhana 8268c12cfb
Add support for audit/logger max retry and retry interval (#20402)
Current implementation retries forever until our
log buffer is full, and we start dropping events.

This PR allows you to set a value until we give
up on existing audit/logger batches to proceed to
process the new ones.

Bonus:
 - do not blow up buffers beyond batchSize value
 - do not leak the ticker if the worker returns
2024-09-08 05:15:09 -07:00
Sveinn 3f39da48ea
fix: retries and failed message counter (#20401) 2024-09-07 17:13:57 -07:00
Klaus Post 9d5cdaa2e3
Limit Response Recorder memory (#20399)
Disable body recording for...

* admin inspect
* admin metrics
* profiling download

Also, if the recorded body is > 10MB, drop it.
2024-09-07 12:16:04 -07:00
Praveen raj Mani 261111e728
Kafka notify: support batched commits for queue store (#20377)
The items will be saved per target batch and will
be committed to the queue store when the batch is full

Also, periodically commit the batched items to the queue store
based on configured commit_timeout; default is 30s;

Bonus: compress queue store multi writes
2024-09-06 16:06:30 -07:00
Harshavardhana 0f1e8db4c5
all 2xx status codes to be success for audit (#20394) 2024-09-06 15:53:34 -07:00
jiuker 241be9709c
fix: jwt error overrwriten by nil public key (#20387) 2024-09-05 19:46:36 -07:00
Anis Eleuch 9b79eec29e
site-repl: Fix ILM document replication in some cases (#20380)
S3 spec does not accept an ILM XML document containing both <Filter>
and <Prefix> XML tags, even if both are empty. That is why we added
a 'set' field in some lifecycle structures to decide when and when not to
show a tag. However, we forgot to disallow marshaling of Filter when
'set' is set to false.

This will fix ILM document replication in a site replication
configuration in some cases.
2024-09-04 10:01:26 -07:00
Harshavardhana c2e318dd40
remove mincache EOS related feature from upstream (#20375) 2024-09-03 11:23:41 -07:00
Harshavardhana 504e52b45e
protect bpool from buffer pollution by invalid buffers (#20342) 2024-08-28 18:40:52 -07:00
Harshavardhana c65e67c357
add more details on the payload sent to webhook audit (#20335) 2024-08-28 08:31:56 -07:00
Mark Theunissen 9511056f44
fix: simplify error logged when logger target is unreachable (#20304) 2024-08-22 02:43:48 -07:00
Aditya Manthramurthy 8a11282522
[fix] S3Select: Add some missing input validation (#20278)
Prevents server panic when some CSV parameters are empty.
2024-08-20 11:31:45 -07:00
Mark Theunissen 6378ca10a4
kms.ListKeys returns CreatedBy/CreatedAt when information is available (#20223) 2024-08-17 23:43:03 -07:00
Harshavardhana a5702f978e
remove requests deadline, instead just reject the requests (#20272)
Additionally set

 - x-ratelimit-limit
 - x-ratelimit-remaining

To indicate the request rates.
2024-08-16 01:43:49 -07:00
Klaus Post f1302c40fe
Fix uninitialized replication stats (#20260)
Services are unfrozen before `initBackgroundReplication` is finished. This means that 
the globalReplicationStats write is racy. Switch to an atomic pointer.

Provide the `ReplicationPool` with the stats, so it doesn't have to be grabbed 
from the atomic pointer on every use.

All other loads and checks are nil, and calls return empty values when stats 
still haven't been initialized.
2024-08-15 05:04:40 -07:00
Harshavardhana 3b1aa40372
support relative paths for KMS_SECRET_KEY_FILE (#20264)
fixes #20251
2024-08-15 04:46:39 -07:00
Sveinn 743ddb196a
Removing the audit log retry mechanism (#20259) 2024-08-14 15:25:08 -07:00
Klaus Post 3ffeabdfcb
Fix govet+staticcheck issues (#20263)
This is better: https://github.com/golang/go/issues/60529
2024-08-14 10:11:51 -07:00
Harshavardhana e7a56f35b9
flatten out audit tags, do not send as free-form (#20256)
move away from map[string]interface{} to map[string]string
to simplify the audit, and also provide concise information.

avoids large allocations under load(), reduces the amount
of audit information generated, as the current implementation
was a bit free-form. instead all datastructures must be
flattened.
2024-08-13 15:22:04 -07:00
Harshavardhana acdb355070
update deps and update azure WARM tier implementation (#20247) 2024-08-13 11:21:34 -07:00
Klaus Post d8f0e0ea6e
Simplify error logging on event send (#20246)
Overly verbose, hard to read and can leak data.

Print even as JSON and simplify target&error printing.
2024-08-12 08:55:28 -07:00
Harshavardhana 2e0fd2cba9
implement a safer completeMultipart implementation (#20227)
- optimize writing part.N.meta by writing both part.N
  and its meta in sequence without network component.

- remove part.N.meta, part.N which were partially success
  ful, in quorum loss situations during renamePart()

- allow for strict read quorum check arbitrated via ETag
  for the given part number, this makes it double safer
  upon final commit.

- return an appropriate error when read quorum is missing,
  instead of returning InvalidPart{}, which is non-retryable
  error. This kind of situation can happen when many
  nodes are going offline in rotation, an example of such
  a restart() behavior is statefulset updates in k8s.

fixes #20091
2024-08-12 01:38:15 -07:00
Andreas Auernhammer 14876a4df1
ldap: use custom TLS cipher suites (#20221)
This commit replaces the LDAP client TLS config and
adds a custom list of TLS cipher suites which support
RSA key exchange (RSA kex).

Some LDAP server connections experience a significant slowdown
when these cipher suites are not available. The Go TLS stack
disables them by default. (Can be enabled via GODEBUG=tlsrsakex=1).

fixes https://github.com/minio/minio/issues/20214

With a custom list of TLS ciphers, Go can pick the TLS RSA key-exchange
cipher. Ref:
```
	if c.CipherSuites != nil {
		return c.CipherSuites
	}
	if tlsrsakex.Value() == "1" {
		return defaultCipherSuitesWithRSAKex
	}
```
Ref: https://cs.opensource.google/go/go/+/refs/tags/go1.22.5:src/crypto/tls/common.go;l=1017

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2024-08-07 05:59:47 -07:00
Harshavardhana a17f14f73a
separate lock from common grid to avoid epoll contention (#20180)
epoll contention on TCP causes latency build-up when
we have high volume ingress. This PR is an attempt to
relieve this pressure.

upstream issue https://github.com/golang/go/issues/65064
It seems to be a deeper problem; haven't yet tried the fix
provide in this issue, but however this change without
changing the compiler helps. 

Of course, this is a workaround for now, hoping for a
more comprehensive fix from Go runtime.
2024-07-29 11:10:04 -07:00
Klaus Post 59788e25c7
Update connection deadlines less frequently (#20166)
Only set write deadline on connections every second. Combine the 2 write locations into 1.
2024-07-26 10:40:11 -07:00
Harshavardhana 064f36ca5a
move to GET for internal stream READs instead of POST (#20160)
the main reason is to let Go net/http perform necessary
book keeping properly, and in essential from consistency
point of view its GETs all the way.

Deprecate sendFile() as its buggy inside Go runtime.
2024-07-26 05:55:01 -07:00
Klaus Post 15b609ecea
Expose RPC reconnections and ping time (#20157)
- Keeps track of reconnection count.
- Keeps track of connection ping roundtrip times. 
  Sends timestamp in ping message.
- Allow ping without payload.
2024-07-25 14:07:21 -07:00
Harshavardhana 3b21bb5be8
use unixNanoTime instead of time.Time in lockRequestorInfo (#20140)
Bonus: Skip Source, Quorum fields in lockArgs that are never
sent during Unlock() phase.
2024-07-24 03:24:01 -07:00
Harshavardhana 6fe2b3f901
avoid sendFile() for ranges or object lengths < 4MiB (#20141) 2024-07-24 03:22:50 -07:00