Commit Graph

11124 Commits

Author SHA1 Message Date
Krishnan Parthasarathi 71c32e9b48
Return successorModTime in quorum when available (#17925) 2023-09-04 08:24:17 -07:00
Harshavardhana 380a59520b add missing testdata for benchmarking 2023-09-02 14:40:38 -07:00
Harshavardhana 3995355150
avoid repeated large allocations for large parts (#17968)
objects with 10,000 parts and many of them can
cause a large memory spike which can potentially
lead to OOM due to lack of GC.

with previous PR reducing the memory usage significantly
in #17963, this PR reduces this further by 80% under
repeated calls.

Scanner sub-system has no use for the slice of Parts(),
it is better left empty.

```
benchmark                            old ns/op     new ns/op     delta
BenchmarkToFileInfo/ToFileInfo-8     295658        188143        -36.36%

benchmark                            old allocs     new allocs     delta
BenchmarkToFileInfo/ToFileInfo-8     61             60             -1.64%

benchmark                            old bytes     new bytes     delta
BenchmarkToFileInfo/ToFileInfo-8     1097210       227255        -79.29%
```
2023-09-02 07:49:24 -07:00
Harshavardhana 8208bcb896
remove all unnecessary logging, logOnce when absolutely needed (#17965) 2023-09-01 16:19:18 -07:00
Poorna d665e855de
replication: remove check for empty version id (#17964) 2023-09-01 13:46:10 -07:00
Harshavardhana 18b3655c99
with xlv2 format we never had to fill in checksumInfo() (#17963)
- this PR avoids sending a large ChecksumInfo slice
  when its not needed

- also for a file with XLV2 format there is no reason
  to allocate Checksum slice while reading
2023-09-01 13:45:58 -07:00
Anis Eleuch 6a8d8f34a5
kafka: Do not require key when sending a message (#17962)
Keys are helpful to ensure the strict ordering of messages, however currently the
code uses a random request id for every log, hence using the request-id
as a Kafka key is not serve any purpose;

This commit removes the usage of the key, to also fix the audit issue from
internal subsystem that does not have a request ID.
2023-09-01 08:37:22 -07:00
Harshavardhana b1c1f02132
use buffers for pathJoin, to re-use buffers. (#17960)
```
benchmark                        old ns/op     new ns/op     delta
BenchmarkPathJoin/PathJoin-8     79.6          55.3          -30.53%

benchmark                        old allocs     new allocs     delta
BenchmarkPathJoin/PathJoin-8     2              1              -50.00%

benchmark                        old bytes     new bytes     delta
BenchmarkPathJoin/PathJoin-8     48            24            -50.00%
```
2023-08-31 17:58:48 -07:00
Minio Trusted ea93643e6a Update yaml files to latest version RELEASE.2023-08-31T15-31-16Z 2023-09-01 00:15:57 +00:00
Shubhendu e47e625f73
Added replication graphs for site replication metrics (#17951)
This dashboard graphs the metrics when site replication is enabled
across MinIO instances.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-31 08:31:16 -07:00
yangw b13fcaf666
fix: read atomic variable in clientDevNull round trip time (#17955) 2023-08-31 08:31:01 -07:00
Harshavardhana 9458485e43
avoid double logging from healing (#17950) 2023-08-30 18:46:04 -07:00
Shubhendu 0ce9e00ffa
Added node scanner and node drives graphs (#17949)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-30 14:01:51 -07:00
Shubhendu c778c381b5
Added new bucket replication graphs (#17947)
This PR adds new bucket replication graphs for better and granular
monitoring of bucket replication. Also arranged all replication graphs
together.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-30 11:57:41 -07:00
Harshavardhana 0d1fbef751
fix: a possible crash in event target Close() (#17948)
these are possible crashes when the configured
target is still in init() state and never finished
- however a delete config was initiated.
2023-08-30 07:27:45 -07:00
Poorna b48bbe08b2
Add additional info for replication metrics API (#17293)
to track the replication transfer rate across different nodes,
number of active workers in use and in-queue stats to get
an idea of the current workload.

This PR also adds replication metrics to the site replication
status API. For site replication, prometheus metrics are
no longer at the bucket level - but at the cluster level.

Add prometheus metric to track credential errors since uptime
2023-08-30 01:00:59 -07:00
Minio Trusted cce90cb2b7 Update yaml files to latest version RELEASE.2023-08-29T23-07-35Z 2023-08-30 02:17:56 +00:00
Harshavardhana 07b1281046 add queue_dir to help message for logger/audit targets 2023-08-29 16:07:35 -07:00
Ravind Kumar 3515b99671
Clarify Community Helm Chart (#17944)
There is some consistent confusion between the Community Helm Chart in this repo and the MinIO Kubernetes Operator Helm Chart.

This change seeks to clarify the differences between the two charts and which ones are community maintained vs MinIO maintained.
2023-08-29 11:27:48 -07:00
Krishnan Parthasarathi 6a67c277eb
Reuse types for key-value, notification and retry (#17936) 2023-08-29 11:27:23 -07:00
Harshavardhana 1067dd3011
update minio-go v7.0.63 (#17937)
Signed-off-by: Harshavardhana <harsha@minio.io>
2023-08-28 20:02:14 -07:00
Harshavardhana 7cafdc0512
fix: skip access checks further for known buckets (#17934) 2023-08-28 15:16:41 -07:00
Harshavardhana 8a57b6bced
use renameat2 Linux extension syscall (#17757)
this is a faster and safer alternative
on newer kernel versions.
2023-08-27 09:57:11 -07:00
Andrea Longo 6f0ed2a091
Add equivalent mc ilm rule commands for some JSON rule examples (#17923) 2023-08-26 00:33:25 -07:00
Krishnan Parthasarathi 53abd25116
Don't log when object to be tiered is not found (#17924) 2023-08-25 23:34:16 -07:00
Harshavardhana 1ea7826c0e
do not have to consider replicationTimestamp for healing and quorum (#17922)
replicationTimestamp might differ if there were retries
in replication and the retried attempt overwrote in
quorum but enough shards with newer timestamp causing
the existing timestamps on xl.meta to be invalid, we
do not rely on this value for anything external.

this is purely a hint for debugging purposes, but there
is no real value in it considering the object itself
is in-tact we do not have to spend time healing this
situation.

we may consider healing this situation in future but
that needs to be decoupled to make sure that we do not
over calculate how much we have to heal.
2023-08-25 15:31:15 -07:00
Harshavardhana 97f4cf48f8 update wording on PR template 2023-08-25 11:57:37 -07:00
Anis Eleuch 0cde37be50
Reduce the number of calls to import bucket metadata (#17899)
For each bucket, save the bucket metadata 
once, call the site replication hook once
2023-08-25 07:59:16 -07:00
jiuker 6aeca54ece
fix: replace context by timeout-context from parent-context when `selfSpeedTest` (#17906) 2023-08-25 07:58:38 -07:00
Harshavardhana 124e28578c
remove strict persistence requirements for List() .metacache objects (#17917)
.metacache objects are transient in nature, and are better left to
use page-cache effectively to avoid using more IOPs on the disks.

this allows for incoming calls to be not taxed heavily due to
multiple large batch listings.
2023-08-25 07:58:11 -07:00
Harshavardhana 62c9e500de
remove mTime requirement from pre-condition checks (#17916)
given a versionId the mtime is always the same, it
can never be different than its original value.

versionIds also do not conflict, since they are uuid's
and unique practically forever.
2023-08-24 14:33:58 -07:00
jiuker 02cc18ff29 refactor the perf client for TTFB and TotalResponseTime (#17901) 2023-08-24 10:21:08 -07:00
Harshavardhana ba4566e86d
add missing IAM node metrics to cluster and node endpoint (#17908) 2023-08-24 09:26:37 -07:00
Krishnan Parthasarathi 87cb0081ec
Retain current and upto NewerNoncurrentVersions versions (#17909)
applyNewerNoncurrentVersionLimit method should pass along versions
unaffected by NewerNoncurrentVersions rule for further ILM evaluation.
2023-08-24 09:26:29 -07:00
Poorna 4a6af93c83
mark replication target offline if network timeouts seen (#17907)
regular target liveness check every 5 secs will toggle state back
as target returns online.
2023-08-24 09:24:26 -07:00
Minio Trusted a2f0771fd3 Update yaml files to latest version RELEASE.2023-08-23T10-07-06Z 2023-08-23 23:17:59 +00:00
Harshavardhana af564b8ba0
allow bootstrap to capture time-spent for each initializers (#17900) 2023-08-23 03:07:06 -07:00
Harshavardhana adb8be069e
tune-kafka targets to ensure timeout triggers on hung brokers (#17898)
hung brokers can cause slowness to the entire system
when many callers are hung, leading to large goroutine
build-up.
2023-08-22 20:26:35 -07:00
Klaus Post 7c8746732b
Return cancelled storage calls as 499 (#17895)
Make upstream cancels more visible - right now they are just reported as "forbidden".
2023-08-22 11:10:41 -07:00
Klaus Post f506117edb
Reduce memory profiling rate (#17894)
Change profiling from every 4KB to every 128K, reducing the lock contention by a factor of 32.
2023-08-22 07:21:49 -07:00
Harshavardhana 1c5af7c31a
serialize queueMRFHeal(), add timeouts and avoid normal build-ups (#17886)
we expect a certain level of IOPs and latency so this is okay.

fixes other miscellaneous bugs

- such as hanging on mrfCh <- when the context is canceled
- queuing MRF heal when the context is canceled
- remove unused saveStateCh channel
2023-08-21 16:44:50 -07:00
Harshavardhana 3a0125fa1f
remove unexpected logging from peer calls (#17888)
also make sure RequestID is set for system logs
2023-08-21 14:25:24 -07:00
Daniel Valdivia 328cb0a076
Pass environment variable to control session length to console (#17885)
Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>
2023-08-21 11:55:43 -07:00
Shubhendu c3c8441a1d
Corrected the count of buckets and objects graphs (#17883)
In distributed setup with a load balancer, randmoly any server
would report the metrics `minio_cluster_bucket_total` and
`minio_cluster_usage_object_total` and while graphing it, we should
take max of reported values.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-21 09:04:38 -07:00
jiuker fa2a8d7209
fix: drain the req.body into io.Discard correctly (#17881) 2023-08-21 01:09:07 -07:00
jiuker e3ea97c964
fix: replace req context by locker context (#17880) 2023-08-19 22:09:07 -07:00
Mathieu Parent 7219ae530e
helm: allow to configure statement policy effect (#17700)
Signed-off-by: Mathieu Parent <mathieu.parent@insee.fr>
2023-08-19 07:39:11 -07:00
Andreas Auernhammer 8f8f8854f0
update `minio/kes-go` dep to v0.2.0 (#17850)
This commit updates the minio/kes-go dependency
to v0.2.0 and updates the existing code to work
with the new KES APIs.

The `SetPolicy` handler got removed since it
may not get implemented by KES at all and could
not have been used in the past since stateless KES
is read-only w.r.t. policies and identities.

Signed-off-by: Andreas Auernhammer <hi@aead.dev>
2023-08-19 07:37:53 -07:00
Anis Eleuch 4c6869cd9a
ilm: Fix cleaning non current null versions (#17876) 2023-08-18 12:55:47 -07:00
Harshavardhana bc7c0d8624
if object is a delete marker it must skip tags filter in ILM (#17861) 2023-08-18 09:36:23 -07:00