Commit Graph

259 Commits

Author SHA1 Message Date
Harshavardhana
77e94087cf
fix: calling statfs() call moves the disk head (#18203)
if erasure upgrade is needed rely on the in-memory
values, instead of performing a "DiskInfo()" call.

https://brendangregg.com/blog/2016-09-03/sudden-disk-busy.html

for HDDs these are problematic, lets avoid this because
there is no value in "being" absolutely strict here
in terms of parity. We are okay to increase parity
as we see based on the in-memory online/offline ratio.
2023-10-10 13:47:35 -07:00
Poorna
9dc29d7687
Avoid ILM expiry on deleted versions that are yet to replicate (#18175)
Fixes #18167
2023-10-06 06:55:15 -06:00
Harshavardhana
c34bdc33fb
make sure to set Versioned field to ensure rename2 is not called (#18141)
without this the rename2() can rename the previous dataDir
causing issues for different versions of the object, only
latest version is preserved due to this bug.

Added healing code to ensure recovery of such content.
2023-09-29 09:08:24 -07:00
Harshavardhana
3c470a6b8b
fix: the inspect script to use scheme per deployment (#18118) 2023-09-27 08:22:50 -07:00
jiuker
9947c01c8e
feat: SSE-KMS use uuid instead of read all data to md5. (#17958) 2023-09-18 10:00:54 -07:00
Harshavardhana
9f7044aed0
fix: ignore transient errors in read path (#18006)
Errors such as

```
returned an error (context deadline exceeded) (*fmt.wrapError)
```

```
(msgp: too few bytes left to read object) (*fmt.wrapError)
```
2023-09-11 15:29:59 -07:00
Aditya Manthramurthy
1c99fb106c
Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Harshavardhana
3995355150
avoid repeated large allocations for large parts (#17968)
objects with 10,000 parts and many of them can
cause a large memory spike which can potentially
lead to OOM due to lack of GC.

with previous PR reducing the memory usage significantly
in #17963, this PR reduces this further by 80% under
repeated calls.

Scanner sub-system has no use for the slice of Parts(),
it is better left empty.

```
benchmark                            old ns/op     new ns/op     delta
BenchmarkToFileInfo/ToFileInfo-8     295658        188143        -36.36%

benchmark                            old allocs     new allocs     delta
BenchmarkToFileInfo/ToFileInfo-8     61             60             -1.64%

benchmark                            old bytes     new bytes     delta
BenchmarkToFileInfo/ToFileInfo-8     1097210       227255        -79.29%
```
2023-09-02 07:49:24 -07:00
Harshavardhana
18b3655c99
with xlv2 format we never had to fill in checksumInfo() (#17963)
- this PR avoids sending a large ChecksumInfo slice
  when its not needed

- also for a file with XLV2 format there is no reason
  to allocate Checksum slice while reading
2023-09-01 13:45:58 -07:00
Harshavardhana
8a57b6bced
use renameat2 Linux extension syscall (#17757)
this is a faster and safer alternative
on newer kernel versions.
2023-08-27 09:57:11 -07:00
Harshavardhana
124e28578c
remove strict persistence requirements for List() .metacache objects (#17917)
.metacache objects are transient in nature, and are better left to
use page-cache effectively to avoid using more IOPs on the disks.

this allows for incoming calls to be not taxed heavily due to
multiple large batch listings.
2023-08-25 07:58:11 -07:00
Harshavardhana
c45bc32d98
skip disks under scanning when healing disks (#17822)
Bonus:

- avoid calling DiskInfo() calls when missing blocks
  instead heal the object using MRF operation.

- change the max_sleep to 250ms beyond that we will
  not stop healing.
2023-08-09 12:51:47 -07:00
Harshavardhana
81be718674
fix: optimize DiskInfo() call avoid metrics when not needed (#17763) 2023-07-31 15:20:48 -07:00
Harshavardhana
e7b60c4d65
Add slow drive timeouts to match with active disk monitoring (#17701)
allow active disk-monitoring to be configurable, and use
these add deadlines in various call layers for various
syscalls.
2023-07-25 16:58:31 -07:00
Harshavardhana
a566bcf613
treat 0-byte objects to honor same quorum as delete marker (#17633)
on unversioned buckets its possible that 0-byte objects
might lose quorum on flaky systems, allow them to be same
as DELETE markers. Since practically speak they have no
content.
2023-07-11 21:53:49 -07:00
Kaan Kabalak
f64d62b01d
Fix style of logOnceIf calls w/unique identifiers (#17631) 2023-07-11 13:17:45 -07:00
Poorna
e8c98c3246
Avoid extra GetObjectInfo call in DeleteObject API (#17599)
Optimize DeleteObject API to avoid extra 
GetObjectInfo call on the replicating side.

For receiving side, it is just a regular
DeleteObject call.

Bonus: Fix a corner case where version purged is 
absent on target (either due to replication not yet
complete or target version already deleted in a
one-way replication or when replication was disabled). 

In such cases, mark version purge complete.
2023-07-10 07:57:56 -07:00
Klaus Post
ff5988f4e0
Reduce allocations (#17584)
* Reduce allocations

* Add stringsHasPrefixFold which can compare string prefixes, while ignoring case and not allocating.
* Reuse all msgp.Readers
* Reuse metadata buffers when not reading data.

* Make type safe. Make buffer 4K instead of 8.

* Unslice
2023-07-06 16:02:08 -07:00
Kaan Kabalak
21fbe88e1f
Print certain log messages once per error (#17484) 2023-06-24 20:29:13 -07:00
Harshavardhana
1f8b9b4bd5
fix: do not listAndHeal() inline with PutObject() (#17499)
there is a possibility that slow drives can actually add latency
to the overall call, leading to a large spike in latency.

this can happen if there are other parallel listObjects()
calls to the same drive, in-turn causing each other to sort
of serialize.

this potentially improves performance and makes PutObject()
also non-blocking.
2023-06-24 19:31:04 -07:00
Harshavardhana
02c2ec3027
skip onlineDisks with parity mismatch (#17478) 2023-06-20 13:18:24 -07:00
Aditya Manthramurthy
5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Harshavardhana
1443b5927a
allow quorum fileInfo to pick same parityBlocks (#17454)
Bonus: allow replication to proceed for 503 errors such as
with error code SlowDownRead
2023-06-18 18:20:15 -07:00
Harshavardhana
64de61d15d
fallback on etags if they match when mtime is not same (#17424)
on "unversioned" buckets there are situations
when successive concurrent I/O can lead to
an inconsistent state() with mtime while the
etag might be the same for the object on disk.

in such a scenario it is possible for us to
allow reading of the object since etag matches
and if etag matches we are guaranteed that we
have enough copies the object will be readable
and same.

This PR allows fallback in such scenarios.
2023-06-17 19:18:20 -07:00
Harshavardhana
ad4e511026
do not save plain-text ETag when encryption is requested (#17427)
fixes an issue under bucket replication could cause
ETags for replicated SSE-S3 single part PUT objects,
to fail as we would attempt a decryption while listing,
or stat() operation.
2023-06-15 12:43:26 -07:00
Harshavardhana
54e544e03e
allow lookup()/head() operations on Veeam SOS objects (#17331) 2023-06-01 15:26:26 -07:00
Harshavardhana
ef54200db7
offline drives more than 50% of total drives return error (#17252) 2023-05-23 07:57:57 -07:00
Krishnan Parthasarathi
3e128c116e
Add lifecycle event source to audit log tags (#17248) 2023-05-22 15:28:56 -07:00
Klaus Post
aaf1abc993
simplify HardLimitReader by using LimitReader for internal usage (#17218) 2023-05-16 13:14:37 -07:00
Poorna
e07c2ab868
Use hash.NewLimitReader for internal multipart calls (#17191) 2023-05-12 11:19:08 -07:00
Klaus Post
7fad0c8b41
Remove checksums from HTTP range request, add part checksums (#17105) 2023-04-28 08:26:32 -07:00
Krishnan Parthasarathi
e7cac8acef
Add tags to auditLogLifecycle (#17081) 2023-04-26 17:49:00 -07:00
Praveen raj Mani
72802a5972
Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069) 2023-04-25 22:57:40 -07:00
Harshavardhana
b1f3935c5b
allow ListObjects() when a prefix is an object (#17074) 2023-04-25 22:41:54 -07:00
Krishnan Parthasarathi
fae9000304
heal: Pick maximally occuring modTime in quorum (#17071) 2023-04-25 10:13:57 -07:00
Harshavardhana
84f31ed45d
simplify MRF, converge it to regular healing (#17026) 2023-04-19 07:47:42 -07:00
Harshavardhana
6825bd7e75
fix: inlined objects don't need to honor long locks (#17039) 2023-04-17 12:16:37 -07:00
Klaus Post
c133979b8e
Add part count to checksum (#17035) 2023-04-14 09:44:45 -07:00
Poorna
cd6dec49c0
Add trace support for ilm activity (#16993) 2023-04-11 19:22:32 -07:00
Harshavardhana
c06e0bfef9
set correct Host: value for replication event notification (#16984) 2023-04-06 10:20:53 -07:00
Poorna
699a24f7e5
batch: validate versioning on src/tgt buckets (#16955) 2023-04-04 10:50:11 -07:00
Harshavardhana
216a471bbb
on quorum DeleteObject() errors attempt an MRF (#16932) 2023-03-31 08:15:41 -07:00
Harshavardhana
6c11dbffd5
add crash protection from backend modifications (#16846) 2023-03-20 09:08:42 -07:00
Poorna
d1e775313d
support decommissioning of tiered objects (#16751) 2023-03-16 07:48:05 -07:00
Harshavardhana
e0f4dd6027
remove unncessary logs from WalkDir(), PutObject() (#16818) 2023-03-15 11:52:23 -07:00
ferhat elmas
714283fae2
cleanup ignored static analysis (#16767) 2023-03-06 08:56:10 -08:00
Klaus Post
9acf1024e4
Remove bloom filter (#16682)
Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner.

Also removes a tiny bit of CPU by each write operation.
2023-02-24 09:03:31 +05:30
Harshavardhana
a0f06eac2a
add Veeam SOS API first implementation (#16688) 2023-02-22 19:54:57 +05:30
Krishnan Parthasarathi
d136ac0596
Don't close transition task channel on server exit (#16627) 2023-02-15 22:09:25 -08:00
Krishnan Parthasarathi
cea2ca8c8e
Add restore-status header for multipart objects (#16508) 2023-01-31 07:53:45 +05:30