Commit Graph

75 Commits

Author SHA1 Message Date
Klaus Post
f0b91e5504 Run modernize (#21546)
`go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...` executed.

`go generate ./...` ran afterwards to keep generated.
2025-08-28 19:39:48 -07:00
Harshavardhana
2b34e5b9ae move to go1.24 (#21114) 2025-04-09 07:28:39 -07:00
Anis Eleuch
734d1e320a heal: Single object heal to look for older versions as well (#203) (#20723)
`mc admin heal ALIAS/bucket/object` does not have any flag to heal
object noncurrent versions, this commit will make healing of the object
noncurrent versions implicitly asked.

This also fixes the 'mc admin heal ALIAS/bucket/object' that does not work 
correctly when the bucket is versioned. This has been broken since Apr 2023.
2024-12-04 04:42:04 +05:30
Anis Eleuch
789cbc6fb2 heal: Dangling check to evaluate object parts separately (#19797) 2024-06-10 08:51:27 -07:00
Harshavardhana
3549e583a6 results must be a single channel to avoid overwriting healing.bin (#19702) 2024-05-09 10:15:03 -07:00
Harshavardhana
364d3a0ac9 fix: new staticheck and linter issues reported (#19340) 2024-03-27 08:10:40 -07:00
Harshavardhana
53aa8f5650 use typos instead of codespell (#19088) 2024-02-21 22:26:06 -08:00
Harshavardhana
5b1a74b6b2 do not block iam.store registration (#18999)
current implementation would quite simply
block the sys.store registration, making
sys.Initialized() call to be blocked.
2024-02-07 12:41:58 -08:00
Harshavardhana
fec13b0ec1 remove unused DiskMTime (#18965) 2024-02-05 01:04:26 -08:00
Harshavardhana
f225ca3312 Add more advanced cases for dangling (#18968) 2024-02-04 14:36:13 -08:00
Harshavardhana
80ca120088 remove checkBucketExist check entirely to avoid fan-out calls (#18917)
Each Put, List, Multipart operations heavily rely on making
GetBucketInfo() call to verify if bucket exists or not on
a regular basis. This has a large performance cost when there
are tons of servers involved.

We did optimize this part by vectorizing the bucket calls,
however its not enough, beyond 100 nodes and this becomes
fairly visible in terms of performance.
2024-01-30 12:43:25 -08:00
Harshavardhana
dd2542e96c add codespell action (#18818)
Original work here, #18474,  refixed and updated.
2024-01-17 23:03:17 -08:00
Shubhendu
e31081d79d Heal buckets at node level (#18612)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-01-09 20:34:04 -08:00
Harshavardhana
196e7e072b allow bitrot files to be healed in MRF (#18618)
bitrot scanMode was ignored in MRF,
allow it to heal relevant content if
needed when seen as an error.
2023-12-08 12:26:01 -08:00
Harshavardhana
e30c0e7ca3 Revert "Heal buckets at node level (#18504)"
This reverts commit 708296ae1b.
2023-12-05 22:34:46 -08:00
Shubhendu
708296ae1b Heal buckets at node level (#18504) 2023-12-05 02:17:35 -08:00
Anis Eleuch
b7d11141e1 rename Force to Immediate for clarity (#18540) 2023-11-28 22:35:16 -08:00
Harshavardhana
a4cfb5e1ed return errors if dataDir is missing during HeadObject() (#18477)
Bonus: allow replication to attempt Deletes/Puts when
the remote returns quorum errors of some kind, this is
to ensure that MinIO can rewrite the namespace with the
latest version that exists on the source.
2023-11-20 21:33:47 -08:00
Harshavardhana
b28bcad11b avoid Access() calls on known bucket paths (#17719) 2023-07-26 11:31:40 -07:00
Anis Eleuch
15fd5ce2fa fix: A typo in per pool make/delete bucket errs calculation (#17553) 2023-07-03 09:47:40 -07:00
Aditya Manthramurthy
5a1612fe32 Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Harshavardhana
5569acd95c disallow EC:0 if not set during server startup (#17141) 2023-05-04 14:44:30 -07:00
Harshavardhana
6825bd7e75 fix: inlined objects don't need to honor long locks (#17039) 2023-04-17 12:16:37 -07:00
Krishnan Parthasarathi
f92450d8b3 commonParity should pick readable FileInfo (#17032) 2023-04-14 16:23:28 -07:00
ferhat elmas
3423028713 cleanup Go linter settings (#16736) 2023-03-04 20:57:35 -08:00
Harshavardhana
b882310e2b avoid locks for internal and invalid buckets in MakeBucket() (#16302) 2022-12-23 07:46:00 -08:00
Aditya Manthramurthy
a30cfdd88f Bump up madmin-go to v2 (#16162) 2022-12-06 13:46:50 -08:00
Klaus Post
cc1d8f0057 Check for abandoned data when healing (#16122) 2022-11-28 10:20:55 -08:00
Harshavardhana
94dbb4a427 fix: generalize SC config and also skip healing sub-sys under SD (#15757) 2022-09-26 09:04:54 -07:00
Klaus Post
a9f1ad7924 Add extended checksum support (#15433) 2022-08-29 16:57:16 -07:00
Poorna
426c902b87 site replication: fix healing of bucket deletes. (#15377)
This PR changes the handling of bucket deletes for site 
replicated setups to hold on to deleted bucket state until 
it syncs to all the clusters participating in site replication.
2022-07-25 17:51:32 -07:00
Harshavardhana
65166e4ce4 fix: readQuorum calculation when defaultParityCount is 0 (#15363)
when parity is '0' the readQuorum must be equal
to the number of data disks.
2022-07-21 07:25:54 -07:00
Klaus Post
f939d1c183 Independent Multipart Uploads (#15346)
Do completely independent multipart uploads.

In distributed mode, a lock was held to merge each multipart 
upload as it was added. This lock was highly contested and 
retries are expensive (timewise) in distributed mode.

Instead, each part adds its metadata information uniquely. 
This eliminates the per object lock required for each to merge.
The metadata is read back and merged by "CompleteMultipartUpload" 
without locks when constructing final object.

Co-authored-by: Harshavardhana <harsha@minio.io>
2022-07-19 08:35:29 -07:00
Praveen raj Mani
b49fc33cb3 purge objects immediately with x-minio-force-delete in DeleteObject and DeleteBucket API (#15148) 2022-07-11 09:15:54 -07:00
Anis Elleuch
73733a8fb9 heal: Report correctly in multip-pools setup (#15117)
`mc admin heal -r <alias>` in a multi setup pools returns incorrectly
grey objects. The reason is that erasure-server-pools.HealObject() runs
HealObject in all pools and returns the result of the first nil
error. However, in the lower erasureObject level, HealObject() returns
nil if an object does not exist + missing error in each disk of the object
in that pool, therefore confusing mc.

Make erasureObject.HealObject() to return not found error in the lower
level, so at least erasureServerPools will know what pools to ignore.
2022-06-20 08:07:45 -07:00
Harshavardhana
507f993075 attempt to real resolve when there is a quorum failure on reads (#14613) 2022-04-20 12:49:05 -07:00
Klaus Post
8a274169da heal: Fix first entry on dangling (#14495)
Instead of the first, the last entry was returned
pointerizing the range value.
2022-03-08 09:04:20 -08:00
Anis Elleuch
3fca4055d2 heal: Re-heal an object when a corruption is found during normal scan (#14482)
When scanning using normal mode, HealObject() can report an 
error saying that it found a corrupted part. This doesn't have 
when HealObject() is called with bitrot scan flag. However, when 
this happens, we can still restart HealObject() with the bitrot scan.

This is also important because this means the scanner and the 
new disks healer will not be able to heal an object that doesn't 
exist in a specific disk and has corruption in another disk.

Also without this PR, mc admin heal command without bitrot will report
an error.
2022-03-04 18:24:34 -08:00
Harshavardhana
f19a414e09 fix: allow danging objects to be purged properly deleteMultipleObjects() (#14273)
Deleting bulk objects had an issue since the relevant versionID
is not passed through the layers to ensure that the dangling
object purge actually works cleanly.

This is a continuation of quorum related error returned by
multi-object delete API from #14248

This PR ensures that we pass down correct information as
well as extend the scope of dangling object detection.
2022-02-08 20:08:23 -08:00
Harshavardhana
74faed166a Add quota usage as part of prometheus metrics (#14222)
Bonus: pass caller context when needed to all bucket metadata handling calls.
2022-01-31 17:27:43 -08:00
Harshavardhana
b7c5e45fff heal: isObjectDangling should return false when it cannot decide (#14053)
In a multi-pool setup when disks are coming up, or in a single pool
setup let's say with 100's of erasure sets with a slow network.

It's possible when healing is attempted on `.minio.sys/config`
folder, it can lead to healing unexpectedly deleting some policy
files as dangling due to a mistake in understanding when `isObjectDangling`
is considered to be 'true'.

This issue happened in commit 30135eed86
when we assumed the validMeta with empty ErasureInfo is considered
to be fully dangling. This implementation issue gets exposed when
the server is starting up.

This is most easily seen with multiple-pool setups because of the
disconnected fashion pools that come up. The decision to purge the
object as dangling is taken incorrectly prior to the correct state
being achieved on each pool, when the corresponding drive let's say
returns 'errDiskNotFound', a 'delete' is triggered. At this point,
the 'drive' comes online because this is part of the startup sequence
as drives can come online lazily.

This kind of situation exists because we allow (totalDisks/2) number
of drives to be online when the server is being restarted.

Implementation made an incorrect assumption here leading to policies
getting deleted.

Added tests to capture the implementation requirements.
2022-01-07 19:11:54 -08:00
Harshavardhana
001b77e7e1 use readConfig/saveConfig to simplify I/O on usage/tracker info (#14019) 2022-01-03 10:22:58 -08:00
Harshavardhana
79df2c7ce7 correctly calculate read quorum based on the available fileInfo (#14000)
The current usage of assuming `default` parity of `4` is not correct
for all objects stored on MinIO, objects in .minio.sys have maximum
parity, healing won't trigger on these objects due to incorrect
verification of quorum.
2021-12-28 15:33:03 -08:00
Harshavardhana
b883803b21 fix: healing across pools removing dangling objects (#13990)
adds other simplifications to the code when running
namespace heals across pools.
2021-12-25 09:01:44 -08:00
Harshavardhana
0e3037631f skip inconsistent shards if possible (#13945)
data shards were wrong due to a healing bug
reported in #13803 mainly with unaligned object
sizes.

This PR is an attempt to automatically avoid
these shards, with available information about
the `xl.meta` and actually disk mtime.
2021-12-21 10:08:26 -08:00
jiangfucheng
88c0d0120c update heal object unit test (#13886) 2021-12-11 09:04:07 -08:00
jiangfucheng
7460fb8349 fix padding error and compatible with uploaded objects (#13803) 2021-12-03 09:26:30 -08:00
Klaus Post
3db931dc0e Improve listing consistency with version merging (#13723) 2021-12-02 11:29:16 -08:00
Harshavardhana
28f95f1fbe quorum calculation getLatestFileInfo should be itself (#13717)
FileInfo quorum shouldn't be passed down, instead
inferred after obtaining a maximally occurring FileInfo.

This PR also changes other functions that rely on
wrong quorum calculation.

Update tests as well to handle the proper requirement. All
these changes are needed when migrating from older deployments
where we used to set N/2 quorum for reads to EC:4 parity in
newer releases.
2021-11-22 09:36:29 -08:00
Harshavardhana
661b263e77 add gocritic/ruleguard checks back again, cleanup code. (#13665)
- remove some duplicated code
- reported a bug, separately fixed in #13664
- using strings.ReplaceAll() when needed
- using filepath.ToSlash() use when needed
- remove all non-Go style comments from the codebase

Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>
2021-11-16 09:28:29 -08:00