Commit Graph

104 Commits

Author SHA1 Message Date
Klaus Post
b8631cf531 Use new gofumpt (#21613)
Update tinylib. Should fix CI.

`gofumpt -w .&&go generate ./...`
2025-09-28 13:59:21 -07:00
Klaus Post
f0b91e5504 Run modernize (#21546)
`go run golang.org/x/tools/gopls/internal/analysis/modernize/cmd/modernize@latest -fix -test ./...` executed.

`go generate ./...` ran afterwards to keep generated.
2025-08-28 19:39:48 -07:00
Harshavardhana
a6f1e727fb add tests for ILM transition and healing (#166) (#20601)
This PR fixes a regression introduced in https://github.com/minio/minio/pull/19797
by restoring the healing ability of transitioned objects

Bonus: support for transitioned objects to carry original
The object name is for future reverse lookups if necessary.

Also fix parity calculation for tiered objects to n/2 for n/2 == (parity)
2024-10-31 15:10:24 -07:00
Harshavardhana
b6b7cddc9c make sure listParts returns parts that are valid (#20390) 2024-09-06 02:42:21 -07:00
Krishnan Parthasarathi
04be352ae9 Relax quorum agreement on DataDir values (#20232)
Previously, we checked if we had a quorum on the DataDir value. 
We are removing this check, which allows reading objects with different 
DataDir values in a few drives (due to a rebalance-stop race bug) 
provided their eTags or ModTimes match.
2024-08-12 12:02:21 -07:00
Harshavardhana
2e0fd2cba9 implement a safer completeMultipart implementation (#20227)
- optimize writing part.N.meta by writing both part.N
  and its meta in sequence without network component.

- remove part.N.meta, part.N which were partially success
  ful, in quorum loss situations during renamePart()

- allow for strict read quorum check arbitrated via ETag
  for the given part number, this makes it double safer
  upon final commit.

- return an appropriate error when read quorum is missing,
  instead of returning InvalidPart{}, which is non-retryable
  error. This kind of situation can happen when many
  nodes are going offline in rotation, an example of such
  a restart() behavior is statefulset updates in k8s.

fixes #20091
2024-08-12 01:38:15 -07:00
Poorna
6651c655cb fix replication of checksum when encryption is enabled (#20161)
- Adding functional tests
- Return checksum header on GET/HEAD, previously this was returning
  InvalidPartNumber error
2024-07-29 01:02:16 -07:00
Krishnan Parthasarathi
4a1edfd9aa Different read quorum for tiered objects (#20115)
For a non-tiered object, MinIO requires that EcM (# of data blocks) of
xl.meta agree, corresponding to the number of data blocks needed to 
read this object.

OTOH, tiered objects have metadata in the hot tier and data in the 
warm tier. The data and its integrity are offloaded to the warm tier. This
allows us to reduce the read quorum from EcM (typically > N/2, where N -
erasure stripe width) to N/2 + 1. The simple majority of metadata
ensures consensus on what the object is and where it is
located.
2024-07-25 14:02:50 -07:00
Aditya Manthramurthy
5f78691fcf ldap: Add user DN attributes list config param (#19758)
This change uses the updated ldap library in minio/pkg (bumped
up to v3). A new config parameter is added for LDAP configuration to
specify extra user attributes to load from the LDAP server and to store
them as additional claims for the user.

A test is added in sts_handlers.go that shows how to access the LDAP
attributes as a claim.

This is in preparation for adding SSH pubkey authentication to MinIO's SFTP
integration.
2024-05-24 16:05:23 -07:00
Krishnan Parthasarathi
1228d6bf1a Return NumVersions in quorum when available (#19766)
Similar to https://github.com/minio/minio/pull/17925
2024-05-17 13:57:37 -07:00
Harshavardhana
a372c6a377 a bunch of fixes for error handling (#19627)
- handle errFileCorrupt properly
- micro-optimization of sending done() response quicker
  to close the goroutine.
- fix logger.Event() usage in a couple of places
- handle the rest of the client to return a different error other than
  lastErr() when the client is closed.
2024-04-28 10:53:50 -07:00
Poorna
e7aa26dc29 fix: allow DeleteObject unversioned objects with insufficient read quorum (#19581)
Since the object is being permanently deleted, the lack of read quorum should not
matter as long as sufficient disks are online to complete the deletion with parity
requirements.

If several pools have the same object with insufficient read quorum, attempt to
delete object from all the pools where it exists
2024-04-25 17:31:12 -07:00
Anis Eleuch
95bf4a57b6 logging: Add subsystem to log API (#19002)
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.

Also introduce bugLogIf() for errors that should not happen in the first
place.
2024-04-04 05:04:40 -07:00
Anis Eleuch
24b4f9d748 Fix quorum calculation with zero parity objects (#19250)
Currently, the code relies on object parity to decide whether it is a
delete marker or a regular object. In the case of a delete marker, the
return quorum is half of the disks in the erasure set. However, this
calculation must be corrected with objects with EC = 0, mainly 
because EC is not a one-time fixed configuration.

Though all data are correct, the manifested symptom is a 503 with an 
EC=0 object. This bug was manifested after we introduced the 
fast Get Object feature that does not read all data from all disks in 
case of inlined objects
2024-03-12 12:59:11 -07:00
Krishnan Parthasarathi
a7577da768 Improve expiration of tiered objects (#18926)
- Use a shared worker pool for all ILM expiry tasks
- Free version cleanup executes in a separate goroutine
- Add a free version only if removing the remote object fails
- Add ILM expiry metrics to the node namespace
- Move tier journal tasks to expiryState
- Remove unused on-disk journal for tiered objects pending deletion
- Distribute expiry tasks across workers such that the expiry of versions of
  the same object serialized
- Ability to resize worker pool without server restart
- Make scaling down of expiryState workers' concurrency safe; Thanks
  @klauspost
- Add error logs when expiryState and transition state are not
  initialized (yet)
* metrics: Add missed tier journal entry tasks
* Initialize the ILM worker pool after the object layer
2024-03-01 21:11:03 -08:00
Harshavardhana
c599c11e70 fix: relax metadata checks for healing (#19165)
we should do this to ensure that we focus on
data healing as primary focus, fixing metadata
as part of healing must be done but making
data available is the main focus.

the main reason is metadata inconsistencies can
cause data availability issues, which must be
avoided at all cost.

will be bringing in an additional healing mechanism
that involves "metadata-only" heal, for now we do
not expect to have these checks.

continuation of #19154

Bonus: add a pro-active healthcheck to perform a connection
2024-02-29 22:49:01 -08:00
Harshavardhana
80ca120088 remove checkBucketExist check entirely to avoid fan-out calls (#18917)
Each Put, List, Multipart operations heavily rely on making
GetBucketInfo() call to verify if bucket exists or not on
a regular basis. This has a large performance cost when there
are tons of servers involved.

We did optimize this part by vectorizing the bucket calls,
however its not enough, beyond 100 nodes and this becomes
fairly visible in terms of performance.
2024-01-30 12:43:25 -08:00
Harshavardhana
708cebe7f0 add necessary protection err, fileInfo slice reads and writes (#18854)
protection was in place. However, it covered only some
areas, so we re-arranged the code to ensure we could hold
locks properly.

Along with this, remove the DataShardFix code altogether,
in deployments with many drive replacements, this can affect
and lead to quorum loss.
2024-01-24 01:08:23 -08:00
Harshavardhana
dd2542e96c add codespell action (#18818)
Original work here, #18474,  refixed and updated.
2024-01-17 23:03:17 -08:00
Harshavardhana
38637897ba fix: listing SSE encrypted multipart objects (#18786)
GetActualSize() was heavily relying on o.Parts()
to be non-empty to figure out if the object is multipart or not, 
However, we have many indicators of whether an object is multipart 
or not.

Blindly assuming that o.Parts == nil is not a multipart, is an 
incorrect expectation instead, multipart must be obtained via

- Stored metadata value indicating this is a multipart encrypted object.

- Rely on <meta>-actual-size metadata to get the object's actual size.
  This value is preserved for additional reasons such as these.

- ETag != 32 length
2024-01-15 00:57:49 -08:00
Harshavardhana
fba883839d feat: bring new HDD related performance enhancements (#18239)
Optionally allows customers to enable 

- Enable an external cache to catch GET/HEAD responses 
- Enable skipping disks that are slow to respond in GET/HEAD 
  when we have already achieved a quorum
2023-11-22 13:46:17 -08:00
Poorna
96ec8fcba1 Preserve replica timestamps in multipart (#18318)
Also a backward compatibility fix to use x-amz-replica-status
if present as replication status.
2023-10-25 21:24:10 -07:00
Aditya Manthramurthy
1c99fb106c Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Krishnan Parthasarathi
71c32e9b48 Return successorModTime in quorum when available (#17925) 2023-09-04 08:24:17 -07:00
Harshavardhana
18b3655c99 with xlv2 format we never had to fill in checksumInfo() (#17963)
- this PR avoids sending a large ChecksumInfo slice
  when its not needed

- also for a file with XLV2 format there is no reason
  to allocate Checksum slice while reading
2023-09-01 13:45:58 -07:00
Harshavardhana
1ea7826c0e do not have to consider replicationTimestamp for healing and quorum (#17922)
replicationTimestamp might differ if there were retries
in replication and the retried attempt overwrote in
quorum but enough shards with newer timestamp causing
the existing timestamps on xl.meta to be invalid, we
do not rely on this value for anything external.

this is purely a hint for debugging purposes, but there
is no real value in it considering the object itself
is in-tact we do not have to spend time healing this
situation.

we may consider healing this situation in future but
that needs to be decoupled to make sure that we do not
over calculate how much we have to heal.
2023-08-25 15:31:15 -07:00
Harshavardhana
9ebd10d3f4 Revert "Include SuccessorModTime for FileInfo quorum (#17732)" (#17860)
This reverts commit bf3901342c.

This is to fix a regression caused when there are inconsistent
versions, but one version is in quorum. SuccessorModTime issue
must be fixed differently.
2023-08-16 07:51:33 -07:00
Krishnan Parthasarathi
bf3901342c Include SuccessorModTime for FileInfo quorum (#17732) 2023-07-26 17:04:16 -07:00
Harshavardhana
a566bcf613 treat 0-byte objects to honor same quorum as delete marker (#17633)
on unversioned buckets its possible that 0-byte objects
might lose quorum on flaky systems, allow them to be same
as DELETE markers. Since practically speak they have no
content.
2023-07-11 21:53:49 -07:00
Harshavardhana
1443b5927a allow quorum fileInfo to pick same parityBlocks (#17454)
Bonus: allow replication to proceed for 503 errors such as
with error code SlowDownRead
2023-06-18 18:20:15 -07:00
Harshavardhana
64de61d15d fallback on etags if they match when mtime is not same (#17424)
on "unversioned" buckets there are situations
when successive concurrent I/O can lead to
an inconsistent state() with mtime while the
etag might be the same for the object on disk.

in such a scenario it is possible for us to
allow reading of the object since etag matches
and if etag matches we are guaranteed that we
have enough copies the object will be readable
and same.

This PR allows fallback in such scenarios.
2023-06-17 19:18:20 -07:00
jiuker
0474791cf8 fix: set time format right (#17402) 2023-06-14 07:49:13 -07:00
Harshavardhana
d5059840ef fix: for delete marked objects choose appropriate parity (#17287) 2023-05-26 09:57:44 -07:00
Anis Eleuch
a30a55f3b1 Add object parity in listing V2M and listing versions M (#17238) 2023-05-19 09:42:45 -07:00
Praveen raj Mani
72802a5972 Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069) 2023-04-25 22:57:40 -07:00
Harshavardhana
8fd07bcd51 simplify sort.Sort by using sort.Slice (#17066) 2023-04-24 13:28:18 -07:00
Harshavardhana
6825bd7e75 fix: inlined objects don't need to honor long locks (#17039) 2023-04-17 12:16:37 -07:00
Krishnan Parthasarathi
f92450d8b3 commonParity should pick readable FileInfo (#17032) 2023-04-14 16:23:28 -07:00
Harshavardhana
72daccd468 fix: scanner in healing cycle must use actual size (#16589) 2023-02-10 06:53:03 -08:00
Harshavardhana
c242e6c391 fix: calculate common parity properly (#16406) 2023-01-13 03:28:16 +05:30
Harshavardhana
2937711390 fix: DeleteObject() API with versionId under replication (#16325) 2022-12-28 22:48:33 -08:00
Krishnan Parthasarathi
40a2c6b882 Return remote tier as StorageClass for transitioned objects (#16035) 2022-11-09 15:57:34 -08:00
Anis Elleuch
783dd875f7 refactor objectQuorumFromMeta() to search for parity quorum (#15844) 2022-10-12 16:42:45 -07:00
Harshavardhana
228c6686f8 allow non-standards fallback for all http.TimeFormats (#15662)
fixes #15645
2022-09-07 07:24:54 -07:00
Harshavardhana
7776d064cf allow non-standards fallback for Expires header (#15655)
fixes #15645
2022-09-05 19:18:18 -07:00
Klaus Post
a9f1ad7924 Add extended checksum support (#15433) 2022-08-29 16:57:16 -07:00
Harshavardhana
65166e4ce4 fix: readQuorum calculation when defaultParityCount is 0 (#15363)
when parity is '0' the readQuorum must be equal
to the number of data disks.
2022-07-21 07:25:54 -07:00
Harshavardhana
ce8397f7d9 use partInfo only for intermediate part.x.meta (#15353) 2022-07-19 18:56:24 -07:00
Klaus Post
911a17b149 Add compressed file index (#15247) 2022-07-11 17:30:56 -07:00
Harshavardhana
9c605ad153 allow support for parity '0', '1' enabling support for 2,3 drive setups (#15171)
allows for further granular setups

- 2 drives (1 parity, 1 data)
- 3 drives (1 parity, 2 data)

Bonus: allows '0' parity as well.
2022-06-27 20:22:18 -07:00