Commit Graph

87 Commits

Author SHA1 Message Date
Harshavardhana
6186d11761
handle the locks properly for multi-pool callers (#20495)
- PutObjectMetadata()
- PutObjectTags()
- DeleteObjectTags()
- TransitionObject()
- RestoreTransitionObject()

Also improve the behavior of multipart code across
pool locks, hold locks only once per upload ID for

- CompleteMultipartUpload()
- AbortMultipartUpload()
- ListObjectParts() (read-lock)
- GetMultipartInfo() (read-lock)
- PutObjectPart() (read-lock)

This avoids lock attempts across pools for no
reason, this increases O(n) when there are n-pools.
2024-09-29 15:40:36 -07:00
Poorna
e8b457e8a6
Change delete marker proxy test to use distributed setup (#20494) 2024-09-27 18:02:26 -07:00
Harshavardhana
0c53d86017
remove the list from 'mc stat' from testing via '--no-list' (#20468)
avoid 'listing' as it may get incorrect results for
these tests, we are only interested in 'mc stat' as
in HEAD object here.
2024-09-24 01:03:58 -07:00
Harshavardhana
b6b7cddc9c
make sure listParts returns parts that are valid (#20390) 2024-09-06 02:42:21 -07:00
Anis Eleuch
e726d8ff0f
list: Hide objects/versions with pending/failed replicated deletion (#20047)
In regular listing, this commit will avoid showing an object when its
latest version has a pending or failed deletion. In replicated setup.
It will also prevent showing older versions in the same case.
2024-07-09 15:26:42 -07:00
Harshavardhana
a22ce4550c protect workers and simplify use of atomics (#19982)
without atomic load() it is possible that for
a slow receiver we would get into a hot-loop, when
logCh is full and there are many incoming callers.

to avoid this as a workaround enable BATCH_SIZE
greater than 100 to ensure that your slow receiver
receives data in bulk to avoid being throttled in
some manner.

this PR however fixes the unprotected access to
the current workers value.
2024-06-24 18:15:27 -07:00
Shubhendu
21b6204692
Test proxying of DEL marker for bucket replication (#19870)
Make sure to avoid proxying for DEL markers

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-04 04:38:26 -07:00
Harshavardhana
e0fe7cc391
fix: information disclosure bug in preconditions GET (#19810)
precondition check was being honored before, validating
if anonymous access is allowed on the metadata of an
object, leading to metadata disclosure of the following
headers.

```
Last-Modified
Etag
x-amz-version-id
Expires:
Cache-Control:
```

although the information presented is minimal in nature,
and of opaque nature. It still simply discloses that an
object by a specific name exists or not without even having
enough permissions.
2024-05-27 12:17:46 -07:00
Poorna
e947a844c9
Fix test scripts to use mc ready (#19768) 2024-05-18 11:19:01 -07:00
Poorna
e7aa26dc29
fix: allow DeleteObject unversioned objects with insufficient read quorum (#19581)
Since the object is being permanently deleted, the lack of read quorum should not
matter as long as sufficient disks are online to complete the deletion with parity
requirements.

If several pools have the same object with insufficient read quorum, attempt to
delete object from all the pools where it exists
2024-04-25 17:31:12 -07:00
Harshavardhana
9693c382a8
make renameData() more defensive during overwrites (#19548)
instead upon any error in renameData(), we still
preserve the existing dataDir in some form for
recoverability in strange situations such as out
of disk space type errors.

Bonus: avoid running list and heal() instead allow
versions disparity to return the actual versions,
uuid to heal. Currently limit this to 100 versions
and lesser disparate objects.

an undo now reverts back the xl.meta from xl.meta.bkp
during overwrites on such flaky setups.

Bonus: Save N depth syscalls via skipping the parents
upon overwrites and versioned updates.

Flaky setup examples are stretch clusters with regular
packet drops etc, we need to add some defensive code
around to avoid dangling objects.
2024-04-23 10:15:52 -07:00
Harshavardhana
2ca9befd2a
add ILM + site-replication tests (#19554) 2024-04-19 05:48:19 -07:00
Sveinn
108e6f92d4
updating tests to use new mc --enc flags (#19508) 2024-04-19 01:43:09 -07:00
Aditya Manthramurthy
9a4d003ac7
Add common middleware to S3 API handlers (#19171)
The middleware sets up tracing, throttling, gzipped responses and
collecting API stats.

Additionally, this change updates the names of handler functions in
metric labels to be the same as the name derived from Go lang reflection
on the handler name.

The metric api labels are now stored in memory the same as the handler
name - they will be camelcased, e.g. `GetObject` instead of `getobject`.

For compatibility, we lowercase the metric api label values when emitting the metrics.
2024-03-04 10:05:56 -08:00
Harshavardhana
b6e98aed01
fix: found races in accessing globalLocalDrives (#19069)
make a copy before accessing globalLocalDrives

Bonus: update console v0.46.0

Signed-off-by: Harshavardhana <harsha@minio.io>
2024-02-16 17:15:57 -08:00
Harshavardhana
dd2542e96c
add codespell action (#18818)
Original work here, #18474,  refixed and updated.
2024-01-17 23:03:17 -08:00
Harshavardhana
38637897ba
fix: listing SSE encrypted multipart objects (#18786)
GetActualSize() was heavily relying on o.Parts()
to be non-empty to figure out if the object is multipart or not, 
However, we have many indicators of whether an object is multipart 
or not.

Blindly assuming that o.Parts == nil is not a multipart, is an 
incorrect expectation instead, multipart must be obtained via

- Stored metadata value indicating this is a multipart encrypted object.

- Rely on <meta>-actual-size metadata to get the object's actual size.
  This value is preserved for additional reasons such as these.

- ETag != 32 length
2024-01-15 00:57:49 -08:00
Shubhendu
ce62980d4e
Fixed transition rules getting overwritten while healing (#18542)
While healing the latest changes of expiry rules across sites
if target had pre existing transition rules, they were getting
overwritten as cloned latest expiry rules from remote site were
getting written as is. Fixed the same and added test cases as
well.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-28 10:38:35 -08:00
Harshavardhana
fba883839d
feat: bring new HDD related performance enhancements (#18239)
Optionally allows customers to enable 

- Enable an external cache to catch GET/HEAD responses 
- Enable skipping disks that are slow to respond in GET/HEAD 
  when we have already achieved a quorum
2023-11-22 13:46:17 -08:00
Shubhendu
58306a9d34
Replicate Expiry ILM configs while site replication (#18130)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-21 09:48:06 -08:00
Harshavardhana
c0f2f84285
avoid racy replicationCount checks (#18311)
resync status may not be upto-date by
the time the resync is over due to how
the timer is triggered.

diff is sufficient to know if replication
happened or not.
2023-10-24 15:30:42 -07:00
Harshavardhana
e1e33077e8
fix: tests and resync replication status (#18244) 2023-10-13 17:03:34 -07:00
Poorna
311380f8cb
replication resync: fix queueing (#17775)
Assign resync of all versions of object to the same worker to avoid locking
contention. Fixes parallel resync implementation in #16707
2023-08-01 11:51:15 -07:00
Harshavardhana
dfd7cca0d2
fix: allow cancel of decom only when its in progress (#17607) 2023-07-10 07:55:38 -07:00
Harshavardhana
1818764840
fix: bug in passing Versioned field set for getHealReplicationInfo() (#17498)
Bonus: rejects prefix deletes on object-locked buckets earlier
2023-06-27 09:45:50 -07:00
Harshavardhana
f9b8d1c699
fix: sio-error test to fail if commands fail (#17466) 2023-06-19 12:52:50 -07:00
Harshavardhana
ad4e511026
do not save plain-text ETag when encryption is requested (#17427)
fixes an issue under bucket replication could cause
ETags for replicated SSE-S3 single part PUT objects,
to fail as we would attempt a decryption while listing,
or stat() operation.
2023-06-15 12:43:26 -07:00
Harshavardhana
4a425cbac1
cleanup scripts and apply shfmt (#17284) 2023-05-25 22:07:25 -07:00
Harshavardhana
1d0211d395
allow deletes on directory objects to perform permanent deletes (#17132) 2023-05-04 14:43:52 -07:00
Harshavardhana
b53376a3a4
change directory objects to never create new versions (#17109) 2023-05-02 16:09:33 -07:00
Harshavardhana
fb1492f531
check for quorum errors for DeleteBucket() (#16859) 2023-03-20 23:38:06 -07:00
Harshavardhana
e64b9f6751
fix: disallow SSE-C encrypted objects on replicated buckets (#16467) 2023-01-24 15:46:33 -08:00
Poorna
b29e159604
docs: Update replication setup commands (#16361) 2023-01-04 13:39:37 -08:00
Harshavardhana
14d29b77ae
update replication tests with latest 'mc' (#16348) 2023-01-03 22:54:39 -08:00
Harshavardhana
2937711390
fix: DeleteObject() API with versionId under replication (#16325) 2022-12-28 22:48:33 -08:00
Daryl White
d44f3526dc
Update links to documentation site (#15750) 2022-09-28 21:28:45 -07:00
Harshavardhana
d350b666ff
feat: add idempotent delete marker support (#15521)
The bottom line is delete markers are a nuisance,
most applications are not version aware and this
has simply complicated the version management.

AWS S3 gave an unnecessary complication overhead
for customers, they need to now manage these
markers by applying ILM settings and clean
them up on a regular basis.

To make matters worse all these delete markers
get replicated as well in a replicated setup,
requiring two ILM settings on each site.

This PR is an attempt to address this inferior
implementation by deviating MinIO towards an
idempotent delete marker implementation i.e
MinIO will never create any more than single
consecutive delete markers.

This significantly reduces operational overhead
by making versioning more useful for real data.

This is an S3 spec deviation for pragmatic reasons.
2022-08-18 16:41:59 -07:00
Harshavardhana
b6eb8dff64
Add decommission compression+encryption enabled tests (#15322)
update compression environment variables to follow
the expected sub-system style, however support fallback
mode.
2022-07-17 08:43:14 -07:00
Harshavardhana
48e367ff7d
reject resync start on misconfigured replication rules (#15041)
we expect resync to start on buckets with replication
rule ExistingObjects enabled, if not we reject such
calls.
2022-06-06 02:54:39 -07:00
Krishnan Parthasarathi
ad8e611098
feat: implement prefix-level versioning exclusion (#14828)
Spark/Hadoop workloads which use Hadoop MR 
Committer v1/v2 algorithm upload objects to a 
temporary prefix in a bucket. These objects are 
'renamed' to a different prefix on Job commit. 
Object storage admins are forced to configure 
separate ILM policies to expire these objects 
and their versions to reclaim space.

Our solution:

This can be avoided by simply marking objects 
under these prefixes to be excluded from versioning, 
as shown below. Consequently, these objects are 
excluded from replication, and don't require ILM 
policies to prune unnecessary versions.

-  MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> 
        <Status>Enabled</Status>
        <ExcludeFolders>true</ExcludeFolders>
        <ExcludedPrefixes>
          <Prefix>app1-jobs/*/_temporary/</Prefix>
        </ExcludedPrefixes>
        <ExcludedPrefixes>
          <Prefix>app2-jobs/*/__magic/</Prefix>
        </ExcludedPrefixes>

        <!-- .. up to 10 prefixes in all -->     
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket 
from versioning. This is required to prevent the parent 
folders from accumulating delete markers, especially
those which are shared across spark workloads 
spanning projects/teams.

- To enable version exclusion on a list of prefixes

```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```
2022-05-06 19:05:28 -07:00
Anis Elleuch
1f92fc3fc0
Always check for root disks unless MINIO_CI_CD is set (#14232)
The current code considers a pool with all root disks to be as part
of a testing environment even if there are other pools with mounted
disks. This will result to illegitimate writing in root disks.

Fix this by simplifing the logic: require MINIO_CI_CD in order to skip
root disk check.
2022-02-13 15:42:07 -08:00
Harshavardhana
e3e0532613
cleanup markdown docs across multiple files (#14296)
enable markdown-linter
2022-02-11 16:51:25 -08:00
Poorna
ed3418c046
Refactor replication resync to be an active process (#14266)
When resync is triggered, walk the bucket namespace and
resync objects that are unreplicated. This PR also adds
an API to report resync progress.
2022-02-10 10:16:52 -08:00
Poorna
295730408b
Disallow delete replication for tag based rules (#14167) 2022-01-24 15:22:20 -08:00
Harshavardhana
0e3037631f
skip inconsistent shards if possible (#13945)
data shards were wrong due to a healing bug
reported in #13803 mainly with unaligned object
sizes.

This PR is an attempt to automatically avoid
these shards, with available information about
the `xl.meta` and actually disk mtime.
2021-12-21 10:08:26 -08:00
Harshavardhana
92fdcafb66
add verification tests for ETag on replicated content (#13857) 2021-12-07 10:08:26 -08:00
Harshavardhana
5acc8c0134
add multi-site replication tests (#13631) 2021-11-10 18:18:09 -08:00
Harshavardhana
3c1220adca add tests for default governance replication 2021-10-30 08:57:59 -07:00
Harshavardhana
2af5445309 update 3-site replication tests 2021-10-29 22:09:55 -07:00
Poorna Krishnamoorthy
7f6ed35347
Allow null versions to be replicated (#13310)
for pre-existing objects present in a bucket
prior to enabling existing object replication.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-09-28 10:26:12 -07:00