Commit Graph

48 Commits

Author SHA1 Message Date
Krishnan Parthasarathi
ad8e611098
feat: implement prefix-level versioning exclusion (#14828)
Spark/Hadoop workloads which use Hadoop MR 
Committer v1/v2 algorithm upload objects to a 
temporary prefix in a bucket. These objects are 
'renamed' to a different prefix on Job commit. 
Object storage admins are forced to configure 
separate ILM policies to expire these objects 
and their versions to reclaim space.

Our solution:

This can be avoided by simply marking objects 
under these prefixes to be excluded from versioning, 
as shown below. Consequently, these objects are 
excluded from replication, and don't require ILM 
policies to prune unnecessary versions.

-  MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> 
        <Status>Enabled</Status>
        <ExcludeFolders>true</ExcludeFolders>
        <ExcludedPrefixes>
          <Prefix>app1-jobs/*/_temporary/</Prefix>
        </ExcludedPrefixes>
        <ExcludedPrefixes>
          <Prefix>app2-jobs/*/__magic/</Prefix>
        </ExcludedPrefixes>

        <!-- .. up to 10 prefixes in all -->     
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket 
from versioning. This is required to prevent the parent 
folders from accumulating delete markers, especially
those which are shared across spark workloads 
spanning projects/teams.

- To enable version exclusion on a list of prefixes

```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```
2022-05-06 19:05:28 -07:00
Anis Elleuch
1f92fc3fc0
Always check for root disks unless MINIO_CI_CD is set (#14232)
The current code considers a pool with all root disks to be as part
of a testing environment even if there are other pools with mounted
disks. This will result to illegitimate writing in root disks.

Fix this by simplifing the logic: require MINIO_CI_CD in order to skip
root disk check.
2022-02-13 15:42:07 -08:00
Harshavardhana
e3e0532613
cleanup markdown docs across multiple files (#14296)
enable markdown-linter
2022-02-11 16:51:25 -08:00
Poorna
ed3418c046
Refactor replication resync to be an active process (#14266)
When resync is triggered, walk the bucket namespace and
resync objects that are unreplicated. This PR also adds
an API to report resync progress.
2022-02-10 10:16:52 -08:00
Poorna
295730408b
Disallow delete replication for tag based rules (#14167) 2022-01-24 15:22:20 -08:00
Harshavardhana
0e3037631f
skip inconsistent shards if possible (#13945)
data shards were wrong due to a healing bug
reported in #13803 mainly with unaligned object
sizes.

This PR is an attempt to automatically avoid
these shards, with available information about
the `xl.meta` and actually disk mtime.
2021-12-21 10:08:26 -08:00
Harshavardhana
92fdcafb66
add verification tests for ETag on replicated content (#13857) 2021-12-07 10:08:26 -08:00
Harshavardhana
5acc8c0134
add multi-site replication tests (#13631) 2021-11-10 18:18:09 -08:00
Harshavardhana
3c1220adca add tests for default governance replication 2021-10-30 08:57:59 -07:00
Harshavardhana
2af5445309 update 3-site replication tests 2021-10-29 22:09:55 -07:00
Poorna Krishnamoorthy
7f6ed35347
Allow null versions to be replicated (#13310)
for pre-existing objects present in a bucket
prior to enabling existing object replication.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-09-28 10:26:12 -07:00
Poorna Krishnamoorthy
0b55a0423e
fix: cache usage deserialization from v5 to v6 (#13258) 2021-09-21 09:01:51 -07:00
Harshavardhana
f492f72154
add 3site replication script (#13256) 2021-09-20 18:24:24 -07:00
Poorna Krishnamoorthy
c4373ef290
Add support for multi site replication (#12880) 2021-09-18 13:31:35 -07:00
Harshavardhana
c13cbc64d1 fix multiple typos in documentation 2021-08-04 14:15:45 -07:00
Poorna Krishnamoorthy
92e4c8be10
Update replication docs to fix mc reference (#12490)
Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-06-13 11:37:22 -07:00
Poorna Krishnamoorthy
f2a3872301
Update design.md for replication (#12486)
Fixes #12483
2021-06-10 16:05:39 -07:00
Poorna Krishnamoorthy
dbea8d2ee0
Add support for existing object replication. (#12109)
Also adding an API to allow resyncing replication when
existing object replication is enabled and the remote target
is entirely lost. With the `mc replicate reset` command, the
objects that are eligible for replication as per the replication
config will be resynced to target if existing object replication
is enabled on the rule.
2021-06-01 19:59:11 -07:00
Harshavardhana
41e9c6572f fix: docs links use non-raw links for markdown 2021-05-22 10:52:47 -07:00
Harshavardhana
df4914b6f3 fix: update docs, fix wording and links 2021-05-21 12:36:03 -07:00
Poorna Krishnamoorthy
a27339826c
Fix replication README.md docs (#12330)
Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-05-20 08:17:14 -07:00
Poorna Krishnamoorthy
951acf561c
Add support for syncing replica modifications (#11104)
when bidirectional replication is set up.

If ReplicaModifications is enabled in the replication
configuration, sync metadata updates to source if
replication rules are met. By default, if this
configuration is unset, MinIO automatically sync's
metadata updates on replica back to the source.
2021-05-13 19:20:45 -07:00
Poorna Krishnamoorthy
28f0ded1a4
Update replication design.md for sync mode (#12100) 2021-04-20 17:31:36 -07:00
Ravind Kumar
ca9b48b3b4
Update Replication README to point at new docs (#12069)
This is a minor change to call out the new documentation and warn 
users to change  their bookmarks. Once we are ready to set up 
some redirects, we can remove this page from Gluegun TOC.
2021-04-15 16:32:44 -07:00
Poorna Krishnamoorthy
2899cc92b4
Update replication docs for required permission (#12010) 2021-04-07 15:56:02 -07:00
Harshavardhana
9171d6ef65
rename all references from crawl -> scanner (#11621) 2021-02-26 15:11:42 -08:00
Poorna Krishnamoorthy
8e8a792d9d
Allow delete marker replication from replica (#11566)
in the case of active-active replication.

This PR also has the following changes:

- add docs on replication design
- fix corner case of completing versioned delete on a delete marker
  when the target is down and `mc rm --vid` is performed repeatedly. Instead
  the version should still be retained in the `PENDING|FAILED` state until
  replication sync completes.
- remove `s3:Replication:OperationCompletedReplication` and
   `s3:Replication:OperationFailedReplication` from ObjectCreated 
  events type
2021-02-18 00:33:51 -08:00
Poorna Krishnamoorthy
7090bcc8e0
fix: doc links and delete replication permissions enforcement (#11285) 2021-01-15 15:22:55 -08:00
Harshavardhana
cc2d887e0e fix: whitespace and formatting in replication docs 2021-01-14 22:58:53 -08:00
Poorna Krishnamoorthy
c1b4b24236
Update replication docs (#11279) 2021-01-15 10:22:57 +05:30
Poorna Krishnamoorthy
feaf8dfb9a
Fix replication status reported on completion (#11273)
Fixes: #11272
2021-01-13 11:52:28 -08:00
Poorna Krishnamoorthy
b97d53b29c
fix remote target healthcheck (#11267) 2021-01-12 20:48:04 -08:00
Poorna Krishnamoorthy
7824e19d20
Allow synchronous replication if enabled. (#11165)
Synchronous replication can be enabled by setting the --sync
flag while adding a remote replication target.

This PR also adds proxying on GET/HEAD to another node in a
active-active replication setup in the event of a 404 on the current node.
2021-01-11 22:36:51 -08:00
Poorna Krishnamoorthy
39f3d5493b
Show Delete replication status header (#10946)
X-Minio-Replication-Delete-Status header shows the
status of the replication of a permanent delete of a version.

All GETs are disallowed and return 405 on this object version.
In the case of replicating delete markers.

X-Minio-Replication-DeleteMarker-Status shows the status 
of replication, and would similarly return 405.

Additionally, this PR adds reporting of delete marker event completion
and updates documentation
2020-11-21 23:48:50 -08:00
Harshavardhana
9a34fd5c4a Revert "Revert "Add delete marker replication support (#10396)""
This reverts commit 267d7bf0a9.
2020-11-19 18:43:58 -08:00
Harshavardhana
267d7bf0a9 Revert "Add delete marker replication support (#10396)"
This reverts commit 50c10a5087.

PR is moved to origin/dev branch
2020-11-12 11:43:14 -08:00
Anton Melser
2c1e37197b
fix: bad example json for policy in replication docs (#10869) 2020-11-10 17:49:49 -08:00
Poorna Krishnamoorthy
50c10a5087
Add delete marker replication support (#10396)
Delete marker replication is implemented for V2
configuration specified in AWS spec (though AWS
allows it only in the V1 configuration).

This PR also brings in a MinIO only extension of
replicating permanent deletes, i.e. deletes specifying
version id are replicated to target cluster.
2020-11-10 15:24:14 -08:00
poornas
e6ab4db6b8
Fix minimum replication workers started (#10560)
This PR also fixes GetReplicationConfiguration permission
in web-handlers.go to use bucket as resource
2020-09-24 12:25:41 -07:00
poornas
a4006e23a0
Update replication docs to clarify permissions (#10536)
Co-authored-by: Klaus Post <klauspost@gmail.com>
2020-09-22 11:58:04 -07:00
poornas
73a6b4ea11
fix typo in replication docs (#10366) 2020-08-27 12:54:23 -07:00
Ritesh H Shukla
8049184dcc
fix: documentation changes in replication docs (#10209) 2020-08-07 13:30:52 -07:00
poornas
adcaa6f9de
fix: Change ListBucketTargets handler (#10217)
to list all targets across a tenant.
Also fixing some validations.
2020-08-06 17:10:21 -07:00
poornas
121164db56
fix: relax some replication validations (#10210)
Also inherit storage class from source object
if replication configuration does not have a storage
class specified for destination bucket.
2020-08-05 20:01:20 -07:00
poornas
3acc0ebb81
fix: Change service name in Arn for replication (#10205) 2020-08-05 00:43:18 -07:00
poornas
a8dd7b3eda
Refactor replication target management. (#10154)
Generalize replication target management so
that remote targets for a bucket can be
managed with ARNs. `mc admin bucket remote`
command will be used to manage targets.
2020-07-30 19:55:22 -07:00
Harshavardhana
0b5d1bc91d
fix: bucket replication docs (#10104)
* fix: bucket replication docs

* Update docs/bucket/replication/README.md

Co-authored-by: kannappanr <30541348+kannappanr@users.noreply.github.com>

Co-authored-by: kannappanr <30541348+kannappanr@users.noreply.github.com>
2020-07-21 22:19:30 -07:00
poornas
c43da3005a
Add support for server side bucket replication (#9882) 2020-07-21 17:49:56 -07:00