fix: multiple fixes in prefix exclude implementation (#14877)

- do not need to restrict prefix exclusions that do not
  have `/` as suffix, relax this requirement as spark may
  have staging folders with other autogenerated characters
  , so we are better off doing full prefix March and skip. 

- multiple delete objects was incorrectly creating a
  null delete marker on a versioned bucket instead of
  creating a proper versioned delete marker.

- do not suspend paths on the excluded prefixes during
  delete operations to avoid creating `null` delete markers,
  honor suspension of versioning only at bucket level for
  delete markers.
This commit is contained in:
Harshavardhana
2022-05-07 22:06:44 -07:00
committed by GitHub
parent def75ffcfe
commit 5cffd3780a
9 changed files with 25 additions and 48 deletions

View File

@@ -65,6 +65,8 @@ Similarly to suspend versioning set the configuration with Status set to `Suspen
## MinIO extension to Bucket Versioning
### Motivation
**PLEASE READ: This feature is meant for advanced usecases only where the setup is using bucket versioning or with replicated buckets, use this feature to optimize versioning behavior for some specific applications. MinIO experts will evaluate and guide on the benefits for your application, please reach out to us on https://subnet.min.io.**
Spark/Hadoop workloads which use Hadoop MR Committer v1/v2 algorithm upload objects to a temporary prefix in a bucket. These objects are 'renamed' to a different prefix on Job commit. Object storage admins are forced to configure separate ILM policies to expire these objects and their versions to reclaim space.
### Solution
@@ -76,19 +78,23 @@ To exclude objects under a list of prefix (glob) patterns from being versioned,
<ExcludeFolders>true</ExcludeFolders>
<ExcludedPrefixes>
<Prefix>app1-jobs/*/_temporary/</Prefix>
<Prefix>*/_temporary</Prefix>
</ExcludedPrefixes>
<ExcludedPrefixes>
<Prefix>app2-jobs/*/_magic/</Prefix>
<Prefix>*/__magic</Prefix>
</ExcludedPrefixes>
<ExcludedPrefixes>
<Prefix>*/_staging</Prefix>
</ExcludedPrefixes>
<!-- .. up to 10 prefixes in all -->
</VersioningConfiguration>
```
Note: Objects matching these prefixes will behave as though versioning were suspended. These objects **will not** be replicated if replication is configured.
Only users with explicit permissions or the root credential can configure the versioning state of any bucket.
### Features
- Objects matching these prefixes will behave as though versioning were suspended. These objects **will not** be replicated if bucket has replication configured.
- Objects matching these prefixes will also not leave delete markers, dramatically reduces namespace pollution while keeping the benefits of replication.
- Users with explicit permissions or the root credential can configure the versioning state of any bucket.
## Examples of enabling bucket versioning using MinIO Java SDK