minio/docs
Krishnan Parthasarathi ad8e611098
feat: implement prefix-level versioning exclusion (#14828)
Spark/Hadoop workloads which use Hadoop MR 
Committer v1/v2 algorithm upload objects to a 
temporary prefix in a bucket. These objects are 
'renamed' to a different prefix on Job commit. 
Object storage admins are forced to configure 
separate ILM policies to expire these objects 
and their versions to reclaim space.

Our solution:

This can be avoided by simply marking objects 
under these prefixes to be excluded from versioning, 
as shown below. Consequently, these objects are 
excluded from replication, and don't require ILM 
policies to prune unnecessary versions.

-  MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> 
        <Status>Enabled</Status>
        <ExcludeFolders>true</ExcludeFolders>
        <ExcludedPrefixes>
          <Prefix>app1-jobs/*/_temporary/</Prefix>
        </ExcludedPrefixes>
        <ExcludedPrefixes>
          <Prefix>app2-jobs/*/__magic/</Prefix>
        </ExcludedPrefixes>

        <!-- .. up to 10 prefixes in all -->     
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket 
from versioning. This is required to prevent the parent 
folders from accumulating delete markers, especially
those which are shared across spark workloads 
spanning projects/teams.

- To enable version exclusion on a list of prefixes

```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```
2022-05-06 19:05:28 -07:00
..
bigdata cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
bucket feat: implement prefix-level versioning exclusion (#14828) 2022-05-06 19:05:28 -07:00
chroot cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
compression Remove Azure gateway implementation (#14418) 2022-04-29 12:51:23 -07:00
config cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
debugging rename 'mc admin inspect' to 'mc support inspect' 2022-02-24 17:17:53 -08:00
deployment/kernel-tuning fix sysctl.sh quotes which are incompatible with sysctl (#10446) 2020-09-09 17:29:23 -07:00
disk-caching Remove Azure gateway implementation (#14418) 2022-04-29 12:51:23 -07:00
distributed docs: turn-on more markdown rules and fix them (#14301) 2022-02-14 08:50:42 -08:00
docker doc: add console-address on all example (#14307) 2022-02-15 09:26:04 -08:00
erasure cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
extensions/s3zip cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
federation/lookup docs: turn-on more markdown rules and fix them (#14301) 2022-02-14 08:50:42 -08:00
gateway Remove Azure gateway implementation (#14418) 2022-04-29 12:51:23 -07:00
integrations/veeam cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
kms kes: add support for encrypted private keys (#14650) 2022-03-29 09:53:33 -07:00
logging cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
metrics doc: typo fix for ttfb entry in table (#14647) 2022-03-29 09:42:02 -07:00
multi-tenancy cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
multi-user cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
orchestration Update yaml files to latest version RELEASE.2022-05-04T07-45-27Z 2022-05-04 08:54:16 +00:00
screenshots feat: Deprecate embedded browser and import console (#12460) 2021-06-17 20:27:04 -07:00
security cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
select docs: turn-on more markdown rules and fix them (#14301) 2022-02-14 08:50:42 -08:00
shared-backend docs: turn-on more markdown rules and fix them (#14301) 2022-02-14 08:50:42 -08:00
site-replication Add support for multiple OpenID providers with role policies (#14223) 2022-04-28 18:27:09 -07:00
sts Add OPA doc and remove deprecation marking (#14863) 2022-05-04 23:53:42 -07:00
throttle cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00
tls docs: turn-on more markdown rules and fix them (#14301) 2022-02-14 08:50:42 -08:00
LICENSE purge deprecate docker swarm documentation 2021-05-10 09:50:06 -07:00
hotfixes.md update hotfixes instructions and fix some typo 2022-03-25 23:49:28 -07:00
minio-limits.md cleanup markdown docs across multiple files (#14296) 2022-02-11 16:51:25 -08:00