We need to make sure if we cannot read bucket metadata
for some reason, and bucket metadata is not missing and
returning corrupted information we should panic such
handlers to disallow I/O to protect the overall state
on the system.
In-case of such corruption we have a mechanism now
to force recreate the metadata on the bucket, using
`x-minio-force-create` header with `PUT /bucket` API
call.
Additionally fix the versioning config updated state
to be set properly for the site replication healing
to trigger correctly.
Spark/Hadoop workloads which use Hadoop MR
Committer v1/v2 algorithm upload objects to a
temporary prefix in a bucket. These objects are
'renamed' to a different prefix on Job commit.
Object storage admins are forced to configure
separate ILM policies to expire these objects
and their versions to reclaim space.
Our solution:
This can be avoided by simply marking objects
under these prefixes to be excluded from versioning,
as shown below. Consequently, these objects are
excluded from replication, and don't require ILM
policies to prune unnecessary versions.
- MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Status>Enabled</Status>
<ExcludeFolders>true</ExcludeFolders>
<ExcludedPrefixes>
<Prefix>app1-jobs/*/_temporary/</Prefix>
</ExcludedPrefixes>
<ExcludedPrefixes>
<Prefix>app2-jobs/*/__magic/</Prefix>
</ExcludedPrefixes>
<!-- .. up to 10 prefixes in all -->
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket
from versioning. This is required to prevent the parent
folders from accumulating delete markers, especially
those which are shared across spark workloads
spanning projects/teams.
- To enable version exclusion on a list of prefixes
```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```
This is to ensure that there are no projects
that try to import `minio/minio/pkg` into
their own repo. Any such common packages should
go to `https://github.com/minio/pkg`
- Implement a new xl.json 2.0.0 format to support,
this moves the entire marshaling logic to POSIX
layer, top layer always consumes a common FileInfo
construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
for object placements.
Fixes#2111