Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner.
Also removes a tiny bit of CPU by each write operation.
Move to using `xl.meta` data structure to keep temporary partInfo,
this allows for a future change where we move to different parts to
different drives.
PUT shall only proceed if pre-conditions are met, the new
code uses
- x-minio-source-mtime
- x-minio-source-etag
to verify if the object indeed needs to be replicated
or not, allowing us to avoid StatObject() call.
This PR is a continuation of the previous change instead
of returning an error, instead trigger a spot heal on the
'xl.meta' and return only after the healing is complete.
This allows for future GETs on the same resource to be
consistent for any version of the object.
xl.meta gets written and never rolled back, however
we definitely need to validate the state that is
persisted on the disk, if there are inconsistencies
- more than write quorum we should return an error
to the client
- if write quorum was achieved however there are
inconsistent xl.meta's we should simply trigger
an MRF on them
The bottom line is delete markers are a nuisance,
most applications are not version aware and this
has simply complicated the version management.
AWS S3 gave an unnecessary complication overhead
for customers, they need to now manage these
markers by applying ILM settings and clean
them up on a regular basis.
To make matters worse all these delete markers
get replicated as well in a replicated setup,
requiring two ILM settings on each site.
This PR is an attempt to address this inferior
implementation by deviating MinIO towards an
idempotent delete marker implementation i.e
MinIO will never create any more than single
consecutive delete markers.
This significantly reduces operational overhead
by making versioning more useful for real data.
This is an S3 spec deviation for pragmatic reasons.
The path is marked dirty automatically when healObject() is called, which is
wrong. HealObject() is called during self-healing and this will lead to
an increase in the false positive result of the bloom filter.
Also move NSUpdated() from renameData() and call it directly in
CompleteMultipart and PutObject, this is not a functional change but
it will make it less prone to errors in the future.
Add up to 256 bytes of padding for compressed+encrypted files.
This will obscure the obvious cases of extremely compressible content
and leave a similar output size for a very wide variety of inputs.
This does *not* mean the compression ratio doesn't leak information
about the content, but the outcome space is much smaller,
so often *less* information is leaked.
We need to make sure if we cannot read bucket metadata
for some reason, and bucket metadata is not missing and
returning corrupted information we should panic such
handlers to disallow I/O to protect the overall state
on the system.
In-case of such corruption we have a mechanism now
to force recreate the metadata on the bucket, using
`x-minio-force-create` header with `PUT /bucket` API
call.
Additionally fix the versioning config updated state
to be set properly for the site replication healing
to trigger correctly.
readAllXL would return inlined data for outdated disks
causing "read" to return incorrect content to the client,
this PR fixes this behavior by making sure we skip such
outdated disks appropriately based on the latest ModTime
on the disk.