Fixes#7458Fixes#7573Fixes#7938Fixes#6934Fixes#6265Fixes#6630
This will allow the cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.
- All cached content is automatically bitrot protected.
- Avoid ETag verification if a cache-control header
is set and the cached content is still valid.
- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
When MinIO is behind a proxy, proxies end up killing
clients when no data is seen on the connection, adding
the right content-type ensures that proxies do not come
in the way.
This allows for canonicalization of the strings
throughout our code and provides a common space
for all these constants to reside.
This list is rather non-exhaustive but captures
all the headers used in AWS S3 API operations
This commit relaxes the restriction that the MinIO gateway
does not accept SSE-KMS headers. Now, the S3 gateway allows
SSE-KMS headers for PUT and MULTIPART PUT requests and forwards them
to the S3 gateway backend (AWS). This is considered SSE pass-through
mode.
Fixes#7753
This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.
Fixes: #7458, #7573, #6265, #6630
When size is unknown and auto encryption is enabled,
and compression is set to true, putobject API is failing.
Moving adding the SSE-S3 header as part of the request to before
checking if compression can be done, otherwise the size is set to -1
and that seems to cause problems.
Most hadoop distributions hortonworks, cloudera all
depend on aws-sdk-java 1.7.x to 1.10.x - the releases
which have bugs related case sensitive check for
ETag header. Go changes the case of the headers set
to be canonical but only preserves them when set
through a direct map.
This fixes most compatibility issues we have had
in the past supporting older hadoop distributions.
CopyObject precondition checks into GetObjectReader
in order to perform SSE-C pre-condition checks using the
last 32 bytes of encrypted ETag rather than the decrypted
ETag
This also necessitates moving precondition checks for
gateways to gateway layer rather than object handler check
Prevents deferred close functions from being called while still
attempting to copy reader to snappyWriter.
Reduces code duplication when compressing objects.
Clients like AWS SDK Java and AWS cli XML parsers are
unable to handle on `\r\n` characters to avoid these
errors send XML header first and write white space characters
instead.
Also handle cases to avoid double WriteHeader calls
Currently, we were sending errors in Select binary format,
which is incompatible with AWS S3 behavior, errors in binary
are sent after HTTP status code is already 200 OK - i.e it
happens during the evaluation of the record reader.
Different gateway implementations due to different backend
API errors, might return different unsupported errors at
our handler layer. Current code posed a problem for us because
this information was lost and we would convert it to InternalError
in this situation all S3 clients end up retrying the request.
To avoid this unexpected situation implement a way to support
this cleanly such that the underlying information is not lost
which is returned by gateway.
This PR also adds some comments and simplifies
the code. Primary handling is done to ensure
that we make sure to honor cached buffer.
Added unit tests as well
Fixes#7141
- New parser written from scratch, allows easier and complete parsing
of the full S3 Select SQL syntax. Parser definition is directly
provided by the AST defined for the SQL grammar.
- Bring support to parse and interpret SQL involving JSON path
expressions; evaluation of JSON path expressions will be
subsequently added.
- Bring automatic type inference and conversion for untyped
values (e.g. CSV data).
This PR supports iam and bucket policies to have
policy variable replacements in resource and
condition key values.
For example
- ${aws:username}
- ${aws:userid}
When auto-encryption is turned on, we pro-actively add SSEHeader
for all PUT, POST operations. This is unusual for V2 signature
calculation because V2 signature doesn't have a pre-defined set
of signed headers in the request like V4 signature. According to
V2 we should canonicalize all incoming supported HTTP headers.
Make sure to validate signatures before we mutate http headers
Before this change the CopyObjectHandler and the CopyObjectPartHandler
both looked for a `versionId` parameter on the `X-Amz-Copy-Source` URL
for the version of the object to be copied on the URL unescaped version
of the header. This meant that files that had question marks in were
truncated after the question mark so that files with `?` in their
names could not be server side copied.
After this change the URL unescaping is done during the parsing of the
`versionId` parameter which fixes the problem.
This change also introduces the same logic for the
`X-Amz-Copy-Source-Version-Id` header field which was previously
ignored, namely returning an error if it is present and not `null`
since minio does not currently support versions.
S3 Docs:
- https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html
- https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html
When source is encrypted multipart object and the parts are not
evenly divisible by DARE package block size, target encrypted size
will not necessarily be the same as encrypted source object.
This PR adds pass-through, single encryption at gateway and double
encryption support (gateway encryption with pass through of SSE
headers to backend).
If KMS is set up (either with Vault as KMS or using
MINIO_SSE_MASTER_KEY),gateway will automatically perform
single encryption. If MINIO_GATEWAY_SSE is set up in addition to
Vault KMS, double encryption is performed.When neither KMS nor
MINIO_GATEWAY_SSE is set, do a pass through to backend.
When double encryption is specified, MINIO_GATEWAY_SSE can be set to
"C" for SSE-C encryption at gateway and backend, "S3" for SSE-S3
encryption at gateway/backend or both to support more than one option.
Fixes#6323, #6696
minio-java tests were failing under multiple places when
auto encryption was turned on, handle all the cases properly
This PR fixes
- CopyObject should decrypt ETag before it does if-match
- CopyObject should not try to preserve metadata of source
when rotating keys, unless explicitly asked by the user.
- We should not try to decrypt Compressed object etag, the
potential case was if user sets encryption headers along
with compression enabled.
This commit adds an auto-encryption feature which allows
the Minio operator to ensure that uploaded objects are
always encrypted.
This change adds the `autoEncryption` configuration option
as part of the KMS conifguration and the ENV. variable
`MINIO_SSE_AUTO_ENCRYPTION:{on,off}`.
It also updates the KMS documentation according to the
changes.
Fixes#6502
This PR implements one of the pending items in issue #6286
in S3 API a user can request CSV output for a JSON document
and a JSON output for a CSV document. This PR refactors
the code a little bit to bring this feature.