This commit allows the MinIO server to parse the metadata if:
- either the `X-Minio-Internal-Server-Side-Encryption-S3-Key-Id`
and the `X-Minio-Internal-Server-Side-Encryption-S3-Kms-Sealed-Key`
entries are present.
- or *both* headers are not present.
This is in service to support a K/V data key store.
In commit a8296445ad we changed the code to handle
some corner cases on ARM and other platforms, this
PR just avoids the return for unknown filetypes
prematurely and let the name be populated appropriately.
This fixes bug for older XFS implementations such as
in Ubuntu 14.04
This commit fixes a subtle bug that (probably)
caused an issue affecting encrypted multipart objects.
When a cluster has no quorum this bug causes `ListObjectParts`
to return nil as error instead of a quorum error.
Thanks to @harshavardhana for detecting this.
This PR changes cache on PUT behavior to background fill the cache
after PutObject completes. This will avoid concurrency issues as in #8219.
Added cleanup of partially filled cache to prevent cache corruption
- Fixes#8208
VerifyFile in the distributed setup does not work with
the non streaming highway hash. The reason is that the
internode mux router did not expect `storageRESTBitrotHash`
parameter.
endpoint.IsLocal will not have .Host entries so
using them to skip double entries will never work.
change the code such that we look for endpoint.Host
outside of endpoint.IsLocal logic to skip double
hosts appropriately.
Move these functions to their appropriate file.
- ListObjectsHeal should list only objects
which need healing, not the entire namespace.
- DeleteObjects() to be used to delete 1000s of
objects in bulk instead of serially.
Add LDAP based users-groups system
This change adds support to integrate an LDAP server for user
authentication. This works via a custom STS API for LDAP. Each user
accessing the MinIO who can be authenticated via LDAP receives
temporary credentials to access the MinIO server.
LDAP is enabled only over TLS.
User groups are also supported via LDAP. The administrator may
configure an LDAP search query to find the group attribute of a user -
this may correspond to any attribute in the LDAP tree (that the user
has access to view). One or more groups may be returned by such a
query.
A group is mapped to an IAM policy in the usual way, and the server
enforces a policy corresponding to all the groups and the user's own
mapped policy.
When LDAP is configured, the internal MinIO users system is disabled.
It looks like from implementation point of view fastjson
parser pool doesn't behave the same way as expected
when dealing many `xl.json` from multiple disks.
The fastjson parser pool usage ends up returning incorrect
xl.json entries for checksums, with references pointing
to older entries. This led to the subtle bug where checksum
info is duplicated from a previous xl.json read of a different
file from different disk.
With this PR, liveness check responds with 200 OK with "server-not-
initialized" header while objectLayer gets initialized. The header
is removed as objectLayer is initialized. This is to allow
MinIO distributed cluster to get started when running on an
orchestration platforms like Docker Swarm.
This PR also updates sample Swarm yaml files to use correct values
for healthcheck fields.
Fixes#8140
This commit adds an admin API route and handler for
requesting status information about a KMS key.
Therefore, the client specifies the KMS key ID (when
empty / not set the server takes the currently configured
default key-ID) and the server tries to perform a dummy encryption,
re-wrap and decryption operation. If all three succeed we know that
the server can access the KMS and has permissions to generate, re-wrap
and decrypt data keys (policy is set correctly).
This is a defensive change to avoid any future issues,
from this part of the code. New change also ensures
to populate UUID if present for the right disk.
This commit fixes the web ZIP download handler for
encrypted objects. The decryption logic has moved into
`getObjectNInfo`. So trying to decrypt the (already decrypted)
content again in the ZIP handler obviously causes an error.
This commit fixes this by removing the decryption logic from the
the handler.
Fixes#7965
The underlying errors are important, for IAM
requirements and should wait appropriately at
the caller level, this allows for distributed
setups to run properly and not fail prematurely
during startup.
Also additionally fix the onlineDisk counting
The change now is to ensure that we take custom URL as
well for updating the deployment, this is required for
hotfix deliveries for certain deployments - other than
the community release.
This commit changes the previous work d65a2c6725
with newer set of requirements.
Also deprecates PeerUptime()
net.Error is very unreliable in providing better error
handling, we need to ensure that we always have a fallback
option in case of network failures.
This fixes an important issue in our distributed server
setups when one of the servers is down, all deployments
out there are recommended to upgrade after this fix is
merged to ensure that availability is not lost.
Fixes#8127Fixes#8016Fixes#7964
This API implementation simply behaves like listObjects()
but returns back single version for each object, this
implementation should be considered dummy it is only
meant for some applications which rely on this.
This is to avoid using unsafe.Pointer type
code dependency for MinIO, this causes
crashes on ARM64 platforms
Refer #8005 collection of runtime crashes due
to unsafe.Pointer usage incorrectly. We have
seen issues like this before when using
jsoniter library in the past.
This PR hopes to fix this using fastjson
Add better dynamic timeouts for locks, also
add jitters before launching daily sweep to ensure
that not all the servers in distributed setup
are not trying to hold locks to begin the sweep
round.
Also, add enough delay for incoming requests based
on totalSetCount*totalDriveCount.
A possible fix for #8071
It is observed that when `mc admin trace` is being
used due to ResponseWriter wrapper, we loose information
about statusCode,statusText for audit logging.
This PR fixes this behavior
This avoids a network call, also fixes an issue
when empty paths are passed the underlying call
fails with "405 Method Not Allowed".
This is reproducible when you are deleting a
non-existent object.
Fixes#8083
Add API to set policy mapping for a user or group
Contains a breaking Admin APIs change.
- Also enforce all applicable policies
- Removes the previous /set-user-policy API
Bump up peerRESTVersion
Add get user info API to show groups of a user
* Cleanup ui-errors and print proper error messages
Change HELP to HINT instead, handle more error
cases when starting up MinIO. One such is related
to #8048
* Apply suggestions from code review
When a peer client which higher version sends a request to a peer
server with lower version, the returned status code is 200 OK instead
of 405 code. The reason is that the peer client request reaches the
browser handler, which registers itself by '/minio' route but without
any other constraints. Adding filtering by user agent header to the
browser route so internal requests to old endpoints versions return
405 error code.
This is a behavior change from AWS S3, but it is done with
better judgment on our end to allow the listing of buckets only
which user has access to.
The advantage is this declutters the UI for users and only
lists bucket which they have access to.
Precursor for this feature to be applicable is a policy
must have the following actions
```
s3:ListAllMyBuckets
```
and
```
s3:ListBucket
```
enabled in the policy.
Fixes#7458Fixes#7573Fixes#7938Fixes#6934Fixes#6265Fixes#6630
This will allow the cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.
- All cached content is automatically bitrot protected.
- Avoid ETag verification if a cache-control header
is set and the cached content is still valid.
- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
Refactor the Dirent parsing code such that when we
calculate offsets are correct based on the platform
This PR fixes a silent potential crash on ARM
architecture.
This commit fixes a DoS issue that is caused by an incorrect
SHA-256 content verification during STS requests.
Before that fix clients could write arbitrary many bytes
to the server memory. This commit fixes this by limiting the
request body size.
This change adds admin APIs and IAM subsystem APIs to:
- add or remove members to a group (group addition and deletion is
implicit on add and remove)
- enable/disable a group
- list and fetch group info
When checking if federation is necessary, the code compares
the SRV record stored in etcd against the list of endpoints
that the MinIO server is exposing. If there is an intersection
in this list the request is forwarded.
The SRV record includes both the host and the port, but the
intersection check previously only looked at the IP address. This
would prevent federation from working in situations where the endpoint
IP is the same for multiple MinIO servers. Some examples of where this
can occur are:
- running mulitiple copies of MinIO on the same host
- using multiple MinIO servers behind a NAT with port-forwarding
This commit adds a new method `UpdateKey` to the KMS
interface.
The purpose of `UpdateKey` is to re-wrap an encrypted
data key (the key generated & encrypted with a master key by e.g.
Vault).
For example, consider Vault with a master key ID: `master-key-1`
and an encrypted data key `E(dk)` for a particular object. The
data key `dk` has been generated randomly when the object was created.
Now, the KMS operator may "rotate" the master key `master-key-1`.
However, the KMS cannot forget the "old" value of that master key
since there is still an object that requires `dk`, and therefore,
the `D(E(dk))`.
With the `UpdateKey` method call MinIO can ask the KMS to decrypt
`E(dk)` with the old key (internally) and re-encrypted `dk` with
the new master key value: `E'(dk)`.
However, this operation only works for the same master key ID.
When rotating the data key (replacing it with a new one) then
we perform a `UnsealKey` operation with the 1st master key ID
and then a `GenerateKey` operation with the 2nd master key ID.
This commit also updates the KMS documentation and removes
the `encrypt` policy entry (we don't use `encrypt`) and
add a policy entry for `rewarp`.
After some extensive refactors, it turned out empty directories
are not healed and heal status is also not reported correctly.
This commit fixes it and adds the appropriate unit tests
This PR fixes relying on r.Context().Done()
by setting
```
Connection: "close"
```
HTTP Header, this has detrimental issues for
client side connection pooling. Since this
header explicitly tells clients to turn-off
connection pooling. This causing pro-active
connections to be closed leaving many conn's
in TIME_WAIT state. This can be observed with
`mc admin trace -a` when running distributed
setup.
This PR also fixes tracing filtering issue
when bucket names have `minio` as prefixes,
trace was erroneously ignoring them.
Golang proactively prints this error
`http: proxy error: context canceled`
when a request arrived to the current deployment and
redirected to another deployment in a federated setup.
Since this error can confuse users, this commit will
just hide it.
Allow renaming/editing a notification config. By replying with
a successful GetBucketNotification response, without checking
for any missing config ARN in targetList.
Fixes#7650
Related to #7982, this PR refactors the code
such that we validate the OPA or JWKS in a
common place.
This is also a refactor which is already done
in the new config migration change. Attempt
to avoid any network I/O during Unmarshal of
JSON from disk, instead do it later when
updating the in-memory data structure.
In situations such as when client uploading data,
prematurely disconnects from server such as pressing
ctrl-c before uploading all the data. Under this
situation in distributed setup we prematurely
disconnect disks causing a reconnect loop. This has
an adverse affect we end up leaving a lot of files
in temporary location which ideally should have been
cleaned up when Put() prematurely fails.
This is also a regression which got introduced in #7610
- Policy mapping is now at `config/iam/policydb/users/myuser1.json`
and includes version.
- User identity file is now versioned.
- Migrate old data to the new format.
Calling ListMultipartUploads fails if an upload is aborted while a
part is being uploaded because the directory for the upload exists
(since fsRenameFile ends up calling os.MkdirAll) but the meta JSON file
doesn't. To fix this we make sure an upload hasn't been aborted during
PutObjectPart by checking the existence of the directory for the upload
while moving the temporary part file into it.
This PR is based off @sinhaashish's PR for object lifecycle
management, which includes support only for,
- Expiration of object
- Filter using object prefix (_not_ object tags)
N B the code for actual expiration of objects will be included in a
subsequent PR.
Current code already allows users to GetPolicy/SetPolicy
there was a missing code in ListAllBucketPolicies to allow
access, this fixes this behavior.
Fixes#7913
The fix in #7646 introduced a regression which
was left unnoticed, the fix didn't work for
sub-commands unfortunately. This fixes it
by moving v1.21.0 version of the minio/cli
package.
Fixes#7924
posix.VerifyFile() doesn't know how to check if a file
is corrupted if that file is empty. We do have the part
size in xl.json so we pass it to VerifyFile to return
an error so healing empty parts can work properly.
When MinIO is behind a proxy, proxies end up killing
clients when no data is seen on the connection, adding
the right content-type ensures that proxies do not come
in the way.
Update prompt shows some weird characters under Windows, the reason
is that go-prompt is used to show a yes/no prompt, since go-prompt
does not seem to have a way to support color/fatih, this PR will
implements its own yes/no prompt with the correct text coloration.
This allows for canonicalization of the strings
throughout our code and provides a common space
for all these constants to reside.
This list is rather non-exhaustive but captures
all the headers used in AWS S3 API operations
r.URL.Path is empty when HEAD bucket with virtual
DNS requests come in since bucket is now part of
r.Host, we should use our domain names and fetch
the right bucket/object names.
This fixes an really old issue in our federation
setups.
This API returns the information related to the self healing routine.
For the moment, it returns:
- The total number of objects that are scanned
- The last time when an item was scanned
- return all objects in web-handlers listObjects response
- added local pagination to object list ui
- also fixed infinite loader and removed unused fields
While checking for port availability, ip address should be included.
When a machine has multiple ip addresses, multiple minio instances
or some other applications can be run on same port but different
ip address.
Fixes#7685
This PR adds support for adding session policies
for further restrictions on STS credentials, useful
in situations when applications want to generate
creds for multiple interested parties with different
set of policy restrictions.
This session policy is not mandatory, but optional.
Fixes#7732
This commit relaxes the restriction that the MinIO gateway
does not accept SSE-KMS headers. Now, the S3 gateway allows
SSE-KMS headers for PUT and MULTIPART PUT requests and forwards them
to the S3 gateway backend (AWS). This is considered SSE pass-through
mode.
Fixes#7753
Previously the read/write lock applied both for gateway use cases as
well the object store use case. Nothing from sys is touched or looked
at in the gateway usecase though, so we don't need to lock. Don't lock
to make the gateway policy getting a little more efficient, particularly
as where this is called from (checkRequestAuthType) is quite common.
etcd when used in federated setups, currently
mandates that all clusters should have same
config.json, which is too restrictive and makes
federation a restrictive environment.
This change makes it apparent that each cluster
needs to be independently managed if necessary
from `mc admin info` command line.
Each cluster with in federation can have their
own root credentials and as well as separate
regions. This way buckets get further restrictions
and allows for root creds to be not common
across clusters/data centers.
Existing data in etcd gets migrated to backend
on each clusters, upon start. Once done
users can change their config entries
independently.
- Background Heal routine receives heal requests from a channel, either to
heal format, buckets or objects
- Daily sweeper lists all objects in all buckets, these objects
don't necessarly have read quorum so they can be removed if
these objects are unhealable
- Heal daily ops receives objects from the daily sweeper
and send them to the heal routine.
Inconsistencies can arise after applying bucket policies in
gateway mode, since all gateway instances do not share a
common shared state. This is by design to keep gateway as
shared nothing architecture.
This PR fixes such inconsistencies by reloading policy
if any from the backend.
Fixes#7723
Consider errors returned by httpClient.Do() as network errors. This is because
the http clients returns different types of errors and it is hard to catch
all the error types.
With these changes we are now able to peak performances
for all Write() operations across disks HDD and NVMe.
Also adds readahead for disk reads, which also increases
performance for reads by 3x.
IsTruncated should not be set to true if there is no further
possible entries beyond maxKeys.
This commit will also move wide testing on object API from xl
to xl sets.
The problem in current code was we were removing
an entry from a lock lockerMap without considering
the fact that different entry for same resource is
a possibility due the nature of locks that can be
acquired in parallel before we decide if the lock
is considered stale
A sequence of events is as follows
- Lock("resource")
- lockMaintenance(finds a long lived lock in this "resource")
- Owner node rebooted which now retruns Expired() as true for
this "resource"
- Unlock("resource") which succeeded in quorum
- Now by this time application retried and acquired a new
Lock() on the same "resource"
- Now that we have Expired() true from the previous call,
we proceed to purge the entry from the local lockMap()
local lockMap reports a different entry for the expired
UID which results in a spurious log entry.
This PR removes this logging as this situation is an
expected scenario.
This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.
Fixes: #7458, #7573, #6265, #6630
This is necessary to avoid connection build up between servers
unexpectedly for example in a situation where 16 servers are
talking to each other and one server now allows a maximum of
15*4096 = 61440 idle connections
Will be kept in pool. Such a large pool is perhaps inefficient for
many reasons and also affects overall system resources.
This PR also reduces idleConnection timeout from 120 secs to 60 secs.
errs was passed to many goroutines but they are all allowed
to update errs if any error happens during deletion, which
can cause a data race.
This commit will avoid issuing bulk delete operations in parallel
to avoid the warning race.
We broke into parts previously as we had checksum for the entire file
to tax less on memory and to have better TTFB. We dont need to now,
after the streaming-bitrot change.
Bulk delete at storage level in Multiple Delete Objects API
In order to accelerate bulk delete in Multiple Delete objects API,
a new bulk delete is introduced in storage layer, which will accept
a list of objects to delete rather than only one. Consequently,
a new API is also need to be added to Object API.
PR #7595 fixed part of the regression, but did not handle the
scenario, where in docker, the internal port is different from
the port on the host.
This PR modifies the regular expression such that all the
scenarios are handled.
Fixes#7619
After recent listing refactor, recursive list doesn't return empty
directories, this commit will fix the behavior and add unit tests
so it won't happen again.
When size is unknown and auto encryption is enabled,
and compression is set to true, putobject API is failing.
Moving adding the SSE-S3 header as part of the request to before
checking if compression can be done, otherwise the size is set to -1
and that seems to cause problems.
Currently we used to reload users every five minutes,
regardless of etcd is configured or not. But with etcd
configured we can do this more asynchronously to trigger
a refresh by using the watch API
Fixes#7515
One user has seen this following error log:
API: CompleteMultipartUpload(bucket=vertica, object=perf-dss-v03/cc2/02596813aecd4e476d810148586c2a3300d00000013557ef_0.gt)
Time: 15:44:07 UTC 04/11/2019
RequestID: 159475EFF4DEDFFB
RemoteHost: 172.26.87.184
UserAgent: vertica-v9.1.1-5
Error: open /data/.minio.sys/tmp/100bb3ec-6c0d-4a37-8b36-65241050eb02/xl.json: file exists
1: cmd/xl-v1-metadata.go:448:cmd.writeXLMetadata()
2: cmd/xl-v1-metadata.go:501:cmd.writeUniqueXLMetadata.func1()
This can happen when CompleteMultipartUpload fails with write quorum,
the S3 client will retry (since write quorum is 500 http response),
however the second call of CompleteMultipartUpload will fail because
this latter doesn't truly use a random uuid under .minio.sys/tmp/
directory but pick the upload id.
This commit fixes the behavior to choose a random uuid for generating
xl.json
Since AssumeRole API was introduced we have a wrong route
match which results in certain clients failing to upload objects
using multipart because, multipart POST conflicts with STS POST
AssumeRole API.
Write a proper matcher function which verifies the route more
appropriately such that both can co-exist.
Other listing optimizations include
- remove double sorting while filtering object entries
- improve error message when upload-id is not in quorum
- use jsoniter for full unmarshal json, instead of gjson
- remove unused code
Allow server to start if one of the local nodes in docker/kubernetes setup is successfully resolved
- The rule is that we need atleast one local node to work. We dont need to resolve the
rest at that point.
- In a non-orchestrational setup, we fail if we do not have atleast one local node up
and running.
- In an orchestrational setup (docker-swarm and kubernetes), We retry with a sleep of 5
seconds until any one local node shows up.
Fixes#6995
In distributed mode, use REST API to acquire and manage locks instead
of RPC.
RPC has been completely removed from MinIO source.
Since we are moving from RPC to REST, we cannot use rolling upgrades as the
nodes that have not yet been upgraded cannot talk to the ones that have
been upgraded.
We expect all minio processes on all nodes to be stopped and then the
upgrade process to be completed.
Also force http1.1 for inter-node communication
common prefixes in bucket name if already created
are disallowed when etcd is configured due to the
prefix matching issue. Make sure that when we look
for bucket we are only interested in exact bucket
name not the prefix.
- [x] Support bucket and regular object operations
- [x] Supports Select API on HDFS
- [x] Implement multipart API support
- [x] Completion of ListObjects support
There is no written specification about how to encode key names
when url encoding type is passed.
However, this change will encode URLs as url.QueryEscape() does
while considering AWS S3 exceptions.
This commit adds a unit test for the vault
config verification (which covers also `IsEmpty()`).
Vault-related code is hard to test with unit tests
since a Vault service would be necessary. Therefore
this commit only adds tests for a fraction of the code.
Fixes#7409
Most hadoop distributions hortonworks, cloudera all
depend on aws-sdk-java 1.7.x to 1.10.x - the releases
which have bugs related case sensitive check for
ETag header. Go changes the case of the headers set
to be canonical but only preserves them when set
through a direct map.
This fixes most compatibility issues we have had
in the past supporting older hadoop distributions.
This commit fixes a privilege escalation issue against
the S3 and web handlers. An authenticated IAM user
can:
- Read from or write to the internal '.minio.sys'
bucket by simply sending a properly signed
S3 GET or PUT request. Further, the user can
- Read from or write to the internal '.minio.sys'
bucket using the 'Upload'/'Download'/'DownloadZIP'
API by sending a "browser" request authenticated
with its JWT token.
This commit fixes another privilege escalation issue
abusing the inter-node communication of distributed
servers to obtain/modify the server configuration.
The inter-node communication is authenticated using
JWT-Tokens. Further, IAM users accessing the cluster
via the web UI also get a JWT token and the browser
will add this "user" JWT token to each the request.
Now, a user can extract that JWT token an can craft
HTTP POST requests for the inter-node communication
API endpoint. Since the server accepts ANY valid
JWT token it also accepts inter-node commands from
an authenticated user such that the user can execute
arbitrary commands bypassing the IAM policy engine
and impersonate other users, change its own IAM policy
or extract the admin access/secret key.
This is fixed by only accepting "admin" JWT tokens
(tokens containing the admin access key - and therefore
were generated with the admin secret key). Consequently,
only the admin user can execute such inter-node commands.
Simplify the cmd/http package overall by removing
custom plain text v/s tls connection detection, by
migrating to go1.12 and choose minimum version
to be go1.12
Also remove all the vendored deps, since they
are not useful anymore.
A race is detected between a bytes.Buffer generated with cmd/rpc.Pool
and http2 module. An issue is raised in golang (https://github.com/golang/go/issues/31192).
Meanwhile, this commit disables Pool in RPC code and it generates a
new 1kb of bytes.Buffer for each RPC call.
Before this commit, nodes wait indefinitely without showing any
indicate error message when a node is started with different access
and secret keys.
This PR will show '401 Unauthorized' in this case.
Currently message is set to error type value.
Message field is not used in error logs. it is used only in the case of info logs.
This PR sets error message field to store error type correctly.
Copying an encrypted SSEC object when this latter is uploaded using
multipart mechanism was failing because ETag in case of encrypted
multipart upload is not encrypted.
This PR fixes the behavior.
This fixes varying pids for server-respawns. And avoids duplicate process
creating multiple pids when the server restart signal is triggered with
service restart enabled.
Fixes#7350
It is required to set the environment variable in the case of distributed
minio. LoadCredentials is used to notify peers of the change and will not work if
environment variable is set. so, this function will never be called.
In scenario 1
```
- bucket/object-prefix
- bucket/object-prefix/object
```
Server responds with `XMinioParentIsObject`
In scenario 2
```
- bucket/object-prefix/object
- bucket/object-prefix
```
Server responds with `XMinioObjectExistsAsDirectory`
Fixes#6566
Healing scan used to read all objects parts to check for bitrot
checksum. This commit will add a quicker way of healing scan
by only checking if parts are actually present in disks or not.
We should internally handle when http2 input stream has smaller
content than its content-length header
Upstream issue reported https://github.com/golang/go/issues/30648
This a change which we need to handle internally until Go fixes it
correctly, till now our code doesn't expect a custom error to be returned.
CopyObject precondition checks into GetObjectReader
in order to perform SSE-C pre-condition checks using the
last 32 bytes of encrypted ETag rather than the decrypted
ETag
This also necessitates moving precondition checks for
gateways to gateway layer rather than object handler check
if a bucket with `Captialized letters` is created, `InvalidBucketName` error
will be returned.
In the case of pre-existing buckets, it will be listed.
Fixes#6938
Prevents deferred close functions from being called while still
attempting to copy reader to snappyWriter.
Reduces code duplication when compressing objects.
This change allows indefinitely running go-routines to cleanup
gracefully.
This channel is now closed at the beginning of each test so that
long-running go-routines quit and a new one is assigned.
The side affect of this change memory
increase, but this is a trade-off between
performance and actual memory usage.
For all practical scenarios this should be
an adequate change.
- The events will be persisted in queueStore if `queueDir` is set.
- Else, if queueDir is not set events persist in memory.
The events are replayed back when the mqtt broker is back online.
Clients like AWS SDK Java and AWS cli XML parsers are
unable to handle on `\r\n` characters to avoid these
errors send XML header first and write white space characters
instead.
Also handle cases to avoid double WriteHeader calls
- Current implementation was spawning renewer goroutines
without waiting for the lease duration to end. Remove vault renewer
and call vault.RenewToken directly and manage reauthentication if
lease expired.
We should change the logic for both isObject()
and isObjectDir() leaf detection to be done
with quorum, due to how our directory navigation
works - this allows for properly deleting all
the dangling directories or objects if any.
This commit fixes a nil pointer dereference issue
that can occur when the Vault KMS returns e.g. a 404
with an empty HTTP response. The Vault client SDK
does not treat that as error and returns nil for
the error and the secret.
Further it simplifies the token renewal and
re-authentication mechanism by using a single
background go-routine.
The control-flow of Vault authentications looks
like this:
1. `authenticate()`: Initial login and start of background job
2. Background job starts a `vault.Renewer` to renew the token
3. a) If this succeeds the token gets updated
b) If this fails the background job tries to login again
4. If the login in 3b. succeeded goto 2. If it fails
goto 3b.
Currently, we were sending errors in Select binary format,
which is incompatible with AWS S3 behavior, errors in binary
are sent after HTTP status code is already 200 OK - i.e it
happens during the evaluation of the record reader.
This commit increases storage REST requests to 5 minutes, this includes
the opening TCP connection, and sending/receiving data. This will reduce
clients receiving errors when the server is under high load.
Different gateway implementations due to different backend
API errors, might return different unsupported errors at
our handler layer. Current code posed a problem for us because
this information was lost and we would convert it to InternalError
in this situation all S3 clients end up retrying the request.
To avoid this unexpected situation implement a way to support
this cleanly such that the underlying information is not lost
which is returned by gateway.
Bucket metadata healing in the current code was executed multiple
times each time for a given set. Bucket metadata just like
objects are hashed in accordance with its name on any given set,
to allow hashing to play a role we should let the top level
code decide where to navigate.
Current code also had 3 bucket metadata files hardcoded, whereas
we should make it generic by listing and navigating the .minio.sys
to heal such objects.
We also had another bug where due to isObjectDangling changes
without pre-existing bucket metadata files, we were erroneously
reporting it as grey/corrupted objects.
This PR fixes all of the above items.
This PR also adds some comments and simplifies
the code. Primary handling is done to ensure
that we make sure to honor cached buffer.
Added unit tests as well
Fixes#7141
foo.CORRUPTED should never be created because when
multiple sets are involved we would hash the file
to wrong a location, this PR removes the code.
But allows DeleteBucket() to work properly to delete
dangling buckets/objects. Also adds another option
to Healing where a user needs to specify `--remove`
such that all dangling objects will be deleted with
user confirmation.
ListObjectParts is using xl.readXLMetaParts which picks the first
xl meta found in any disk, which is an inconsistent information.
E.g.: In a middle of a multipart upload, one node can go offline
and get back later with an outdated multipart information.
This commit fixes the computation of Before/After healing state
for empty directories.
Issues before the commit:
- Before state doesn't reflect the real status (no StatVol() called)
- For any MakeVol() error, healObjectDir is exited directly, which is
wrong.
Currently during a heal of a bucket, if one disk is offline an empty endpoint entry is added.
Then another entry with the missing endpoint is also added.
This results in more entries than disks being added.
Code that adds empty endpoint has been removed.
Collect historic cpu and mem stats. Also, use actual values
instead of formatted strings while returning to the client. The string
formatting prevents values from being processed by the server or
by the client without parsing it.
This change will allow the values to be processed (eg.
compute rolling-average over the lifetime of the minio server)
and offloads the formatting to the client.
We made a change previously in #7111 which moved support
for AWS envs only for AWS S3 endpoint. Some users requested
that this be added back to Non-AWS endpoints as well as
they require separate credentials for backend authentication
from security point of view.
More than one client can't use the same clientID for MQTT connection.
This causes problem in distributed deployments where config is shared
across nodes, as each Minio instance tries to connect to MQTT using the
same clientID.
This commit removes the clientID field in config, and allows
MQTT client to create random clientID for each node.
- New parser written from scratch, allows easier and complete parsing
of the full S3 Select SQL syntax. Parser definition is directly
provided by the AST defined for the SQL grammar.
- Bring support to parse and interpret SQL involving JSON path
expressions; evaluation of JSON path expressions will be
subsequently added.
- Bring automatic type inference and conversion for untyped
values (e.g. CSV data).
This situation happens only in gateway nas which supports
etcd based `config.json` to support all FS mode features.
The issue was we would try to migrate something which doesn't
exist when etcd is configured which leads to inconsistent
server configs in memory.
This PR fixes this situation by properly loading config after
initialization, avoiding backend disk config migration to be
done only if etcd is not configured.
If it does happen that we have a lot files in '.minio.sys/tmp',
minio startup might block deleting this folder. Rename and
delete in background instead to allow Minio to start serving
requests.
To avoid a large number of concurrent connections between minio
servers and to reduce CPU pressure, it is better to limit the number
of objects healed in parallel to number_of_CPUs.
Requirements like being able to run minio gateway in ec2
pointing to a Minio deployment wouldn't work properly
because IAM creds take precendence on ec2.
Add checks such that we only enable AWS specific features
if our backend URL points to actual AWS S3 not S3 compatible
endpoints.
Returning unexpected errors can cause problems for config handling,
which is what led gateway deployments with etcd to misbehave and
had stopped working properly
Deployment ID is not copied into new formats after healing format. Although,
this is not critical since a new deployment ID will be generated and set in the
next cluster restart, it is still much better if we don't change the deployment
id of a cluster for a better tracking.
Fix regexp matcher for special assets for the browser to clash with
less of the object namespace.
Assets should now be loaded with the /minio/ prefix. Previously,
favicon.ico (and others) could be loaded at any path matching
/minio/*/favicon.ico. This clashes with a large part of the object
namespace. With this change, /minio/favicon.ico will serve the favicon
but not /minio/mybucket/favicon.ico
Fixes#7077
This changes causes `getRootCAs` to always load system-wide CAs.
Any additional custom CAs (at `certs/CA/`) are added to the certificate pool
of system CAs.
The previous behavior was incorrect since all no system-wide CAs were
loaded if either there were CAs under `certs/CA` or the `certs/CA`
directory didn't exist at all.
Also add a cross compile script to test always cross
compilation for some well known platforms and architectures
, we support out of box compilation of these platforms even
if we don't make an official release build.
This script is to avoid regressions in this area when we
add platform dependent code.
This PR supports iam and bucket policies to have
policy variable replacements in resource and
condition key values.
For example
- ${aws:username}
- ${aws:userid}
Health checking programs very frequently use /minio/health/live
to check health, hence we can avoid doing StorageInfo() and
ListBuckets() for FS/Erasure backend.
* Use 0-byte file for bitrot verification of whole-file-bitrot files
Also pass the right checksum information for bitrot verification
* Copy xlMeta info from latest meta except []checksums and []Parts while healing
Simplify parallelReader.Read() which also fixes previous
implementation where it was returning before all the parallel
reading go-routines had terminated which caused race conditions.
When auto-encryption is turned on, we pro-actively add SSEHeader
for all PUT, POST operations. This is unusual for V2 signature
calculation because V2 signature doesn't have a pre-defined set
of signed headers in the request like V4 signature. According to
V2 we should canonicalize all incoming supported HTTP headers.
Make sure to validate signatures before we mutate http headers
Deprecate the use of Admin Peers concept and migrate all peer
communication to Notification subsystem. This finally allows
for a common subsystem for all peer notification in case of
distributed server deployments.
Before this change the CopyObjectHandler and the CopyObjectPartHandler
both looked for a `versionId` parameter on the `X-Amz-Copy-Source` URL
for the version of the object to be copied on the URL unescaped version
of the header. This meant that files that had question marks in were
truncated after the question mark so that files with `?` in their
names could not be server side copied.
After this change the URL unescaping is done during the parsing of the
`versionId` parameter which fixes the problem.
This change also introduces the same logic for the
`X-Amz-Copy-Source-Version-Id` header field which was previously
ignored, namely returning an error if it is present and not `null`
since minio does not currently support versions.
S3 Docs:
- https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html
- https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html
When source is encrypted multipart object and the parts are not
evenly divisible by DARE package block size, target encrypted size
will not necessarily be the same as encrypted source object.
This commit removes old code preventing PUT requests with '/' as a path,
because this is not needed anymore after the introduction of the virtual
host style in Minio server code.
'PUT /' when global domain is not configured already returns 405 Method
Not Allowed http error.
This PR adds pass-through, single encryption at gateway and double
encryption support (gateway encryption with pass through of SSE
headers to backend).
If KMS is set up (either with Vault as KMS or using
MINIO_SSE_MASTER_KEY),gateway will automatically perform
single encryption. If MINIO_GATEWAY_SSE is set up in addition to
Vault KMS, double encryption is performed.When neither KMS nor
MINIO_GATEWAY_SSE is set, do a pass through to backend.
When double encryption is specified, MINIO_GATEWAY_SSE can be set to
"C" for SSE-C encryption at gateway and backend, "S3" for SSE-S3
encryption at gateway/backend or both to support more than one option.
Fixes#6323, #6696
This is part of implementation for mc admin health command. The
ServerDrivesPerfInfo() admin API returns read and write speed
information for all the drives (local and remote) in a given Minio
server deployment.
Part of minio/mc#2606
It can happen with erroneous clients which do not send `Host:`
header until 4k worth of header bytes have been read. This can lead
to Peek() method of bufio to fail with ErrBufferFull.
To avoid this we should make sure that Peek buffer is as large as
our maxHeaderBytes count.
minio-java tests were failing under multiple places when
auto encryption was turned on, handle all the cases properly
This PR fixes
- CopyObject should decrypt ETag before it does if-match
- CopyObject should not try to preserve metadata of source
when rotating keys, unless explicitly asked by the user.
- We should not try to decrypt Compressed object etag, the
potential case was if user sets encryption headers along
with compression enabled.
Especially in gateway IAM admin APIs are not enabled
if etcd is not enabled, we should enable admin API though
but only enable IAM and Config APIs with etcd configured.
By default when we listen on all interfaces, we print all the
endpoints that at local to all interfaces including IPv6
addresses. Remove IPv6 addresses in endpoint list to be
printed in endpoints unless explicitly specified with '--address'
This commit adds an auto-encryption feature which allows
the Minio operator to ensure that uploaded objects are
always encrypted.
This change adds the `autoEncryption` configuration option
as part of the KMS conifguration and the ENV. variable
`MINIO_SSE_AUTO_ENCRYPTION:{on,off}`.
It also updates the KMS documentation according to the
changes.
Fixes#6502
Currently we use GetObject to check if we are allowed to list,
this might be a security problem since there are many users now
who actively disable a publicly readable listing, anyone who
can guess the browser URL can list the objects.
This PR turns off this behavior and provides a more expected way
based on the policies.
This PR also additionally improves the Download() object
implementation to use a more streamlined code.
These are precursor changes to facilitate federation and web
identity support in browser.
One user reported having discovered the following error:
API: SYSTEM()
Time: 20:06:17 UTC 12/06/2018
Error: xml: encoding "US-ASCII" declared but Decoder.CharsetReader is nil
1: cmd/handler-utils.go:43:cmd.parseLocationConstraint()
2: cmd/auth-handler.go:250:cmd.checkRequestAuthType()
3: cmd/bucket-handlers.go:411:cmd.objectAPIHandlers.PutBucketHandler()
4: cmd/api-router.go100cmd.(objectAPIHandlers).PutBucketHandler-fm()
5: net/http/server.go:1947:http.HandlerFunc.ServeHTTP()
Hence, adding support of different xml encoding. Although there
is no clear specification about it, even setting "GARBAGE" as an xml
encoding won't change the behavior of AWS, hence the encoding seems
to be ignored.
This commit will follow that behavior and will ignore encoding field
and consider all xml as utf8 encoded.
This refactors the vault configuration by moving the
vault-related environment variables to `environment.go`
(Other ENV should follow in the future to have a central
place for adding / handling ENV instead of magic constants
and handling across different files)
Further this commit adds master-key SSE-S3 support.
The operator can specify a SSE-S3 master key using
`MINIO_SSE_MASTER_KEY` which will be used as master key
to derive and encrypt per-object keys for SSE-S3
requests.
This commit is also a pre-condition for SSE-S3
auto-encyption support.
Fixes#6329
This PR implements one of the pending items in issue #6286
in S3 API a user can request CSV output for a JSON document
and a JSON output for a CSV document. This PR refactors
the code a little bit to bring this feature.
guessIsRPCReq() considers all POST requests as RPC but doesn't
check if this is an object operation API or not, which is actually
confusing bucket forwarder handler when it receives a new multipart
upload API which is a POST http request.
Due to this bug, users having a federated setup are not able to
upload a multipart object using an endpoint which doesn't actually
contain the specified bucket that will store the object.
Hence this commit will fix the described issue.
registering notFound handler more than once causes
gorilla mux to return error for all registered paths
greater than > 8. This might be a bug in the gorilla/mux
but we shouldn't be using it this way. NotFound handler
should be only registered once per root router.
Fixes#6915
When MINIO_PUBLIC_IPS is not specified and no endpoints are passed
as arguments, fallback to the address of non loop-back interfaces.
This is useful so users can avoid setting MINIO_PUBLIC_IPS in docker
or orchestration scripts, ince users naturally setup an internal
network that connects all instances.
This commit renames the env variable for vault namespaces
such that it begins with `MINIO_SSE_`. This is the prefix
for all Minio SSE related env. variables (like KMS).
clientID must be a unique `UUID` for each connections. Now, the
server generates it, rather considering the config.
Removing it as it is non-beneficial right now.
Fixes#6364
When migrating configs it happens often that some
servers fail to start due to version mismatch etc.
Hold a transaction lock such that all servers get
serialized.
This can create inconsistencies i.e Parts might have
lesser number of parts than ChecksumInfos. This will
result in object to be not readable.
This PR also allows for deleting previously created
corrupted objects.
globalMinioPort is used in federation which stores the address
and the port number of the server hosting the specified bucket,
this latter uses globalMinioPort but this latter is not set in
startup of the gateway mode.
This commit fixes the behavior.