instead perform a liveness check call to
verify if server is online and print relevant
errors.
Also introduce a StorageErr string error type
instead of errors.New() deprecate usage of
VerifyFileError, DeleteFileError for gob,
change in datastructure also requires bump in
storage REST version to v13.
Fixes#8811
Enabling the memory profiling has a significant impact on performance.
Reduce the profiling rate by 2 orders of magnitude. It is still 128x smaller than default so it should be plenty.
object lock config is enabled for a bucket.
Creating a bucket with object lock configuration
enabled does not automatically cause WORM protection
to be applied. PUT operation needs to specifically
request object locking or bucket has to have default
retention settings configured.
Fixes regression introduced in #8657
When formatting a set validate if a host failure will likely lead to data loss.
While we don't know what config will be set in the future
evaluate to our best knowledge, assuming default settings.
X-Cache sets cache status of HIT if object is
served from the disk cache, or MISS otherwise.
X-Cache-Lookup is set to HIT if object was found
in the cache even if not served (for e.g. if cache
entry was invalidated by ETag verification)
A new key was added in identity_openid recently
required explicitly for client to set the optional
value without that it would be empty, handle this
appropriately.
Fixes#8787
Use reference format to initialize lockers
during startup, also handle `nil` for NetLocker
in dsync and remove *errorLocker* implementation
Add further tuning parameters such as
- DialTimeout is now 15 seconds from 30 seconds
- KeepAliveTimeout is not 20 seconds, 5 seconds
more than default 15 seconds
- ResponseHeaderTimeout to 10 seconds
- ExpectContinueTimeout is reduced to 3 seconds
- DualStack is enabled by default remove setting
it to `true`
- Reduce IdleConnTimeout to 30 seconds from
1 minute to avoid idleConn build up
Fixes#8773
- Stop spawning store replay routines when testing the notification targets
- Properly honor the target.Close() to clean the resources used
Fixes#8707
Co-authored-by: Harshavardhana <harsha@minio.io>
In certain organizations policy claim names
can be not just 'policy' but also things like
'roles', the value of this field might also
be *string* or *[]string* support this as well
In this PR we are still not supporting multiple
policies per STS account which will require a
more comprehensive change.
Exponential backoff does not seem like a good fit for
this function since we can expect a few roundtrips on
initial startup.
This retry loop get slow pretty quickly with initial
wait being 1 second and each try being double the
wait until 30 seconds is reached.
Instead simply try 2 times per second.
This PR adds jsoniter package to replace encoding/json
in places where faster json unmarshal is necessary
whenever input JSON is large enough.
Some benchmarking comparison between jsoniter and enconding/json
benchmark old MB/s new MB/s speedup
BenchmarkParseUnmarshal/N10-4 110.02 331.17 3.01x
BenchmarkParseUnmarshal/N100-4 125.74 524.09 4.17x
BenchmarkParseUnmarshal/N500-4 131.68 542.60 4.12x
BenchmarkParseUnmarshal/N1000-4 133.93 514.88 3.84x
BenchmarkParseUnmarshal/N5000-4 122.10 415.36 3.40x
BenchmarkParseUnmarshal/N10000-4 132.13 403.90 3.06x
Fixes scenario where zones are appropriately
handled, along with supporting overriding set
count. The new fix also ensures that we handle
the various setup types properly.
Update documentation to properly indicate the
behavior.
Fixes#8750
Co-authored-by: Nitish Tiwari <nitish@minio.io>
Currently when connections to vault fail, client
perpetually retries this leads to assumptions that
the server has issues and masks the problem.
Re-purpose *crypto.Error* type to send appropriate
errors back to the client.
In existing functionality we simply return a generic
error such as "MalformedPolicy" which indicates just
a generic string "invalid resource" which is not very
meaningful when there might be multiple types of errors
during policy parsing. This PR ensures that we send
these errors back to client to indicate the actual
error, brings in two concrete types such as
- iampolicy.Error
- policy.Error
Refer #8202
minio server /data{1..4} shows an error about inability to bind a port, though
the real problem is /data{1..4} cannot be created because of the lack of
permissions.
This commit fix the behavior.
- We should declare a cluster ready even if read quorum is achieved (atleast n/2 disks are online).
- Such that, all the zones should have enough read quorum. Thus making the cluster ready for reads.
We had messy cyclical dependency problem with `mc`
due to dependencies in pkg/console, moved the pkg/console
to minio for more control and also to avoid any further
cyclical dependencies of `mc` clobbering up the
dependencies on server.
Fixes#8659
If two distinct clusters are started with different domains
along with single common domain, this situation was leading
to conflicting buckets getting created on different clusters
To avoid this do not prematurely error out if the key has no
entries, let the caller decide on which entry matches and
which entry is valid. This allows support for MINIO_DOMAIN
with one common domain, but each cluster may have their own
domains.
Fixes#8705
This is to ensure that when we have multiple tenants
deployed all sharing the same etcd for global bucket
should avoid listing each others buckets, this leads
to information leak which should be avoided unless
etcd is not namespaced for IAM assets in which case
it can be assumed that its a federated setup.
Federated setup and namespaced IAM assets on etcd
is not supported since namespacing is only useful
when you wish to separate the tenants as isolated
instances of MinIO.
This PR allows a new type of behavior, primarily
driven by the usecase of m3(mkube) multi-tenant
deployments with global bucket support.
Currently all bucket events are sent to all watchers
with matching prefix and event names, this becomes
problematic and prone to performance issues, fix this
situation by filtering based on buckets as well.
This PR fixes the issue where we might allow policy changes
for temporary credentials out of band, this situation allows
privilege escalation for those temporary credentials. We
should disallow any external actions on temporary creds
as a practice and we should clearly differentiate which
are static and which are temporary credentials.
Refer #8667
Changes in IP underneath are dynamic in replica sets
with multiple tenants, so deploying in that fashion
will not work until we wait for atleast one participatory
server to be local.
This PR also ensures that multi-tenant zone expansion also
works in replica set k8s deployments.
Introduces a new ENV `KUBERNETES_REPLICA_SET` check to call
appropriate code paths.
Continuation from #8629 which basically broke
zone deployments on k8s statefulset environment
due to incorrect assumptions which made it work
on replicated set.
Fix this properly such that this container works
for both replicated set and stateful set deployment
This commit removes github.com/minio/kes as
a dependency and implements the necessary
client-side functionality without relying
on the KES project.
This resolves the licensing issue since
KES is licensed under AGPL while MinIO
is licensed under Apache.
With this PR,cache eviction will continue until
no LRU entries older than an hour can be cache
evicted or sufficient percentage of disk space
has been reclaimed.
This ensures that we can update the
- .minio.sys is updated for accounting/data usage purposes
- .minio.sys is updated to indicate if backend is encrypted
or not.
The approach is that now safe mode is only invoked when
we cannot read the config or under some catastrophic
situations, but not under situations when config entries
are invalid or unreachable. This allows for maximum
availability for MinIO and not fail on our users unlike
most of our historical releases.
This commit adds support for the minio/kes KMS.
See: https://github.com/minio/kes
In particular you can configure it as KMS by:
- `export MINIO_KMS_KES_ENDPOINT=` // Server URL
- `export MINIO_KMS_KES_KEY_FILE=` // TLS client private key
- `export MINIO_KMS_KES_CERT_FILE=` // TLS client certificate
- `export MINIO_KMS_KES_CA_PATH=` // Root CAs issuing server cert
- `export MINIO_KMS_KES_KEY_NAME=` // The name of the (default)
master key
Admin data usage info API returns the following
(Only FS & XL, for now)
- Number of buckets
- Number of objects
- The total size of objects
- Objects histogram
- Bucket sizes
In replica sets, hosts resolve to localhost
IP automatically until the deployment fully
comes up. To avoid this issue we need to
wait for such resolution.
Final update to all messages across sub-systems
after final review, the only change here is that
NATS now has TLS and TLSSkipVerify to be consistent
for all other notification targets.
This PR adds support below metrics
- Cache Hit Count
- Cache Miss Count
- Data served from Cache (in Bytes)
- Bytes received from AWS S3
- Bytes sent to AWS S3
- Number of requests sent to AWS S3
Fixes#8549
Fixes an issue reported by @klauspost and @vadmeste
This PR also allows users to expand their clusters
from single node XL deployment to distributed mode.
Currently, we use the top-level prefix "config/"
for all our IAM assets, instead of to provide
tenant-level separation bring 'path_prefix'
to namespace the access properly.
Fixes#8567
StorageInfo() call is supposed to give each
server/disk information independently, rely
on this appropriately so that `mc admin info server`
gets correct information all the time.
Regression from #8509 which changes objectsToDelete entry
from a list to map. This will cause index out of range
panic if object is not selected for delete.
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS
This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
- added ability to specify CA for self-signed certificates
- added option to authenticate using client certificates
- added unit tests for nats connections
We should only read ahead if we are reading big files. We enable it for files >= 16MB.
Benchmark on 64KB objects.
Before:
```
Operation: GET
Errors: 0
Average: 59.976s, 87.13 MB/s, 1394.07 ops ended/s.
Fastest: 1s, 90.99 MB/s, 1455.00 ops ended/s.
50% Median: 1s, 87.53 MB/s, 1401.00 ops ended/s.
Slowest: 1s, 81.39 MB/s, 1301.00 ops ended/s.
```
After:
```
Operation: GET
Errors: 0
Average: 59.992s, 207.99 MB/s, 3327.85 ops ended/s.
Fastest: 1s, 219.20 MB/s, 3507.00 ops ended/s.
50% Median: 1s, 210.54 MB/s, 3368.00 ops ended/s.
Slowest: 1s, 179.14 MB/s, 2865.00 ops ended/s.
```
The 64KB buffer is actually a small disadvantage for this case, but I believe it will be better in general than no buffer.
- Migrate and save only settings which are enabled
- Rename logger_http to logger_webhook and
logger_http_audit to audit_webhook
- No more pretty printing comments, comment
is a key=value pair now.
- Avoid quotes on values which do not have space in them
- `state="on"` is implicit for all SetConfigKV unless
specified explicitly as `state="off"`
- Disabled IAM users should be disabled always
The code does not properly set a new deployemnt ID when not present
in format.json: it loops twice without releasing write lock on format.json
causing an infinite locking error on the same file.
This commit fixes and simplifies a little the code.
This PR implements locking from a global entity into
a more localized set level entity, allowing for locks
to be held only on the resources which are writing
to a collection of disks rather than a global level.
In this process this PR also removes the top-level
limit of 32 nodes to an unlimited number of nodes. This
is a precursor change before bring in bucket expansion.
This PR also fixes issues related to
- Add proper newline for `mc admin config get` output
for more than one targets
- Fixes issue of temporary user credentials to have
consistent output
- Fixes a crash when setting a key with empty values
- Fixes a parsing issue with `mc admin config history`
- Fixes gateway ENV handling for etcd server and gateway
This PR fixes issues found in config migration
- StorageClass migration error when rrs is empty
- Plain-text migration of older config
- Do not run in safe mode with incorrect credentials
- Update logger_http documentation for _STATE env
Refer more reported issues at #8434
This PR refactors object layer handling such
that upon failure in sub-system initialization
server reaches a stage of safe-mode operation
wherein only certain API operations are enabled
and available.
This allows for fixing many scenarios such as
- incorrect configuration in vault, etcd,
notification targets
- missing files, incomplete config migrations
unable to read encrypted content etc
- any other issues related to notification,
policies, lifecycle etc
This PR brings support for `history` list to
list in the following agreed format
```
~ mc admin config history list -n 2 myminio
RestoreId: df0ebb1e-69b0-4043-b9dd-ab54508f2897
Date: Mon, 04 Nov 2019 17:27:27 GMT
region name="us-east-1" state="on"
region name="us-east-1" state="on"
region name="us-east-1" state="on"
region name="us-east-1" state="on"
RestoreId: ecc6873a-0ed3-41f9-b03e-a2a1bab48b5f
Date: Mon, 04 Nov 2019 17:28:23 GMT
region name=us-east-1 state=off
```
This PR also moves the help templating and coloring to
fully `mc` side instead than `madmin` API.
This PR adds code to appropriately handle versioning issues
that come up quite constantly across our API changes. Currently
we were also routing our requests wrong which sort of made it
harder to write a consistent error handling code to appropriately
reject or honor requests.
This PR potentially fixes issues
- old mc is used against new minio release which is incompatible
returns an appropriate for client action.
- any older servers talking to each other, report appropriate error
- incompatible peer servers should report error and reject the calls
with appropriate error
This is to avoid making calls to backend and requiring
gateways to allow permissions for ListBuckets() operation
just for Liveness checks, we can avoid this and make
our liveness checks to be more performant.
- Supports migrating only when the credential ENVs are set,
so any FS mode deployments which do not have ENVs set will
continue to remain as is.
- Credential ENVs can be rotated using MINIO_ACCESS_KEY_OLD
and MINIO_SECRET_KEY_OLD envs, in such scenarios it allowed
to rotate the encrypted content to a new admin key.
This commit replaces the returned error message by
the hardware info handler from `Method-Not-Allowed`
to `Bad-Request` since the current HTTP error is not
correct according to the HTTP spec.
In particular:
```
The origin server MUST generate an Allow header field
in a 405 response containing a list of the target
resource's currently supported methods.
```
From: https://tools.ietf.org/html/rfc7231#section-6.5.5
- This PR allows config KVS to be validated properly
without being affected by ENV overrides, rejects
invalid values during set operation
- Expands unit tests and refactors the error handling
for notification targets, returns error instead of
ignoring targets for invalid KVS
- Does all the prep-work for implementing safe-mode
style operation for MinIO server, introduces a new
global variable to toggle safe mode based operations
NOTE: this PR itself doesn't provide safe mode operations
The new auto healing model selects one node always responsible
for auto-healing the whole cluster, erasure set by erasure set.
If that node dies, another node will be elected as a leading
operator to perform healing.
This code also adds a goroutine which checks each 10 minutes
if there are any new unformatted disks and performs its healing
in that case, only the erasure set which has the new disk will
be healed.
While Tracing requests on server, type assertion on logger.ResponseWriter
caused nil pointer exception because of recordAPIStats{} being
used as ResponseWriter. This PR avoids the type assertion and
initializes a new logger.ResponseWriter.
Fixes regression introduced in #8003
- adding oauth support to MinIO browser (#8400) by @kanagaraj
- supports multi-line get/set/del for all config fields
- add support for comments, allow toggle
- add extensive validation of config before saving
- support MinIO browser to support proper claims, using STS tokens
- env support for all config parameters, legacy envs are also
supported with all documentation now pointing to latest ENVs
- preserve accessKey/secretKey from FS mode setups
- add history support implements three APIs
- ClearHistory
- RestoreHistory
- ListHistory
- add help command support for each config parameters
- all the bug fixes after migration to KV, and other bug
fixes encountered during testing.
The measures are consolidated to the following metrics
- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `disks_offline`: Total number of offline disks in current MinIO instance.
- `disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests.
And this PR also modifies the current StorageInfo queries
- Decouples StorageInfo from ServerInfo .
- StorageInfo is enhanced to give endpoint information.
NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR
Fixes#7873
xl.isObject() returns 'nil' for not found disks when
calculating the existance of xl.json for a given object,
which what StatFile() is also doing (setting nil) if
xl.json exists.
This commit avoids this confusion by setting errDiskNotFound
error when the storage disk is not found.
When `mc admin user add` is attempted in gateway mode without
etcd setup, NoSuchBucket error is returned instead of MethodNotAllowed.
Regression from commit - e48005ddc7
Background append creates a temporary file which appends
uploaded parts as long as they are available, but when a
client stops the upload, the temporary file is not removed
by any way.
This commit removes the temporary file when the server does
its regular removing stale multipart uploads.
specific errors, `application` errors or `all` by default.
console logging on server by default lists all logs -
enhance admin console API to accept `type` as query parameter to
subscribe to application/minio logs.
- This PR fixes situation to avoid underflow, this is possible
because of disconnected operations in replay/sendEvents
- Hold right locks if Del() operation is performed in Get()
- Remove panic in the code and use loggerOnce
- Remove Timer and instead use Ticker instead for proper ticks
On Kubernetes/Docker setups DNS resolves inappropriately
sometimes where there are situations same endpoints with
multiple disks come online indicating either one of them
is local and some of them are not local. This situation
can never happen and its only a possibility in orchestrated
deployments with dynamic DNS. Following code ensures that we
treat if one of the endpoint says its local for a given host
it is true for all endpoints for the same host. Following code
ensures that this assumption is true and it works in all
scenarios and it is safe to assume for a given host.
This PR also adds validation such that we do not crash the
server if there are bugs in the endpoints list in dsync
initialization.
Thanks to Daniel Valdivia <hola@danielvaldivia.com> for
reproducing this, this fix is needed as part of the
https://github.com/minio/m3 project.
This change is related to larger config migration PR
change, this is a first stage change to move our
configs to `cmd/config/` - divided into its subsystems
Current master repeatedly calls ListBuckets() during
initialization of multiple sub-systems
Use single ListBuckets() call for each sub-system as
follows
- LifeCycle
- Policy
- Notification
- Heal if the part.1 is truncated from its original size
- Heal if the part.1 fails while being verified in between
- Heal if the part.1 fails while being at a certain offset
Other cleanups include make sure to flush the HTTP responses
properly from storage-rest-server, avoid using 'defer' to
improve call latency. 'defer' incurs latency avoid them
in our hot-paths such as storage-rest handlers.
Fixes#8319
Minio V2 listing uses object names/prefixes as continuation tokens. This
is problematic when object names contain some characters that are forbidden
in XML documents. This PR will use base64 encoded form of continuation
and next continuation tokens to address that corner case.
errors.errorString() cannot be marshalled by gob
encoder, so using a slice of []error would fail
to be encoded. This leads to no errors being
generated instead gob.Decoder on the storage-client
would see an io.EOF
To avoid such bugs introduce a typed error for
handling such translations and register this type
for gob encoding support.
On objects bigger than 100MiB can have a corrupted object
stored due to partial blockListing attempted right after
each blocks uploaded. Simplify this code to ensure that
all the blocks successfully uploaded are committed right
away.
This PR also updates the azure-sdk-go to latest release.
- Choose a unique uuid such that under situations of duplicate
mounts we do not append to an existing json entry.
- Avoid AppendFile instead use WriteAll() to write the entire
byte array atomically.
When url encoding is passed in v2 listing handler, continuationToken
and nextContinuationToken needs to be encoded. The reason is that
both represents an object name/prefix in Minio server and it could
contain a character unsupported by XML specification.
This PR additionally also adds support for missing
- Session policy support for AD/LDAP
- Add API request/response parameters detail
- Update example to take ldap username,
password input from the command line
- Fixes session policy handling for
ClientGrants and WebIdentity
This commit removes the SSE-S3 key rotation functionality
from CopyObject since there will be a dedicated Admin-API
for this purpose.
Also update the security documentation to link to mc and
the admin documentation.
This commit allows the MinIO server to parse the metadata if:
- either the `X-Minio-Internal-Server-Side-Encryption-S3-Key-Id`
and the `X-Minio-Internal-Server-Side-Encryption-S3-Kms-Sealed-Key`
entries are present.
- or *both* headers are not present.
This is in service to support a K/V data key store.
In commit a8296445ad we changed the code to handle
some corner cases on ARM and other platforms, this
PR just avoids the return for unknown filetypes
prematurely and let the name be populated appropriately.
This fixes bug for older XFS implementations such as
in Ubuntu 14.04
This commit fixes a subtle bug that (probably)
caused an issue affecting encrypted multipart objects.
When a cluster has no quorum this bug causes `ListObjectParts`
to return nil as error instead of a quorum error.
Thanks to @harshavardhana for detecting this.
This PR changes cache on PUT behavior to background fill the cache
after PutObject completes. This will avoid concurrency issues as in #8219.
Added cleanup of partially filled cache to prevent cache corruption
- Fixes#8208
VerifyFile in the distributed setup does not work with
the non streaming highway hash. The reason is that the
internode mux router did not expect `storageRESTBitrotHash`
parameter.
endpoint.IsLocal will not have .Host entries so
using them to skip double entries will never work.
change the code such that we look for endpoint.Host
outside of endpoint.IsLocal logic to skip double
hosts appropriately.
Move these functions to their appropriate file.
- ListObjectsHeal should list only objects
which need healing, not the entire namespace.
- DeleteObjects() to be used to delete 1000s of
objects in bulk instead of serially.
Add LDAP based users-groups system
This change adds support to integrate an LDAP server for user
authentication. This works via a custom STS API for LDAP. Each user
accessing the MinIO who can be authenticated via LDAP receives
temporary credentials to access the MinIO server.
LDAP is enabled only over TLS.
User groups are also supported via LDAP. The administrator may
configure an LDAP search query to find the group attribute of a user -
this may correspond to any attribute in the LDAP tree (that the user
has access to view). One or more groups may be returned by such a
query.
A group is mapped to an IAM policy in the usual way, and the server
enforces a policy corresponding to all the groups and the user's own
mapped policy.
When LDAP is configured, the internal MinIO users system is disabled.
It looks like from implementation point of view fastjson
parser pool doesn't behave the same way as expected
when dealing many `xl.json` from multiple disks.
The fastjson parser pool usage ends up returning incorrect
xl.json entries for checksums, with references pointing
to older entries. This led to the subtle bug where checksum
info is duplicated from a previous xl.json read of a different
file from different disk.
With this PR, liveness check responds with 200 OK with "server-not-
initialized" header while objectLayer gets initialized. The header
is removed as objectLayer is initialized. This is to allow
MinIO distributed cluster to get started when running on an
orchestration platforms like Docker Swarm.
This PR also updates sample Swarm yaml files to use correct values
for healthcheck fields.
Fixes#8140
This commit adds an admin API route and handler for
requesting status information about a KMS key.
Therefore, the client specifies the KMS key ID (when
empty / not set the server takes the currently configured
default key-ID) and the server tries to perform a dummy encryption,
re-wrap and decryption operation. If all three succeed we know that
the server can access the KMS and has permissions to generate, re-wrap
and decrypt data keys (policy is set correctly).
This is a defensive change to avoid any future issues,
from this part of the code. New change also ensures
to populate UUID if present for the right disk.
This commit fixes the web ZIP download handler for
encrypted objects. The decryption logic has moved into
`getObjectNInfo`. So trying to decrypt the (already decrypted)
content again in the ZIP handler obviously causes an error.
This commit fixes this by removing the decryption logic from the
the handler.
Fixes#7965
The underlying errors are important, for IAM
requirements and should wait appropriately at
the caller level, this allows for distributed
setups to run properly and not fail prematurely
during startup.
Also additionally fix the onlineDisk counting