- Implement a new xl.json 2.0.0 format to support,
this moves the entire marshaling logic to POSIX
layer, top layer always consumes a common FileInfo
construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
for object placements.
Fixes#2111
At a customer setup with lots of concurrent calls
it can be observed that in newRetryTimer there
were lots of tiny alloations which are not
relinquished upon retries, in this codepath
we were only interested in re-using the timer
and use it wisely for each locker.
```
(pprof) top
Showing nodes accounting for 8.68TB, 97.02% of 8.95TB total
Dropped 1198 nodes (cum <= 0.04TB)
Showing top 10 nodes out of 79
flat flat% sum% cum cum%
5.95TB 66.50% 66.50% 5.95TB 66.50% time.NewTimer
1.16TB 13.02% 79.51% 1.16TB 13.02% github.com/ncw/directio.AlignedBlock
0.67TB 7.53% 87.04% 0.70TB 7.78% github.com/minio/minio/cmd.xlObjects.putObject
0.21TB 2.36% 89.40% 0.21TB 2.36% github.com/minio/minio/cmd.(*posix).Walk
0.19TB 2.08% 91.49% 0.27TB 2.99% os.statNolog
0.14TB 1.59% 93.08% 0.14TB 1.60% os.(*File).readdirnames
0.10TB 1.09% 94.17% 0.11TB 1.25% github.com/minio/minio/cmd.readDirN
0.10TB 1.07% 95.23% 0.10TB 1.07% syscall.ByteSliceFromString
0.09TB 1.03% 96.27% 0.09TB 1.03% strings.(*Builder).grow
0.07TB 0.75% 97.02% 0.07TB 0.75% path.(*lazybuf).append
```
this is a major overhaul by migrating off all
bucket metadata related configs into a single
object '.metadata.bin' this allows us for faster
bootups across 1000's of buckets and as well
as keeps the code simple enough for future
work and additions.
Additionally also fixes#9396, #9394
This PR is to ensure that we call the relevant object
layer APIs for necessary S3 API level functionalities
allowing gateway implementations to return proper
errors as NotImplemented{}
This allows for all our tests in mint to behave
appropriately and can be handled appropriately as
well.
This PR allows setting a "hard" or "fifo" quota
restriction at the bucket level. Buckets that
have reached the FIFO quota configured, will
automatically be cleaned up in FIFO manner until
bucket usage drops to configured quota.
If a bucket is configured with a "hard" quota
ceiling, all further writes are disallowed.
OSS go sdk lacks licensing terms in their
repository, and there has been no activity
On the issue here https://github.com/aliyun/aliyun-oss-go-sdk/issues/245
This PR is to ensure we remove any dependency code which
lacks explicit license file in their repo.
This PR fixes couple of behaviors with service accounts
- not need to have session token for service accounts
- service accounts can be generated by any user for themselves
implicitly, with a valid signature.
- policy input for AddNewServiceAccount API is not fully typed
allowing for validation before it is sent to the server.
- also bring in additional context for admin API errors if any
when replying back to client.
- deprecate GetServiceAccount API as we do not need to reply
back session tokens
- total number of S3 API calls per server
- maximum wait duration for any S3 API call
This implementation is primarily meant for situations
where HDDs are not capable enough to handle the incoming
workload and there is no way to throttle the client.
This feature allows MinIO server to throttle itself
such that we do not overwhelm the HDDs.
OperationTimedout error occurs when locking
timesout, trying to acquire a lock. This
error should be returned appropriately to
the client with http status "408" (request timedout)
This translation was broken, fix it.
Add dummy calls which respond success when ACL's
are set to be private and fails, if user tries
to change them from their default 'private'
Some applications such as nuxeo may have an
unnecessary requirement for this operation,
we support this anyways such that don't have
to fully implement the functionality just that
we can respond with success for default ACLs
First step is to ensure that Path component is not decoded
by gorilla/mux to avoid routing issues while handling
certain characters while uploading through PutObject()
Delay the decoding and use PathUnescape() to escape
the `object` path component.
Thanks to @buengese and @ncw for neat test cases for us
to test with.
Fixes#8950Fixes#8647
- pkg/bucket/encryption provides support for handling bucket
encryption configuration
- changes under cmd/ provide support for AES256 algorithm only
Co-Authored-By: Poorna <poornas@users.noreply.github.com>
Co-authored-by: Harshavardhana <harsha@minio.io>
Currently when connections to vault fail, client
perpetually retries this leads to assumptions that
the server has issues and masks the problem.
Re-purpose *crypto.Error* type to send appropriate
errors back to the client.
In existing functionality we simply return a generic
error such as "MalformedPolicy" which indicates just
a generic string "invalid resource" which is not very
meaningful when there might be multiple types of errors
during policy parsing. This PR ensures that we send
these errors back to client to indicate the actual
error, brings in two concrete types such as
- iampolicy.Error
- policy.Error
Refer #8202
Currently, we use the top-level prefix "config/"
for all our IAM assets, instead of to provide
tenant-level separation bring 'path_prefix'
to namespace the access properly.
Fixes#8567
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS
This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
This PR adds code to appropriately handle versioning issues
that come up quite constantly across our API changes. Currently
we were also routing our requests wrong which sort of made it
harder to write a consistent error handling code to appropriately
reject or honor requests.
This PR potentially fixes issues
- old mc is used against new minio release which is incompatible
returns an appropriate for client action.
- any older servers talking to each other, report appropriate error
- incompatible peer servers should report error and reject the calls
with appropriate error
- adding oauth support to MinIO browser (#8400) by @kanagaraj
- supports multi-line get/set/del for all config fields
- add support for comments, allow toggle
- add extensive validation of config before saving
- support MinIO browser to support proper claims, using STS tokens
- env support for all config parameters, legacy envs are also
supported with all documentation now pointing to latest ENVs
- preserve accessKey/secretKey from FS mode setups
- add history support implements three APIs
- ClearHistory
- RestoreHistory
- ListHistory
- add help command support for each config parameters
- all the bug fixes after migration to KV, and other bug
fixes encountered during testing.
This change adds admin APIs and IAM subsystem APIs to:
- add or remove members to a group (group addition and deletion is
implicit on add and remove)
- enable/disable a group
- list and fetch group info
This PR is based off @sinhaashish's PR for object lifecycle
management, which includes support only for,
- Expiration of object
- Filter using object prefix (_not_ object tags)
N B the code for actual expiration of objects will be included in a
subsequent PR.
Consider errors returned by httpClient.Do() as network errors. This is because
the http clients returns different types of errors and it is hard to catch
all the error types.
In scenario 1
```
- bucket/object-prefix
- bucket/object-prefix/object
```
Server responds with `XMinioParentIsObject`
In scenario 2
```
- bucket/object-prefix/object
- bucket/object-prefix
```
Server responds with `XMinioObjectExistsAsDirectory`
Fixes#6566
We should internally handle when http2 input stream has smaller
content than its content-length header
Upstream issue reported https://github.com/golang/go/issues/30648
This a change which we need to handle internally until Go fixes it
correctly, till now our code doesn't expect a custom error to be returned.
Different gateway implementations due to different backend
API errors, might return different unsupported errors at
our handler layer. Current code posed a problem for us because
this information was lost and we would convert it to InternalError
in this situation all S3 clients end up retrying the request.
To avoid this unexpected situation implement a way to support
this cleanly such that the underlying information is not lost
which is returned by gateway.
This improves the performance of certain queries dramatically,
such as 'count(*)' etc.
Without this PR
```
~ time mc select --query "select count(*) from S3Object" myminio/sjm-airlines/star2000.csv.gz
2173762
real 0m42.464s
user 0m0.071s
sys 0m0.010s
```
With this PR
```
~ time mc select --query "select count(*) from S3Object" myminio/sjm-airlines/star2000.csv.gz
2173762
real 0m17.603s
user 0m0.093s
sys 0m0.008s
```
Almost a 250% improvement in performance. This PR avoids a lot of type
conversions and instead relies on raw sequences of data and interprets
them lazily.
```
benchcmp old new
benchmark old ns/op new ns/op delta
BenchmarkSQLAggregate_100K-4 551213 259782 -52.87%
BenchmarkSQLAggregate_1M-4 6981901985 2432413729 -65.16%
BenchmarkSQLAggregate_2M-4 13511978488 4536903552 -66.42%
BenchmarkSQLAggregate_10M-4 68427084908 23266283336 -66.00%
benchmark old allocs new allocs delta
BenchmarkSQLAggregate_100K-4 2366 485 -79.50%
BenchmarkSQLAggregate_1M-4 47455492 21462860 -54.77%
BenchmarkSQLAggregate_2M-4 95163637 43110771 -54.70%
BenchmarkSQLAggregate_10M-4 476959550 216906510 -54.52%
benchmark old bytes new bytes delta
BenchmarkSQLAggregate_100K-4 1233079 1086024 -11.93%
BenchmarkSQLAggregate_1M-4 2607984120 557038536 -78.64%
BenchmarkSQLAggregate_2M-4 5254103616 1128149168 -78.53%
BenchmarkSQLAggregate_10M-4 26443524872 5722715992 -78.36%
```
In many situations, while testing we encounter
ErrInternalError, to reduce logging we have
removed logging from quite a few places which
is acceptable but when ErrInternalError occurs
we should have a facility to log the corresponding
error, this helps to debug Minio server.
This commit moves the check that SSE-C requests
must be made over TLS into a generic HTTP handler.
Since the HTTP server uses custom TCP connection handling
it is not possible to use `http.Request.TLS` to check
for TLS connections. So using `globalIsSSL` is the only
option to detect whether the request is made over TLS.
By extracting this check into a separate handler it's possible
to refactor other parts of the SSE handling code further.
This PR introduces two new features
- AWS STS compatible STS API named AssumeRoleWithClientGrants
```
POST /?Action=AssumeRoleWithClientGrants&Token=<jwt>
```
This API endpoint returns temporary access credentials, access
tokens signature types supported by this API
- RSA keys
- ECDSA keys
Fetches the required public key from the JWKS endpoints, provides
them as rsa or ecdsa public keys.
- External policy engine support, in this case OPA policy engine
- Credentials are stored on disks
When download profiling data API fails to gather profiling data
from all nodes for any reason (including profiler not enabled),
return 400 http code with the appropriate json message.
Two handlers are added to admin API to enable profiling and disable
profiling of a server in a standalone mode, or all nodes in the
distributed mode.
/minio/admin/profiling/start/{cpu,block,mem}:
- Start profiling and return starting JSON results, e.g. one
node is offline.
/minio/admin/profiling/download:
- Stop the on-going profiling task
- Stream a zip file which contains all profiling files that can
be later inspected by go tool pprof
This commit fixes are regression in the server regarding
handling SSE requests with wrong SSE-C keys.
The server now returns an AWS S3 compatable API error (access denied)
in case of the SSE key does not match the secret key used during upload.
Fixes#6431
This PR adds two new admin APIs in Minio server and madmin package:
- GetConfigKeys(keys []string) ([]byte, error)
- SetConfigKeys(params map[string]string) (err error)
A key is a path in Minio configuration file, (e.g. notify.webhook.1)
The user will always send a string value when setting it in the config file,
the API will know how to convert the value to the appropriate type. The user
is also able to set a raw json.
Before setting a new config, Minio will validate all fields and try to connect
to notification targets if available.
This commit adds error handling for SSE-KMS requests to
HEAD, GET, PUT and COPY operations. The server responds
with `not implemented` if a client sends a SSE-KMS
request.
Add support for sse-s3 encryption with vault as KMS.
Also refactoring code to make use of headers and functions defined in
crypto package and clean up duplicated code.
This commit adds the crypto.* errors to the
`toAPIErrorCode` switch. Further this commit adds an S3
API error code returned whenever the client specifes a
SSE-S3 request with an invalid algorithm parameter.
Since implementing `pwrite` like implementation would
require a more complex code than background append
implementation, it is better to keep the current code
as is and not implement `pwrite` based functionality.
Closes#4881
Currently, requestid field in logEntry is not populated, as the
requestid field gets set at the very end.
It is now set before regular handler functions. This is also
useful in setting it as part of the XML error response.
Travis build for ppc64le has been quite inconsistent and stays queued
for most of the time. Removing this build as part of Travis.yml for
the time being.
With CoreDNS now supporting etcdv3 as the DNS backend, we
can update our federation target to etcdv3. Users will now be
able to use etcdv3 server as the federation backbone.
Minio will update bucket data to etcdv3 and CoreDNS can pick
that data up and serve it as bucket style DNS path.
- remove old bucket policy handling
- add new policy handling
- add new policy handling unit tests
This patch brings support to bucket policy to have more control not
limiting to anonymous. Bucket owner controls to allow/deny any rest
API.
For example server side encryption can be controlled by allowing
PUT/GET objects with encryptions including bucket owner.
This change let the server return the S3 error for a key rotation
if the source key is not valid but equal to the destination key.
This change also fixes the SSE-C error messages since AWS returns error messages
ending with a '.'.
Fixes#5625
This PR adds disk based edge caching support for minio server.
Cache settings can be configured in config.json to take list of disk drives,
cache expiry in days and file patterns to exclude from cache or via environment
variables MINIO_CACHE_DRIVES, MINIO_CACHE_EXCLUDE and MINIO_CACHE_EXPIRY
Design assumes that Atime support is enabled and the list of cache drives is
fixed.
- Objects are cached on both GET and PUT/POST operations.
- Expiry is used as hint to evict older entries from cache, or if 80% of cache
capacity is filled.
- When object storage backend is down, GET, LIST and HEAD operations fetch
object seamlessly from cache.
Current Limitations
- Bucket policies are not cached, so anonymous operations are not supported in
offline mode.
- Objects are distributed using deterministic hashing among list of cache
drives specified.If one or more drives go offline, or cache drive
configuration is altered - performance could degrade to linear lookup.
Fixes#4026
This is a trival fix to support server level WORM. The feature comes
with an environment variable `MINIO_WORM`.
Usage:
```
$ export MINIO_WORM=on
$ minio server endpoint
```
*) Add Put/Get support of multipart in encryption
*) Add GET Range support for encryption
*) Add CopyPart encrypted support
*) Support decrypting of large single PUT object
- Changes related to moving admin APIs
- admin APIs now have an endpoint under /minio/admin
- admin APIs are now versioned - a new API to server the version is
added at "GET /minio/admin/version" and all API operations have the
path prefix /minio/admin/v1/<operation>
- new service stop API added
- credentials change API is moved to /minio/admin/v1/config/credential
- credentials change API and configuration get/set API now require TLS
so that credentials are protected
- all API requests now receive JSON
- heal APIs are disabled as they will be changed substantially
- Heal API changes
Heal API is now provided at a single endpoint with the ability for a
client to start a heal sequence on all the data in the server, a
single bucket, or under a prefix within a bucket.
When a heal sequence is started, the server returns a unique token
that needs to be used for subsequent 'status' requests to fetch heal
results.
On each status request from the client, the server returns heal result
records that it has accumulated since the previous status request. The
server accumulates upto 1000 records and pauses healing further
objects until the client requests for status. If the client does not
request any further records for a long time, the server aborts the
heal sequence automatically.
A heal result record is returned for each entity healed on the server,
such as system metadata, object metadata, buckets and objects, and has
information about the before and after states on each disk.
A client may request to force restart a heal sequence - this causes
the running heal sequence to be aborted at the next safe spot and
starts a new heal sequence.
- Return error when the config JSON has duplicate keys (fixes#5286)
- Limit size of configuration file provided to 256KiB - this prevents
another form of DoS
- Add storage class metadata validation for request header
- Change storage class header values to be consistent with AWS S3
- Refactor internal method to take only the reqd argument