Add two new configuration entries, api.requests-max and
api.requests-deadline which have the same role of
MINIO_API_REQUESTS_MAX and MINIO_API_REQUESTS_DEADLINE.
This PR fixes couple of behaviors with service accounts
- not need to have session token for service accounts
- service accounts can be generated by any user for themselves
implicitly, with a valid signature.
- policy input for AddNewServiceAccount API is not fully typed
allowing for validation before it is sent to the server.
- also bring in additional context for admin API errors if any
when replying back to client.
- deprecate GetServiceAccount API as we do not need to reply
back session tokens
- Introduced a function `FetchRegisteredTargets` which will return
a complete set of registered targets irrespective to their states,
if the `returnOnTargetError` flag is set to `False`
- Refactor NewTarget functions to return non-nil targets
- Refactor GetARNList() to return a complete list of configured targets
- Removes PerfInfo admin API as its not OBDInfo
- Keep the drive path without the metaBucket in OBD
global latency map.
- Remove all the unused code related to PerfInfo API
- Do not redefined global mib,gib constants use
humanize.MiByte and humanize.GiByte instead always
if needed use --no-compat to disable md5sum while
verifying any performance numbers.
bring back --compat behavior as default to avoid
additional documentation and confusing behavior,
as we are working towards improving md5sum to
be faster on AVX instructions, enabling this
should be hardly a problem in future versions
of MinIO.
fixes#8012fixes#7859fixes#7642
Continuing from previous PR #9304, comment
is a special key is not present in the
default KV list. Add it explicitly when
tokenizing fields as it may be possible that
some clients might try to set comments.
This PR adds context-based `k=v` splits based
on the sub-system which was obtained, if the
keys are not provided an error will be thrown
during parsing, if keys are provided with wrong
values an error will be thrown. Keys can now
have values which are of a much more complex
form such as `k="v=v"` or `k=" v = v"`
and other variations.
additionally, deprecate unnecessary postgres/mysql
configuration styles, support only
- connection_string for Postgres
- dsn_string for MySQL
All other parameters are removed.
This commit fixes a performance issue caused
by too many calls to the external KMS - i.e.
for single-part PUT requests.
In general, the issue is caused by a sub-optimal
code structure. In particular, when the server
encrypts an object it requests a new data encryption
key from the KMS. With this key it does some key
derivation and encrypts the object content and
ETag.
However, to behave S3-compatible the MinIO server
has to return the plaintext ETag to the client
in case SSE-S3.
Therefore, the server code used to decrypt the
(previously encrypted) ETag again by requesting
the data encryption key (KMS decrypt API) from
the KMS.
This leads to 2 KMS API calls (1 generate key and
1 decrypt key) per PUT operation - while only
one KMS call is necessary.
This commit fixes this by fetching a data key only
once from the KMS and keeping the derived object
encryption key around (for the lifetime of the request).
This leads to a significant performance improvement
w.r.t. to PUT workloads:
```
Operation: PUT
Operations: 161 -> 239
Duration: 28s -> 29s
* Average: +47.56% (+25.8 MiB/s) throughput, +47.56% (+2.6) obj/s
* Fastest: +55.49% (+34.5 MiB/s) throughput, +55.49% (+3.5) obj/s
* 50% Median: +58.24% (+32.8 MiB/s) throughput, +58.24% (+3.3) obj/s
* Slowest: +1.83% (+0.6 MiB/s) throughput, +1.83% (+0.1) obj/s
```
Fixes#8667
In addition to the above, if the user is mapped to a policy or
belongs in a group, the user-info API returns this information,
but otherwise, the API will now return a non-existent user error.
make rest of the Walk() function more predictable,
it was observed that in nominal deployments even
without much workload the drives are generally
slow for respond for readdir operations, for the
sleepDuration factor of 10 this can cause
unexpected slowness in the Listing calls, while
it is good for all other I/O, it may simply slow
down Listing immensely which is not useful.
fixes#9261
In FS mode under Windows, removing an object will not automatically.
remove parent empty prefixes.
The reason is that path.Dir() was used, however filepath.Dir() is
more appropriate since filepath is physical (meaning it operates
on OS filesystem paths)
This is not caught because failure for Windows CI is not caught.
fs-v1 in server mode only checks to see if the path exist, so that it
returns ready before it is indeed ready.
This change adds a check to ensure that the global object api is
available too before reporting ready.
Fixes#9283
It is some times common and convenient to use
just local IPs for testing purposes, 127.0.0.x
are special IPs regardless of being available on
an interface they can be bound to on all operating
systems.
Allow this behavior to work for minio server
fixes#9274
also, bring in an additional policy to ensure that
force delete bucket is only allowed with the right
policy for the user, just DeleteBucketAction
policy action is not enough.
This PR also tries to simplify the approach taken in
object-locking implementation by preferential treatment
given towards full validation.
This in-turn has fixed couple of bugs related to
how policy should have been honored when ByPassGovernance
is provided.
Simplifies code a bit, but also duplicates code intentionally
for clarity due to complex nature of object locking
implementation.
Too many deployments come up with an odd number
of hosts or drives, to facilitate even distribution
among those setups allow for odd and prime numbers
based packs.
- B2 does actually return an MD5 hash for newly uploaded objects
so we can use it to provide better compatibility with S3 client
libraries that assume the ETag is the MD5 hash such as boto.
- depends on change in blazer library.
- new behaviour is only enabled if MinIO's --compat mode is active.
- behaviour for multipart uploads is unchanged (works fine as is).
- Implement a graph algorithm to test network bandwidth from every
node to every other node
- Saturate any network bandwidth adaptively, accounting for slow
and fast network capacity
- Implement parallel drive OBD tests
- Implement a paging mechanism for OBD test to provide periodic updates to client
- Implement Sys, Process, Host, Mem OBD Infos
- total number of S3 API calls per server
- maximum wait duration for any S3 API call
This implementation is primarily meant for situations
where HDDs are not capable enough to handle the incoming
workload and there is no way to throttle the client.
This feature allows MinIO server to throttle itself
such that we do not overwhelm the HDDs.
- acquire since leader lock for all background operations
- healing, crawling and applying lifecycle policies.
- simplify lifecyle to avoid network calls, which was a
bug in implementation - we should hold a leader and
do everything from there, we have access to entire
name space.
- make listing, walking not interfere by slowing itself
down like the crawler.
- effectively use global context everywhere to ensure
proper shutdown, in cache, lifecycle, healing
- don't read `format.json` for prometheus metrics in
StorageInfo() call.
- Add conservative timeouts upto 3 minutes
for internode communication
- Add aggressive timeouts of 30 seconds
for gateway communication
Fixes#9105Fixes#8732Fixes#8881Fixes#8376Fixes#9028
This is to improve responsiveness for all
admin API operations and allowing callers
to cancel any on-going admin operations,
if they happen to be waiting too long.
canonicalize the ENVs such that we can bring these ENVs
as part of the config values, as a subsequent change.
- fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin`
- fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json`
(avoid conflicts with a bucket named `usage.json` )
- fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin`
(avoid conflicts with a bucket named `usage.bin`
As an optimization of the healing, HealObjects() avoid sending an
object to the background healing subsystem when the object is
present in all disks.
However, HealObjects() should have checked the scan type, if this
deep, always pass the object to the healing subsystem.
Currently, a tree walking, needed to a list objects in a specific
set quits listing as long as it finds no entries in a disk, which
is wrong.
This affected background healing, because the latter is using
tree walk directly. If one object does not exist in the first
disk for example, it will be seemed like the object does not
exist at all and no healing work is needed.
This commit fixes the behavior.
The staleness of a lock should be determined by
the quorum number of entries returning stale,
this allows for situations when locks are held
when nodes are down - we don't accidentally
clear locks unintentionally when they are valid
and correct.
Also lock maintenance should be run by all servers,
not one server, stale locks need to be run outside
the requirement for holding distributed locks.
Thanks @klauspost for reproducing this issue
Some AWS SDKs latently rely on this value some times
to calculate the right number of parts during a parallel
GetObject request, this is feature used along with
content-range - we should support this as well.
- avoid setting last heal activity when starting self-healing
This can be confusing to users thinking that the self healing
cycle was already performed.
- add info about the next background healing round
OperationTimedout error occurs when locking
timesout, trying to acquire a lock. This
error should be returned appropriately to
the client with http status "408" (request timedout)
This translation was broken, fix it.
Bulk delete API was using cleanupObjectsBulk() which calls posix
listing and delete API to remove objects internal files in the
backend (xl.json and parts) one by one.
Add DeletePrefixes in the storage API to remove the content
of a directory in a single call.
Also use a remove goroutine for each disk to accelerate removal.
Currently the code assumed some orthogonal requirements
which led situations where when we have a setup where
we have let's say for example 168 drives, the final
set_drive_count chosen was 14. Indeed 168 drives are
divisible by 12 but this wasn't allowed due to an
unexpected requirement to have 12 to be a perfect modulo
of 14 which is not possible. This assumption was incorrect.
This PR fixes this old assumption properly, also adds
few tests and some negative tests as well. Improvements
are seen in error messages as well.
- Remove the requirement to honor storage class for deletes
- Improve `posix.DeleteFileBulk` code to Stat the volumeDir
only once per call, rather than for all object paths.