Currently, cache purges are triggered as soon as the low watermark is exceeded.
To reduce IO this should only be done when reaching the high watermark.
This simplifies checks and reduces all calls for a GC to go through
`dcache.diskSpaceAvailable(size)`. While a comment claims that
`dcache.triggerGC <- struct{}{}` was non-blocking I don't see how
that was possible. Instead, we add a 1 size to the queue channel
and use channel semantics to avoid blocking when a GC has
already been requested.
`bytesToClear` now takes the high watermark into account to it will
not request any bytes to be cleared until that is reached.
I have built a fuzz test and it crashes heavily in seconds and will OOM shortly after.
It seems like supporting Parquet is basically a completely open way to crash the
server if you can upload a file and run s3 select on it.
Until Parquet is more hardened it is DISABLED by default since hostile
crafted input can easily crash the server.
If you are in a controlled environment where it is safe to assume no hostile
content can be uploaded to your cluster you can safely enable Parquet.
To enable Parquet set the environment variable `MINIO_API_SELECT_PARQUET=on`
while starting the MinIO server.
Furthermore, we guard parquet by recover functions.
Generalize replication target management so
that remote targets for a bucket can be
managed with ARNs. `mc admin bucket remote`
command will be used to manage targets.
Readiness as no reasoning to be cluster scope
because that is not how the k8s networking works
for pods, all the pods to a deployment are not
sharing the network in a singleton. Instead they
are run as local scopes to themselves, with
readiness failures the pod is potentially taken
out of the network to be resolvable - this
affects the distributed setup in myriad of
different ways.
Instead readiness should behave like liveness
with local scope alone, and should be a dummy
implementation.
This PR all the startup times and overal k8s
startup time dramatically improves.
Added another handler called as `/minio/health/cluster`
to understand the cluster scope health.
The default behavior is to cache each range requested
to cache drive. Add an environment variable
`MINIO_RANGE_CACHE` - when set to off, it disables
range caching and instead downloads entire object
in the background.
Fixes#9870
This PR has the following changes
- Removing duplicate lookupConfigs() calls.
- Deprecate admin config APIs for NAS gateways. This will avoid repeated reloads of the config from the disk.
- WatchConfigNASDisk will be removed
- Migration guide for NAS gateways users to migrate to ENV settings.
NOTE: THIS PR HAS A BREAKING CHANGE
Fixes#9875
Co-authored-by: Harshavardhana <harsha@minio.io>
* Just read files from args (more than 1 now supported)
* Pretty print by default. `-ndjson` will disable.
* Check header.
* Support stdin as '-'
* Don't just ignore errors.
Bonus change to use channel to serialize triggers,
instead of using atomic variables. More efficient
mechanism for synchronization.
Co-authored-by: Nitish Tiwari <nitish@minio.io>
- Implement a new xl.json 2.0.0 format to support,
this moves the entire marshaling logic to POSIX
layer, top layer always consumes a common FileInfo
construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
for object placements.
Fixes#2111
Bonus fixes in quota enforcement to use the
new datastructure and use timedValue to cache
a value/reload automatically avoids one less
global variable.
No one really uses FS for large scale accounting
usage, neither we crawl in NAS gateway mode. It is
worthwhile to simply disable this feature as its
not useful for anyone.
Bonus disable bucket quota ops as well in, FS
and gateway mode
This commit fixes a layout issue w.r.t. the KMS
Quickstart guide. The problem seems to be caused
by docs server not converting the markdown into html
as expected.
This commit fixes this by converting the ordered list
into subsections.
This commit simplifies the KMS configuration guide by
adding a get started section that uses our KES play instance
at `https://play.min.io:7373`.
Further, it removes sections that we don't recommend for production
anyways (MASTER_KEY).
This commit updates the two client env. variables:
```
KES_CLIENT_TLS_KEY_FILE
KES_CLIENT_TLS_CERT_FILE
```
The KES CLI client expects the client key and certificate
as `KES_CLIENT_KEY` resp. `KES_CLIENT_CERT`.
S3 is now natively supported by B2 cloud storage provider
there is no reason to use specialized gateway for B2 anymore,
our current S3 gateway with caching would work with B2.
Resolves#8584
This commit updates the KMS guide to reflect the
latest changes in KES. Based on internal design
meetings we made some adjustments to the overall
KES configuration.
This commit ensures that the KMS guide contains
a working KES demo-setup with Vault.
global WORM mode is a complex piece for which
the time has passed, with the advent of S3 compatible
object locking and retention implementation global
WORM is sort of deprecated, this has been mentioned
in our documentation for some time, now the time
has come for this to go.
OSS go sdk lacks licensing terms in their
repository, and there has been no activity
On the issue here https://github.com/aliyun/aliyun-oss-go-sdk/issues/245
This PR is to ensure we remove any dependency code which
lacks explicit license file in their repo.
New value defaults to 100K events by default,
but users can tune this value upto any value
they seem necessary.
* increase the limit to maxint64 while validating
Add two new configuration entries, api.requests-max and
api.requests-deadline which have the same role of
MINIO_API_REQUESTS_MAX and MINIO_API_REQUESTS_DEADLINE.
- Removes PerfInfo admin API as its not OBDInfo
- Keep the drive path without the metaBucket in OBD
global latency map.
- Remove all the unused code related to PerfInfo API
- Do not redefined global mib,gib constants use
humanize.MiByte and humanize.GiByte instead always
This PR adds context-based `k=v` splits based
on the sub-system which was obtained, if the
keys are not provided an error will be thrown
during parsing, if keys are provided with wrong
values an error will be thrown. Keys can now
have values which are of a much more complex
form such as `k="v=v"` or `k=" v = v"`
and other variations.
additionally, deprecate unnecessary postgres/mysql
configuration styles, support only
- connection_string for Postgres
- dsn_string for MySQL
All other parameters are removed.
Too many deployments come up with an odd number
of hosts or drives, to facilitate even distribution
among those setups allow for odd and prime numbers
based packs.
- B2 does actually return an MD5 hash for newly uploaded objects
so we can use it to provide better compatibility with S3 client
libraries that assume the ETag is the MD5 hash such as boto.
- depends on change in blazer library.
- new behaviour is only enabled if MinIO's --compat mode is active.
- behaviour for multipart uploads is unchanged (works fine as is).
- Implement a graph algorithm to test network bandwidth from every
node to every other node
- Saturate any network bandwidth adaptively, accounting for slow
and fast network capacity
- Implement parallel drive OBD tests
- Implement a paging mechanism for OBD test to provide periodic updates to client
- Implement Sys, Process, Host, Mem OBD Infos
- total number of S3 API calls per server
- maximum wait duration for any S3 API call
This implementation is primarily meant for situations
where HDDs are not capable enough to handle the incoming
workload and there is no way to throttle the client.
This feature allows MinIO server to throttle itself
such that we do not overwhelm the HDDs.
canonicalize the ENVs such that we can bring these ENVs
as part of the config values, as a subsequent change.
- fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin`
- fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json`
(avoid conflicts with a bucket named `usage.json` )
- fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin`
(avoid conflicts with a bucket named `usage.bin`
This commit fixes the env. variable in the
KMS guide used to specify the CA certificates
for the KES server.
Before the env. variable `MINIO_KMS_KES_CAPATH` has
been used - which works in non-containerized environments
due to how MinIO merges the config file and environment
variables. In containerized environments (e.g. docker)
this does not work and trying to specify `MINIO_KMS_KES_CAPATH`
instead of `MINIO_KMS_KES_CA_PATH` eventually leads to MinIO not
trusting the certificate presented by the kes server.
See: cfd12914e1/cmd/crypto/config.go (L186)
To allow better control the cache eviction process.
Introduce MINIO_CACHE_WATERMARK_LOW and
MINIO_CACHE_WATERMARK_HIGH env. variables to specify
when to stop/start cache eviction process.
Deprecate MINIO_CACHE_EXPIRY environment variable. Cache
gc sweeps at 30 minute intervals whenever high watermark is
reached to clear least recently accessed entries in the cache
until sufficient space is cleared to reach the low watermark.
Garbage collection uses an adaptive file scoring approach based
on last access time, with greater weights assigned to larger
objects and those with more hits to find the candidates for eviction.
Thanks to @klauspost for this file scoring algorithm
Co-authored-by: Klaus Post <klauspost@minio.io>
We added support for caching and S3 related metrics in #8591. As
a continuation, it would be helpful to add support for Azure & GCS
gateway related metrics as well.
This commit updates the KMS getting started guide
and replaces the legacy MinIO<-->Vault setup with a
MinIO<-->KES<-->Vault setup.
Therefore, add some architecture ASCII diagrams and
provide a step-by-step guide to setup Vault, KES and
MinIO such that MinIO can encrypt objects with KES +
Vault.
The legacy Vault guide has been moved to `./vault-legacy.md`.
Co-authored-by: Harshavardhana <harsha@minio.io>
Fixes scenario where zones are appropriately
handled, along with supporting overriding set
count. The new fix also ensures that we handle
the various setup types properly.
Update documentation to properly indicate the
behavior.
Fixes#8750
Co-authored-by: Nitish Tiwari <nitish@minio.io>
This is to ensure that when we have multiple tenants
deployed all sharing the same etcd for global bucket
should avoid listing each others buckets, this leads
to information leak which should be avoided unless
etcd is not namespaced for IAM assets in which case
it can be assumed that its a federated setup.
Federated setup and namespaced IAM assets on etcd
is not supported since namespacing is only useful
when you wish to separate the tenants as isolated
instances of MinIO.
This PR allows a new type of behavior, primarily
driven by the usecase of m3(mkube) multi-tenant
deployments with global bucket support.
Final update to all messages across sub-systems
after final review, the only change here is that
NATS now has TLS and TLSSkipVerify to be consistent
for all other notification targets.
This PR adds support below metrics
- Cache Hit Count
- Cache Miss Count
- Data served from Cache (in Bytes)
- Bytes received from AWS S3
- Bytes sent to AWS S3
- Number of requests sent to AWS S3
Fixes#8549
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS
This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
- Migrate and save only settings which are enabled
- Rename logger_http to logger_webhook and
logger_http_audit to audit_webhook
- No more pretty printing comments, comment
is a key=value pair now.
- Avoid quotes on values which do not have space in them
- `state="on"` is implicit for all SetConfigKV unless
specified explicitly as `state="off"`
- Disabled IAM users should be disabled always
This PR implements locking from a global entity into
a more localized set level entity, allowing for locks
to be held only on the resources which are writing
to a collection of disks rather than a global level.
In this process this PR also removes the top-level
limit of 32 nodes to an unlimited number of nodes. This
is a precursor change before bring in bucket expansion.
This PR fixes issues found in config migration
- StorageClass migration error when rrs is empty
- Plain-text migration of older config
- Do not run in safe mode with incorrect credentials
- Update logger_http documentation for _STATE env
Refer more reported issues at #8434
This PR refactors object layer handling such
that upon failure in sub-system initialization
server reaches a stage of safe-mode operation
wherein only certain API operations are enabled
and available.
This allows for fixing many scenarios such as
- incorrect configuration in vault, etcd,
notification targets
- missing files, incomplete config migrations
unable to read encrypted content etc
- any other issues related to notification,
policies, lifecycle etc