- acquire since leader lock for all background operations
- healing, crawling and applying lifecycle policies.
- simplify lifecyle to avoid network calls, which was a
bug in implementation - we should hold a leader and
do everything from there, we have access to entire
name space.
- make listing, walking not interfere by slowing itself
down like the crawler.
- effectively use global context everywhere to ensure
proper shutdown, in cache, lifecycle, healing
- don't read `format.json` for prometheus metrics in
StorageInfo() call.
To allow better control the cache eviction process.
Introduce MINIO_CACHE_WATERMARK_LOW and
MINIO_CACHE_WATERMARK_HIGH env. variables to specify
when to stop/start cache eviction process.
Deprecate MINIO_CACHE_EXPIRY environment variable. Cache
gc sweeps at 30 minute intervals whenever high watermark is
reached to clear least recently accessed entries in the cache
until sufficient space is cleared to reach the low watermark.
Garbage collection uses an adaptive file scoring approach based
on last access time, with greater weights assigned to larger
objects and those with more hits to find the candidates for eviction.
Thanks to @klauspost for this file scoring algorithm
Co-authored-by: Klaus Post <klauspost@minio.io>
X-Cache sets cache status of HIT if object is
served from the disk cache, or MISS otherwise.
X-Cache-Lookup is set to HIT if object was found
in the cache even if not served (for e.g. if cache
entry was invalidated by ETag verification)
This PR adds support below metrics
- Cache Hit Count
- Cache Miss Count
- Data served from Cache (in Bytes)
- Bytes received from AWS S3
- Bytes sent to AWS S3
- Number of requests sent to AWS S3
Fixes#8549
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS
This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
This PR implements locking from a global entity into
a more localized set level entity, allowing for locks
to be held only on the resources which are writing
to a collection of disks rather than a global level.
In this process this PR also removes the top-level
limit of 32 nodes to an unlimited number of nodes. This
is a precursor change before bring in bucket expansion.
This PR refactors object layer handling such
that upon failure in sub-system initialization
server reaches a stage of safe-mode operation
wherein only certain API operations are enabled
and available.
This allows for fixing many scenarios such as
- incorrect configuration in vault, etcd,
notification targets
- missing files, incomplete config migrations
unable to read encrypted content etc
- any other issues related to notification,
policies, lifecycle etc
- This PR allows config KVS to be validated properly
without being affected by ENV overrides, rejects
invalid values during set operation
- Expands unit tests and refactors the error handling
for notification targets, returns error instead of
ignoring targets for invalid KVS
- Does all the prep-work for implementing safe-mode
style operation for MinIO server, introduces a new
global variable to toggle safe mode based operations
NOTE: this PR itself doesn't provide safe mode operations
- adding oauth support to MinIO browser (#8400) by @kanagaraj
- supports multi-line get/set/del for all config fields
- add support for comments, allow toggle
- add extensive validation of config before saving
- support MinIO browser to support proper claims, using STS tokens
- env support for all config parameters, legacy envs are also
supported with all documentation now pointing to latest ENVs
- preserve accessKey/secretKey from FS mode setups
- add history support implements three APIs
- ClearHistory
- RestoreHistory
- ListHistory
- add help command support for each config parameters
- all the bug fixes after migration to KV, and other bug
fixes encountered during testing.
This change is related to larger config migration PR
change, this is a first stage change to move our
configs to `cmd/config/` - divided into its subsystems
This PR changes cache on PUT behavior to background fill the cache
after PutObject completes. This will avoid concurrency issues as in #8219.
Added cleanup of partially filled cache to prevent cache corruption
- Fixes#8208
Fixes#7458Fixes#7573Fixes#7938Fixes#6934Fixes#6265Fixes#6630
This will allow the cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.
- All cached content is automatically bitrot protected.
- Avoid ETag verification if a cache-control header
is set and the cached content is still valid.
- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
- Background Heal routine receives heal requests from a channel, either to
heal format, buckets or objects
- Daily sweeper lists all objects in all buckets, these objects
don't necessarly have read quorum so they can be removed if
these objects are unhealable
- Heal daily ops receives objects from the daily sweeper
and send them to the heal routine.
This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.
Fixes: #7458, #7573, #6265, #6630
Bulk delete at storage level in Multiple Delete Objects API
In order to accelerate bulk delete in Multiple Delete objects API,
a new bulk delete is introduced in storage layer, which will accept
a list of objects to delete rather than only one. Consequently,
a new API is also need to be added to Object API.
Other listing optimizations include
- remove double sorting while filtering object entries
- improve error message when upload-id is not in quorum
- use jsoniter for full unmarshal json, instead of gjson
- remove unused code
- [x] Support bucket and regular object operations
- [x] Supports Select API on HDFS
- [x] Implement multipart API support
- [x] Completion of ListObjects support
CopyObject precondition checks into GetObjectReader
in order to perform SSE-C pre-condition checks using the
last 32 bytes of encrypted ETag rather than the decrypted
ETag
This also necessitates moving precondition checks for
gateways to gateway layer rather than object handler check
To conform with AWS S3 Spec on ETag for SSE-S3 encrypted objects,
encrypt client sent MD5Sum and store it on backend as ETag.Extend
this behavior to SSE-C encrypted objects.
The new call combines GetObjectInfo and GetObject, and returns an
object with a ReadCloser interface.
Also adds a number of end-to-end encryption tests at the handler
level.
* Revert "Encrypted reader wrapped in NewGetObjectReader should be closed (#6383)"
This reverts commit 53a0bbeb5b.
* Revert "Change SelectAPI to use new GetObjectNInfo API (#6373)"
This reverts commit 5b05df215a.
* Revert "Implement GetObjectNInfo object layer call (#6290)"
This reverts commit e6d740ce09.
This combines calling GetObjectInfo and GetObject while returning a
io.ReadCloser for the object's body. This allows the two operations to
be under a single lock, fixing a race between getting object info and
reading the object body.
Continuing from PR 157ed65c35
Our posix.go implementation did not handle I/O errors
properly on the disks, this led to situations where
top-level callers such as ListObjects might return early
without even verifying all the available disks.
This commit tries to address this in Kubernetes, drbd/nbd based
persistent volumes which can disconnect under load and
result in the situations with disks return I/O errors.
This commit also simplifies listing operation, listing
never returns any error. We can avoid this since we pretty
much ignore most of the errors anyways. When objects are
accessed directly we return proper errors.
Better support of HEAD and listing of zero sized objects with trailing
slash (a.k.a empty directory). For that, isLeafDir function is added
to indicate if the specified object is an empty directory or not. Each
backend (xl, fs) has the responsibility to store that information.
Currently, in both of XL & FS, an empty directory is represented by
an empty directory in the backend.
isLeafDir() checks if the given path is an empty directory or not,
since dir listing is costly if the latter contains too many objects,
readDirN() is added in this PR to list only N number of entries.
In isLeadDir(), we will only list one entry to check if a directory
is empty or not.
This PR adds disk based edge caching support for minio server.
Cache settings can be configured in config.json to take list of disk drives,
cache expiry in days and file patterns to exclude from cache or via environment
variables MINIO_CACHE_DRIVES, MINIO_CACHE_EXCLUDE and MINIO_CACHE_EXPIRY
Design assumes that Atime support is enabled and the list of cache drives is
fixed.
- Objects are cached on both GET and PUT/POST operations.
- Expiry is used as hint to evict older entries from cache, or if 80% of cache
capacity is filled.
- When object storage backend is down, GET, LIST and HEAD operations fetch
object seamlessly from cache.
Current Limitations
- Bucket policies are not cached, so anonymous operations are not supported in
offline mode.
- Objects are distributed using deterministic hashing among list of cache
drives specified.If one or more drives go offline, or cache drive
configuration is altered - performance could degrade to linear lookup.
Fixes#4026