This PR adds disk based edge caching support for minio server. Cache settings can be configured in config.json to take list of disk drives, cache expiry in days and file patterns to exclude from cache or via environment variables MINIO_CACHE_DRIVES, MINIO_CACHE_EXCLUDE and MINIO_CACHE_EXPIRY Design assumes that Atime support is enabled and the list of cache drives is fixed. - Objects are cached on both GET and PUT/POST operations. - Expiry is used as hint to evict older entries from cache, or if 80% of cache capacity is filled. - When object storage backend is down, GET, LIST and HEAD operations fetch object seamlessly from cache. Current Limitations - Bucket policies are not cached, so anonymous operations are not supported in offline mode. - Objects are distributed using deterministic hashing among list of cache drives specified.If one or more drives go offline, or cache drive configuration is altered - performance could degrade to linear lookup. Fixes #4026
2.0 KiB
Disk based caching
Disk caching can be turned on by updating the "cache" config
settings for minio server. By default, this is at ${HOME}/.minio
.
"cache" takes the drives location, duration to expiry (in days) and any wildcard patterns to exclude certain content from cache as configuration settings.
"cache": {
"drives": ["/path/drive1", "/path/drive2", "/path/drive3"],
"expiry": 30,
"exclude": ["*.png","bucket1/a/b","bucket2/*"]
},
The cache settings can also be set by the environment variables below. When set, environment variables override any cache settings in config.json
export MINIO_CACHE_DRIVES="/drive1;/drive2;/drive3"
export MINIO_CACHE_EXPIRY=90
export MINIO_CACHE_EXCLUDE="pattern1;pattern2;pattern3"
-
Cache size is 80% of drive capacity. Disk caching requires Atime support to be enabled on the cache drive.
-
Expiration of entries takes user provided expiry as a hint, and defaults to 90 days if not provided.
-
Garbage collection sweep of the expired entries happens whenever disk usage is > 80% of drive capacity until sufficient disk space has been freed.
-
Object is cached only when drive has sufficient disk space for 100 times the size of current object
Behavior
Disk caching happens on both GET and PUT operations.
-
GET caches new objects for entries not found in cache. Otherwise serves from the cache.
-
PUT/POST caches all successfully uploaded objects. Replaces existing cached entry for the same object if needed.
When an object is deleted, it is automatically cleared from the cache.
NOTE: Expiration happens automatically based on the configured interval as explained above, frequently accessed objects stay alive in cache for a significantly longer time on every cache hit.
The following caveats apply for offline mode
- GET, LIST and HEAD operations will be served from the disk cache.
- PUT operations are disallowed when gateway backend is offline.
- Anonymous operations are not implemented as of now.