minio

Commit Graph

Author	SHA1	Message	Date
Poorna	75b925c326	Deprecate root disk for disk caching (#14527 ) This PR modifies #14513 to issue a deprecation warning rather than reject settings on startup.	2022-03-10 18:42:44 -08:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Poorna	1ef8babfef	cache: improve error reported for atime check (#14384 )	2022-02-23 11:57:06 -08:00
yfanswer	f4e373e0d2	de-couple cache completeMultipartUpload with caller context (#14181 )	2022-01-26 11:55:58 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Poorna K	111c6177d2	Deprecate caching for erasure/distributed mode (#13909 ) Fixes: #13907 Also removing default value of `writethrough` for cache commit which was interfering with cache_after setting	2021-12-15 16:48:34 -08:00
Poorna K	0a66a6f1e5	Avoid cache GC of writebacks before commit syncs (#13860 ) Save part.1 for writebacks in a separate folder and move it to cache dir atomically while saving the cache metadata. This is to avoid GC mistaking part.1 as orphaned cache entries and purging them. This PR also fixes object size being overwritten during retries for write-back mode.	2021-12-08 14:52:31 -08:00
Poorna K	d21466f595	cache: in writeback mode skip etag verification (#13781 ) if the commit is still in pending or failed status This PR also does some minor code cleanup	2021-11-30 10:22:42 -08:00
Poorna K	03725dc015	Default multipart caching to writethrough (#13613 ) when `MINIO_CACHE_COMMIT` is set. - `writeback` caching applies only to single uploads. When cache commit mode is `writeback`, default multipart caching to be synchronous. - Add writethrough caching for single uploads	2021-11-10 08:12:03 -08:00
Poorna K	15dcacc1fc	Add support for caching multipart in writethrough mode (#13507 )	2021-11-01 08:11:58 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Harshavardhana	81d5688d56	move the dependency to minio/pkg for common libraries (#12397 )	2021-05-28 15:17:01 -07:00
Harshavardhana	e84f533c6c	add missing wait groups for certain io.Pipe() usage (#12264 ) wait groups are necessary with io.Pipes() to avoid races when a blocking function may not be expected and a Write() -> Close() before Read() races on each other. We should avoid such situations.. Co-authored-by: Klaus Post <klauspost@gmail.com>	2021-05-11 09:18:37 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Andreas Auernhammer	d4b822d697	pkg/etag: add new package for S3 ETag handling (#11577 ) This commit adds a new package `etag` for dealing with S3 ETags. Even though ETag is often viewed as MD5 checksum of an object, handling S3 ETags correctly is a surprisingly complex task. While it is true that the ETag corresponds to the MD5 for the most basic S3 API operations, there are many exceptions in case of multipart uploads or encryption. In worse, some S3 clients expect very specific behavior when it comes to ETags. For example, some clients expect that the ETag is a double-quoted string and fail otherwise. Non-AWS compliant ETag handling has been a source of many bugs in the past. Therefore, this commit adds a dedicated `etag` package that provides functionality for parsing, generating and converting S3 ETags. Further, this commit removes the ETag computation from the `hash` package. Instead, the `hash` package (i.e. `hash.Reader`) should focus only on computing and verifying the content-sha256. One core feature of this commit is to provide a mechanism to communicate a computed ETag from a low-level `io.Reader` to a high-level `io.Reader`. This problem occurs when an S3 server receives a request and has to compute the ETag of the content. However, the server may also wrap the initial body with several other `io.Reader`, e.g. when encrypting or compressing the content: ``` reader := Encrypt(Compress(ETag(content))) ``` In such a case, the ETag should be accessible by the high-level `io.Reader`. The `etag` provides a mechanism to wrap `io.Reader` implementations such that the `ETag` can be accessed by a type-check. This technique is applied to the PUT, COPY and Upload handlers.	2021-02-23 12:31:53 -08:00
Poorna Krishnamoorthy	93fd248b52	fix: save ModTime properly in disk cache (#11522 ) fix #11414	2021-02-11 19:25:47 -08:00
Krishnan Parthasarathi	b87fae0049	Simplify PutObjReader for plain-text reader usage (#11470 ) This change moves away from a unified constructor for plaintext and encrypted usage. NewPutObjReader is simplified for the plain-text reader use. For encrypted reader use, WithEncryption should be called on an initialized PutObjReader. Plaintext: func NewPutObjReader(rawReader hash.Reader) PutObjReader The hash.Reader is used to provide payload size and md5sum to the downstream consumers. This is different from the previous version in that there is no need to pass nil values for unused parameters. Encrypted: func WithEncryption(encReader hash.Reader, key crypto.ObjectKey) (*PutObjReader, error) This method sets up encrypted reader along with the key to seal the md5sum produced by the plain-text reader (already setup when NewPutObjReader was called). Usage: ``` pReader := NewPutObjReader(rawReader) // ... other object handler code goes here // Prepare the encrypted hashed reader pReader, err = pReader.WithEncryption(encReader, objEncKey) ```	2021-02-10 08:52:50 -08:00
Poorna Krishnamoorthy	f3beb1236a	Add cache usage, total capacity to prometheus metrics (#11026 )	2020-12-07 16:35:11 -08:00
Poorna Krishnamoorthy	03fdbc3ec2	Add async caching commit option in diskcache (#10742 ) Add store and a forward option for a single part uploads when an async mode is enabled with env MINIO_CACHE_COMMIT=writeback It defaults to `writethrough` if unspecified.	2020-11-02 10:00:45 -08:00
Kaloyan Raev	df9894e275	avoid caching http ranges in background goroutine (#10724 )	2020-10-26 23:04:48 -07:00
Poorna Krishnamoorthy	46275c6547	cache: rename function declarations (#10763 )	2020-10-26 15:41:24 -07:00
Poorna Krishnamoorthy	0994ed9783	cache: fix call in GetObjectNInfo (#10762 ) Fixes: #10751	2020-10-26 12:30:40 -07:00
Klaus Post	b7438fe4e6	Copy metadata before spawning goroutine + prealloc maps (#10458 ) In `(*cacheObjects).GetObjectNInfo` copy the metadata before spawning a goroutine. Clean up a few map[string]string copies as well, reducing allocs and simplifying the code. Fixes #10426	2020-09-10 11:37:22 -07:00
Klaus Post	650dccfa9e	cache: Only start at high watermark (#10403 ) Currently, cache purges are triggered as soon as the low watermark is exceeded. To reduce IO this should only be done when reaching the high watermark. This simplifies checks and reduces all calls for a GC to go through `dcache.diskSpaceAvailable(size)`. While a comment claims that `dcache.triggerGC <- struct{}{}` was non-blocking I don't see how that was possible. Instead, we add a 1 size to the queue channel and use channel semantics to avoid blocking when a GC has already been requested. `bytesToClear` now takes the high watermark into account to it will not request any bytes to be cleared until that is reached.	2020-09-02 17:48:44 -07:00
Harshavardhana	caad314faa	add ruleguard support, fix all the reported issues (#10335 )	2020-08-24 12:11:20 -07:00
poornas	a2a5ec93d3	fix: use global context for filling cache in the background (#10308 )	2020-08-20 14:23:24 -07:00
Harshavardhana	e7ba78beee	use GlobalContext instead of context.Background when possible (#10254 )	2020-08-13 09:16:01 -07:00
poornas	1b6ba0d062	Add validation in cache for offline drives (#10146 ) closes #10144	2020-07-28 10:06:52 -07:00
poornas	55a3b071ea	Allow optionally to disable range caching. (#9908 ) The default behavior is to cache each range requested to cache drive. Add an environment variable `MINIO_RANGE_CACHE` - when set to off, it disables range caching and instead downloads entire object in the background. Fixes #9870	2020-06-29 13:25:29 -07:00
Harshavardhana	f9aa239973	fix: export prometheus metrics for cache GC triggers (#9815 ) Bonus change to use channel to serialize triggers, instead of using atomic variables. More efficient mechanism for synchronization. Co-authored-by: Nitish Tiwari <nitish@minio.io>	2020-06-15 09:05:35 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
Nathan Brown	2af3004409	Use registry to check Atime support on Windows (#9741 )	2020-05-30 09:47:42 -07:00
Harshavardhana	41688a936b	fix: CopyObject behavior on expanded zones (#9729 ) CopyObject was not correctly figuring out the correct destination object location and would end up creating duplicate objects on two different zones, reproduced by doing encryption based key rotation.	2020-05-28 14:36:38 -07:00
Harshavardhana	7ea026ff1d	fix: reply back user-metadata in lower case form (#9697 ) some clients such as veeam expect the x-amz-meta to be sent in lower cased form, while this does indeed defeats the HTTP protocol contract it is harder to change these applications, while these applications get fixed appropriately in future. x-amz-meta is usually sent in lowercased form by AWS S3 and some applications like veeam incorrectly end up relying on the case sensitivity of the HTTP headers. Bonus fixes - Fix the iso8601 time format to keep it same as AWS S3 response - Increase maxObjectList to 50,000 and use maxDeleteList as 10,000 whenever multi-object deletes are needed.	2020-05-25 16:51:32 -07:00
poornas	3202f78f0f	Fix cache metadata update for range GET (#9636 ) This was inadvertently deleting cached ranges because HTTPRangeSpec was not being passed down fixes #9597	2020-05-18 18:33:43 -07:00
Egon Elbre	a5efcbab51	fix: cacheReader.Close in all paths that don't return it. (#9418 )	2020-04-22 12:13:57 -07:00
Harshavardhana	282c9f790a	fix: validate partNumber in queryParam as part of preConditions (#9386 )	2020-04-20 22:01:59 -07:00
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	2020-04-09 09:30:02 -07:00
Harshavardhana	43a3778b45	fix: support object-remaining-retention-days policy condition (#9259 ) This PR also tries to simplify the approach taken in object-locking implementation by preferential treatment given towards full validation. This in-turn has fixed couple of bugs related to how policy should have been honored when ByPassGovernance is provided. Simplifies code a bit, but also duplicates code intentionally for clarity due to complex nature of object locking implementation.	2020-04-06 13:44:16 -07:00
Harshavardhana	cfc9cfd84a	fix: various optimizations, idiomatic changes (#9179 ) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.	2020-03-22 12:16:36 -07:00
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	2020-03-18 16:19:29 -07:00
poornas	c93157019f	Allow gc to run in parallel on cache drives (#9051 )	2020-03-03 06:42:26 +03:00
poornas	978bd4e2c4	check cacheControl not nil before access (#9055 ) Fixes: #9053	2020-02-27 10:57:00 -08:00
poornas	224b4f13b8	Add cache eviction low and high watermarks (#8958 ) To allow better control the cache eviction process. Introduce MINIO_CACHE_WATERMARK_LOW and MINIO_CACHE_WATERMARK_HIGH env. variables to specify when to stop/start cache eviction process. Deprecate MINIO_CACHE_EXPIRY environment variable. Cache gc sweeps at 30 minute intervals whenever high watermark is reached to clear least recently accessed entries in the cache until sufficient space is cleared to reach the low watermark. Garbage collection uses an adaptive file scoring approach based on last access time, with greater weights assigned to larger objects and those with more hits to find the candidates for eviction. Thanks to @klauspost for this file scoring algorithm Co-authored-by: Klaus Post <klauspost@minio.io>	2020-02-23 19:03:39 +05:30
poornas	013773065c	Save metadata correctly in cache.json on PUT (#8985 ) fixes #8979	2020-02-13 08:49:32 +05:30
poornas	9b4d46a6ed	evict cached entry for server side copy (#8947 ) Fixes #8942	2020-02-07 14:36:46 -08:00
poornas	278a165674	Allow caching based on a configurable number of hits. (#8891 ) Co-authored-by: Harshavardhana <harsha@minio.io>	2020-02-04 09:10:01 +05:30
Harshavardhana	0cbebf0f57	Rename pkg/{tagging,lifecycle} to pkg/bucket sub-directory (#8892 ) Rename to allow for more such features to come in a more proper hierarchical manner.	2020-01-27 14:12:34 -08:00
poornas	a78e5d4763	Add missing error check in cache GetObjectNInfo (#8889 )	2020-01-24 15:49:16 -08:00
poornas	60e60f68dd	Add support for object locking with legal hold. (#8634 )	2020-01-16 15:41:56 -08:00

1 2 3

115 Commits