minio

mirror of https://github.com/minio/minio.git synced 2025-05-02 07:23:59 -04:00

Author	SHA1	Message	Date
Klaus Post	fa9cf1251b	Imporve healing and reporting (#11312 ) * Provide information on actively healing, buckets healed/queued, objects healed/failed. * Add concurrent healing of multiple sets (typically on startup). * Add bucket level resume, so restarts will only heal non-healed buckets. * Print summary after healing a disk is done.	2021-03-04 14:36:23 -08:00
Andreas Auernhammer	1f659204a2	remove GetObject from ObjectLayer interface (#11635 ) This commit removes the `GetObject` method from the `ObjectLayer` interface. The `GetObject` method is not longer used by the HTTP handlers implementing the high-level S3 semantics. Instead, they use the `GetObjectNInfo` method which returns both, an object handle as well as the object metadata. Therefore, it is no longer necessary that a concrete `ObjectLayer` implements `GetObject`.	2021-02-26 09:52:02 -08:00
Andreas Auernhammer	d4b822d697	pkg/etag: add new package for S3 ETag handling (#11577 ) This commit adds a new package `etag` for dealing with S3 ETags. Even though ETag is often viewed as MD5 checksum of an object, handling S3 ETags correctly is a surprisingly complex task. While it is true that the ETag corresponds to the MD5 for the most basic S3 API operations, there are many exceptions in case of multipart uploads or encryption. In worse, some S3 clients expect very specific behavior when it comes to ETags. For example, some clients expect that the ETag is a double-quoted string and fail otherwise. Non-AWS compliant ETag handling has been a source of many bugs in the past. Therefore, this commit adds a dedicated `etag` package that provides functionality for parsing, generating and converting S3 ETags. Further, this commit removes the ETag computation from the `hash` package. Instead, the `hash` package (i.e. `hash.Reader`) should focus only on computing and verifying the content-sha256. One core feature of this commit is to provide a mechanism to communicate a computed ETag from a low-level `io.Reader` to a high-level `io.Reader`. This problem occurs when an S3 server receives a request and has to compute the ETag of the content. However, the server may also wrap the initial body with several other `io.Reader`, e.g. when encrypting or compressing the content: ``` reader := Encrypt(Compress(ETag(content))) ``` In such a case, the ETag should be accessible by the high-level `io.Reader`. The `etag` provides a mechanism to wrap `io.Reader` implementations such that the `ETag` can be accessed by a type-check. This technique is applied to the PUT, COPY and Upload handlers.	2021-02-23 12:31:53 -08:00
Harshavardhana	be7de911c4	fix: update minio-go to fix an issue with S3 gateway (#11591 ) since we have changed our default envs to MINIO_ROOT_USER, MINIO_ROOT_PASSWORD this was not supported by minio-go credentials package, update minio-go to v7.0.10 for this support. This also addresses few bugs related to users had to specify AWS_ACCESS_KEY_ID as well to authenticate with their S3 backend if they only used MINIO_ROOT_USER.	2021-02-20 11:10:21 -08:00
Poorna Krishnamoorthy	e6b4ea7618	More fixes for delete marker replication (#11504 ) continuation of PR#11491 for multiple server pools and bi-directional replication. Moving proxying for GET/HEAD to handler level rather than server pool layer as this was also causing incorrect proxying of HEAD. Also fixing metadata update on CopyObject - minio-go was not passing source version ID in X-Amz-Copy-Source header	2021-02-10 17:25:04 -08:00
Krishnan Parthasarathi	b87fae0049	Simplify PutObjReader for plain-text reader usage (#11470 ) This change moves away from a unified constructor for plaintext and encrypted usage. NewPutObjReader is simplified for the plain-text reader use. For encrypted reader use, WithEncryption should be called on an initialized PutObjReader. Plaintext: func NewPutObjReader(rawReader hash.Reader) PutObjReader The hash.Reader is used to provide payload size and md5sum to the downstream consumers. This is different from the previous version in that there is no need to pass nil values for unused parameters. Encrypted: func WithEncryption(encReader hash.Reader, key crypto.ObjectKey) (*PutObjReader, error) This method sets up encrypted reader along with the key to seal the md5sum produced by the plain-text reader (already setup when NewPutObjReader was called). Usage: ``` pReader := NewPutObjReader(rawReader) // ... other object handler code goes here // Prepare the encrypted hashed reader pReader, err = pReader.WithEncryption(encReader, objEncKey) ```	2021-02-10 08:52:50 -08:00
Harshavardhana	f108873c48	fix: replication metadata comparsion and other fixes (#11410 ) - using miniogo.ObjectInfo.UserMetadata is not correct - using UserTags from Map->String() can change order - ContentType comparison needs to be removed. - Compare both lowercase and uppercase key names. - do not silently error out constructing PutObjectOptions if tag parsing fails - avoid notification for empty object info, failed operations should rely on valid objInfo for notification in all situations - optimize copyObject implementation, also introduce a new replication event - clone ObjectInfo() before scheduling for replication - add additional headers for comparison - remove strings.EqualFold comparison avoid unexpected bugs - fix pool based proxying with multiple pools - compare only specific metadata Co-authored-by: Poorna Krishnamoorthy <poornas@users.noreply.github.com>	2021-02-03 20:41:33 -08:00
Andreas Auernhammer	838d4dafbd	gateway: don't use encrypted ETags for If-Match (#11400 ) This commit fixes a bug in the S3 gateway that causes GET requests to fail when the object is encrypted by the gateway itself. The gateway was not able to GET the object since it always specified a `If-Match` pre-condition checking that the object ETag matches an expected ETag - even for encrypted ETags. The problem is that an encrypted ETag will never match the ETag computed by the backend causing the `If-Match` pre-condition to fail. This commit fixes this by not sending an `If-Match` header when the ETag is encrypted. This is acceptable because: 1. A gateway-encrypted object consists of two objects at the backend and there is no way to provide a concurrency-safe implementation of two consecutive S3 GETs in the deployment model of the S3 gateway. Ref: S3 gateways are self-contained and isolated - and there may be multiple instances at the same time (no lock across instances). 2. Even if the data object changes (concurrent PUT) while gateway A has download the metadata object (but not issued the GET to the data object => data race) then we don't return invalid data to the client since the decryption (of the currently uploaded data) will fail - given the metadata of the previous object.	2021-02-01 23:02:08 -08:00
Anis Elleuch	e96fdcd5ec	tagging: Add event notif for PUT object tagging (#11366 ) An optimization to avoid double calling for during PutObject tagging	2021-02-01 13:52:51 -08:00
Ritesh H Shukla	b4add82bb6	Updated Prometheus metrics (#11141 ) * Add metrics for nodes online and offline * Add cluster capacity metrics * Introduce v2 metrics	2021-01-18 20:35:38 -08:00
Harshavardhana	cb0eaeaad8	feat: migrate to ROOT_USER/PASSWORD from ACCESS/SECRET_KEY (#11185 )	2021-01-05 10:22:57 -08:00
Harshavardhana	e7ae49f9c9	fix: calculate prometheus disks_offline/disks_total correctly (#11215 ) fixes #11196	2021-01-04 09:42:09 -08:00
Anis Elleuch	cffdb01279	azure/s3 gateways: Pass ETag during GET call to avoid data corruption (#11024 ) Both Azure & S3 gateways call for object information before returning the stream of the object, however, the object content/length could be modified meanwhile, which means it can return a corrupted object. Use ETag to ensure that the object was not modified during the GET call	2020-12-17 09:11:14 -08:00
Poorna Krishnamoorthy	251c1ef6da	Add support for replication of object tags, retention metadata (#10880 )	2020-11-19 18:56:09 -08:00
Harshavardhana	0f9e125cf3	fix: check for gateway backend online without http request (#10924 ) fixes #10921	2020-11-19 10:38:02 -08:00
Steven Reitsma	54120107ce	fix: infinite loop in cleanupStaleUploads of encrypted MPUs (#10845 ) fixes #10588	2020-11-06 11:53:42 -08:00
Steven Reitsma	74f7cf24ae	fix: s3 gateway SSE pagination (#10840 ) Fixes #10838	2020-11-05 15:04:03 -08:00
Harshavardhana	80fab03b63	fix: S3 gateway doesn't support full passthrough for encryption (#10484 ) The entire encryption layer is dependent on the fact that KMS should be configured for S3 encryption to work properly and we only support passing the headers as is to the backend for encryption only if KMS is configured. Make sure that this predictability is maintained, currently the code was allowing encryption to go through and fail at later to indicate that KMS was not configured. We should simply reply "NotImplemented" if KMS is not configured, this allows clients to simply proceed with their tests.	2020-09-15 13:57:15 -07:00
Harshavardhana	0104af6bcc	delayed locks until we have started reading the body (#10474 ) This is to ensure that Go contexts work properly, after some interesting experiments I found that Go net/http doesn't cancel the context when Body is non-zero and hasn't been read till EOF. The following gist explains this, this can lead to pile up of go-routines on the server which will never be canceled and will die at a really later point in time, which can simply overwhelm the server. https://gist.github.com/harshavardhana/c51dcfd055780eaeb71db54f9c589150 To avoid this refactor the locking such that we take locks after we have started reading from the body and only take locks when needed. Also, remove contextReader as it's not useful, doesn't work as expected context is not canceled until the body reaches EOF so there is no point in wrapping it with context and putting a `select {` on it which can unnecessarily increase the CPU overhead. We will still use the context to cancel the lockers etc. Additional simplification in the locker code to avoid timers as re-using them is a complicated ordeal avoid them in the hot path, since locking is very common this may avoid lots of allocations.	2020-09-14 15:57:13 -07:00
Harshavardhana	6a0372be6c	cleanup tmpDir any older entries automatically just like multipart (#10439 ) also consider multipart uploads, temporary files in `.minio.sys/tmp` as stale beyond 24hrs and clean them up automatically	2020-09-08 15:55:40 -07:00
飞雪无情	ea1803417f	Use constants for gateway names to avoid bugs caused by spelling. (#10355 )	2020-08-26 08:52:46 -07:00
poornas	c43da3005a	Add support for server side bucket replication (#9882 )	2020-07-21 17:49:56 -07:00
Harshavardhana	ec06089eda	fix: re-implement cluster healthcheck (#10101 )	2020-07-20 18:31:22 -07:00
Harshavardhana	14b1c9f8e4	fix: return Range errors after If-Matches (#10045 ) closes #7292	2020-07-17 13:01:22 -07:00
Harshavardhana	4bfc50411c	fix: return versionId in tagging APIs (#10068 )	2020-07-16 22:38:58 -07:00
Anis Elleuch	778e9c864f	Move dependency from minio-go v6 to v7 (#10042 )	2020-07-14 09:38:05 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
kannappanr	225b812b5e	Update minio-go library to latest (#9813 )	2020-06-12 10:18:42 -07:00
Harshavardhana	b2db8123ec	Preserve errors returned by diskInfo to detect disk errors (#9727 ) This PR basically reverts #9720 and re-implements it differently	2020-05-28 13:03:04 -07:00
Harshavardhana	b330c2c57e	Introduce simpler GetMultipartInfo call for performance (#9722 ) Advantages avoids 100's of stats which are needed for each upload operation in FS/NAS gateway mode when uploading a large multipart object, dramatically increases performance for multipart uploads by avoiding recursive calls. For other gateway's simplifies the approach since azure, gcs, hdfs gateway's don't capture any specific metadata during upload which needs handler validation for encryption/compression. Erasure coding was already optimized, additionally just avoids small allocations of large data structure. Fixes #7206	2020-05-28 12:36:20 -07:00
P R	3f6d624c7b	add gateway object tagging support (#9124 )	2020-05-23 11:09:35 -07:00
Harshavardhana	1bc32215b9	enable full linter across the codebase (#9620 ) enable linter using golangci-lint across codebase to run a bunch of linters together, we shall enable new linters as we fix more things the codebase. This PR fixes the first stage of this cleanup.	2020-05-18 09:59:45 -07:00
kannappanr	a62572fb86	Check for address flags in all positions (#9615 ) Fixes #9599	2020-05-17 08:46:23 -07:00
Harshavardhana	d348ec0f6c	avoid double listObjectParts calls improves performance (#9606 ) this PR is to avoid double calls across multiple calls in APIs - CopyObjectPart - PutObjectPart	2020-05-15 08:06:45 -07:00
kannappanr	6c1bbf918d	do not add quotes around etag, if already present (#9603 )	2020-05-14 17:43:54 -07:00
Harshavardhana	a1de9cec58	cleanup object-lock/bucket tagging for gateways (#9548 ) This PR is to ensure that we call the relevant object layer APIs for necessary S3 API level functionalities allowing gateway implementations to return proper errors as NotImplemented{} This allows for all our tests in mint to behave appropriately and can be handled appropriately as well.	2020-05-08 13:44:44 -07:00
Harshavardhana	282c9f790a	fix: validate partNumber in queryParam as part of preConditions (#9386 )	2020-04-20 22:01:59 -07:00
Harshavardhana	69fb68ef0b	fix simplify code to start using context (#9350 )	2020-04-16 10:56:18 -07:00
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	2020-04-09 09:30:02 -07:00
Bala FA	2c3e34f001	add force delete option of non-empty bucket (#9166 ) passing HTTP header `x-minio-force-delete: true` would allow standard S3 API DeleteBucket to delete a non-empty bucket forcefully.	2020-03-27 21:52:59 -07:00
Harshavardhana	3d3beb6a9d	Add response header timeouts (#9170 ) - Add conservative timeouts upto 3 minutes for internode communication - Add aggressive timeouts of 30 seconds for gateway communication Fixes #9105 Fixes #8732 Fixes #8881 Fixes #8376 Fixes #9028	2020-03-21 22:10:13 -07:00
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	2020-03-18 16:19:29 -07:00
Harshavardhana	e3b44c3829	Remove partName, partETag requirement (#9044 ) This is a precursor change before versioning, removes/deprecates the requirement of remembering partName and partETag which are not useful after a multipart transaction has finished. This PR reduces the overall size of the backend JSON for large file uploads.	2020-03-03 03:29:30 +03:00
poornas	224b4f13b8	Add cache eviction low and high watermarks (#8958 ) To allow better control the cache eviction process. Introduce MINIO_CACHE_WATERMARK_LOW and MINIO_CACHE_WATERMARK_HIGH env. variables to specify when to stop/start cache eviction process. Deprecate MINIO_CACHE_EXPIRY environment variable. Cache gc sweeps at 30 minute intervals whenever high watermark is reached to clear least recently accessed entries in the cache until sufficient space is cleared to reach the low watermark. Garbage collection uses an adaptive file scoring approach based on last access time, with greater weights assigned to larger objects and those with more hits to find the candidates for eviction. Thanks to @klauspost for this file scoring algorithm Co-authored-by: Klaus Post <klauspost@minio.io>	2020-02-23 19:03:39 +05:30
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2020-02-20 09:21:33 +05:30
Nitish Tiwari	63be4709b7	Add metrics support for Azure & GCS Gateway (#8954 ) We added support for caching and S3 related metrics in #8591. As a continuation, it would be helpful to add support for Azure & GCS gateway related metrics as well.	2020-02-11 21:08:01 +05:30
Harshavardhana	0cbebf0f57	Rename pkg/{tagging,lifecycle} to pkg/bucket sub-directory (#8892 ) Rename to allow for more such features to come in a more proper hierarchical manner.	2020-01-27 14:12:34 -08:00
Praveen raj Mani	5d09233115	Fix Readiness check (#8681 ) - Remove goroutine-check in Readiness check - Bring in quorum check for readiness Fixes #8385 Co-authored-by: Harshavardhana <harsha@minio.io>	2019-12-28 22:24:43 +05:30
Harshavardhana	c8d82588c2	Fix crash in console logger and also handle bucket DNS updates (#8654 ) Also fix listenBucketNotification bugs seen by minio-js listen bucket notification API.	2019-12-16 20:30:57 -08:00
Nitish Tiwari	3df7285c3c	Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591 ) This PR adds support below metrics - Cache Hit Count - Cache Miss Count - Data served from Cache (in Bytes) - Bytes received from AWS S3 - Bytes sent to AWS S3 - Number of requests sent to AWS S3 Fixes #8549	2019-12-05 23:16:06 -08:00

1 2 3

120 Commits