minio

Commit Graph

Author	SHA1	Message	Date
Krishnan Parthasarathi	c829e3a13b	Support for remote tier management (#12090 ) With this change, MinIO's ILM supports transitioning objects to a remote tier. This change includes support for Azure Blob Storage, AWS S3 compatible object storage incl. MinIO and Google Cloud Storage as remote tier storage backends. Some new additions include: - Admin APIs remote tier configuration management - Simple journal to track remote objects to be 'collected' This is used by object API handlers which 'mutate' object versions by overwriting/replacing content (Put/CopyObject) or removing the version itself (e.g DeleteObjectVersion). - Rework of previous ILM transition to fit the new model In the new model, a storage class (a.k.a remote tier) is defined by the 'remote' object storage type (one of s3, azure, GCS), bucket name and a prefix. * Fixed bugs, review comments, and more unit-tests - Leverage inline small object feature - Migrate legacy objects to the latest object format before transitioning - Fix restore to particular version if specified - Extend SharedDataDirCount to handle transitioned and restored objects - Restore-object should accept version-id for version-suspended bucket (#12091) - Check if remote tier creds have sufficient permissions - Bonus minor fixes to existing error messages Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Krishna Srinivas <krishna@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	d46386246f	api: Introduce metadata update APIs to update only metadata (#11962 ) Current implementation heavily relies on readAllFileInfo but with the advent of xl.meta inlined with data, we cannot easily avoid reading data when we are only interested is updating metadata, this leads to invariably write amplification during metadata updates, repeatedly reading data when we are only interested in updating metadata. This PR ensures that we implement a metadata only update API at storage layer, that handles updates to metadata alone for any given version - given the version is valid and present. This helps reduce the chattiness for following calls.. - PutObjectTags - DeleteObjectTags - PutObjectLegalHold - PutObjectRetention - ReplicateObject (updates metadata on replication status)	2021-04-04 13:32:31 -07:00
Poorna Krishnamoorthy	47c09a1e6f	Various improvements in replication (#11949 ) - collect real time replication metrics for prometheus. - add pending_count, failed_count metric for total pending/failed replication operations. - add API to get replication metrics - add MRF worker to handle spill-over replication operations - multiple issues found with replication - fixes an issue when client sends a bucket name with `/` at the end from SetRemoteTarget API call make sure to trim the bucket name to avoid any extra `/`. - hold write locks in GetObjectNInfo during replication to ensure that object version stack is not overwritten while reading the content. - add additional protection during WriteMetadata() to ensure that we always write a valid FileInfo{} and avoid ever writing empty FileInfo{} to the lowest layers. Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2021-04-03 09:03:42 -07:00
Klaus Post	fa9cf1251b	Imporve healing and reporting (#11312 ) * Provide information on actively healing, buckets healed/queued, objects healed/failed. * Add concurrent healing of multiple sets (typically on startup). * Add bucket level resume, so restarts will only heal non-healed buckets. * Print summary after healing a disk is done.	2021-03-04 14:36:23 -08:00
Harshavardhana	c6a120df0e	fix: Prometheus metrics to re-use storage disks (#11647 ) also re-use storage disks for all `mc admin server info` calls as well, implement a new LocalStorageInfo() API call at ObjectLayer to lookup local disks storageInfo also fixes bugs where there were double calls to StorageInfo()	2021-03-02 17:28:04 -08:00
Harshavardhana	9171d6ef65	rename all references from crawl -> scanner (#11621 )	2021-02-26 15:11:42 -08:00
Krishna Srinivas	876b79b8d8	read-health check endpoint returns success if cluster can serve read requests (#11310 )	2021-02-09 01:00:44 -08:00
Anis Elleuch	e96fdcd5ec	tagging: Add event notif for PUT object tagging (#11366 ) An optimization to avoid double calling for during PutObject tagging	2021-02-01 13:52:51 -08:00
Harshavardhana	a6c146bd00	validate storage class across pools when setting config (#11320 ) ``` mc admin config set alias/ storage_class standard=EC:3 ``` should only succeed if parity ratio is valid for all server pools, if not we should fail proactively. This PR also needs to bring other changes now that we need to cater for variadic drive counts per pool. Bonus fixes also various bugs reproduced with - GetObjectWithPartNumber() - CopyObjectPartWithOffsets() - CopyObjectWithMetadata() - PutObjectPart,PutObject with truncated streams	2021-01-22 12:09:24 -08:00
Ritesh H Shukla	b4add82bb6	Updated Prometheus metrics (#11141 ) * Add metrics for nodes online and offline * Add cluster capacity metrics * Introduce v2 metrics	2021-01-18 20:35:38 -08:00
Anis Elleuch	2ecaab55a6	admin: ServerInfo returns info without object layer initialized (#11142 )	2020-12-21 09:35:19 -08:00
Harshavardhana	8368ab76aa	fix: remove the requirement for healing buckets in ListBucketsHeal (#11098 ) With new refactor of bucket healing, healing bucket happens automatically including its metadata, there is no need to redundant heal buckets also in ListBucketsHeal remove it.	2020-12-14 12:07:07 -08:00
Harshavardhana	2eb52ca5f4	fix: heal bucket metadata right before healing bucket (#11097 ) optimization mainly to avoid listing the entire `.minio.sys/buckets/.minio.sys` directory, this can get really huge and comes in the way of startup routines, contents inside `.minio.sys/buckets/.minio.sys` are rather transient and not necessary to be healed.	2020-12-13 11:57:08 -08:00
Harshavardhana	9c53cc1b83	fix: heal multiple buckets in bulk (#11029 ) makes server startup, orders of magnitude faster with large number of buckets	2020-12-05 13:00:44 -08:00
Klaus Post	2294e53a0b	Don't retain context in locker (#10515 ) Use the context for internal timeouts, but disconnect it from outgoing calls so we always receive the results and cancel it remotely.	2020-11-04 08:25:42 -08:00
Harshavardhana	b686bb9c83	fix: replaced drive properly by healing the entire drive (#10799 ) Bonus fixes, we do not need reload format anymore as the replaced drive is healed locally we only need to ensure that drive heal reloads the drive properly. We preserve the UUID of the original order, this means that the replacement in `format.json` doesn't mean that the drive needs to be reloaded into memory anymore. fixes #10791	2020-10-31 01:34:48 -07:00
Harshavardhana	0104af6bcc	delayed locks until we have started reading the body (#10474 ) This is to ensure that Go contexts work properly, after some interesting experiments I found that Go net/http doesn't cancel the context when Body is non-zero and hasn't been read till EOF. The following gist explains this, this can lead to pile up of go-routines on the server which will never be canceled and will die at a really later point in time, which can simply overwhelm the server. https://gist.github.com/harshavardhana/c51dcfd055780eaeb71db54f9c589150 To avoid this refactor the locking such that we take locks after we have started reading from the body and only take locks when needed. Also, remove contextReader as it's not useful, doesn't work as expected context is not canceled until the body reaches EOF so there is no point in wrapping it with context and putting a `select {` on it which can unnecessarily increase the CPU overhead. We will still use the context to cancel the lockers etc. Additional simplification in the locker code to avoid timers as re-using them is a complicated ordeal avoid them in the hot path, since locking is very common this may avoid lots of allocations.	2020-09-14 15:57:13 -07:00
Harshavardhana	a20d4568a2	fix: make sure to use uniform drive count calculation (#10208 ) It is possible in situations when server was deployed in asymmetric configuration in the past such as ``` minio server ~/fs{1...4}/disk{1...5} ``` Results in setDriveCount of 10 in older releases but with fairly recent releases we have moved to having server affinity which means that a set drive count ascertained from above config will be now '4' While the object layer make sure that we honor `format.json` the storageClass configuration however was by mistake was using the global value obtained by heuristics. Which leads to prematurely using lower parity without being requested by the an administrator. This PR fixes this behavior.	2020-08-05 13:31:12 -07:00
Harshavardhana	ec06089eda	fix: re-implement cluster healthcheck (#10101 )	2020-07-20 18:31:22 -07:00
Harshavardhana	2955aae8e4	feat: Add notification support for bucketCreates and removal (#10075 )	2020-07-20 12:52:49 -07:00
Anis Elleuch	778e9c864f	Move dependency from minio-go v6 to v7 (#10042 )	2020-07-14 09:38:05 -07:00
Harshavardhana	2d17c16d93	fix: make sure to honor versioning from browser UI deletes (#10016 )	2020-07-10 22:21:04 -07:00
Harshavardhana	91817d0d1a	fix: implement generic Walk for gateway (#9938 ) Walk() functionality was missing on gateway implementations leading to missing functionality for the browser UI such as remove multiple objects, download as zip file etc. This PR brings a generic implementation across all gateway's, it is not required to repeat the same code in all gateway's	2020-06-29 17:07:23 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
Harshavardhana	5686a7e273	fix NAS gateway support for policy/notification (#9765 ) Fixes #9764	2020-06-03 13:18:54 -07:00
Harshavardhana	b330c2c57e	Introduce simpler GetMultipartInfo call for performance (#9722 ) Advantages avoids 100's of stats which are needed for each upload operation in FS/NAS gateway mode when uploading a large multipart object, dramatically increases performance for multipart uploads by avoiding recursive calls. For other gateway's simplifies the approach since azure, gcs, hdfs gateway's don't capture any specific metadata during upload which needs handler validation for encryption/compression. Erasure coding was already optimized, additionally just avoids small allocations of large data structure. Fixes #7206	2020-05-28 12:36:20 -07:00
P R	3f6d624c7b	add gateway object tagging support (#9124 )	2020-05-23 11:09:35 -07:00
Harshavardhana	a1de9cec58	cleanup object-lock/bucket tagging for gateways (#9548 ) This PR is to ensure that we call the relevant object layer APIs for necessary S3 API level functionalities allowing gateway implementations to return proper errors as NotImplemented{} This allows for all our tests in mint to behave appropriately and can be handled appropriately as well.	2020-05-08 13:44:44 -07:00
Bala FA	3773874cd3	add bucket tagging support (#9389 ) This patch also simplifies object tagging support	2020-05-05 14:18:13 -07:00
Klaus Post	073aac3d92	add data update tracking using bloom filter (#9208 ) By monitoring PUT/DELETE and heal operations it is possible to track changed paths and keep a bloom filter for this data. This can help prioritize paths to scan. The bloom filter can identify paths that have not changed, and the few collisions will only result in a marginal extra workload. This can be implemented on either a bucket+(1 prefix level) with reasonable performance. The bloom filter is set to have a false positive rate at 1% at 1M entries. A bloom table of this size is about ~2500 bytes when serialized. To not force a full scan of all paths that have changed cycle bloom filters would need to be kept, so we guarantee that dirty paths have been scanned within cycle runs. Until cycle bloom filters have been collected all paths are considered dirty.	2020-04-27 10:06:21 -07:00
Anis Elleuch	db2155551a	heal: Pass scan mode to HealObjects to deep scan full quorum objects (#9159 ) As an optimization of the healing, HealObjects() avoid sending an object to the background healing subsystem when the object is present in all disks. However, HealObjects() should have checked the scan type, if this deep, always pass the object to the healing subsystem.	2020-03-18 17:50:00 -07:00
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	2020-03-18 16:19:29 -07:00
Harshavardhana	23a8411732	Add a generic Walk()'er to list a bucket, optinally prefix (#9026 ) This generic Walk() is used by likes of Lifecyle, or KMS to rotate keys or any other functionality which relies on this functionality.	2020-02-25 21:22:28 +05:30
Harshavardhana	ab7d3cd508	fix: Speed up multi-object delete by taking bulk locks (#8974 ) Change distributed locking to allow taking bulk locks across objects, reduces usually 1000 calls to 1. Also allows for situations where multiple clients sends delete requests to objects with following names ``` {1,2,3,4,5} ``` ``` {5,4,3,2,1} ``` will block and ensure that we do not fail the request on each other.	2020-02-21 11:29:57 +05:30
Krishnan Parthasarathi	026265f8f7	Add support for bucket encryption feature (#8890 ) - pkg/bucket/encryption provides support for handling bucket encryption configuration - changes under cmd/ provide support for AES256 algorithm only Co-Authored-By: Poorna <poornas@users.noreply.github.com> Co-authored-by: Harshavardhana <harsha@minio.io>	2020-02-05 15:12:34 +05:30
Harshavardhana	f98616dce7	heal: Optimize heal listing by avoiding batches (#8901 ) Also limit the heal per object if there is incoming requests by suspending heal for longer periods of time.	2020-01-29 12:05:44 +05:30
Harshavardhana	0cbebf0f57	Rename pkg/{tagging,lifecycle} to pkg/bucket sub-directory (#8892 ) Rename to allow for more such features to come in a more proper hierarchical manner.	2020-01-27 14:12:34 -08:00
Harshavardhana	f14f60a487	fix: Avoid double usage calculation on every restart (#8856 ) On every restart of the server, usage was being calculated which is not useful instead wait for sufficient time to start the crawling routine. This PR also avoids lots of double allocations through strings, optimizes usage of string builders and also avoids crawling through symbolic links. Fixes #8844	2020-01-21 14:07:49 -08:00
Nitish Tiwari	61c17c8933	Add ObjectTagging Support (#8754 ) This PR adds support for AWS S3 ObjectTagging API as explained here https://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html	2020-01-20 08:45:59 -08:00
Praveen raj Mani	5d09233115	Fix Readiness check (#8681 ) - Remove goroutine-check in Readiness check - Bring in quorum check for readiness Fixes #8385 Co-authored-by: Harshavardhana <harsha@minio.io>	2019-12-28 22:24:43 +05:30
Nitish Tiwari	3df7285c3c	Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591 ) This PR adds support below metrics - Cache Hit Count - Cache Miss Count - Data served from Cache (in Bytes) - Bytes received from AWS S3 - Bytes sent to AWS S3 - Number of requests sent to AWS S3 Fixes #8549	2019-12-05 23:16:06 -08:00
Harshavardhana	3a34d98db8	Initialize local nsLocker for gateway instances (#8540 )	2019-11-19 16:45:35 -08:00
Harshavardhana	e9b2bf00ad	Support MinIO to be deployed on more than 32 nodes (#8492 ) This PR implements locking from a global entity into a more localized set level entity, allowing for locks to be held only on the resources which are writing to a collection of disks rather than a global level. In this process this PR also removes the top-level limit of 32 nodes to an unlimited number of nodes. This is a precursor change before bring in bucket expansion.	2019-11-13 12:17:45 -08:00
Harshavardhana	8b80eca184	List buckets only once per sub-system initialization (#8333 ) Current master repeatedly calls ListBuckets() during initialization of multiple sub-systems Use single ListBuckets() call for each sub-system as follows - LifeCycle - Policy - Notification	2019-10-02 05:35:02 +05:30
Krishnan Parthasarathi	559a59220e	Add initial support for bucket lifecycle (#7563 ) This PR is based off @sinhaashish's PR for object lifecycle management, which includes support only for, - Expiration of object - Filter using object prefix (_not_ object tags) N B the code for actual expiration of objects will be included in a subsequent PR.	2019-07-19 21:20:33 +01:00
Anis Elleuch	7abadfccc2	Add self-healing feature (#7604 ) - Background Heal routine receives heal requests from a channel, either to heal format, buckets or objects - Daily sweeper lists all objects in all buckets, these objects don't necessarly have read quorum so they can be removed if these objects are unhealable - Heal daily ops receives objects from the daily sweeper and send them to the heal routine.	2019-06-08 22:14:07 -07:00
kannappanr	5ecac91a55	Replace Minio refs in docs with MinIO and links (#7494 )	2019-04-09 11:39:42 -07:00
Anis Elleuch	facbd653ba	Add normal/deep type of heal scanning (#7251 ) Healing scan used to read all objects parts to check for bitrot checksum. This commit will add a quicker way of healing scan by only checking if parts are actually present in disks or not.	2019-03-14 13:08:51 -07:00
Harshavardhana	7079abc931	Implement HealObjects API to simplify healing (#7351 )	2019-03-13 17:35:09 -07:00

1 2

77 Commits