minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	80ca120088	remove checkBucketExist check entirely to avoid fan-out calls (#18917 ) Each Put, List, Multipart operations heavily rely on making GetBucketInfo() call to verify if bucket exists or not on a regular basis. This has a large performance cost when there are tons of servers involved. We did optimize this part by vectorizing the bucket calls, however its not enough, beyond 100 nodes and this becomes fairly visible in terms of performance.	2024-01-30 12:43:25 -08:00
Harshavardhana	dd2542e96c	add codespell action (#18818 ) Original work here, #18474, refixed and updated.	2024-01-17 23:03:17 -08:00
Anis Eleuch	b7d11141e1	rename Force to Immediate for clarity (#18540 )	2023-11-28 22:35:16 -08:00
Harshavardhana	a4cfb5e1ed	return errors if dataDir is missing during HeadObject() (#18477 ) Bonus: allow replication to attempt Deletes/Puts when the remote returns quorum errors of some kind, this is to ensure that MinIO can rewrite the namespace with the latest version that exists on the source.	2023-11-20 21:33:47 -08:00
Anis Eleuch	15fd5ce2fa	fix: A typo in per pool make/delete bucket errs calculation (#17553 )	2023-07-03 09:47:40 -07:00
Harshavardhana	5569acd95c	disallow EC:0 if not set during server startup (#17141 )	2023-05-04 14:44:30 -07:00
Harshavardhana	6825bd7e75	fix: inlined objects don't need to honor long locks (#17039 )	2023-04-17 12:16:37 -07:00
Harshavardhana	f1bbb7fef5	vectorize cluster-wide calls such as bucket operations (#16313 )	2023-01-03 08:16:39 -08:00
Harshavardhana	b882310e2b	avoid locks for internal and invalid buckets in MakeBucket() (#16302 )	2022-12-23 07:46:00 -08:00
Harshavardhana	944c62daf4	skip flaky tests on windows OS (#16015 )	2022-11-07 00:11:21 -08:00
Klaus Post	ff12080ff5	Remove deprecated io/ioutil (#15707 )	2022-09-19 11:05:16 -07:00
Klaus Post	5c61c3ccdc	Fix flaky TestGetObjectWithOutdatedDisks (#15687 ) On occasion this test fails: ``` 2022-09-12T17:22:44.6562737Z === RUN TestGetObjectWithOutdatedDisks 2022-09-12T17:22:44.6563751Z erasure-object_test.go:1214: Test 2: Expected data to have md5sum = `c946b71bb69c07daf25470742c967e7c`, found `7d16d23f07072af1a809707ba101ae07` 2 ``` Theory: Both objects are written with the same timestamp due to lower timer resolution on Windows. This results in secondary resolution, which is deterministic, but random. Solution: Instead of hacking in a wait we request the specific version we want. Should still keep the test relevant. Bonus: Remote action dependency for vulncheck	2022-09-14 08:17:39 -07:00
Klaus Post	a9f1ad7924	Add extended checksum support (#15433 )	2022-08-29 16:57:16 -07:00
Poorna	426c902b87	site replication: fix healing of bucket deletes. (#15377 ) This PR changes the handling of bucket deletes for site replicated setups to hold on to deleted bucket state until it syncs to all the clusters participating in site replication.	2022-07-25 17:51:32 -07:00
Eng Zer Jun	0a3b1ad4eb	test: use `T.TempDir` to create temporary test directory (#15400 ) This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The directory created by `t.TempDir` is automatically removed when the test and all its subtests complete. Prior to this commit, temporary directory created using `ioutil.TempDir` needs to be removed manually by calling `os.RemoveAll`, which is omitted in some tests. The error handling boilerplate e.g. defer func() { if err := os.RemoveAll(dir); err != nil { t.Fatal(err) } } is also tedious, but `t.TempDir` handles this for us nicely. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-07-25 12:37:26 -07:00
Praveen raj Mani	b49fc33cb3	purge objects immediately with `x-minio-force-delete` in DeleteObject and DeleteBucket API (#15148 )	2022-07-11 09:15:54 -07:00
Harshavardhana	9c605ad153	allow support for parity '0', '1' enabling support for 2,3 drive setups (#15171 ) allows for further granular setups - 2 drives (1 parity, 1 data) - 3 drives (1 parity, 2 data) Bonus: allows '0' parity as well.	2022-06-27 20:22:18 -07:00
Anis Elleuch	56a61bab56	test: Add GetObjectNInfo test with some outdated disks (#15004 ) Add a test reading an object which has some old data in some outdated disks, in a versioned and non-versioned bucket.	2022-05-30 17:52:59 -07:00
Anis Elleuch	77dc99e71d	Do not use inline data size in xl.meta quorum calculation (#14831 ) * Do not use inline data size in xl.meta quorum calculation Data shards of one object can different inline/not-inline decision in multiple disks. This happens with outdated disks when inline decision changes. For example, enabling bucket versioning configuration will change the small file threshold. When the parity of an object becomes low, GET object can return 503 because it is not unable to calculate the xl.meta quorum, just because some xl.meta has inline data and other are not. So this commit will be disable taking the size of the inline data into consideration when calculating the xl.meta quorum. * Add tests for simulatenous inline/notinline object Co-authored-by: Anis Elleuch <anis@min.io>	2022-05-24 06:26:38 -07:00
Harshavardhana	b0c84e3de7	fix: deleteVersions causing xl.meta to have empty Versions[] slice (#14483 ) This is a side-affect of the optimization done in PR #13544 which causes a certain type of delete operations on given object versions can cause lastVersion indication to be skipped, which leads to an `xl.meta` where Versions[] slice is empty while the entire file is intact by itself. This PR tries to ensure that such files are visible and deletable by regular means of listing as null 'delete-marker' and also avoid the situation where this potential issue might arise.	2022-03-04 20:01:26 -08:00
Harshavardhana	a60ac7ca17	fix: audit log to support object names in multipleObjectNames() handler (#14017 )	2022-01-03 01:28:52 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	30ba85bc67	no need to write storageClass globally (#13555 ) fixes #13548	2021-11-02 08:11:20 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Klaus Post	acc452b7ce	Add more erasure codes on degraded systems. (#11852 ) In cases where a cluster is degraded, we do not uphold our consistency guarantee and we will write fewer erasure codes and rely on healing to recreate the missing shards. In some cases replacing known bad disks in practice take days. We want to change the behavior of a known degraded system to keep the erasure code promise of the storage class for each object. This will create the objects with the same confidence as a fully functional cluster. The tradeoff will be that objects created during a partial outage will take up slightly more space. This means that when the storage class is EC:4, there should always be written 4 parity shards, even if some disks are unavailable. When an object is created on a set, the disks are immediately checked. If any disks are unavailable additional parity shards will be made for each offline disk, up to 50% of the number of disks. We add an internal metadata field with the actual and intended erasure code level, this can optionally be picked up later by the scanner if we decide that data like this should be re-sharded.	2021-05-27 11:38:09 -07:00
Anis Elleuch	0e80b5fe63	tests: Add test for upload of the same object inlined and not inlined (#12374 ) Upload an object smaller than small file threshold and upload another file bigger than small file threshold and tries to read it.	2021-05-26 08:09:23 -07:00
Harshavardhana	4840974d7a	fix: inline data upon overwrites should be readable (#12369 ) This PR fixes two bugs - Remove fi.Data upon overwrite of objects from inlined-data to non-inlined-data - Workaround for an existing bug on disk with latest releases to ignore fi.Data and instead read from the disk for non-inlined-data - Addtionally add a reserved metadata header to indicate data is inlined for a given version.	2021-05-25 16:33:06 -07:00
Harshavardhana	d84261aa6d	fix: ensure proper usage of DataDir (#12300 ) - GetObject() should always use a common dataDir to read from when it starts reading, this allows the code in erasure decoding to have sane expectations. - Healing should always heal on the common dataDir, this allows the code in dangling object detection to purge dangling content. These both situations can happen under certain types of retries during PUT when server is restarting etc, some namespace entries might be left over.	2021-05-14 16:50:47 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	2ef824bbb2	collapse two distinct calls into single RenameData() call (#12093 ) This is an optimization by reducing one extra system call, and many network operations. This reduction should increase the performance for small file workloads.	2021-04-20 10:44:39 -07:00
Klaus Post	4dcce17eb9	Determine small objects on shard size (#11935 ) Use shard size to determine when to inline data. For unversioned objects, use 128K/shard and for versioned 16K thresholds.	2021-03-31 09:19:14 -07:00
Klaus Post	2623338dc5	Inline small file data in xl.meta file (#11758 )	2021-03-29 17:00:55 -07:00
Harshavardhana	f903cae6ff	Support variable server pools (#11256 ) Current implementation requires server pools to have same erasure stripe sizes, to facilitate same SLA and expectations. This PR allows server pools to be variadic, i.e they do not have to be same erasure stripe sizes - instead they should have SLA for parity ratio. If the parity ratio cannot be guaranteed by the new server pool, the deployment is rejected i.e server pool expansion is not allowed.	2021-01-16 12:08:02 -08:00
Harshavardhana	1a5775e2e8	enable small and large file optimization (#11260 ) - for large objects we found that 1MiB block for r/w respectively. - for small objects we found that 128KiB block for r/w respectively.	2021-01-12 10:20:39 -08:00
Harshavardhana	f21d650ed4	fix: readData in bulk call using messagepack byte wrappers (#11228 ) This PR refactors the way we use buffers for O_DIRECT and to re-use those buffers for messagepack reader writer. After some extensive benchmarking found that not all objects have this benefit, and only objects smaller than 64KiB see this benefit overall. Benefits are seen from almost all objects from 1KiB - 32KiB Beyond this no objects see benefit with bulk call approach as the latency of bytes sent over the wire v/s streaming content directly from disk negate each other with no remarkable benefits. All other optimizations include reuse of msgp.Reader, msgp.Writer using sync.Pool's for all internode calls.	2021-01-07 19:27:31 -08:00
Harshavardhana	d0027c3c41	do not use large buffers if not necessary (#11220 ) without this change, there is a performance regression for small objects GETs, this makes the overall speed to go back to pre '59d363' commit days.	2021-01-04 18:51:52 -08:00
Harshavardhana	c4131c2798	feat: Small object optimization read data in single bulk call (#11207 )	2021-01-03 11:27:57 -08:00
Anis Elleuch	677e80c0f8	xl: Remove check-dir in ReadVersion (#11200 ) The only purpose of check-dir flag in ReadVersion is to return 404 when an object has xl.meta but without data. This is causing an extract call to the disk which can be penalizing in case of busy system where disks receive many concurrent access.	2021-01-02 10:35:57 -08:00
Harshavardhana	4ec45753e6	rename server sets to server pools	2020-12-01 13:50:33 -08:00
Harshavardhana	790833f3b2	Revert "Support variable server sets (#10314 )" This reverts commit `aabf053d2f`.	2020-12-01 12:02:29 -08:00
Harshavardhana	aabf053d2f	Support variable server sets (#10314 )	2020-11-25 16:28:47 -08:00
Harshavardhana	519c0077a9	fix: do not return an error for successfully deleted dangling objects (#10938 ) dangling objects when removed `mc admin heal -r` or crawler auto heal would incorrectly return error - this can interfere with usage calculation as the entry size for this would be returned as `0`, instead upon success use the resultant object size to calculate the final size for the object and avoid reporting this in the log messages Also do not set ObjectSize in healResultItem to be '-1' this has an effect on crawler metrics calculating 1 byte less for objects which seem to be missing their `xl.meta`	2020-11-23 09:12:17 -08:00
Harshavardhana	ad726b49b4	rename zones to serverSets to avoid terminology conflict (#10679 ) we are bringing in availability zones, we should avoid zones as per server expansion concept.	2020-10-15 14:28:50 -07:00
Klaus Post	493c714663	Remove erasureSets and erasureObjects from ObjectLayer (#10442 )	2020-09-10 09:18:19 -07:00
Harshavardhana	b68bc75dad	fix: quorum calculation mistake with reduced parity (#10186 ) With reduced parity our write quorum should be same as read quorum, but code was still assuming ``` readQuorum+1 ``` In all situations which is not necessary.	2020-08-03 12:15:08 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00

46 Commits