minio

mirror of https://github.com/minio/minio.git synced 2024-12-27 15:45:55 -05:00

Author	SHA1	Message	Date
Ritesh H Shukla	3ddd8b04d1	fix: handle unsupported APIs more granularly (#11674 )	2021-03-30 23:19:36 -07:00
Harshavardhana	8e6e287729	fix: delete/delete marker replication versions consistent (#11932 ) replication didn't work as expected when deletion of delete markers was requested in DeleteMultipleObjects API, this is due to incorrect lookup elements being used to look for delete markers.	2021-03-30 17:15:36 -07:00
Harshavardhana	014edd3462	allow configuring scanner cycles dynamically (#11931 ) This allows us to speed up or slow down sleeps between multiple scanner cycles, helps in testing as well as some deployments might want to run scanner more frequently. This change is also dynamic can be applied on a running cluster, subsequent cycles pickup the newly set value.	2021-03-30 13:59:02 -07:00
Steven Reitsma	e9fede88b3	fix: multi delete when using S3 Gateway with SSE (#11929 )	2021-03-30 13:09:48 -07:00
Harshavardhana	edf053c5c9	disksWithAllParts should use parts if present (#11923 )	2021-03-30 01:51:00 -07:00
Klaus Post	2623338dc5	Inline small file data in xl.meta file (#11758 )	2021-03-29 17:00:55 -07:00
Anis Elleuch	f5831174e6	iam: Use 'on' for enabled accounts for consistency (#11913 ) This commit does not fix any bug, just ensure consistency.	2021-03-29 09:32:36 -07:00
Harshavardhana	d93c6cb9c7	use Access() instead of Lstat() for frequent use (#11911 ) using Lstat() is causing tiny memory allocations, that are usually wasted and never used, instead we can simply uses Access() call that does 0 memory allocations.	2021-03-29 08:07:23 -07:00
Harshavardhana	7c5b35d20f	trace: enhance trace experience further	2021-03-27 13:19:14 -07:00
Anis Elleuch	07ab4d1250	trace: Add prefix to func names of OS & Storage (#11912 )	2021-03-27 10:07:07 -07:00
Anis Elleuch	d8b5adfd10	trace: Add storage & OS tracing (#11889 )	2021-03-26 23:24:07 -07:00
Poorna Krishnamoorthy	95096e31a7	Improve error message from SetRemoteTargetHandler (#11909 )	2021-03-26 18:58:13 -07:00
Harshavardhana	d8bda2dd92	[feat] Add targz transparent extract support (#11849 ) This feature brings in support for auto extraction of objects onto MinIO's namespace from an incoming tar gzipped stream, the only expected metadata sent by the client is to set `snowball-auto-extract`. All the contents from the tar stream are saved as folders and objects on the namespace. fixes #8715	2021-03-26 17:15:09 -07:00
Harshavardhana	df42b128db	fix: service accounts policy enforcement regression (#11910 ) service accounts were not inheriting parent policies anymore due to refactors in the PolicyDBGet() from the latest release, fix this behavior properly.	2021-03-26 13:55:42 -07:00
Anis Elleuch	2c296652f7	Simplify access to local node name (#11907 ) The local node name is heavily used in tracing, create a new global variable to store it. Multiple goroutines can access it since it won't be changed later.	2021-03-26 11:37:58 -07:00
Klaus Post	9efcb9e15c	Fix listPathRaw/WalkDir cancelation (#11905 ) In #11888 we observe a lot of running, WalkDir calls. There doesn't appear to be any listerners for these calls, so they should be aborted. Ensure that WalkDir aborts when upstream cancels the request. Fixes #11888	2021-03-26 11:18:30 -07:00
Anis Elleuch	8d5456c15a	Fix error returned by HealObject in some cases (#11906 ) The background healing can return NoSuchUpload error, the reason is that healing code can return errFileNotFound with three parameters. Simplify the code by returning exact errUploadNotFound error in multipart code. Also ensure that a typed error is always returned whatever the number of parameters because it is better than showing internal error.	2021-03-26 11:17:23 -07:00
Harshavardhana	cf87303094	do not call LocalStorageInfo on gateways (#11903 ) fixes https://github.com/minio/mc/issues/3665	2021-03-25 15:26:22 -07:00
Harshavardhana	90d8ec6310	fix: reject duplicate keys in PostPolicyJSON document (#11902 ) fixes #11894	2021-03-25 13:57:57 -07:00
Klaus Post	b383522743	fix error could not read /proc ion windows. (#11868 ) Bonus: Prealloc reasonable sizes for metrics.	2021-03-25 12:58:43 -07:00
Aditya Manthramurthy	b4d8bcf644	Converge PolicyDBGet functions in IAM (#11891 )	2021-03-25 00:38:15 -07:00
Harshavardhana	d7f32ad649	xl: avoid sending Delete() remote call for fully successful runs an optimization to avoid extra syscalls in PutObject(), adds up to our PutObject response times.	2021-03-24 17:32:12 -07:00
Aditya Manthramurthy	906d68c356	Fix LDAP policy application on user policy (#11887 )	2021-03-24 12:29:25 -07:00
Klaus Post	749e9c5771	metrics: Add canceled requests (#11881 ) Add metric for canceled requests	2021-03-24 10:25:27 -07:00
Harshavardhana	410e84d273	xl: add checks for minioTmpMetaBucket in CreateFile	2021-03-24 09:36:10 -07:00
Harshavardhana	75741dbf4a	xl: remove cleanupDir instead use Delete() (#11880 ) use a single call to remove directly at disk instead of doing recursively at network layer.	2021-03-24 09:08:05 -07:00
Anis Elleuch	fad7b27f15	metrics: Change type of minio_s3_requests_waiting_total to gauge (#11884 )	2021-03-24 09:06:37 -07:00
Harshavardhana	79564656eb	xl: CreateFile shouldn't prematurely timeout (#11878 ) For large objects taking more than '3 minutes' response times in a single PUT operation can timeout prematurely as 'ResponseHeader' timeout hits for 3 minutes. Avoid this by keeping the connection active during CreateFile phase.	2021-03-24 09:05:03 -07:00
Harshavardhana	21cfc4aa49	Revert "xl: CreateFile shouldn't prematurely timeout (#11854 )" This reverts commit `922c7b57f5`.	2021-03-23 23:47:45 -07:00
Harshavardhana	e80239a661	simplify OS instrumentation remove functions for global variables	2021-03-23 22:32:44 -07:00
Ritesh H Shukla	6a2ed44095	fix: optionally enable tracing posix calls	2021-03-23 22:23:08 -07:00
Aditya Manthramurthy	8adfeb0d84	fix: AccountInfo API for LDAP users (#11874 ) Also, ensure admin APIs auth additionally validates groups	2021-03-23 17:39:20 -07:00
Harshavardhana	d23485e571	fix: LDAP groups handling and group mapping (#11855 ) comprehensively handle group mapping for LDAP users across IAM sub-subsytem.	2021-03-23 15:15:51 -07:00
Harshavardhana	da70e6ddf6	avoid healObjects recursively healing at empty path (#11856 ) baseDirFromPrefix(prefix) for object names without parent directory incorrectly uses empty path, leading to long listing at various paths that are not useful for healing - avoid this listing completely if "baseDir" returns empty simple use the "prefix" as is. this improves startup performance significantly	2021-03-23 07:57:07 -07:00
Harshavardhana	922c7b57f5	xl: CreateFile shouldn't prematurely timeout (#11854 ) For large objects taking more than '3 minutes' response times in a single PUT operation can timeout prematurely as 'ResponseHeader' timeout hits for 3 minutes. Avoid this by keeping the connection active during CreateFile phase.	2021-03-22 18:25:05 -07:00
Harshavardhana	726d80dbb7	fix: merge duplicate keys in post policy (#11843 ) some SDKs might incorrectly send duplicate entries for keys such as "conditions", Go stdlib unmarshal for JSON does not support duplicate keys - instead skips the first duplicate and only preserves the last entry. This can lead to issues where a policy JSON while being valid might not properly apply the required conditions, allowing situations where POST policy JSON would end up allowing uploads to unauthorized buckets and paths. This PR fixes this properly.	2021-03-20 22:16:30 -07:00
Ritesh H Shukla	23b03dadb8	Add process uptime metric (#11844 )	2021-03-20 21:23:27 -07:00
Andreas Auernhammer	7b3719c17b	crypto: simplify Context encoding (#11812 ) This commit adds a `MarshalText` implementation to the `crypto.Context` type. The `MarshalText` implementation replaces the `WriteTo` and `AppendTo` implementation. It is slightly slower than the `AppendTo` implementation ``` goos: darwin goarch: arm64 pkg: github.com/minio/minio/cmd/crypto BenchmarkContext_AppendTo/0-elems-8 381475698 2.892 ns/op 0 B/op 0 allocs/op BenchmarkContext_AppendTo/1-elems-8 17945088 67.54 ns/op 0 B/op 0 allocs/op BenchmarkContext_AppendTo/3-elems-8 5431770 221.2 ns/op 72 B/op 2 allocs/op BenchmarkContext_AppendTo/4-elems-8 3430684 346.7 ns/op 88 B/op 2 allocs/op ``` vs. ``` BenchmarkContext/0-elems-8 135819834 8.658 ns/op 2 B/op 1 allocs/op BenchmarkContext/1-elems-8 13326243 89.20 ns/op 128 B/op 1 allocs/op BenchmarkContext/3-elems-8 4935301 243.1 ns/op 200 B/op 3 allocs/op BenchmarkContext/4-elems-8 2792142 428.2 ns/op 504 B/op 4 allocs/op goos: darwin ``` However, the `AppendTo` benchmark used a pre-allocated buffer. While this improves its performance it does not match the actual usage of `crypto.Context` which is passed to a `KMS` and always encoded into a newly allocated buffer. Therefore, this change seems acceptable since it should not impact the actual performance but reduces the overall code for Context marshaling.	2021-03-20 02:48:48 -07:00
Harshavardhana	9a6487319a	remove MINIO_IO_DEADLINE support (#11841 ) this feature in actual deployment was found to be not that useful, remove support for this for now.	2021-03-20 02:47:04 -07:00
Aditya Manthramurthy	94ff624242	Fix querying LDAP group/user policy (#11840 )	2021-03-20 02:37:52 -07:00
Anis Elleuch	98ff91b484	xl: Reduce usage of isDirEmpty() (#11838 ) When an object is removed, its parent directory is inspected to check if it is empty to remove if that is the case. However, we can use os.Remove() directly since it is only able to remove a file or an empty directory.	2021-03-19 15:42:01 -07:00
Anis Elleuch	4d86384dc7	xl: Remove non needed check for empty dir (#11835 ) RenameData renames xl.meta and data dir and removes the parent directory if empty, however, there is a duplicate check for empty dir, since the parent dir of xl.meta is always the same as the data-dir.	2021-03-19 12:26:53 -07:00
Ritesh H Shukla	b5dcaaccb4	Introduce metrics caching for performant metrics (#11831 )	2021-03-19 00:04:29 -07:00
Harshavardhana	b92a220db1	fix: handle weird drives sporadic read O_DIRECT behavior (#11832 ) on freshReads if drive returns errInvalidArgument, we should simply turn-off DirectIO and read normally, there are situations in k8s like environments where the drives behave sporadically in a single deployment and may not have been implemented properly to handle O_DIRECT for reads.	2021-03-18 20:16:50 -07:00
Harshavardhana	51a8619a79	[feat] Add configurable deadline for writers (#11822 ) This PR adds deadlines per Write() calls, such that slow drives are timed-out appropriately and the overall responsiveness for Writes() is always up to a predefined threshold providing applications sustained latency even if one of the drives is slow to respond.	2021-03-18 14:09:55 -07:00
Anis Elleuch	14d89eaae4	mrf: Enhance behavior for better results (#11788 ) MRF was starting to heal when it receives a disk connection event, which is not good when a node having multiple disks reconnects to the cluster. Besides, MRF needs Remove healing option to remove stale files.	2021-03-18 11:19:02 -07:00
Harshavardhana	add3cd4e44	allow configuring delete cleanup interval from default 10minutes (#11818 )	2021-03-17 15:15:58 -07:00
Harshavardhana	60b0f2324e	storage write call path optimizations (#11805 ) - write in o_dsync instead of o_direct for smaller objects to avoid unaligned double Write() situations that may arise for smaller objects < 128KiB - avoid fallocate() as its not useful since we do not use Append() semantics anymore, fallocate is not useful for streaming I/O we can save on a syscall - createFile() doesn't need to validate `bucket` name with a Lstat() call since createFile() is only used to write at `minioTmpBucket` - use io.Copy() when writing unAligned writes to allow usage of ReadFrom() from *os.File providing zero buffer writes().	2021-03-17 09:38:38 -07:00
Anis Elleuch	0eb146e1b2	add additional metrics per disk API latency, API call counts #11250 ) ``` mc admin info --json ``` provides these details, for now, we shall eventually expose this at Prometheus level eventually. Co-authored-by: Harshavardhana <harsha@minio.io>	2021-03-16 20:06:57 -07:00
Andreas Auernhammer	e197800f90	s3v4: read and verify S3 signature v4 chunks separately (#11801 ) This commit fixes a security issue in the signature v4 chunked reader. Before, the reader returned unverified data to the caller and would only verify the chunk signature once it has encountered the end of the chunk payload. Now, the chunk reader reads the entire chunk into an in-memory buffer, verifies the signature and then returns data to the caller. In general, this is a common security problem. We verifying data streams, the verifier MUST NOT return data to the upper layers / its callers as long as it has not verified the current data chunk / data segment: ``` func (r *Reader) Read(buffer []byte) { if err := r.readNext(r.internalBuffer); err != nil { return err } if err := r.verify(r.internalBuffer); err != nil { return err } copy(buffer, r.internalBuffer) } ```	2021-03-16 13:33:40 -07:00
Klaus Post	771dea175c	erasure pools enable faster checks for file not found (#11799 ) For operations that require the object to exist make it possible to detect if the file isn't found in any pool. This will allow these to return the error early without having to re-check.	2021-03-16 11:02:20 -07:00
Harshavardhana	6160188bf3	fix: erasure index based reading based on actual ParityBlocks (#11792 ) in some setups with ordering issues in drive configuration, we should rely on expected parityBlocks instead of `len(disks)/2`	2021-03-15 20:03:13 -07:00
Steve Wills	642ba3f2d6	fix: runtime issue on FreeBSD due to missing O_NOATIME/O_DSYNC support (#11790 ) See also: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253937	2021-03-15 14:02:36 -07:00
Harshavardhana	afbd3e41eb	add missing principalId in web notifications (#11777 ) fixes #11561	2021-03-13 10:52:43 -08:00
Poorna Krishnamoorthy	5e003549cc	Replication: Enforce DeleteMarker disable setting (#11720 ) This PR also enforces DeleteReplication disable setting	2021-03-13 10:28:35 -08:00
Nitish Tiwari	7fa3e4106b	Add consoleAdmin as a default canned policy (#11770 )	2021-03-12 12:51:43 -08:00
Philip Brown	75db500e85	cmd/os-readdir_other.go - return nil with err (#11772 )	2021-03-12 07:22:25 -08:00
Harshavardhana	feafccf007	handle trimming '/' if present in the object names (#11765 ) - MultipleDeletes should handle '/' prefix for objectnames - Trimming the slash alone is enough for ListObjects() prefix and markers fixes #11769	2021-03-11 13:57:03 -08:00
Anis Elleuch	f92b7a5621	Browser: Shared link has content-disposition header (#11712 ) The shared link will be automatically downloadable when the user opens the shared link in a browser.	2021-03-10 23:02:16 -08:00
Poorna Krishnamoorthy	c25e75f0b5	Fix redact LDAP password properly (#11762 ) fixes #11742 previous pull request #11750 fixed only the web trace	2021-03-10 11:05:38 -08:00
Harshavardhana	777344a594	add release build-arg to docker multiarch builds (#11754 ) additional paths to ignore for healing	2021-03-10 09:38:35 -08:00
Poorna Krishnamoorthy	878bc6c72b	Redact LDAP password if any in request trace (#11750 ) Fixes: #11742	2021-03-09 14:43:16 -08:00
Klaus Post	fdc2f69218	truncate xl.meta files upon rewrites #11749 ) If the destination files exist and is larger - junk data will be left at the end of the file.	2021-03-09 14:42:24 -08:00
Anis Elleuch	0d124095ea	lc: Return expiration header only when version id is unspecified (#11718 ) Follow S3 specification to return Expiration header in HEAD/GET call only when version-id is not passed in the request.	2021-03-09 13:19:08 -08:00
Harshavardhana	691035832a	fix: normalize object layer inputs (#11534 ) Cases where we have applications making request for `//` in object names make sure that all are normalized to `/` and all such requests that are prefixed '/' are removed. To ensure a consistent view from all operations.	2021-03-09 12:58:22 -08:00
Anis Elleuch	eac66e67ec	Use maximum parity for config files (#11740 ) Some deployments have low parity (EC:2), but we really do not need to save our config data with the same parity configuration. N/2 would be better to keep MinIO configurations intact when unexpected a number of drives fail.	2021-03-09 10:19:47 -08:00
Anis Elleuch	57f3ed22d4	erasure: Reduce the interval of cleaning up .trash folder (#11741 ) Reduce from 30 to 10 minutes.	2021-03-09 09:45:38 -08:00
Poorna Krishnamoorthy	2f29719e6b	resize replication worker pool dynamically after config update (#11737 )	2021-03-09 02:56:42 -08:00
Andreas Auernhammer	209fe61dcc	vault: disable Hashicorp Vault with opt-in (#11711 ) This commit disables the Hashicorp Vault support but provides a way to temp. enable it via the `MINIO_KMS_VAULT_DEPRECATION=off` Vault support has been deprecated long ago and this commit just requires users to take action if they maintain a Vault integration.	2021-03-09 00:02:35 -08:00
Harshavardhana	8ecffdb7a7	Revert "Revert "heal: Heal bucket metadata when a fresh disk is inserted (#11734 )"" This reverts commit `806df164b2`.	2021-03-08 16:12:17 -08:00
Harshavardhana	806df164b2	Revert "heal: Heal bucket metadata when a fresh disk is inserted (#11734 )" This reverts commit `64662a49ff`.	2021-03-08 14:43:24 -08:00
Klaus Post	4ac9ed4248	CopyObject: Do not remove crypto info when compressed (#11702 ) Removing crypto info makes it impossible to copy encrypted+compressed objects. Disable destination compression when encrypted.	2021-03-08 12:57:54 -08:00
Klaus Post	3ff5f55dcb	Fetch fileinfo concurrently (#11700 ) For non-erasure setups fetch up to 10 fileinfos concurrently. Fixes #11625	2021-03-08 11:30:43 -08:00
Max Xu	097e5eba9f	feat: remove go-bindata-assetfs in favor of embed by upgrading to go1.16 (#11733 )	2021-03-08 11:26:43 -08:00
Anis Elleuch	64662a49ff	heal: Heal bucket metadata when a fresh disk is inserted (#11734 ) Replacing disk with a fresh one never heals bucket metadata (policy, notification, etc..). This commit fixes the issue.	2021-03-08 10:54:13 -08:00
Harshavardhana	78e867e145	ignore healing .trash, .metacache amd .multipart paths (#11725 )	2021-03-07 09:38:31 -08:00
Harshavardhana	9ccc483df6	[feat]: change erasure coding default block size from 10MiB to 1MiB (#11721 ) major performance improvements in range GETs to avoid large read amplification when ranges are tiny and random ``` ------------------- Operation: GET Operations: 142014 -> 339421 Duration: 4m50s -> 4m56s * Average: +139.41% (+1177.3 MiB/s) throughput, +139.11% (+658.4) obj/s * Fastest: +125.24% (+1207.4 MiB/s) throughput, +132.32% (+612.9) obj/s * 50% Median: +139.06% (+1175.7 MiB/s) throughput, +133.46% (+660.9) obj/s * Slowest: +203.40% (+1267.9 MiB/s) throughput, +198.59% (+753.5) obj/s ``` TTFB from 10MiB BlockSize ``` * First Access TTFB: Avg: 81ms, Median: 61ms, Best: 20ms, Worst: 2.056s ``` TTFB from 1MiB BlockSize ``` * First Access TTFB: Avg: 22ms, Median: 21ms, Best: 8ms, Worst: 91ms ``` Full object reads however do see a slight change which won't be noticeable in real world, so not doing any comparisons TTFB still had improvements with full object reads with 1MiB ``` * First Access TTFB: Avg: 68ms, Median: 35ms, Best: 11ms, Worst: 1.16s ``` v/s TTFB with 10MiB ``` * First Access TTFB: Avg: 388ms, Median: 98ms, Best: 20ms, Worst: 4.156s ``` This change should affect all new uploads, previous uploads should continue to work with business as usual. But dramatic improvements can be seen with these changes.	2021-03-06 14:09:34 -08:00
Anis Elleuch	abce040088	fix: Remove repetitive IAM ready message (#11723 ) "IAM initialization complete" is printed each 5 minutes, avoid this by printing it only during the first initialization of IAM.	2021-03-06 09:27:46 -08:00
Anis Elleuch	558762bdf6	iam: Return a slice of policies for a group (#11722 ) A group can have multiple policies, a user subscribed to readwrite & diagnostics can perform S3 operations & admin operations as well. However, the current code only returns one policy for one group.	2021-03-06 09:27:06 -08:00
Harshavardhana	d971061305	use listPathRaw for HealObjects() instead of expensive WalkVersions() (#11675 )	2021-03-06 09:25:48 -08:00
Andreas Auernhammer	509bcc01ad	fips: do not use SHA-3 when building a FIPS-140 2 binary (#11710 ) This commit disables SHA-3 for OpenID when building a FIPS-140 2 compatible binary. While SHA-3 is a crypto. hash function accepted by NIST there is no FIPS-140 2 compliant implementation available when using the boringcrypto Go branch. Therefore, SHA-3 must not be used when building a FIPS-140 2 binary.	2021-03-05 20:43:42 -08:00
Krishnan Parthasarathi	bcf9825082	Data usage should account for transitioned objects (#11717 )	2021-03-05 14:15:53 -08:00
sgandon	124816f6a6	fix : IAM Intialization failing with a large number of users/policies (#11701 )	2021-03-05 08:36:16 -08:00
Klaus Post	fa9cf1251b	Imporve healing and reporting (#11312 ) * Provide information on actively healing, buckets healed/queued, objects healed/failed. * Add concurrent healing of multiple sets (typically on startup). * Add bucket level resume, so restarts will only heal non-healed buckets. * Print summary after healing a disk is done.	2021-03-04 14:36:23 -08:00
Harshavardhana	d73d756a80	fix: incorrect errors thrown by lint (#11699 ) fixes #11698	2021-03-04 14:27:38 -08:00
Aditya Manthramurthy	7488c77e7c	Test LDAP connection configuration at startup (#11684 )	2021-03-04 12:17:36 -08:00
Harshavardhana	786585009e	fix: capture disks when entire peer is offline (#11697 ) currently when one of the peer is down, the drives from that peer are reported as '0/0' offline instead we should capture/filter the drives from the peer and populate it appropriately such that `mc admin info` displays correct info.	2021-03-04 10:07:05 -08:00
Anis Elleuch	7be7109471	locking: Add Refresh for better locking cleanup (#11535 ) Co-authored-by: Anis Elleuch <anis@min.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2021-03-03 18:36:43 -08:00
Klaus Post	c3217bd6eb	Use actual size for buffer selection (#11687 ) For compressed inputs, this will be -1, but the object may be small.	2021-03-03 16:28:10 -08:00
Andreas Auernhammer	f14cc6c943	etag: add FromContentMD5 to parse content-md5 as ETag (#11688 ) This commit adds the `FromContentMD5` function to parse a client-provided content-md5 as ETag. Further, it also adds multipart ETag computation for future needs.	2021-03-03 12:58:28 -08:00
Harshavardhana	2c198ae7b6	fix: prometheus metrics disks_online count when disks are down (#11689 ) prometheus metrics was using total disks instead of online disk count, when disks were down, this PR fixes this and also adds a new metric for total_disk_count	2021-03-03 11:18:41 -08:00
Poorna Krishnamoorthy	690434514d	Avoid notification event for replicas (#11683 ) Creating notification events for replica creation is not particularly useful to send as the notification event generated at source already includes replication completion events. For applications using replica cluster as failover, avoiding duplicate notifications for replica event will allow seamless failover.	2021-03-03 11:13:31 -08:00
Harshavardhana	039f59b552	fix: missing user policy enforcement in PostPolicyHandler (#11682 )	2021-03-03 08:47:08 -08:00
Harshavardhana	c6a120df0e	fix: Prometheus metrics to re-use storage disks (#11647 ) also re-use storage disks for all `mc admin server info` calls as well, implement a new LocalStorageInfo() API call at ObjectLayer to lookup local disks storageInfo also fixes bugs where there were double calls to StorageInfo()	2021-03-02 17:28:04 -08:00
Klaus Post	cd9e30c0f4	IAM: Block while loading users (#11671 ) While starting up a request that needs all IAM data will start another load operation if the first on startup hasn't finished. This slows down both operations. Block these requests until initial load has completed. Blocking calls will be ListPolicies, ListUsers, ListServiceAccounts, ListGroups - and the calls that eventually trigger these. These will wait for the initial load to complete. Fixes issue seen in #11305	2021-03-02 17:08:25 -08:00
Harshavardhana	f96d4cf7d3	fix: do not deny admins to change other passwords fixes a regression from #11680	2021-03-02 17:02:32 -08:00
Harshavardhana	879599b0cf	fix: enforce deny if present for implicit permissions (#11680 ) Implicit permissions for any user is to be allowed to change their own password, we need to restrict this further even if there is an implicit allow for this scenario - we have to honor Deny statements if they are specified.	2021-03-02 15:35:50 -08:00
Harshavardhana	b1bb3f7016	[feat]: implement GetBucketPolicyStatus API (#11673 ) additionally also add more APIs in notImplemented list, adjust routing rules appropriately	2021-03-01 23:10:33 -08:00
Anis Elleuch	e8d8dfa3ae	Add metric for internode RPC calls errors (#11669 )	2021-03-01 12:31:33 -08:00
Nitish Tiwari	bbd1244a88	Add support for mTLS for Audit log target (#11645 )	2021-03-01 09:19:13 -08:00
Klaus Post	10bdb78699	fix: listObjectVersions Include object in marker (#11562 ) ListObjectVersions would skip past the object in the marker when version id is specified. Make `listPath` return the object with the marker and truncate it if not needed. Avoid having to parse unintended objects to find a version marker.	2021-03-01 08:12:02 -08:00
Shireesh Anjal	289b22d911	fix: pool number not added for one server (#11670 ) The previous code was iterating over replies from peers and assigning pool numbers to them, thus missing to add it for the local server. Fixed by iterating over the server properties of all the servers including the local one.	2021-03-01 08:09:43 -08:00
Harshavardhana	0b9c17443e	update gopsutil to use the v3 API (#11638 )	2021-03-01 00:15:46 -08:00
Bala FA	23f7ab40b3	Add PoolNumber field to madmin.ServerProperties (#11327 )	2021-02-28 21:26:28 -08:00
Harshavardhana	2f4af09c01	fix: alow changes to readAllData to decrement activeCount()	2021-02-28 20:09:23 -08:00
Harshavardhana	37960cbc2f	fix: avoid writing more content on network with O_DIRECT reads (#11659 ) There was an io.LimitReader was missing for the 'length' parameter for ranged requests, that would cause client to get truncated responses and errors. fixes #11651	2021-02-28 15:33:03 -08:00
cbows	c67d1bf120	add unauthenticated lookup-bind mode to LDAP identity (#11655 ) Closes #11646	2021-02-28 12:57:31 -08:00
Klaus Post	c5b3a675fa	Block profiling tweaks (#11612 ) The base profiles contains no valuable data, don't record them. Reduce block rate by 2 orders of magnitude, should still capture just as valuable data with less CPU strain.	2021-02-27 09:22:14 -08:00
Harshavardhana	b690304eed	use faster way for siphash (#11640 )	2021-02-26 16:53:06 -08:00
Harshavardhana	9171d6ef65	rename all references from crawl -> scanner (#11621 )	2021-02-26 15:11:42 -08:00
Harshavardhana	6386b45c08	[feat] use rename instead of recursive deletes (#11641 ) most of the delete calls today spend time in a blocking operation where multiple calls need to be recursively sent to delete the objects, instead we can use rename operation to atomically move the objects from the namespace to `tmp/.trash` we can schedule deletion of objects at this location once in 15, 30mins and we can also add wait times between each delete operation. this allows us to make delete's faster as well less chattier on the drives, each server runs locally a groutine which would clean this up regularly.	2021-02-26 09:52:27 -08:00
Andreas Auernhammer	1f659204a2	remove GetObject from ObjectLayer interface (#11635 ) This commit removes the `GetObject` method from the `ObjectLayer` interface. The `GetObject` method is not longer used by the HTTP handlers implementing the high-level S3 semantics. Instead, they use the `GetObjectNInfo` method which returns both, an object handle as well as the object metadata. Therefore, it is no longer necessary that a concrete `ObjectLayer` implements `GetObject`.	2021-02-26 09:52:02 -08:00
Harshavardhana	f9f6fd0421	fix: service account permissions generated from LDAP user (#11637 ) service accounts generated from LDAP parent user did not inherit correct permissions, this PR fixes this fully.	2021-02-25 13:49:59 -08:00
Klaus Post	85620dfe93	use bucket in path in distribution hash (#11634 ) Use bucket in erasure distribution hash. For the rare cases where objects with the same names are uploaded to many buckets.	2021-02-25 10:11:31 -08:00
Harshavardhana	a8e4f64ff3	Revert "fix: remove persistence layer for metacache store in memory (#11538 )" This reverts commit `b23659927c`.	2021-02-24 22:24:51 -08:00
Krishnan Parthasarathi	ca5c6e3160	fix: translate empty versionID string to null version where appropriate (#11629 ) We store the null version as empty string. We should translate it to null version for bucket with version suspended too.	2021-02-24 18:39:10 -08:00
Harshavardhana	b23659927c	fix: remove persistence layer for metacache store in memory (#11538 ) store the cache in-memory instead of disks to avoid large write amplifications for list heavy workloads, store in memory instead and let it auto expire.	2021-02-24 15:51:41 -08:00
Andreas Auernhammer	c1a49be639	use crypto/sha256 for FIPS 140-2 compliance (#11623 ) This commit replaces the usage of github.com/minio/sha256-simd with crypto/sha256 of the standard library in all non-performance critical paths. This is necessary for FIPS 140-2 compliance which requires that all crypto. primitives are implemented by a FIPS-validated module. Go can use the Google FIPS module. The boringcrypto branch of the Go standard library uses the BoringSSL FIPS module to implement crypto. primitives like AES or SHA256. We only keep github.com/minio/sha256-simd when computing the content-SHA256 of an object. Therefore, this commit relies on a build tag `fips`. When MinIO is compiled without the `fips` flag it will use github.com/minio/sha256-simd. When MinIO is compiled with the fips flag (go build --tags "fips") then MinIO uses crypto/sha256 to compute the content-SHA256.	2021-02-24 09:00:15 -08:00
Klaus Post	03172b89e2	Ensure cache has finished deserializing (#11620 ) Make sure that response has been fully deserialized before returning.	2021-02-24 02:59:49 -08:00
Harshavardhana	b517c791e9	[feat]: use DSYNC for xl.meta writes and NOATIME for reads (#11615 ) Instead of using O_SYNC, we are better off using O_DSYNC instead since we are only ever interested in data to be persisted to disk not the associated filesystem metadata. For reads we ask customers to turn off noatime, but instead we can proactively use O_NOATIME flag to avoid atime updates upon reads.	2021-02-24 00:14:16 -08:00
Petr Tichý	14aef52004	remove Content-MD5 on Range requests (#11611 ) This removes the Content-MD5 response header on Range requests in Azure Gateway mode. The partial content MD5 doesn't match the full object MD5 in metadata.	2021-02-23 19:32:56 -08:00
Andreas Auernhammer	d4b822d697	pkg/etag: add new package for S3 ETag handling (#11577 ) This commit adds a new package `etag` for dealing with S3 ETags. Even though ETag is often viewed as MD5 checksum of an object, handling S3 ETags correctly is a surprisingly complex task. While it is true that the ETag corresponds to the MD5 for the most basic S3 API operations, there are many exceptions in case of multipart uploads or encryption. In worse, some S3 clients expect very specific behavior when it comes to ETags. For example, some clients expect that the ETag is a double-quoted string and fail otherwise. Non-AWS compliant ETag handling has been a source of many bugs in the past. Therefore, this commit adds a dedicated `etag` package that provides functionality for parsing, generating and converting S3 ETags. Further, this commit removes the ETag computation from the `hash` package. Instead, the `hash` package (i.e. `hash.Reader`) should focus only on computing and verifying the content-sha256. One core feature of this commit is to provide a mechanism to communicate a computed ETag from a low-level `io.Reader` to a high-level `io.Reader`. This problem occurs when an S3 server receives a request and has to compute the ETag of the content. However, the server may also wrap the initial body with several other `io.Reader`, e.g. when encrypting or compressing the content: ``` reader := Encrypt(Compress(ETag(content))) ``` In such a case, the ETag should be accessible by the high-level `io.Reader`. The `etag` provides a mechanism to wrap `io.Reader` implementations such that the `ETag` can be accessed by a type-check. This technique is applied to the PUT, COPY and Upload handlers.	2021-02-23 12:31:53 -08:00
Harshavardhana	aa7244a9a4	fix: make sure to convert the error properly in HealBucket() (#11610 ) server startup code expects the object layer to properly convert error into a proper type, so that in situations when servers are coming up and quorum is not available servers wait on each other.	2021-02-23 09:23:11 -08:00
Harshavardhana	2a79ea0332	isServerResolvable its sufficient to check server is reachable (#11609 ) using isServerResolvable for expiration can lead to chicken and egg problems, a lock might expire knowingly when server is booting up causing perpetual locks getting expired.	2021-02-22 16:29:53 -08:00
Aditya Manthramurthy	02e7de6367	LDAP config: fix substitution variables (#11586 ) - In username search filter and username format variables we support %s for replacing with the username. - In group search filter we support %s for username and %d for the full DN of the username.	2021-02-22 13:20:36 -08:00
Harshavardhana	da676ac298	remove network calls for getLocalDisks (#11603 )	2021-02-22 13:19:44 -08:00
Harshavardhana	18ec933085	fix: for containers use root-disk detection cleverly (#11593 ) root-disk implemented currently had issues where root disk partitions getting modified might race and provide incorrect results, to avoid this lets rely again back on DeviceID and match it instead. In-case of containers `/data` is one such extra entity that needs to be verified for root disk, due to how 'overlay' filesystem works and the 'overlay' presents a completely different 'device' id - using `/data` as another entity for fallback helps because our containers describe 'VOLUME' parameter that allows containers to automatically have a virtual `/data` that points to the container root path this can either be at `/` or `/var/lib/` (on different partition)	2021-02-22 10:32:21 -08:00
Harshavardhana	c31d2c3fdc	fix: CrawlAndGetDataUsage close pipe() before using a new one (#11600 ) also additionally make sure errors during deserializer closes the reader with right error type such that Write() end actually see the final error, this avoids a waitGroup usage and waiting.	2021-02-22 10:04:32 -08:00
Harshavardhana	8778828a03	fix: read metadata in O_DIRECT if configured and supported (#11594 ) reduce the page-cache pressure completely by moving the entire read-phase of our operations to O_DIRECT, primarily this is going to be very useful for chatty metadata operations such as listing, scanner, ilm, healing like operations to avoid filling up the page-cache upon repeated runs.	2021-02-22 01:36:17 -08:00
Sarasa Kisaragi	48b212dd8e	Fix HDFS wrong filepath if subpath provided (#11574 )	2021-02-20 15:32:18 -08:00
Harshavardhana	be7de911c4	fix: update minio-go to fix an issue with S3 gateway (#11591 ) since we have changed our default envs to MINIO_ROOT_USER, MINIO_ROOT_PASSWORD this was not supported by minio-go credentials package, update minio-go to v7.0.10 for this support. This also addresses few bugs related to users had to specify AWS_ACCESS_KEY_ID as well to authenticate with their S3 backend if they only used MINIO_ROOT_USER.	2021-02-20 11:10:21 -08:00
Harshavardhana	8cad407e0b	fix: Bring support for symlink on regular files on NAS (#11383 ) fixes #11203	2021-02-20 00:30:12 -08:00
Poorna Krishnamoorthy	85d2187c20	fix: ETag mismatch for large upload in replica (#11587 )	2021-02-20 00:22:17 -08:00
Anis Elleuch	98d3f94996	metrics: Add the number of requests in the waiting queue (#11580 ) We can use this metric to check if there are too many S3 clients in the queue and could explain why some of those S3 clients are timing out. ``` minio_s3_requests_waiting_total{server="127.0.0.1:9000"} 9981 ``` If max_requests is 10000 then there is a strong possibility that clients are timing out because of the queue deadline.	2021-02-20 00:21:55 -08:00
mailsmail	173284903b	fix incorrect http range in SelectObjectContentHandler (#11585 )	2021-02-19 17:55:28 -08:00
Poorna Krishnamoorthy	2dce5d9442	fix: delete marker permanent delete replication (#11581 )	2021-02-18 16:35:37 -08:00
Anis Elleuch	f28b063091	heal: Use healDeleteDangling global const in self healing (#11579 ) A small fix, use healDeleteDangling constant instead of 'true' in the self-healing code.	2021-02-18 15:16:20 -08:00
Klaus Post	c5b2a8441b	fix: faster healing when disk is replaced. (#11520 )	2021-02-18 11:06:54 -08:00
Klaus Post	8a6b13c239	Avoid synchronizing usage writes (#11560 ) If the periodic `case <-t.C:` save gets held up for a long time it will end up synchronize all disk writes for saving the caches. We add jitter to per set writes so they don't sync up and don't hold a lock for the write, since it isn't needed anyway. If an outage prevents writes for a long while we also add individual waits for each disk in case there was a queue. Furthermore limit the number of buffers kept to 2GiB, since this could get huge in large clusters. This will not act as a hard limit but should be enough for normal operation.	2021-02-18 00:38:37 -08:00
Poorna Krishnamoorthy	8e8a792d9d	Allow delete marker replication from replica (#11566 ) in the case of active-active replication. This PR also has the following changes: - add docs on replication design - fix corner case of completing versioned delete on a delete marker when the target is down and `mc rm --vid` is performed repeatedly. Instead the version should still be retained in the `PENDING\|FAILED` state until replication sync completes. - remove `s3:Replication:OperationCompletedReplication` and `s3:Replication:OperationFailedReplication` from ObjectCreated events type	2021-02-18 00:33:51 -08:00
Harshavardhana	95e0acbb26	fix: allow accountInfo with creds with parentUsers (#11568 )	2021-02-17 20:57:17 -08:00
Poorna Krishnamoorthy	55037e6e54	lifecycle:Fix args passed to determine expiry header (#11567 )	2021-02-17 19:25:19 -08:00
Harshavardhana	289e1d8b2a	fix: reduce crawler memory usage by orders of magnitude (#11556 ) currently crawler waits for an entire readdir call to return until it processes usage, lifecycle, replication and healing - instead we should pass the applicator all the way down to avoid building any special stack for all the contents in a single directory. This allows for - no need to remember the entire list of entries per directory before applying the required functions - no need to wait for entire readdir() call to finish before applying the required functions	2021-02-17 15:34:42 -08:00
Harshavardhana	ffea6fcf09	fix: rename crawler as scanner in config (#11549 )	2021-02-17 12:04:11 -08:00
Klaus Post	11b2220696	Don't autoheal if disks are healing (#11558 ) Don't spawn automatic healing ops if a disk is healing.	2021-02-17 10:18:12 -08:00
Harshavardhana	aa8450a2a1	fix: parallelize getPoolIdx() for object lookup (#11547 )	2021-02-16 19:36:15 -08:00
Harshavardhana	7d4a2d2b68	fix: multiple pool reads parallelize when possible (#11537 )	2021-02-16 02:43:47 -08:00
Anis Elleuch	c4e12dc846	fix: in MultiDelete API return MalformedXML upon empty input (#11532 ) To follow S3 spec	2021-02-13 09:48:25 -08:00
Harshavardhana	a94a9c37fa	fix: support IAM policy handling for wildcard actions (#11530 ) This PR fixes - allow 's3:versionid` as a valid conditional for Get,Put,Tags,Object locking APIs - allow additional headers missing for object APIs - allow wildcard based action matching	2021-02-12 23:05:09 -08:00
Harshavardhana	79b6a43467	fix: avoid timed value for network calls (#11531 ) additionally simply timedValue to have RWMutex to avoid concurrent calls to DiskInfo() getting serialized, this has an effect on all calls that use GetDiskInfo() on the same disks. Such as getOnlineDisks, getOnlineDisksWithoutHealing	2021-02-12 18:17:52 -08:00
Shireesh Anjal	928de04f7a	fix: osinfos incomplete in case of warnings (#11505 ) The function used for getting host information (host.SensorsTemperaturesWithContext) returns warnings in some cases. Returning with error in such cases means we miss out on the other useful information already fetched (os info). If the OS info has been succesfully fetched, it should always be included in the output irrespective of whether the other data (CPU sensors, users) could be fetched or not.	2021-02-12 17:57:57 -08:00
Poorna Krishnamoorthy	93fd248b52	fix: save ModTime properly in disk cache (#11522 ) fix #11414	2021-02-11 19:25:47 -08:00
Harshavardhana	2a7b123895	turn off http2 for TLS setups for now (#11523 ) due to lots of issues with x/net/http2, as well as the bundled h2_bundle.go in the go runtime should be avoided for now. https://github.com/golang/go/issues/23559 https://github.com/golang/go/issues/42534 https://github.com/golang/go/issues/43989 https://github.com/golang/go/issues/33425 https://github.com/golang/go/issues/29246 With collection of such issues present, it make sense to remove HTTP2 support for now	2021-02-11 15:53:04 -08:00
Harshavardhana	b3c56b53fb	fix: metacache should only rename entries during cleanup (#11503 ) To avoid large delays in metacache cleanup, use rename instead of recursive delete calls, renames are cheaper move the content to minioMetaTmpBucket and then cleanup this folder once in 24hrs instead. If the new cache can replace an existing one, we should let it replace since that is currently being saved anyways, this avoids pile up of 1000's of metacache entires for same listing calls that are not necessary to be stored on disk.	2021-02-11 10:22:03 -08:00
Poorna Krishnamoorthy	f24d8127ab	fix: DeleteMultipleObjectsHandler to process deleted objects correctly (#11515 ) DeleteMarkerVersionID which is returned by the lower layer should not be used in the key to lookup ObjectToDelete map	2021-02-10 23:41:41 -08:00
Harshavardhana	7875d472bc	avoid notification for non-existent delete objects (#11514 ) Skip notifications on objects that might have had an error during deletion, this also avoids unnecessary replication attempt on such objects. Refactor some places to make sure that we have notified the client before we - notify - schedule for replication - lifecycle etc.	2021-02-10 22:00:42 -08:00
Harshavardhana	711adb9652	remove ipv6 fallbackdelay leave it as default	2021-02-10 17:35:09 -08:00
Poorna Krishnamoorthy	e6b4ea7618	More fixes for delete marker replication (#11504 ) continuation of PR#11491 for multiple server pools and bi-directional replication. Moving proxying for GET/HEAD to handler level rather than server pool layer as this was also causing incorrect proxying of HEAD. Also fixing metadata update on CopyObject - minio-go was not passing source version ID in X-Amz-Copy-Source header	2021-02-10 17:25:04 -08:00
Aditya Manthramurthy	466e95bb59	Return group DN instead of group name in LDAP STS (#11501 ) - Additionally, check if the user or their groups has a policy attached during the STS call. - Remove the group name attribute configuration value.	2021-02-10 16:52:49 -08:00
Harshavardhana	881f98e511	fix: use getPoolIdx in DeleteObjects() (#11513 ) filter out relevant objects for each pool to avoid calling, further delete operations on subsequent pools where some of these objects might not exist. This is mainly useful to avoid situations during bi-directional bucket replication.	2021-02-10 14:25:43 -08:00
Harshavardhana	cbf4bb62e0	fix: getPoolIdx decouple from top level options (#11512 ) top-level options shouldn't be passed down for GetObjectInfo() while verifying the objects in different pools, this is to make sure that we always get the value from the pool where the object exists.	2021-02-10 11:45:02 -08:00
Anis Elleuch	682482459d	Change the default object content-type to binary/octet-stream (#11508 )	2021-02-10 08:56:37 -08:00
Krishnan Parthasarathi	b87fae0049	Simplify PutObjReader for plain-text reader usage (#11470 ) This change moves away from a unified constructor for plaintext and encrypted usage. NewPutObjReader is simplified for the plain-text reader use. For encrypted reader use, WithEncryption should be called on an initialized PutObjReader. Plaintext: func NewPutObjReader(rawReader hash.Reader) PutObjReader The hash.Reader is used to provide payload size and md5sum to the downstream consumers. This is different from the previous version in that there is no need to pass nil values for unused parameters. Encrypted: func WithEncryption(encReader hash.Reader, key crypto.ObjectKey) (*PutObjReader, error) This method sets up encrypted reader along with the key to seal the md5sum produced by the plain-text reader (already setup when NewPutObjReader was called). Usage: ``` pReader := NewPutObjReader(rawReader) // ... other object handler code goes here // Prepare the encrypted hashed reader pReader, err = pReader.WithEncryption(encReader, objEncKey) ```	2021-02-10 08:52:50 -08:00
Shireesh Anjal	5a18d437ce	fix: drive hw info incomplete when smartinfo fails (#11509 ) Collection of SMART information doesn't work in certain scenarios e.g. in a container based setup. In such cases, instead of returning an error (without any data), we should only set the error on the smartinfo struct, so that other important drive hw info like device, mountpoint, etc is retained in the output.	2021-02-10 08:48:14 -08:00
Poorna Krishnamoorthy	93eb549a83	fix: duplicate delete marker attempts in bi-directional replication (#11491 )	2021-02-09 15:11:43 -08:00
Harshavardhana	fe3c39b583	use the new errgroup API whereever applicable (#11466 ) start using the new errgroup concurrency control API introduced in #11457	2021-02-09 12:08:25 -08:00
Harshavardhana	84d400487f	fix: accountInfo API to cater for federated setups (#11484 ) when MinIO is deployed in a federated setup, use etcd based listing of buckets to provide appropriate filtering of buckets per user.	2021-02-09 09:53:07 -08:00
Shireesh Anjal	3afa499885	fix: empty buckets/objects nodes in new setup (#11493 )	2021-02-09 09:52:38 -08:00
Krishna Srinivas	876b79b8d8	read-health check endpoint returns success if cluster can serve read requests (#11310 )	2021-02-09 01:00:44 -08:00
Ritesh H Shukla	3d74efa6b1	fux: copy object for encrypted objects (#11490 )	2021-02-08 19:58:17 -08:00
Harshavardhana	68d299e719	fix: case-insensitive lookups for metadata (#11489 ) continuation of #11487, with more changes	2021-02-08 18:12:28 -08:00
Poorna Krishnamoorthy	f9c5636c2d	fix: lookup metdata case insensitively (#11487 ) while setting replication options	2021-02-08 16:19:05 -08:00
Klaus Post	9b10118d34	Metacache add abs entry limit (#11483 ) Add an absolute limit to the number of metacaches for a bucket. Delete excess caches if they haven't been handed out in an hour.	2021-02-08 11:36:16 -08:00
Harshavardhana	0e3211f4ad	fix: server upgrades should have more descriptive error messages (#11476 ) during rolling upgrade, provide a more descriptive error message and discourage rolling upgrade in such situations, allowing users to take action. additionally also rename `slashpath -> pathutil` to avoid a slighly mis-pronounced usage of `path` package.	2021-02-08 10:15:12 -08:00
Harshavardhana	2e4d9124ad	honor region specified for remote targets (#11480 ) fixes #11472	2021-02-08 08:54:27 -08:00
Harshavardhana	6fef4c21b9	fix: align atomic variables for 32bit arch (#11475 ) fixes #11474	2021-02-08 08:51:12 -08:00
Poorna Krishnamoorthy	8e1bbd989a	replication:alloc UserDefined map before use (#11478 )	2021-02-07 22:01:10 -08:00
Sarasa Kisaragi	152d7cd95b	HDFS support keytab (#11473 )	2021-02-07 17:29:47 -08:00
Harshavardhana	0d057c777a	remove restriction for multi pool distribution algo	2021-02-06 16:19:05 -08:00
Anis Elleuch	275f7a63e8	lc: Apply DeleteAction correctly to objects (#11471 ) When lifecycle decides to Delete an object and not a version in a versioned bucket, the code should create a delete marker and not removing the scanned version. This commit fixes the issue.	2021-02-06 16:10:33 -08:00
Shireesh Anjal	97fe57bba9	Remove Connections from SysProcess struct (#11373 ) The connections info of the processes takes up a huge amount of space, and is not important for adding any useful health checks. Removing it will significantly reduce the size of the subnet health report.	2021-02-05 21:32:28 -08:00
Harshavardhana	88c1bb0720	fix: improper ticker usage in goroutines (#11468 ) - lock maintenance loop was incorrectly sleeping as well as using ticker badly, leading to extra expiration routines getting triggered that could flood the network. - multipart upload cleanup should be based on timer instead of ticker, to ensure that long running jobs don't get triggered twice. - make sure to get right lockers for object name	2021-02-05 19:23:48 -08:00
Harshavardhana	1fdafaf72f	fix: listing for directory object when delimiter is present (#11463 ) When you have heirarchy of prefixes with directory objects our current master would list directory objects as prefixes when delimiter is present, this is inconsistent with AWS S3 ``` aws s3api list-objects --endpoint-url http://localhost:9000 \ --profile minio --bucket testbucket-v --prefix new/ --delimiter / { "CommonPrefixes": [ { "Prefix": "new/" }, { "Prefix": "new/new/" } ] } ``` Instead this PR fixes this to behave like AWS S3 ``` aws s3api list-objects --endpoint-url http://localhost:9000 \ --profile minio --bucket testbucket-v --prefix new/ --delimiter / { "Contents": [ { "Key": "new/", "LastModified": "2021-02-05T06:27:42.660Z", "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"", "Size": 0, "StorageClass": "STANDARD", "Owner": { "DisplayName": "", "ID": "02d6176db174dc93cb1b899f7c6078f08654445fe8cf1b6ce98d8855f66bdbf4" } } ], "CommonPrefixes": [ { "Prefix": "new/new/" } ] } ```	2021-02-05 16:24:40 -08:00
Ritesh H Shukla	5fe4bb6b36	Reduce redundant crawler logging (#11448 )	2021-02-05 15:51:11 -08:00
Harshavardhana	99b733d44c	fix: deletion of delete marker regression (#11465 ) fixes #11440 fixes #11451 fixes #11454	2021-02-05 15:06:23 -08:00
Klaus Post	b4ac05523b	Add parallel bucket healing during startup (#11457 ) Replaces #11449 Does concurrent healing but limits concurrency to 50 buckets. Aborts on first error. `errgroup.Group` is extended to facilitate this in a generic way.	2021-02-05 13:04:26 -08:00
Anis Elleuch	c7eacba41c	health-info: Add tags to errors (#11412 ) We use multiple libraries in health info, but the returned error does not indicate exactly what library call is failing, hence adding named tags to returned errors whenever applicable.	2021-02-05 12:37:15 -08:00
Anis Elleuch	1887c25279	xl: Fix feeding NumVersions & SuccessorModTime to lifecycle (#11462 ) After recent refactor where lifecycle started to rely on ObjectInfo to make decisions, it turned out there are some issues calculating Successor Modtime and NumVersions, hence the lifecycle is not working as expected in a versioning bucket in some cases. This commit fixes the behavior.	2021-02-05 11:59:08 -08:00
Harshavardhana	c9b0f595b9	support directory objects in listing in certain scenarios (#11452 ) When a directory object is presented as a `prefix` param our implementation tend to only list objects present common to the `prefix` than the `prefix` itself, to mimic AWS S3 like flat key behavior this PR ensures that if `prefix` is directory object, it should be automatically considered to be part of the eventual listing result. fixes #11370	2021-02-05 10:12:25 -08:00
Harshavardhana	8bb580abfc	fix: use getObjectNInfo to avoid bytes.Buffer usage (#11428 ) few places were still using legacy call GetObject() which was mainly designed for client response writer, use GetObjectNInfo() for internal calls instead.	2021-02-05 09:57:30 -08:00
Harshavardhana	da55a05587	fix aggressive expiration detection (#11446 ) for some flaky networks this may be too fast of a value choose a defensive value, and let this be addressed properly in a new refactor of dsync with renewal logic. Also enable faster fallback delay to cater for misconfigured IPv6 servers refer - https://golang.org/pkg/net/#Dialer - https://tools.ietf.org/html/rfc6555	2021-02-04 16:56:40 -08:00
Harshavardhana	3fc4d6f620	update dependenices for relevant projects (#11445 ) - minio-go -> v7.0.8 - ldap/v3 -> v3.2.4 - reedsolomon -> v1.9.11 - sio-go -> v0.3.1 - msgp -> v1.1.5 - simdjson-go, md5-simd, highwayhash	2021-02-04 13:49:52 -08:00
Ritesh H Shukla	67a8f37df0	fix: disk usage capacity metric reporting (#11435 )	2021-02-04 12:26:58 -08:00
ArthurMa	df0c678167	fix: ldap config parsing issue for UserDNSearchFilter (#11437 )	2021-02-04 11:07:29 -08:00
Harshavardhana	f108873c48	fix: replication metadata comparsion and other fixes (#11410 ) - using miniogo.ObjectInfo.UserMetadata is not correct - using UserTags from Map->String() can change order - ContentType comparison needs to be removed. - Compare both lowercase and uppercase key names. - do not silently error out constructing PutObjectOptions if tag parsing fails - avoid notification for empty object info, failed operations should rely on valid objInfo for notification in all situations - optimize copyObject implementation, also introduce a new replication event - clone ObjectInfo() before scheduling for replication - add additional headers for comparison - remove strings.EqualFold comparison avoid unexpected bugs - fix pool based proxying with multiple pools - compare only specific metadata Co-authored-by: Poorna Krishnamoorthy <poornas@users.noreply.github.com>	2021-02-03 20:41:33 -08:00
Andreas Auernhammer	871b450dbd	crypto: add support for decrypting SSE-KMS metadata (#11415 ) This commit refactors the SSE implementation and add S3-compatible SSE-KMS context handling. SSE-KMS differs from SSE-S3 in two main aspects: 1. The client can request a particular key and specify a KMS context as part of the request. 2. The ETag of an SSE-KMS encrypted object is not the MD5 sum of the object content. This commit only focuses on the 1st aspect. A client can send an optional SSE context when using SSE-KMS. This context is remembered by the S3 server such that the client does not have to specify the context again (during multipart PUT / GET / HEAD ...). The crypto. context also includes the bucket/object name to prevent renaming objects at the backend. Now, AWS S3 behaves as following: - If the user does not provide a SSE-KMS context it does not store one - resp. does not include the SSE-KMS context header in the response (e.g. HEAD). - If the user specifies a SSE-KMS context without the bucket/object name then AWS stores the exact context the client provided but adds the bucket/object name internally. The response contains the KMS context without the bucket/object name. - If the user specifies a SSE-KMS context with the bucket/object name then AWS again stores the exact context provided by the client. The response contains the KMS context with the bucket/object name. This commit implements this behavior w.r.t. SSE-KMS. However, as of now, no such object can be created since the server rejects SSE-KMS encryption requests. This commit is one stepping stone for SSE-KMS support. Co-authored-by: Harshavardhana <harsha@minio.io>	2021-02-03 15:19:08 -08:00
Harshavardhana	f71e192343	avoid listing an empty dir without __XLDIR__ (#11427 ) ``` minio server /tmp/disk{1...4} mc mb myminio/testbucket/ mkdir -p /tmp/disk{1..4}/testbucket/test-prefix/ ``` This would end up being listed in the current master, this PR fixes this situation. If a directory is a leaf dir we should it being listed, since it cannot be deleted anymore with DeleteObject, DeleteObjects() API calls because we natively support directories now. Avoid listing it and let healing purge this folder eventually in the background.	2021-02-03 14:06:54 -08:00
Anis Elleuch	b3f81e75f6	xl: Make it clear when to create delete marker for a non existant object (#11423 )	2021-02-03 10:33:43 -08:00
Klaus Post	a71e0483c9	Fix nil disks in getOnlineDisksWithHealing (#11419 ) If a disk is skipped when nil it is still returned.	2021-02-02 17:04:37 -08:00
Klaus Post	4a9d9c8585	Update colinmarc/hdfs (#11417 ) Updates needed dependency as well. Fixes #11416	2021-02-02 15:37:30 -08:00
Harshavardhana	c885777ac6	Add support for TCP_QUICKACK (#11369 ) TCP_QUICKACK is a setting that allows TCP endpoints to acknowledge the receipt of data instantly in situations where they would normally wait to see if more data would be arriving. https://assets.extrahop.com/whitepapers/TCP-Optimization-Guide-by-ExtraHop.pdf	2021-02-02 09:44:18 -08:00
Poorna Krishnamoorthy	fe3aca70c3	Make number of replication workers configurable. (#11379 ) MINIO_API_REPLICATION_WORKERS env.var and `mc admin config set api` allow number of replication workers to be configurable. Defaults to half the number of cpus available. Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2021-02-02 16:45:06 +05:30
Ritesh H Shukla	c4848f9b4f	Add process start time to cluster metrics. (#11405 )	2021-02-01 23:02:18 -08:00
Andreas Auernhammer	838d4dafbd	gateway: don't use encrypted ETags for If-Match (#11400 ) This commit fixes a bug in the S3 gateway that causes GET requests to fail when the object is encrypted by the gateway itself. The gateway was not able to GET the object since it always specified a `If-Match` pre-condition checking that the object ETag matches an expected ETag - even for encrypted ETags. The problem is that an encrypted ETag will never match the ETag computed by the backend causing the `If-Match` pre-condition to fail. This commit fixes this by not sending an `If-Match` header when the ETag is encrypted. This is acceptable because: 1. A gateway-encrypted object consists of two objects at the backend and there is no way to provide a concurrency-safe implementation of two consecutive S3 GETs in the deployment model of the S3 gateway. Ref: S3 gateways are self-contained and isolated - and there may be multiple instances at the same time (no lock across instances). 2. Even if the data object changes (concurrent PUT) while gateway A has download the metadata object (but not issued the GET to the data object => data race) then we don't return invalid data to the client since the decryption (of the currently uploaded data) will fail - given the metadata of the previous object.	2021-02-01 23:02:08 -08:00
Anis Elleuch	e96fdcd5ec	tagging: Add event notif for PUT object tagging (#11366 ) An optimization to avoid double calling for during PutObject tagging	2021-02-01 13:52:51 -08:00
Anis Elleuch	6ef678663e	xl: Create a delete-marker when no other version exists (#11362 ) Currently, it is not possible to create a delete-marker when xl.meta does not exist (no version is created for that object yet). This makes a problem for replication and mc mirroring with versioning enabled. This also follows S3 specification.	2021-02-01 13:23:50 -08:00
Harshavardhana	f737a027cf	fix: regression introduced in federated listing buckets regression was introduced in `6cd255d516` fix it properly.	2021-02-01 12:06:58 -08:00
Anis Elleuch	65aa2bc614	ilm: Remove object in HEAD/GET if having an applicable ILM rule (#11296 ) Remove an object on the fly if there is a lifecycle rule with delete expiry action for the corresponding object.	2021-02-01 09:52:11 -08:00
Andreas Auernhammer	33554651e9	crypto: deprecate native Hashicorp Vault support (#11352 ) This commit deprecates the native Hashicorp Vault support and removes the legacy Vault documentation. The native Hashicorp Vault documentation is marked as outdated and deprecated for over a year now. We give another 6 months before we start removing Hashicorp Vault support and show a deprecation warning when a MinIO server starts with a native Vault configuration.	2021-01-29 17:55:37 -08:00
Poorna Krishnamoorthy	c82aef0a56	fix ObjectInfo returned by CopyObject (#11377 ) erasure CopyObject was returning old metadata	2021-01-29 14:49:18 -08:00
Harshavardhana	1e53bf2789	fix: allow expansion with newer constraints for older setups (#11372 ) currently we had a restriction where older setups would need to follow previous style of "stripe" count being same expansion, we can relax that instead newer pools can be expanded for older setups with newer constraints of common parity ratio.	2021-01-29 11:40:55 -08:00
Ritesh H Shukla	c8489a8f0c	fix: log notification errors only once (#11350 )	2021-01-28 13:40:31 -08:00
Klaus Post	2680772d4b	Don't mark remotes online when shutting down (#11368 ) Shutting down will mark remotes online when the shutdown has started since the context is canceled. For example: ``` API: SYSTEM() Time: 16:21:31 CET 01/28/2021 DeploymentID: 313b0065-c5a1-4aa3-9233-07223e77a730 Error: Storage resources are insufficient for the write operation .minio.sys/tmp/ced455c4-3d27-4bdd-95fc-b4707a179b8a/fd934ef3-8fc8-4330-abc1-f039fbbb9700/part.1 (cmd.InsufficientWriteQuorum) 1: d:\minio\minio\cmd\data-usage.go:56:cmd.storeDataUsageInBackend() Exiting on signal: INTERRUPT Client http://127.0.0.1:9002/minio/lock/v5 online Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d3/v24 online Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d2/v24 online Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d1/v24 online Client http://127.0.0.1:9002/minio/peer/v12 online Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d4/v24 online ``` Use a fresh context for health checks.	2021-01-28 13:38:12 -08:00
Harshavardhana	567f7bdd05	fix: verify overlapping domains when > 1	2021-01-28 13:08:53 -08:00
Harshavardhana	6cd255d516	fix: allow updated domain names in federation (#11365 ) additionally also disallow overlapping domain names	2021-01-28 11:44:48 -08:00
Aditya Manthramurthy	e79829b5b3	Bind to lookup user after user auth to lookup ldap groups (#11357 )	2021-01-27 17:31:21 -08:00
Poorna Krishnamoorthy	fd3f02637a	fix: replication regression due to proxying requests (#11356 ) In PR #11165 due to incorrect proxying for 2 way replication even when the object was not yet replicated Additionally, fix metadata comparisons when deciding to do full replication vs metadata copy. fixes #11340	2021-01-27 11:22:34 -08:00
Harshavardhana	e019f21bda	fix: trigger heal if one of the parts are not found (#11358 ) Previously we added heal trigger when bit-rot checks failed, now extend that to support heal when parts are not found either. This healing gets only triggered if we can successfully decode the object i.e read quorum is still satisfied for the object.	2021-01-27 10:21:14 -08:00
Anis Elleuch	e9ac7b0fb7	heal: Remove empty directories (#11354 ) Since the introduction of __XLDIR__, an empty directory does not have a meaning anymore in erasure mode. Make healing removes it wherever it finds it.	2021-01-27 02:19:28 -08:00
Harshavardhana	1debd722b5	rename last remaining Zone->Pool	2021-01-26 20:47:42 -08:00
massintha azamoum	e7f6051f19	Send bucket name to peers when bucket notification is enabled (#11351 )	2021-01-26 13:48:28 -08:00
Harshavardhana	6717295e18	fix: rename audit log docs and datastructure	2021-01-26 13:39:55 -08:00
Anis Elleuch	00cff1aac5	audit: per object send pool number, set number and servers per operation (#11233 )	2021-01-26 13:21:51 -08:00
Harshavardhana	9722531817	fix: purge LDAP deprecated keys	2021-01-26 09:53:29 -08:00
Harshavardhana	5c6bfae4c7	fix: load credentials from etcd directly when possible (#11339 ) under large deployments loading credentials might be time consuming, while this is okay and we will not respond quickly for `mc admin user list` like queries but it is possible to support `mc admin user info` just like how we handle authentication by fetching the user directly from persistent store. additionally support service accounts properly, reloaded from etcd during watch() - this was missing This PR is also half way remedy for #11305	2021-01-25 20:01:49 -08:00
Aditya Manthramurthy	5f51ef0b40	Add LDAP Lookup-Bind mode (#11318 ) This change allows the MinIO server to be configured with a special (read-only) LDAP account to perform user DN lookups. The following configuration parameters are added (along with corresponding environment variables) to LDAP identity configuration (under `identity_ldap`): - lookup_bind_dn / MINIO_IDENTITY_LDAP_LOOKUP_BIND_DN - lookup_bind_password / MINIO_IDENTITY_LDAP_LOOKUP_BIND_PASSWORD - user_dn_search_base_dn / MINIO_IDENTITY_LDAP_USER_DN_SEARCH_BASE_DN - user_dn_search_filter / MINIO_IDENTITY_LDAP_USER_DN_SEARCH_FILTER This lookup-bind account is a service account that is used to lookup the user's DN from their username provided in the STS API. When configured, searching for the user DN is enabled and configuration of the base DN and filter for search is required. In this "lookup-bind" mode, the username format is not checked and must not be specified. This feature is to support Active Directory setups where the DN cannot be simply derived from the username. When the lookup-bind is not configured, the old behavior is enabled: the minio server performs LDAP lookups as the LDAP user making the STS API request and the username format is checked and configuring it is required.	2021-01-25 14:26:10 -08:00
Harshavardhana	7e266293e6	fix: notify bucket replication after replication/ilm (#11343 )	2021-01-25 14:04:41 -08:00
Harshavardhana	eb6871ecd9	fix: LoginSTS should be an inline implementation (#11337 ) STS tokens can be obtained by using local APIs once the remote JWT token is presented, current code was not validating the incoming token in the first place and was incorrectly making a network operation using that token. For the most part this always works without issues, but under adversarial scenarios it exposes client to hand-craft a request that can reach internal services without authentication. This kind of proxying should be avoided before validating the incoming token.	2021-01-25 10:15:03 -08:00
Harshavardhana	9cdd981ce7	fix: expire locks only on participating lockers (#11335 ) additionally also add a new ForceUnlock API, to allow forcibly unlocking locks if possible.	2021-01-25 10:01:27 -08:00
Anis Elleuch	bd8020aba8	heal: Decode object name in healing result (#11348 ) The user can see __XLDIR__ prefix in mc admin heal when the command heals an empty object with a trailing slash. This commit decodes the name of the object before sending it back to the upper level.	2021-01-25 09:53:37 -08:00
Harshavardhana	09bc49bd51	fix: healBucket across sets should capture results properly (#11341 ) healing `.minio.sys/config` returns incorrect quorum errors across sets, healing of the buckets.	2021-01-25 09:45:09 -08:00
Harshavardhana	82f0471d1b	honor maxWait heal config when maxIO hits (#11338 )	2021-01-25 07:53:12 -08:00
Harshavardhana	6a95f412c9	avoid double CORS headers in federation (#11334 ) CORS proxying adds double headers one by the receiving server, one by proxied server. Remove them before proxying when 'Origin' header is found.	2021-01-23 18:27:23 -08:00
Ritesh H Shukla	7575c24037	Add open FD and FD limit to cluster metrics (#11328 )	2021-01-22 18:30:16 -08:00
Harshavardhana	43f973c4cf	fix: check for O_DIRECT support for reads and writes (#11331 ) In-case user enables O_DIRECT for reads and backend does not support it we shall proceed to turn it off instead and print a warning. This validation avoids any unexpected downtimes that users may incur.	2021-01-22 15:38:21 -08:00
Harshavardhana	1b453728a3	initialize forwarder after init() to avoid crashes (#11330 ) DNSCache dialer is a global value initialized in init(), whereas `go` keeps `var =` before `init()` , also we don't need to keep proxy routers as global entities - register the forwarder as necessary to avoid crashes.	2021-01-22 15:37:41 -08:00
Harshavardhana	a6c146bd00	validate storage class across pools when setting config (#11320 ) ``` mc admin config set alias/ storage_class standard=EC:3 ``` should only succeed if parity ratio is valid for all server pools, if not we should fail proactively. This PR also needs to bring other changes now that we need to cater for variadic drive counts per pool. Bonus fixes also various bugs reproduced with - GetObjectWithPartNumber() - CopyObjectPartWithOffsets() - CopyObjectWithMetadata() - PutObjectPart,PutObject with truncated streams	2021-01-22 12:09:24 -08:00
Klaus Post	2167ba0111	Feed correct part number to sio (#11326 ) When offsets were specified we relied on the first part number to be correct. Recalculate based on offset.	2021-01-21 08:43:03 -08:00
Klaus Post	4e6d717f39	Compress profiling data (#11313 ) Trace data can be rather large and compresses fine. Compress profile data in zip files: ``` 277.895.314 before.profiles.zip 152.800.318 after.profiles.zip ```	2021-01-20 15:49:53 -08:00
Poorna Krishnamoorthy	845e251fa9	fix: crash in notificationsys when peers online is 0 (#11307 ) Check if the number of peers online > 0 before using peerClient	2021-01-20 13:13:05 -08:00
Harshavardhana	d1a8f0b786	fix possible crashes on deleteMarker replication (#11308 ) Delete marker can have `metaSys` set to nil, that can lead to crashes after the delete marker has been healed. Additionally also fix isObjectDangling check for transitioned objects, that do not have parts should be treated similar to Delete marker.	2021-01-20 13:12:12 -08:00
Klaus Post	dac19d7272	Clarify root disk error (#11314 ) Make it clearer what the problem is and how to resolve it.	2021-01-20 13:11:42 -08:00
Harshavardhana	7624c8b9bb	fix: honor storage class uniformity for multiple pools (#11309 )	2021-01-20 01:41:18 -08:00
Klaus Post	19fb1086b2	select: Fix leak on compressed files (#11302 ) Properly close gzip reader when done reading fixes #11300	2021-01-19 17:51:46 -08:00
Harshavardhana	a5e23a40ff	fix: allow delayed etcd updates to have fallbacks (#11151 ) fixes #11149	2021-01-19 10:05:41 -08:00
Harshavardhana	1ad2b7b699	fix: add stricter validation for erasure server pools (#11299 ) During expansion we need to validate if - new deployment is expanded with newer constraints - existing deployment is expanded with older constraints - multiple server pools rejected if they have different deploymentID and distribution algo	2021-01-19 10:01:31 -08:00
Harshavardhana	b5049d541f	fix: reduce an extra readdir() attempted on non-legacy setups (#11301 ) to verify moving content and preserving legacy content, we have way to detect the objects through readdir() this path is not necessary for most common cases on newer setups, avoid readdir() to save multiple system calls. also fix the CheckFile behavior for most common use case i.e without legacy format.	2021-01-19 10:01:06 -08:00
Harshavardhana	e0055609bb	fix: crawler to skip healing the drives in a set being healed (#11274 ) If an erasure set had a drive replacement recently, we don't need to attempt healing on another drive with in the same erasure set - this would ensure we do not double heal the same content and also prioritizes usage for such an erasure set to be calculated sooner.	2021-01-19 02:40:52 -08:00
Klaus Post	e8ce348da1	crypto: Escape JSON text (#10794 ) Escape the JSON keys+values from the context. We do not add the HTML escapes, since that is an extra escape level not mandatory for JSON.	2021-01-19 01:39:04 -08:00
Ritesh H Shukla	b4add82bb6	Updated Prometheus metrics (#11141 ) * Add metrics for nodes online and offline * Add cluster capacity metrics * Introduce v2 metrics	2021-01-18 20:35:38 -08:00
Harshavardhana	3ca6330661	fix: optimize parentDirIsObject by moving isObject to storage layer (#11291 ) For objects with `N` prefix depth, this PR reduces `N` such network operations by converting `CheckFile` into a single bulk operation. Reduction in chattiness here would allow disks to be utilized more cleanly, while maintaining the same functionality along with one extra volume check stat() call is removed. Update tests to test multiple sets scenario	2021-01-18 12:25:22 -08:00
Aditya Manthramurthy	3163a660aa	Fix support for multiple LDAP user formats (#11276 ) Fixes support for using multiple base DNs for user search in the LDAP directory allowing users from different subtrees in the LDAP hierarchy to request credentials. - The username in the produced credentials is now the full DN of the LDAP user to disambiguate users in different base DNs.	2021-01-17 21:54:32 -08:00
Harshavardhana	0dadfd1b3d	fix: do not compute usage for not found lifecycle operations (#11288 ) Currently we would proceed to apply incorrect lifecycle policies for non-existent objects, this PR handles them appropriately.	2021-01-17 13:58:41 -08:00
Harshavardhana	4315f93421	fix: make sure parentDirIsObject is used at set level (#11280 ) parentDirIsObject is not using set level understanding to check for parent objects, without this it can lead to objects that can actually reside on a separate set as objects and would conflict.	2021-01-17 01:11:48 -08:00
Harshavardhana	ddb5d7043a	fix: standard storage class is allowed to be '0'	2021-01-16 17:32:25 -08:00
Harshavardhana	f903cae6ff	Support variable server pools (#11256 ) Current implementation requires server pools to have same erasure stripe sizes, to facilitate same SLA and expectations. This PR allows server pools to be variadic, i.e they do not have to be same erasure stripe sizes - instead they should have SLA for parity ratio. If the parity ratio cannot be guaranteed by the new server pool, the deployment is rejected i.e server pool expansion is not allowed.	2021-01-16 12:08:02 -08:00
Poorna Krishnamoorthy	7090bcc8e0	fix: doc links and delete replication permissions enforcement (#11285 )	2021-01-15 15:22:55 -08:00
Harshavardhana	c222bde14b	fix: use common logging implementation for DNSCache (#11284 )	2021-01-15 14:04:56 -08:00
Poorna Krishnamoorthy	feaf8dfb9a	Fix replication status reported on completion (#11273 ) Fixes: #11272	2021-01-13 11:52:28 -08:00
Harshavardhana	628ef081d1	fix: preserve cache calculated previously while moving from v2 to v3 (#11269 ) This ensures that all the prometheus monitoring and usage trackers to avoid alerts configured, although we cannot support v1 to v2 here - we can v2 to v3.	2021-01-13 09:58:08 -08:00
Harshavardhana	44dff36ff7	listing with prefix prefixed with '/' should be ignored (#11268 ) fixes #11265	2021-01-13 09:44:11 -08:00
Poorna Krishnamoorthy	b97d53b29c	fix remote target healthcheck (#11267 )	2021-01-12 20:48:04 -08:00
Harshavardhana	1a5775e2e8	enable small and large file optimization (#11260 ) - for large objects we found that 1MiB block for r/w respectively. - for small objects we found that 128KiB block for r/w respectively.	2021-01-12 10:20:39 -08:00
Anis Elleuch	e2579b1f5a	azure: Use default upload parameters to avoid consuming too much memory (#11251 ) A lot of memory is consumed when uploading small files in parallel, use the default upload parameters and add MINIO_AZURE_UPLOAD_CONCURRENCY for users to tweak.	2021-01-11 22:48:09 -08:00
Poorna Krishnamoorthy	7824e19d20	Allow synchronous replication if enabled. (#11165 ) Synchronous replication can be enabled by setting the --sync flag while adding a remote replication target. This PR also adds proxying on GET/HEAD to another node in a active-active replication setup in the event of a 404 on the current node.	2021-01-11 22:36:51 -08:00
Harshavardhana	317305d5f9	fix: regression in adding new replication targets (#11257 )	2021-01-11 09:08:42 -08:00
Harshavardhana	e4e117faab	fix: enable xl.json to xl.meta only if legacy drive is found (#11255 ) another optimization is renameLegacyMetadata() never needs to validate bucket with os.Stat() again, leading to reduction in one extra syscall.	2021-01-11 02:27:04 -08:00
Klaus Post	51dad1d130	Fix missing GetObjectNInfo Closure (#11243 ) Review for missing Close of returned value from `GetObjectNInfo`. This was often obscured by the stuff that auto-unlocks when reaching EOF.	2021-01-08 10:12:26 -08:00
Harshavardhana	4593b146be	fix: print errors only when metacache status has errors (#11248 )	2021-01-08 16:52:19 +05:30
Harshavardhana	f21d650ed4	fix: readData in bulk call using messagepack byte wrappers (#11228 ) This PR refactors the way we use buffers for O_DIRECT and to re-use those buffers for messagepack reader writer. After some extensive benchmarking found that not all objects have this benefit, and only objects smaller than 64KiB see this benefit overall. Benefits are seen from almost all objects from 1KiB - 32KiB Beyond this no objects see benefit with bulk call approach as the latency of bytes sent over the wire v/s streaming content directly from disk negate each other with no remarkable benefits. All other optimizations include reuse of msgp.Reader, msgp.Writer using sync.Pool's for all internode calls.	2021-01-07 19:27:31 -08:00
Harshavardhana	a4f6705874	expire stale locks when owner is down (#11247 ) fixes #11246	2021-01-07 19:16:18 -08:00
Poorna Krishnamoorthy	b35b537e3f	Pass versionID to checkReplicateDelete in web handler (#11244 )	2021-01-07 15:28:27 -08:00
Harshavardhana	5c52d5ffc7	fix: treat errVolumeNotFound as EOF error in listPathRaw (#11238 )	2021-01-07 09:52:53 -08:00
Harshavardhana	f0808bb2e5	fix: getObject fd leaks in transition and replication code (#11237 )	2021-01-06 16:13:10 -08:00
Harshavardhana	a6dee21092	initialize IAM store before Init() to avoid any crash (#11236 )	2021-01-06 13:40:20 -08:00
Anis Elleuch	6f781c5e7a	heal: Reduce whitespace ticker to 5 seconds (#11234 ) 30 seconds white spaces is long for some setups which time out when no read activity in short time, reduce the subnet health white space ticker to 5 seconds, since it has no cost at all.	2021-01-06 13:29:50 -08:00
Harshavardhana	f8ca859790	fix: server/gateway banner formatting (#11230 )	2021-01-06 10:38:07 -08:00
Harshavardhana	76e2713ffe	fix: use buffers only when necessary for io.Copy() (#11229 ) Use separate sync.Pool for writes/reads Avoid passing buffers for io.CopyBuffer() if the writer or reader implement io.WriteTo or io.ReadFrom respectively then its useless for sync.Pool to allocate buffers on its own since that will be completely ignored by the io.CopyBuffer Go implementation. Improve this wherever we see this to be optimal. This allows us to be more efficient on memory usage. ``` 385 // copyBuffer is the actual implementation of Copy and CopyBuffer. 386 // if buf is nil, one is allocated. 387 func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) { 388 // If the reader has a WriteTo method, use it to do the copy. 389 // Avoids an allocation and a copy. 390 if wt, ok := src.(WriterTo); ok { 391 return wt.WriteTo(dst) 392 } 393 // Similarly, if the writer has a ReadFrom method, use it to do the copy. 394 if rt, ok := dst.(ReaderFrom); ok { 395 return rt.ReadFrom(src) 396 } ``` From readahead package ``` // WriteTo writes data to w until there's no more data to write or when an error occurs. // The return value n is the number of bytes written. // Any error encountered during the write is also returned. func (a *reader) WriteTo(w io.Writer) (n int64, err error) { if a.err != nil { return 0, a.err } n = 0 for { err = a.fill() if err != nil { return n, err } n2, err := w.Write(a.cur.buffer()) a.cur.inc(n2) n += int64(n2) if err != nil { return n, err } ```	2021-01-06 09:36:55 -08:00
Harshavardhana	b5d291ea88	fix: rename remaining zone -> pool (#11231 )	2021-01-06 09:35:47 -08:00
Klaus Post	eb9172eecb	Allow Compression + encryption (#11103 )	2021-01-05 20:08:35 -08:00
Poorna Krishnamoorthy	64bddf47d8	Pass deletemarker correctly to replicate opts (#11227 ) fixes: #11180	2021-01-05 14:12:37 -08:00
Harshavardhana	4ed45ce543	fix: healing buckets during pool expansion (#11224 ) fixes #11209	2021-01-05 13:24:22 -08:00
Klaus Post	ad511b0eb8	tests: Fix occasional data race (#11223 ) CI tests could trigger a data race. Servers are generally not expected to reinitialize, so tests could trigger data races when reinitializing and async operations are running. We add the option to safely reset global vars instead of overwriting. Fixes races like: ``` WARNING: DATA RACE Read at 0x00000477ab18 by goroutine 1159: github.com/minio/minio/cmd.FileInfo.ToObjectInfo() /home/runner/work/minio/minio/cmd/erasure-metadata.go:105 +0x16d github.com/minio/minio/cmd.erasureObjects.putObject() /home/runner/work/minio/minio/cmd/erasure-object.go:748 +0x13f8 github.com/minio/minio/cmd.(erasureObjects).listPath.func3.2() /home/runner/work/minio/minio/cmd/metacache-set.go:682 +0x7d3 github.com/minio/minio/cmd.newMetacacheBlockWriter.func1.2() /home/runner/work/minio/minio/cmd/metacache-stream.go:777 +0x1c4 github.com/minio/minio/cmd.newMetacacheBlockWriter.func1() /home/runner/work/minio/minio/cmd/metacache-stream.go:806 +0x614 Previous write at 0x00000477ab18 by goroutine 1269: [failed to restore the stack] Goroutine 1159 (running) created at: github.com/minio/minio/cmd.newMetacacheBlockWriter() /home/runner/work/minio/minio/cmd/metacache-stream.go:760 +0x112 github.com/minio/minio/cmd.(erasureObjects).listPath.func3() /home/runner/work/minio/minio/cmd/metacache-set.go:672 +0xe22 Goroutine 1269 (running) created at: testing.(T).Run() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1095 +0x537 testing.runTests.func1() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1339 +0xa6 testing.tRunner() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1050 +0x1eb testing.runTests() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1337 +0x594 testing.(M).Run() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1252 +0x2ff github.com/minio/minio/cmd.TestMain() /home/runner/work/minio/minio/cmd/test-utils_test.go:120 +0x44e main.main() _testmain.go:1408 +0x223 ================== ================== WARNING: DATA RACE Read at 0x00000477aae8 by goroutine 1159: github.com/minio/minio/cmd.(BucketVersioningSys).Enabled() /home/runner/work/minio/minio/cmd/bucket-versioning.go:26 +0x52 github.com/minio/minio/cmd.FileInfo.ToObjectInfo() /home/runner/work/minio/minio/cmd/erasure-metadata.go:105 +0x197 github.com/minio/minio/cmd.erasureObjects.putObject() /home/runner/work/minio/minio/cmd/erasure-object.go:748 +0x13f8 github.com/minio/minio/cmd.(erasureObjects).listPath.func3.2() /home/runner/work/minio/minio/cmd/metacache-set.go:682 +0x7d3 github.com/minio/minio/cmd.newMetacacheBlockWriter.func1.2() /home/runner/work/minio/minio/cmd/metacache-stream.go:777 +0x1c4 github.com/minio/minio/cmd.newMetacacheBlockWriter.func1() /home/runner/work/minio/minio/cmd/metacache-stream.go:806 +0x614 Previous write at 0x00000477aae8 by goroutine 1269: [failed to restore the stack] Goroutine 1159 (running) created at: github.com/minio/minio/cmd.newMetacacheBlockWriter() /home/runner/work/minio/minio/cmd/metacache-stream.go:760 +0x112 github.com/minio/minio/cmd.(erasureObjects).listPath.func3() /home/runner/work/minio/minio/cmd/metacache-set.go:672 +0xe22 Goroutine 1269 (running) created at: testing.(T).Run() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1095 +0x537 testing.runTests.func1() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1339 +0xa6 testing.tRunner() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1050 +0x1eb testing.runTests() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1337 +0x594 testing.(*M).Run() /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1252 +0x2ff github.com/minio/minio/cmd.TestMain() /home/runner/work/minio/minio/cmd/test-utils_test.go:120 +0x44e main.main() _testmain.go:1408 +0x223 ================== ```	2021-01-05 10:45:26 -08:00
Harshavardhana	cb0eaeaad8	feat: migrate to ROOT_USER/PASSWORD from ACCESS/SECRET_KEY (#11185 )	2021-01-05 10:22:57 -08:00
Harshavardhana	d0027c3c41	do not use large buffers if not necessary (#11220 ) without this change, there is a performance regression for small objects GETs, this makes the overall speed to go back to pre '59d363' commit days.	2021-01-04 18:51:52 -08:00
Anis Elleuch	cb7fc99368	handlers: Avoid initializing a struct in each handler call (#11217 )	2021-01-04 09:54:22 -08:00
Harshavardhana	a4383051d9	remove/deprecate crawler disable environment (#11214 ) with changes present to automatically throttle crawler at runtime, there is no need to have an environment value to disable crawling. crawling is a fundamental piece for healing, lifecycle and many other features there is no good reason anyone would need to disable this on a production system. * Apply suggestions from code review	2021-01-04 09:43:31 -08:00
Harshavardhana	e7ae49f9c9	fix: calculate prometheus disks_offline/disks_total correctly (#11215 ) fixes #11196	2021-01-04 09:42:09 -08:00
Anis Elleuch	153d4be032	tracing: NumSubscribers() to use atomic instead of mutex (#11219 ) globalSubscribers.NumSubscribers() is heavily used in S3 requests and it uses mutex, use atomic.Load instead since it is faster Co-authored-by: Anis Elleuch <anis@min.io>	2021-01-04 09:40:30 -08:00
Anis Elleuch	dfd99b6d8f	handlers: Little bit more optimizations (#11211 )	2021-01-04 00:01:06 -08:00
Harshavardhana	c4b1d394d6	erasure: avoid io.Copy in hotpaths to reduce allocation (#11213 )	2021-01-03 16:27:34 -08:00
Harshavardhana	c4131c2798	feat: Small object optimization read data in single bulk call (#11207 )	2021-01-03 11:27:57 -08:00
Anis Elleuch	c9d502e6fa	parentDirIsObject() to return quickly with inexistant parent (#11204 ) Rewrite parentIsObject() function. Currently if a client uploads a/b/c/d, we always check if c, b, a are actual objects or not. The new code will check with the reverse order and quickly quit if the segment doesn't exist. So if a, b, c in 'a/b/c' does not exist in the first place, then returns false quickly.	2021-01-02 12:01:29 -08:00
Anis Elleuch	677e80c0f8	xl: Remove check-dir in ReadVersion (#11200 ) The only purpose of check-dir flag in ReadVersion is to return 404 when an object has xl.meta but without data. This is causing an extract call to the disk which can be penalizing in case of busy system where disks receive many concurrent access.	2021-01-02 10:35:57 -08:00
Harshavardhana	aa85af4d1a	fix: missing CopyObjectPart maxClients reorder	2021-01-01 23:07:37 -08:00
Anis Elleuch	ae731d232f	trace: Reorder http/trace maxClients wrapping for correct tracing (#11202 ) mc admin trace does not show the correct handler name in the output: it is printing `maxClients' for all handlers. The reason is that the wrong order of handler wrapping.	2021-01-01 23:06:07 -08:00
Anis Elleuch	a317d220ed	xl-storage: Do not stat bucket assuming the object exists (#11201 ) In HEAD/GET, only STAT the bucket if the object does not exist to return the correct error response.	2021-01-01 09:44:36 -08:00
Harshavardhana	3e1221a01c	fix: log once updating dataUsageCache versions (#11190 ) also reduce usage of *bytes.Buffer for reading `usage-cache.bin`	2020-12-31 09:45:09 -08:00
Ritesh H Shukla	36fc2f98ed	fix: admin trace throttled requests (#11192 )	2020-12-30 21:04:55 -08:00
Ritesh H Shukla	556524c715	Reduce logging when peer is offline (#11184 )	2020-12-30 14:38:54 -08:00
Harshavardhana	cc457f1798	fix: enhance logging in crawler use console.Debug instead of logger.Info (#11179 )	2020-12-29 01:57:28 -08:00
Harshavardhana	ca0d31b09a	fix: re-arrange handlers to handle requests on /minio (#11177 ) fixes #11175	2020-12-28 17:10:33 -08:00
Harshavardhana	445a9bd827	fix: heal optimizations in crawler to avoid multiple healing attempts (#11173 ) Fixes two problems - Double healing when bitrot is enabled, instead heal attempt once in applyActions() before lifecycle is applied. - If applyActions() is successful and getSize() returns proper value, then object is accounted for and should be removed from the oldCache namespace map to avoid double heal attempts.	2020-12-28 10:31:00 -08:00
Harshavardhana	d8d25a308f	fix: use HealObject for cleaning up dangling objects (#11171 ) main reason is that HealObjects starts a recursive listing for each object, this can be a really really long time on large namespaces instead avoid recursive listing just perform HealObject() instead at the prefix. delete's already handle purging dangling content, we don't need to achieve this by doing recursive listing, this in-turn can delay crawling significantly.	2020-12-27 15:42:20 -08:00
Harshavardhana	c19e6ce773	avoid a crash in crawler when lifecycle is not initialized (#11170 ) Bonus for static buffers use bytes.NewReader instead of bytes.NewBuffer, to use a more reader friendly implementation	2020-12-26 22:58:06 -08:00
Harshavardhana	59d3639396	fix: inherit heal opts globally, including bitrot settings (#11166 ) Bonus re-use ReadFileStream internal io.Copy buffers, fixes lots of chatty allocations when reading metacache readers with many sustained concurrent listing operations ``` 17.30GB 1.27% 84.80% 35.26GB 2.58% io.copyBuffer ```	2020-12-24 23:04:03 -08:00
Harshavardhana	027e17468a	fix: discarding results do not attempt in-memory metacache writer (#11163 ) Optimizations include - do not write the metacache block if the size of the block is '0' and it is the first block - where listing is attempted for a transient prefix, this helps to avoid creating lots of empty metacache entries for `minioMetaBucket` - avoid the entire initialization sequence of cacheCh , metacacheBlockWriter if we are simply going to skip them when discardResults is set to true. - No need to hold write locks while writing metacache blocks - each block is unique, per bucket, per prefix and also is written by a single node.	2020-12-24 15:02:02 -08:00
Harshavardhana	45ea161f8d	webUI: change listing to 1000 keys from browser UI (#11159 ) gateway implementations do not handle maxKeys being `-1` properly unlike MinIO implementation, handle it by setting an appropriate value. fixes #11158	2020-12-23 19:58:15 -08:00
Harshavardhana	6a66f142d4	fix: strict quorum in list should list on all drives (#11157 ) current implementation was incorrect, it in-fact assumed only read quorum number of disks. in-fact that value is only meant for read quorum good entries from all online disks. This PR fixes this behavior properly.	2020-12-23 09:26:40 -08:00
Harshavardhana	5982965839	fix: re-use bytes.Buffer using sync.Pool (#11156 )	2020-12-22 23:22:37 -08:00
Harshavardhana	8565cefe4e	fix: allow HTTP2.0 to be always configured	2020-12-22 16:32:58 -08:00
Andreas Auernhammer	8cdf2106b0	refactor cmd/crypto code for SSE handling and parsing (#11045 ) This commit refactors the code in `cmd/crypto` and separates SSE-S3, SSE-C and SSE-KMS. This commit should not cause any behavior change except for: - `IsRequested(http.Header)` which now returns the requested type {SSE-C, SSE-S3, SSE-KMS} and does not consider SSE-C copy headers. However, SSE-C copy headers alone are anyway not valid.	2020-12-22 09:19:32 -08:00
Harshavardhana	35fafb837b	fix: issues with handling delete markers in metacache (#11150 ) Additional cases handled - fix address situations where healing is not triggered on failed writes and deletes. - consider object exists during listing when metadata can be successfully decoded.	2020-12-22 09:16:43 -08:00
Harshavardhana	274bbad5cb	fix: select always online peers for remote listing (#11153 ) always find the right set of online peers for remote listing, this may have an effect on listing if the server is down - we should do this to avoid always performing transient operations on bucket->peerClient that is permanently or down for a long period.	2020-12-22 09:16:07 -08:00
Harshavardhana	5c451d1690	update x/net/http2 to address few bugs (#11144 ) additionally also configure http2 healthcheck values to quickly detect unstable connections and let them timeout. also use single transport for proxying requests	2020-12-21 21:42:38 -08:00
Poorna Krishnamoorthy	c987313431	Encrypt remote target if kms is configured (#11034 ) Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2020-12-21 16:21:33 -08:00
Anis Elleuch	2ecaab55a6	admin: ServerInfo returns info without object layer initialized (#11142 )	2020-12-21 09:35:19 -08:00
Harshavardhana	3e792ae2a2	fix: change defaults for DNS cache dialer (#11145 )	2020-12-21 09:33:29 -08:00
Harshavardhana	4cc500a041	normalize users with double // in accessKeys (#11143 ) Bonus fix, use constant time compare for secret keys in web-handlers.go:SetAuth()	2020-12-20 10:09:51 -08:00
Harshavardhana	d8e28830cf	fix: allow STS creds for admin accounts to add users (#11138 ) Allow rotating creds with privileges to add users fixes https://github.com/minio/console/issues/529	2020-12-19 13:24:21 -08:00
Harshavardhana	3e16ec457a	fix: support user/groups with '/' character (#11127 ) NOTE: user/groups with `//` shall be normalized to `/` fixes #11126	2020-12-19 09:36:37 -08:00
Harshavardhana	e5d378931d	fix: delimiter based listing was broken without marker (#11136 ) with missing nextMarker with delimiter based listing, top level prefixes beyond 4500 or max-keys value wouldn't be sent back for client to ask for the next batch. reproduced at a customer deployment, create prefixes as shown below ``` for year in $(seq 2017 2020) do for month in {01..12} do for day in {01..31} do mc -q cp file myminio/testbucket/dir/day_id=$year-$month-$day/; done done done ``` Then perform ``` aws s3api --profile minio --endpoint-url http://localhost:9000 list-objects \ --bucket testbucket --prefix dir/ --delimiter / --max-keys 1000 ``` You shall see missing NextMarker, this would disallow listing beyond max-keys requested and also disallow beyond 4500 (maxKeyObjectList) prefixes being listed because client wouldn't know the NextMarker available. This PR addresses this situation properly by making the implementation more spec compatible. i.e NextMarker in-fact can be either an object, a prefix with delimiter depending on the input operation. This issue was introduced after the list caching changes and has been present for a while.	2020-12-19 09:36:04 -08:00
Anis Elleuch	e63a10e505	Profiling does not required object layer to be initialized (#11133 )	2020-12-18 11:51:15 -08:00
Anis Elleuch	5434088c51	replication: Ensure to always use nano precision source modtime (#11135 )	2020-12-18 11:37:28 -08:00
Harshavardhana	a773cf48d8	fix: overlapping object and prefix rejected (#11130 ) fixes #11129	2020-12-18 08:51:09 -08:00
Harshavardhana	f714840da7	add _MINIO_SERVER_DEBUG env for enabling debug messages (#11128 )	2020-12-17 16:52:47 -08:00
Harshavardhana	7c9ef76f66	fix: timer deadlock on expired timers (#11124 ) issue was introduced in #11106 the following pattern <-t.C // timer fired if !t.Stop() { <-t.C // timer hangs } Seems to hang at the last `t.C` line, this issue happens because a fired timer cannot be Stopped() anymore and t.Stop() returns `false` leading to confusing state of usage. Refactor the code such that use timers appropriately with exact requirements in place.	2020-12-17 12:35:02 -08:00
Anis Elleuch	cffdb01279	azure/s3 gateways: Pass ETag during GET call to avoid data corruption (#11024 ) Both Azure & S3 gateways call for object information before returning the stream of the object, however, the object content/length could be modified meanwhile, which means it can return a corrupted object. Use ETag to ensure that the object was not modified during the GET call	2020-12-17 09:11:14 -08:00
Harshavardhana	b390a2a0b9	fix: reuser timers in erasure set hotpaths (#11106 ) reuser timers in - connectDisks() monitoring - healMRFRoutine() channel timeouts	2020-12-16 14:33:05 -08:00
Harshavardhana	90158f1e33	fix: avoid logging for Heal APIs in FS mode (#11121 ) fixes #11120	2020-12-16 09:46:13 -08:00
Harshavardhana	c606c76323	fix: prioritized latest buckets for crawler to finish the scans faster (#11115 ) crawler should only ListBuckets once not for each serverPool, buckets are same across all pools, across sets and ListBuckets always returns an unified view, once list buckets returns sort it by create time to scan the latest buckets earlier with the assumption that latest buckets would have lesser content than older buckets allowing them to be scanned faster and also to be able to provide more closer to latest view.	2020-12-15 17:34:54 -08:00
Klaus Post	e7d3b49a20	metacache: Make very small requests transient (#11109 )	2020-12-15 11:25:36 -08:00
Harshavardhana	5df61ab96b	fix: remove gorilla/rpc/ deps fully after our fork (#11108 )	2020-12-15 11:18:06 -08:00
Poorna Krishnamoorthy	3456b03b12	Ignore ObjectNotFound errors in delete api while enforcing locking (#11114 ) AWS does not report this or version not found as errors in the response.	2020-12-15 11:15:49 -08:00
Klaus Post	f6fb27e8f0	Don't copy interesting ids, clean up logging (#11102 ) When searching the caches don't copy the ids, instead inline the loop. ``` Benchmark_bucketMetacache_findCache-32 19200 63490 ns/op 8303 B/op 5 allocs/op Benchmark_bucketMetacache_findCache-32 20338 58609 ns/op 111 B/op 4 allocs/op ``` Add a reasonable, but still the simplistic benchmark. Bonus - make nicer zero alloc logging	2020-12-14 13:13:33 -08:00
Harshavardhana	8368ab76aa	fix: remove the requirement for healing buckets in ListBucketsHeal (#11098 ) With new refactor of bucket healing, healing bucket happens automatically including its metadata, there is no need to redundant heal buckets also in ListBucketsHeal remove it.	2020-12-14 12:07:07 -08:00
Harshavardhana	3e83643320	lifecycle improvements and additional debug logging (#11096 ) Bonus change fix browser assets	2020-12-13 12:05:54 -08:00
Harshavardhana	2eb52ca5f4	fix: heal bucket metadata right before healing bucket (#11097 ) optimization mainly to avoid listing the entire `.minio.sys/buckets/.minio.sys` directory, this can get really huge and comes in the way of startup routines, contents inside `.minio.sys/buckets/.minio.sys` are rather transient and not necessary to be healed.	2020-12-13 11:57:08 -08:00
Anis Elleuch	f164085227	xl: Always set root disk to true in test environment (#11094 ) Tests environments (go test or manual testing) should always consider the passed disks are root disks and should not rely on disk.IsRootDisk() function. The reason is that this latter can return a false negative when called in a busy system. However, returning a false negative will only occur in a testing environment and not in a production, so we can accept this trade-off for now.	2020-12-12 16:10:07 -08:00
Harshavardhana	48191dd748	return NoSuchVersion if invalid version-id is specified (#11091 )	2020-12-11 20:44:08 -08:00
Anis Elleuch	c4f29d24da	metacache: Ask all disks when drive count is 4 (#11087 )	2020-12-11 17:54:31 -08:00
Harshavardhana	db7890660e	fix: a crash when disk is nil, safe access on erasureDisks (#11089 ) fixes #11088	2020-12-11 16:58:36 -08:00
Poorna Krishnamoorthy	9adc33efbb	Return version-id header in DeleteObject response (#11090 ) even when the object version is non-existent To make this consistent with aws behavior. Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2020-12-11 16:58:15 -08:00
Poorna Krishnamoorthy	8f65aba04b	ignore NoSuchVersion error in DeleteObjects API (#11086 ) Currently, the error response reports NoSuchVersion for a non-existent version-id, whereas AWS ignores it.	2020-12-11 12:39:09 -08:00
Harshavardhana	3a0082f0f1	fix: TTFB prometheus metrics calculation (#11082 ) until now metrics was reporting entire call duration instead of ttfb's this PR fixes it	2020-12-10 23:02:25 -08:00
Klaus Post	4bca62a0bd	crawler: Stream bucket usage cache data (#11068 ) Stream bucket caches to storage and through RPC calls.	2020-12-10 13:03:22 -08:00
Klaus Post	82e2be4239	metacache: Speed up cleanup operation (#11078 ) Perform cleanup operations on copied data. Avoids read locking data while determining which caches to keep. Also, reduce the log(NN) operation to log(NM) where M caches with the same root or below when checking potential replacements.	2020-12-10 12:30:28 -08:00
Harshavardhana	4550ac6fff	fix: refactor locks to apply them uniquely per node (#11052 ) This refactor is done for few reasons below - to avoid deadlocks in scenarios when number of nodes are smaller < actual erasure stripe count where in N participating local lockers can lead to deadlocks across systems. - avoids expiry routines to run 1000 of separate network operations and routes per disk where as each of them are still accessing one single local entity. - it is ideal to have since globalLockServer per instance. - In a 32node deployment however, each server group is still concentrated towards the same set of lockers that partipicate during the write/read phase, unlike previous minio/dsync implementation - this potentially avoids send 32 requests instead we will still send at max requests of unique nodes participating in a write/read phase. - reduces overall chattiness on smaller setups.	2020-12-10 07:28:37 -08:00
Klaus Post	e65ed2e44f	listcache: Add path index (#11063 ) Add a root path index. ``` Before: Benchmark_bucketMetacache_findCache-32 10000 730737 ns/op With excluded prints: Benchmark_bucketMetacache_findCache-32 10000 207100 ns/op With the root path: Benchmark_bucketMetacache_findCache-32 705765 1943 ns/op ``` Benchmark used (not linear): ```Go func Benchmark_bucketMetacache_findCache(b *testing.B) { bm := newBucketMetacache("", false) for i := 0; i < b.N; i++ { bm.findCache(listPathOptions{ ID: mustGetUUID(), Bucket: "", BaseDir: "prefix/" + mustGetUUID(), Prefix: "", FilterPrefix: "", Marker: "", Limit: 0, AskDisks: 0, Recursive: false, Separator: slashSeparator, Create: true, CurrentCycle: 0, OldestCycle: 0, }) } } ``` Replaces #11058	2020-12-09 08:37:43 -08:00
Anis Elleuch	d90044b847	federation: Redirect Lifecycle PUT request by bucket name (#11062 ) The bucket forwarder handler considers MakeBucket to be always local but it mistakenly thinks that PUT bucket lifecycle to be a MakeBucket call. Fix the check of the MakeBucket call by ensuring that the query is empty in the PUT url.	2020-12-09 07:25:26 -08:00
Harshavardhana	d8c1f93de6	reject mixed drive situations with drives on root disks (#11057 ) till now we used to match the inode number of the root drive and the drive path minio would use, if they match we knew that its a root disk. this may not be true in all situations such as running inside a container environment where the container might be mounted from a different partition altogether, root disk detection might fail.	2020-12-09 00:27:02 -08:00
Anis Elleuch	a51488cbaa	s3: Fix reading GET with partNumber specified (#11032 ) partNumber was miscalculting the start and end of parts when partNumber query is specified in the GET request. This commit fixes it and also fixes the ContentRange header in that case.	2020-12-08 13:12:42 -08:00
Harshavardhana	dc819afa44	fix: auto update crawler meta version PR `038bcd9079` introduced version '3', we need to make sure that we do not print an unexpected error instead log a message to indicate we will auto update the version.	2020-12-08 10:40:51 -08:00
Harshavardhana	4a564336fe	Revert "Add metrics for nodes online and offline (#11050 )" This reverts commit `f60bbdf86b`.	2020-12-08 09:23:35 -08:00
Ritesh H Shukla	f60bbdf86b	Add metrics for nodes online and offline (#11050 )	2020-12-08 01:06:27 -08:00
Poorna Krishnamoorthy	f3beb1236a	Add cache usage, total capacity to prometheus metrics (#11026 )	2020-12-07 16:35:11 -08:00
Poorna Krishnamoorthy	934bed47fa	Add transition event notification (#11047 ) This is a MinIO specific extension to allow monitoring of transition events.	2020-12-07 13:53:28 -08:00
Ritesh H Shukla	038bcd9079	Add replication capacity metrics support in crawler (#10786 )	2020-12-07 13:47:48 -08:00
Harshavardhana	ce93b2681b	fix: re-use er.getDisks() properly in certain calls (#11043 )	2020-12-07 10:04:07 -08:00
Harshavardhana	8d036ed6d8	fix: allow sub-admin to modify password for other users (#11039 ) fixes #11037	2020-12-06 20:36:34 -08:00
Harshavardhana	9c53cc1b83	fix: heal multiple buckets in bulk (#11029 ) makes server startup, orders of magnitude faster with large number of buckets	2020-12-05 13:00:44 -08:00
Harshavardhana	3514e89eb3	support envs as well for new crawler sub-system (#11033 )	2020-12-04 21:54:24 -08:00
Klaus Post	a896125490	Add crawler delay config + dynamic config values (#11018 )	2020-12-04 09:32:35 -08:00
Harshavardhana	e083471ec4	use argon2 with sync.Pool for better memory management (#11019 )	2020-12-03 19:23:19 -08:00
Harshavardhana	80d31113e5	fix: etcd import paths again depend on v3.4.14 release (#11020 ) Due to botched upstream renames of project repositories and incomplete migration to go.mod support, our current dependency version of `go.mod` had bugs i.e it was using commits from master branch which didn't have the required fixes present in release-3.4 branches which leads to some rare bugs https://github.com/etcd-io/etcd/pull/11477 provides a workaround for now and we should migrate to this. release-3.5 eventually claims to fix all of this properly until then we cannot use /v3 import right now	2020-12-03 11:35:18 -08:00
Ritesh H Shukla	7e2b79984e	Stream bucket bandwidth measurements (#11014 )	2020-12-03 11:34:42 -08:00
Harshavardhana	951b6b203b	skip metacache entries healing to speed up startup	2020-12-02 21:30:54 -08:00
Harshavardhana	44e23b7f4f	fix: startup being slow - wait only if IOCount > 0	2020-12-02 21:06:17 -08:00
Harshavardhana	96c0ce1f0c	add support for tuning healing to make healing more aggressive (#11003 ) supports `mc admin config set <alias> heal sleep=100ms` to enable more aggressive healing under certain times. also optimize some areas that were doing extra checks than necessary when bitrotscan was enabled, avoid double sleeps make healing more predictable. fixes #10497	2020-12-02 11:12:00 -08:00
ebozduman	303be1866d	Adds "x-amz-usr-agent" and "x-id" params to be used in authentication of presignedURL (#10792 )	2020-12-02 02:02:49 -08:00
Harshavardhana	4ec45753e6	rename server sets to server pools	2020-12-01 13:50:33 -08:00
Klaus Post	e6ea5c2703	crawler: Missing folder heal check per set (#10876 )	2020-12-01 12:07:39 -08:00
Harshavardhana	790833f3b2	Revert "Support variable server sets (#10314 )" This reverts commit `aabf053d2f`.	2020-12-01 12:02:29 -08:00
Harshavardhana	7cbca43eb1	fix: allow admins to create users (#11005 ) PR #10978 introduced a regression, root credential should be allowed to create users	2020-11-30 21:53:23 -08:00
Poorna Krishnamoorthy	2f564437ae	Disallow writeback caching with cache_after (#11002 ) fixes #10974	2020-11-30 20:53:27 -08:00
Harshavardhana	bdd094bc39	fix: avoid sending errors on missing objects on locked buckets (#10994 ) make sure multi-object delete returned errors that are AWS S3 compatible	2020-11-28 21:15:45 -08:00
Harshavardhana	e6fa410778	fix: allow accountInfo, addUser and getUserInfo implicit (#10978 ) - accountInfo API that returns information about user, access to buckets and the size per bucket - addUser - user is allowed to change their secretKey - getUserInfo - returns user info if the incoming is the same user requesting their information	2020-11-27 17:23:57 -08:00
Harshavardhana	aabf053d2f	Support variable server sets (#10314 )	2020-11-25 16:28:47 -08:00
Anis Elleuch	91130e884b	Avoid sending errors in gob in storage requests (#10977 )	2020-11-25 12:42:48 -08:00
Poorna Krishnamoorthy	2ff655a745	Refactor replication, ILM handling in DELETE API (#10945 )	2020-11-25 11:24:50 -08:00
Klaus Post	0422eda6a2	metacache: Always close block writer (#10973 ) In some cases a writer could be left behind unclosed, leaking compression blocks. Always close and set compression concurrency to 2 which should be fine to keep up.	2020-11-25 09:37:30 -08:00
Harshavardhana	31e6f60847	fix: improve error handling in metacache (#10965 )	2020-11-25 01:11:22 -08:00
Poorna Krishnamoorthy	3ad41fe89d	Add admin API to edit remote bucket target credentials (#10848 )	2020-11-24 19:09:05 -08:00
Klaus Post	a75fafdbe2	Remove msgp workaround (#10964 ) The error in `github.com/philhofer/fwd` was quickly fixed through https://github.com/philhofer/fwd/pull/22 - update the dependency and remove the workaround.	2020-11-24 11:58:10 -08:00
Klaus Post	a58b7874ef	Temporary workaround for msgp skipping (#10960 ) Due to https://github.com/philhofer/fwd/issues/20 when skipping a metadata entry that is >2048 bytes and the buffer is full (2048 bytes) the skip will fail with `io.ErrNoProgress`. Enlarge the buffer so we temporarily make this much more unlikely. If it still happens we will have to rewrite the skips to reads. Fixes #10959	2020-11-23 18:51:59 -08:00
Harshavardhana	6990de9c94	fix: dangling object delete shall return object doesn't exist (#10961 ) dangling object when deleted means object doesn't exist anymore, so we should return appropriate errors, this allows crawler heal to ensure that it removes the tracker for dangling objects.	2020-11-23 18:50:53 -08:00
Anis Elleuch	75a8e81f8f	azure: Specify different Azure storage in the shell env (#10943 ) AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY are used in azure CLI to specify the azure blob storage access & secret keys. With this commit, it is possible to set them if you want the gateway's own credentials to be different from the Azure blob credentials. Co-authored-by: Harshavardhana <harsha@minio.io>	2020-11-23 16:45:56 -08:00
Harshavardhana	519c0077a9	fix: do not return an error for successfully deleted dangling objects (#10938 ) dangling objects when removed `mc admin heal -r` or crawler auto heal would incorrectly return error - this can interfere with usage calculation as the entry size for this would be returned as `0`, instead upon success use the resultant object size to calculate the final size for the object and avoid reporting this in the log messages Also do not set ObjectSize in healResultItem to be '-1' this has an effect on crawler metrics calculating 1 byte less for objects which seem to be missing their `xl.meta`	2020-11-23 09:12:17 -08:00
Harshavardhana	734d07a532	fix: all hosts local and port same should be local erasure setup (#10951 ) this is needed to avoid initializing notification peers that can lead to races in many sub-systems fixes #10950	2020-11-23 09:07:50 -08:00
Harshavardhana	df93102235	fix: unwrapping issues with os.Is* functions (#10949 ) reduces 3 stat calls, reducing the overall startup time significantly.	2020-11-23 08:36:49 -08:00
Poorna Krishnamoorthy	39f3d5493b	Show Delete replication status header (#10946 ) X-Minio-Replication-Delete-Status header shows the status of the replication of a permanent delete of a version. All GETs are disallowed and return 405 on this object version. In the case of replicating delete markers. X-Minio-Replication-DeleteMarker-Status shows the status of replication, and would similarly return 405. Additionally, this PR adds reporting of delete marker event completion and updates documentation	2020-11-21 23:48:50 -08:00
Klaus Post	692ff41ef7	Unwrap network errors (#10934 ) Alternative to #10927 Instead of having an upstream fix, do unwrap when checking network errors. 'As' will also work when destination is an interface as checked by the tests.	2020-11-20 22:55:35 -08:00
Harshavardhana	86409fa93d	add audit/admin trace support for browser requests (#10947 ) To support this functionality we had to fork the gorilla/rpc package with relevant changes	2020-11-20 22:52:17 -08:00
Shireesh Anjal	7bc47a14cc	Rename OBD to Health (#10842 ) Also, Remove thread stats and openfds from the health report as we already have process stats and numfds	2020-11-20 12:52:53 -08:00
Harshavardhana	73e308079a	fix: handle errors appropriately as they are wrapped (#10917 )	2020-11-20 10:43:07 -08:00
Poorna Krishnamoorthy	08b24620c0	Display storage-class of transitioned object in HEAD	2020-11-20 09:17:31 -08:00
Harshavardhana	95675b0c9a	fix: do not crash PutObjectTags when node is down (#10940 ) fixes #10939	2020-11-20 09:10:48 -08:00
Poorna Krishnamoorthy	251c1ef6da	Add support for replication of object tags, retention metadata (#10880 )	2020-11-19 18:56:09 -08:00
Poorna Krishnamoorthy	0fa430c1da	validate service type of target in replication/ilm transition config (#10928 )	2020-11-19 18:47:33 -08:00
Poorna Krishnamoorthy	f60b6eb82e	fix validation for deletemarker replication on object locked bucket (#10892 )	2020-11-19 18:47:19 -08:00
Poorna Krishnamoorthy	1ebf6f146a	Add support for ILM transition (#10565 ) This PR adds transition support for ILM to transition data to another MinIO target represented by a storage class ARN. Subsequent GET or HEAD for that object will be streamed from the transition tier. If PostRestoreObject API is invoked, the transitioned object can be restored for duration specified to the source cluster.	2020-11-19 18:47:17 -08:00
Harshavardhana	8f7fe0405e	fix: delete marker replication should support directories (#10878 ) allow directories to be replicated as well, along with their delete markers in replication. Bonus fix to fix bloom filter updates for directories to be preserved.	2020-11-19 18:47:12 -08:00
Harshavardhana	9a34fd5c4a	Revert "Revert "Add delete marker replication support (#10396 )"" This reverts commit `267d7bf0a9`.	2020-11-19 18:43:58 -08:00
Harshavardhana	f794fe79e3	fix: network shutdown was not handle properly (#10927 ) fixes a regression introduced in #10859, due to the error returned by rest.Client being typed i.e *rest.NetworkError - IsNetworkHostDown function didn't work as expected to detect network issues. This in-turn aggravated the situations when nodes are disconnected leading to performance loss.	2020-11-19 13:53:49 -08:00
Harshavardhana	0f9e125cf3	fix: check for gateway backend online without http request (#10924 ) fixes #10921	2020-11-19 10:38:02 -08:00
Harshavardhana	d778d9493f	remove MinIO release tag as part of HTTP Server string (#10929 )	2020-11-19 09:16:02 -08:00
Harshavardhana	70d2c2ccc9	skip files that are not erasure objects or directories (#10926 ) without this change WalkDir reports errors while trying to read `format.json/xl.meta` which is a replicated file	2020-11-19 09:15:09 -08:00
Harshavardhana	9dea7020f0	allow prefix filtering for WalkDir to be optional (#10923 )	2020-11-18 12:03:16 -08:00
Klaus Post	990d074f7d	metacache: Allow prefix filtering (#10920 ) Do listings with prefix filter when bloom filter is dirty. This will forward the prefix filter to the lister which will make it only scan the folders/objects with the specified prefix. If we have a clean bloom filter we try to build a more generally useful cache so in that case, we will list all objects/folders.	2020-11-18 10:44:18 -08:00
Klaus Post	e413f05397	Save listing error async (#10922 ) Since the RPC call may have to time out save an error state async to not hold up the listing returning. Fixes #10919	2020-11-18 10:28:22 -08:00
Harshavardhana	d1b1fee080	fix: save healing tracker right before healing (#10915 ) this change avoids a situation where accidentally if the user deleted the healing tracker or drives were replaced again within the 10sec window.	2020-11-18 09:34:46 -08:00
Harshavardhana	9738d605e4	increase readdir per block memory to facilitate faster WalkDir (#10908 )	2020-11-18 09:21:02 -08:00
Klaus Post	10099357b6	listcache: Wrap returned errors (#10882 ) To give an indication of where they happen	2020-11-17 09:11:59 -08:00
Harshavardhana	80b8ce89a4	remove context deadline from Delete calls (#10901 )	2020-11-17 09:09:45 -08:00
Poorna Krishnamoorthy	0b766288ef	fix: send replication completed event notification (#10902 )	2020-11-15 22:16:41 -08:00
Rafael Bodill	598ca0569c	fix: global in-place update boolean check (#10900 )	2020-11-15 13:34:12 -08:00
Poorna Krishnamoorthy	d295ce5708	Fix disk cache usage percent for prometheus (#10898 ) Fixes: #10895 Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2020-11-14 19:18:00 -08:00
Klaus Post	b5a3d79bce	listobjectversions: Add shortcut for Veeam blocks (#10893 ) Add shortcut for `APN/1.0 Veeam/1.0 Backup/10.0` It requests unique blocks with a specific prefix. We skip scanning the parent directory for more objects matching the prefix.	2020-11-13 16:58:20 -08:00
Harshavardhana	17a5ff51ff	fix: move context timeout closer to network for Delete calls (#10897 ) allowing for disconnects to be limited to the drive themselves instead of disconnecting all drives.	2020-11-13 16:56:45 -08:00
Harshavardhana	0bcb1b679d	fix: disallow update if dates are same (#10890 ) fixes #10889	2020-11-12 14:18:59 -08:00
Klaus Post	a3017c724e	Sort directory objects correctly (#10886 ) Decode dir objects when listing and sort them correctly.	2020-11-12 13:09:34 -08:00
Harshavardhana	267d7bf0a9	Revert "Add delete marker replication support (#10396 )" This reverts commit `50c10a5087`. PR is moved to origin/dev branch	2020-11-12 11:43:14 -08:00
cksac	be83dfc52a	fix: HDFS list bucket when subpath is provided (#10884 )	2020-11-12 11:26:51 -08:00
Harshavardhana	ca88ca753c	ignore typed errors correctly in list cache layer (#10879 ) bonus write bucket metadata cache with enough quorum Possible fix for #10868	2020-11-12 09:28:56 -08:00
Klaus Post	f86d3538f6	Allow deeper sleep (#10883 ) Allow each crawler operation to sleep up to 10 seconds on very heavily loaded systems. This will of course make minimum crawler speed less, but should be more effective at stopping.	2020-11-12 09:17:56 -08:00
Klaus Post	1c3590078d	Skip 0 byte stream writes (#10875 ) Don't send a packet when receiving 0 bytes or there is an error recorded	2020-11-11 18:07:40 -08:00
Harshavardhana	aa158228f9	fix: simplify healing metadata objects per set (#10867 )	2020-11-11 10:58:16 -08:00
Klaus Post	8747834c69	DeletedObjects: Return objects on lock failure (#10874 ) Return objects when locking fails. <details> <summary>Panic</summary> ``` : 2020/11/10 04:15:55 http: panic serving 10.10.62.153:44858: runtime error: index out of range [0] with length 0 : goroutine 363537270 [running]: : net/http.(conn).serve.func1(0xc019232780) : net/http/server.go:1801 +0x147 : panic(0x1cadd60, 0xc001719260) : runtime/panic.go:975 +0x47a : github.com/minio/minio/cmd.criticalErrorHandler.ServeHTTP.func1(0xc0121d1200, 0x210cda0, 0xc0141940e0) : github.com/minio/minio/cmd/generic-handlers.go:781 +0x1a8 : panic(0x1cadd60, 0xc001719260) : runtime/panic.go:969 +0x1b9 : github.com/minio/minio/cmd.objectAPIHandlers.DeleteMultipleObjectsHandler(0x1e71ce8, 0x1e71cc8, 0x2108420, 0xc0192328c0, 0xc0121d1400) : github.com/minio/minio/cmd/bucket-handlers.go:465 +0x2490 : net/http.HandlerFunc.ServeHTTP(...) : net/http/server.go:2042 : github.com/minio/minio/cmd.httpTraceAll.func1(0x2108420, 0xc0192328c0, 0xc0121d1400) : github.com/minio/minio/cmd/handler-utils.go:353 +0x158 : net/http.HandlerFunc.ServeHTTP(...) : net/http/server.go:2042 : github.com/minio/minio/cmd.collectAPIStats.func1(0x2108420, 0xc019232820, 0xc0121d1400) : github.com/minio/minio/cmd/handler-utils.go:380 +0xed : net/http.HandlerFunc.ServeHTTP(...) : net/http/server.go:2042 : github.com/minio/minio/cmd.maxClients.func1(0x2108420, 0xc019232820, 0xc0121d1400) : github.com/minio/minio/cmd/handler-api.go:132 +0x33b : net/http.HandlerFunc.ServeHTTP(0xc00271d590, 0x2108420, 0xc019232820, 0xc0121d1400) : net/http/server.go:2042 +0x44 : github.com/minio/minio/cmd.redirectHandler.ServeHTTP(0x20e2180, 0xc00271d590, 0x2108420, 0xc019232820, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:192 +0x156 : github.com/minio/minio/cmd.customHeaderHandler.ServeHTTP(0x20e1060, 0xc0141a22b0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:751 +0x162 : github.com/minio/minio/cmd.securityHeaderHandler.ServeHTTP(0x20e0fc0, 0xc0141a22c0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:766 +0x1d6 : github.com/minio/minio/cmd.bucketForwardingHandler.ServeHTTP(0xc0121c7a40, 0x20e1120, 0xc0141a22d0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:624 +0xbf : github.com/minio/minio/cmd.requestValidityHandler.ServeHTTP(0x20e0f20, 0xc01814d280, 0x21083e0, 0xc01814d2e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:608 +0x42a : github.com/minio/minio/cmd.httpStatsHandler.ServeHTTP(0x20e10c0, 0xc0141a2300, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:536 +0xe4 : github.com/minio/minio/cmd.requestSizeLimitHandler.ServeHTTP(0x20e0fe0, 0xc0141a2310, 0x50004000000, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:68 +0xd4 : github.com/minio/minio/cmd.requestHeaderSizeLimitHandler.ServeHTTP(0x20e10a0, 0xc01814d2a0, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:93 +0x1b7 : github.com/minio/minio/cmd.crossDomainPolicy.ServeHTTP(0x20e1080, 0xc0141a2320, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/crossdomain-xml-handler.go:51 +0x82 : github.com/minio/minio/cmd.browserRedirectHandler.ServeHTTP(0x20e0fa0, 0xc0141a2330, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:276 +0x68 : github.com/minio/minio/cmd.minioReservedBucketHandler.ServeHTTP(0x20e0f00, 0xc0141a2340, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:344 +0xb8 : github.com/minio/minio/cmd.cacheControlHandler.ServeHTTP(0x20e1020, 0xc0141a2350, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:303 +0x1ce : github.com/minio/minio/cmd.timeValidityHandler.ServeHTTP(0x20e0f40, 0xc0141a2360, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:414 +0x3ca : github.com/minio/minio/cmd.resourceHandler.ServeHTTP(0x20e1160, 0xc0141a2370, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:516 +0xab : github.com/minio/minio/cmd.authHandler.ServeHTTP(0x20e1100, 0xc0141a2380, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/auth-handler.go:502 +0x2e7 : github.com/minio/minio/cmd.sseTLSHandler.ServeHTTP(0x20e0ee0, 0xc0141a2390, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:802 +0x79 : github.com/minio/minio/cmd.reservedMetadataHandler.ServeHTTP(0x20e1140, 0xc0141a23a0, 0x210cda0, 0xc0141940e0, 0xc0121d1400) : github.com/minio/minio/cmd/generic-handlers.go:139 +0x1b7 : github.com/gorilla/mux.(Router).ServeHTTP(0xc00073fb00, 0x210cda0, 0xc0141940e0, 0xc0121d1200) : github.com/gorilla/mux@v1.8.0/mux.go:210 +0xd3 : github.com/rs/cors.(Cors).Handler.func1(0x210cda0, 0xc0141940e0, 0xc0121d1200) : github.com/rs/cors@v1.7.0/cors.go:219 +0x1b9 : net/http.HandlerFunc.ServeHTTP(0xc0009aece0, 0x210cda0, 0xc0141940e0, 0xc0121d1200) : net/http/server.go:2042 +0x44 : github.com/minio/minio/cmd.criticalErrorHandler.ServeHTTP(0x20e2180, 0xc0009aece0, 0x210cda0, 0xc0141940e0, 0xc0121d1200) : github.com/minio/minio/cmd/generic-handlers.go:784 +0x85 : github.com/minio/minio/cmd/http.(Server).Start.func1(0x210cda0, 0xc0141940e0, 0xc0121d1200) : github.com/minio/minio/cmd/http/server.go:101 +0x258 : net/http.HandlerFunc.ServeHTTP(0xc000dc4080, 0x210cda0, 0xc0141940e0, 0xc0121d1200) : net/http/server.go:2042 +0x44 : net/http.serverHandler.ServeHTTP(0xc000764c60, 0x210cda0, 0xc0141940e0, 0xc0121d1200) : net/http/server.go:2843 +0xa3 : net/http.(conn).serve(0xc019232780, 0x2114720, 0xc03381f6c0) : net/http/server.go:1925 +0x8ad : created by net/http.(Server).Serve : net/http/server.go:2969 +0x36c ``` </details>	2020-11-11 09:14:32 -08:00
Poorna Krishnamoorthy	50c10a5087	Add delete marker replication support (#10396 ) Delete marker replication is implemented for V2 configuration specified in AWS spec (though AWS allows it only in the V1 configuration). This PR also brings in a MinIO only extension of replicating permanent deletes, i.e. deletes specifying version id are replicated to target cluster.	2020-11-10 15:24:14 -08:00
Steven Reitsma	4683a623dc	fix: negative STS IAM token TTL value (#10866 )	2020-11-10 12:24:01 -08:00
Klaus Post	06899210a7	Reduce health check output (#10859 ) This will make the health check clients 'silent'. Use `IsNetworkOrHostDown` determine if network is ok so it mimics the functionality in the actual client.	2020-11-10 09:28:23 -08:00
Harshavardhana	cbdab62c1e	fix: heal user/metadata right away upon server startup (#10863 ) this is needed such that we make sure to heal the users, policies and bucket metadata right away as we do listing based on list cache which only lists '3' sufficiently good drives, to avoid possibly losing access to these users upon upgrade make sure to heal them.	2020-11-10 09:02:06 -08:00
Harshavardhana	8df6112204	fix: avoid divide by zero error single node distributed setup (#10862 )	2020-11-09 20:40:39 -08:00
Harshavardhana	97692bc772	re-route requests if IAM is not initialized (#10850 )	2020-11-07 21:03:06 -08:00
Steven Reitsma	54120107ce	fix: infinite loop in cleanupStaleUploads of encrypted MPUs (#10845 ) fixes #10588	2020-11-06 11:53:42 -08:00
Klaus Post	9bf5990ea9	metadata: Invalidate cache if unreadable and not updating (#10844 ) If a scanning server shuts down unexpectedly we may have "successful" caches that are incomplete on a set. In this case mark the cache with an error so it will no longer be handed out.	2020-11-06 08:54:09 -08:00
Steven Reitsma	74f7cf24ae	fix: s3 gateway SSE pagination (#10840 ) Fixes #10838	2020-11-05 15:04:03 -08:00
Harshavardhana	fb28aa847b	fix: add missing deleted key element in multiObjectDelete (#10839 ) fixes #10832	2020-11-05 12:47:46 -08:00
Klaus Post	0724205f35	metacache: Add option for life extension (#10837 ) Add `MINIO_API_EXTEND_LIST_CACHE_LIFE` that will extend the life of generated caches for a while. This changes caches to remain valid until no updates have been received for the specified time plus a fixed margin. This also changes the caches from being invalidated when the first set finishes until the last set has finished plus the specified time has passed.	2020-11-05 11:49:56 -08:00
Harshavardhana	b72cac4cf3	fix: dangling objects on actual namespace (#10822 )	2020-11-05 11:48:55 -08:00
Klaus Post	bd77f29fc4	Don't replace caches that are receiving updates (#10834 ) Keep caches while they are receiving updates. Move update code to separate function.	2020-11-05 07:34:08 -08:00
Klaus Post	d1e1205036	metacache: Always close the s2 writer (#10836 ) The s2 writer could be leaked if there was an error. Make sure it is always closed.	2020-11-05 07:30:14 -08:00
Harshavardhana	71753e21e0	add missing TTL for STS credentials on etcd (#10828 )	2020-11-04 13:06:05 -08:00
Harshavardhana	fde3299bf3	re-use optimized readdir for isDirEmpty() (#10829 ) reduces effective memory usage by an order of magnitude, also increases performance for small objects	2020-11-04 13:05:21 -08:00
Harshavardhana	1a1f00fa15	fix: use internode data for DisksInfo, VolsInfo in message pack (#10821 ) Similar to #10775 for fewer memory allocations, since we use getOnlineDisks() extensively for listing we should optimize it further. Additionally, remove all unused walkers from the storage layer	2020-11-04 10:10:54 -08:00
Bill Thorp	4a1efabda4	Context based AccessKey passing (#10615 ) A new field called AccessKey is added to the ReqInfo struct and populated. Because ReqInfo is added to the context, this allows the AccessKey to be accessed from 3rd-party code, such as a custom ObjectLayer. Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Kaloyan Raev <kaloyan@storj.io>	2020-11-04 09:13:34 -08:00
Klaus Post	3b88a646ec	Add remote online/offline information (#10825 ) Log information about remote clients being marked offline. This will help to identify root causes of failures.	2020-11-04 08:27:32 -08:00
Klaus Post	2294e53a0b	Don't retain context in locker (#10515 ) Use the context for internal timeouts, but disconnect it from outgoing calls so we always receive the results and cancel it remotely.	2020-11-04 08:25:42 -08:00
Klaus Post	f0819cce75	Keep transient lists while they are updating (#10826 ) On extremely long running listings keep the transient list 15 minutes after last update instead of using start time. Also don't do overlap checks on transient lists.	2020-11-04 08:01:33 -08:00
Klaus Post	1e11b4629f	Add remote Diskinfo caching (#10824 ) Add 1 second remote disk info cache. Should decrease need for remote calls a great deal due to how actively it is used now.	2020-11-04 08:00:18 -08:00
Harshavardhana	5c72a34fa8	fix: honor delimiter as per AWS S3 spec (#10823 )	2020-11-04 07:56:58 -08:00
Klaus Post	b9277c8030	metacache: Add trashcan (#10820 ) Add trashcan that keeps recently updated lists after bucket deletion. All caches were deleted once a bucket was deleted, so caches still running would report errors. Now they are canceled. Fix `.minio.sys` not being transient.	2020-11-03 12:47:52 -08:00
Harshavardhana	8c76e1353e	initialize IAM after etcd has initialized (#10819 )	2020-11-03 12:12:30 -08:00
Harshavardhana	ad382799b1	use list cache for Walk() with webUI and quota (#10814 ) bring list cache optimizations for web UI object listing, also FIFO quota enforcement through list cache as well.	2020-11-03 08:53:48 -08:00
Harshavardhana	68de5a6f6a	fix: IAM store fallback to list users and policies from disk (#10787 ) Bonus fixes, remove package retry it is harder to get it right, also manage context remove it such that we don't have to rely on it anymore instead use a simple Jitter retry.	2020-11-02 17:52:13 -08:00
Harshavardhana	4ea31da889	fix: move list quorum ENV to config (#10804 )	2020-11-02 17:21:56 -08:00
Klaus Post	0a796505c1	metacache: Check only one disk for updates (#10809 ) Check only one disk for updates. This will reduce IO while waiting for lists to finish.	2020-11-02 17:20:27 -08:00
Klaus Post	37749f4623	Optimize FileInfo(Version) transfer (#10775 ) File Info decoding, in particular, is showing up as a major allocator and time consumer for internode data transfers Switch to message pack for cross-server transfers: ``` MSGP: Size: 945 bytes BenchmarkEncodeFileInfoMsgp-32 1558444 866 ns/op 1.16 MB/s 0 B/op 0 allocs/op BenchmarkDecodeFileInfoMsgp-32 479968 2487 ns/op 0.40 MB/s 848 B/op 18 allocs/op GOB: Size: 1409 bytes BenchmarkEncodeFileInfoGOB-32 333339 3237 ns/op 0.31 MB/s 576 B/op 19 allocs/op BenchmarkDecodeFileInfoGOB-32 20869 57837 ns/op 0.02 MB/s 16439 B/op 428 allocs/op ```	2020-11-02 17:07:52 -08:00
Klaus Post	86e0d272f3	Reduce WriteAll allocs (#10810 ) WriteAll saw 127GB allocs in a 5 minute timeframe for 4MiB buffers used by `io.CopyBuffer` even if they are pooled. Since all writers appear to write byte buffers, just send those instead and write directly. The files are opened through the `os` package so they have no special properties anyway. This removes the alloc and copy for each operation. REST sends content length so a precise alloc can be made.	2020-11-02 16:14:31 -08:00
Harshavardhana	8527f22df1	optimize request URL encoding for internode (#10811 ) this reduces allocations in order of magnitude Also, revert "erasure: delete dangling objects automatically (#10765)" affects list caching should be investigated.	2020-11-02 15:15:12 -08:00
Anis Elleuch	b456292295	erasure: delete dangling objects automatically (#10765 )	2020-11-02 10:49:30 -08:00
Poorna Krishnamoorthy	03fdbc3ec2	Add async caching commit option in diskcache (#10742 ) Add store and a forward option for a single part uploads when an async mode is enabled with env MINIO_CACHE_COMMIT=writeback It defaults to `writethrough` if unspecified.	2020-11-02 10:00:45 -08:00
Harshavardhana	4c773f7068	re-use remote transports in Peer,Storage,Locker clients (#10788 ) use one transport for internode communication	2020-11-02 07:43:11 -08:00
Harshavardhana	5412d730c1	simplify monitoring doesn't need to be canceled (#10803 ) connect disks monitoring doesn't need to be canceled upon drive replacement, since we only need to replace the newly replaced drive.	2020-10-31 14:10:12 -07:00
Klaus Post	fe9f23e632	Recreate bucket metacache if corrupted (#10800 ) If bucket metadata cannot be read, clean up existing and create a new.	2020-10-31 10:26:16 -07:00
Klaus Post	422898d9b3	Clean up metadata cache when deleting bucket (#10802 ) Metadata caches were left behind when deleting a bucket.	2020-10-31 09:46:18 -07:00
Harshavardhana	b686bb9c83	fix: replaced drive properly by healing the entire drive (#10799 ) Bonus fixes, we do not need reload format anymore as the replaced drive is healed locally we only need to ensure that drive heal reloads the drive properly. We preserve the UUID of the original order, this means that the replacement in `format.json` doesn't mean that the drive needs to be reloaded into memory anymore. fixes #10791	2020-10-31 01:34:48 -07:00
Harshavardhana	5e5cdc581d	remove unnecessary logging and move to log once (#10798 ) the current master logs way too much when a node is down, instead log once and move on.	2020-10-30 14:55:50 -07:00
Harshavardhana	02cfa774be	allow requests to be proxied when server is booting up (#10790 ) when server is booting up there is a possibility that users might see '503' because object layer when not initialized, then the request is proxied to neighboring peers first one which is online.	2020-10-30 12:20:28 -07:00
Krishna Srinivas	3a2f89b3c0	fix: add support for O_DIRECT reads for erasure backends (#10718 )	2020-10-30 11:04:29 -07:00
Klaus Post	6135f072d2	Fix invalidated metacaches (#10784 ) * Fix caches having EOF marked as a failure. * Simplify cache updates. * Provide context for checkMetacacheState failures. * Log 499 when the client disconnects.	2020-10-30 09:33:16 -07:00
Klaus Post	e63a44b734	rest client: Expect context timeouts for locks (#10782 ) Add option for rest clients to not mark a remote offline for context timeouts. This can be used if context timeouts are expected on the call.	2020-10-29 09:52:11 -07:00
Klaus Post	6b14c4ab1e	Optimize decryptObjectInfo (#10726 ) `decryptObjectInfo` is a significant bottleneck when listing objects. Reduce the allocations for a significant speedup. https://github.com/minio/sio/pull/40 ``` λ benchcmp before.txt after.txt benchmark old ns/op new ns/op delta Benchmark_decryptObjectInfo-32 24260928 808656 -96.67% benchmark old MB/s new MB/s speedup Benchmark_decryptObjectInfo-32 0.04 1.24 31.00x benchmark old allocs new allocs delta Benchmark_decryptObjectInfo-32 75112 48996 -34.77% benchmark old bytes new bytes delta Benchmark_decryptObjectInfo-32 287694772 4228076 -98.53% ```	2020-10-29 09:34:20 -07:00
Harshavardhana	4bf90ca67f	fix: handle a crash when AskDisks is set to -1 (#10777 )	2020-10-29 09:25:43 -07:00
Harshavardhana	e0655e24f2	fix: A possible crash when fi.Erasure.Distribution is empty (#10779 )	2020-10-28 19:24:01 -07:00
Klaus Post	bfc36aed89	Add update retry limit and compare error by string instead (#10776 )	2020-10-28 13:19:53 -07:00
Kaloyan Raev	be7f67268d	fix: Do not cleanup range files in cache SaveMetadata when total hits are false (#10728 )	2020-10-28 09:23:17 -07:00
Klaus Post	a982baff27	ListObjects Metadata Caching (#10648 ) Design: https://gist.github.com/klauspost/025c09b48ed4a1293c917cecfabdf21c Gist of improvements: * Cross-server caching and listing will use the same data across servers and requests. * Lists can be arbitrarily resumed at a constant speed. * Metadata for all files scanned is stored for streaming retrieval. * The existing bloom filters controlled by the crawler is used for validating caches. * Concurrent requests for the same data (or parts of it) will not spawn additional walkers. * Listing a subdirectory of an existing recursive cache will use the cache. * All listing operations are fully streamable so the number of objects in a bucket no longer dictates the amount of memory. * Listings can be handled by any server within the cluster. * Caches are cleaned up when out of date or superseded by a more recent one.	2020-10-28 09:18:35 -07:00
Krishna Srinivas	f53c5a020e	fix: heal object shards with ec.index and ec.distribution mismatches (#10773 ) Co-authored-by: Harshavardhana <harsha@minio.io>	2020-10-28 00:10:20 -07:00
Harshavardhana	5b30bbda92	fix: add more protection distribution to match EcIndex (#10772 ) allows for more stricter validation in picking up the right set of disks for reconstruction.	2020-10-28 00:09:15 -07:00
Shireesh Anjal	858e2a43df	Remove logging info from OBDInfoHandler (#10727 ) A lot of logging data is counterproductive. A better implementation with precise useful log data can be introduced later.	2020-10-27 17:41:48 -07:00
Kaloyan Raev	df9894e275	avoid caching http ranges in background goroutine (#10724 )	2020-10-26 23:04:48 -07:00
Krishna Srinivas	592f2f23a3	fix: heal rejects objects with disk re-ordering issue (#10766 )	2020-10-26 18:48:47 -07:00
Krishna Srinivas	c49a80db41	fix: use meta.Erasure.Index for GetObject() to reconstruct object (#10764 )	2020-10-26 16:19:42 -07:00
Poorna Krishnamoorthy	46275c6547	cache: rename function declarations (#10763 )	2020-10-26 15:41:24 -07:00
Poorna Krishnamoorthy	0994ed9783	cache: fix call in GetObjectNInfo (#10762 ) Fixes: #10751	2020-10-26 12:30:40 -07:00
Anis Elleuch	eb95353cb1	fix: Get/HeadObject return 404 on non quorum objects (#10753 )	2020-10-26 10:30:46 -07:00
Harshavardhana	029758cb20	fix: retain the previous UUID for newly replaced drives (#10759 ) only newly replaced drives get the new `format.json`, this avoids disks reloading their in-memory reference format, ensures that drives are online without reloading the in-memory reference format. keeping reference format in-tact means UUIDs never change once they are formatted.	2020-10-26 10:29:29 -07:00
Harshavardhana	646d6917ed	turn-off checking for updates completely if MINIO_UPDATE=off (#10752 )	2020-10-24 22:39:44 -07:00
Harshavardhana	d9db7f3308	expire lockers if lockers are offline (#10749 ) lockers currently might leave stale lockers, in unknown ways waiting for downed lockers. locker check interval is high enough to safely cleanup stale locks.	2020-10-24 13:23:16 -07:00
Harshavardhana	6a8c62f9fd	make sure to preserve UUID from reference format (#10748 ) reference format should be source of truth for inconsistent drives which reconnect, add them back to their original position remove automatic fix for existing offline disk uuids	2020-10-24 13:23:08 -07:00
Anis Elleuch	00124c56d9	erasure: Commit data before xl.meta in RenameData() (#10734 ) This will reduce the chance to have updated xl.meta without data.	2020-10-23 21:54:58 -07:00
Anis Elleuch	2c32c2149e	tests: Avoid running TestNSRace in short test mode (#10735 )	2020-10-23 21:23:12 -07:00
Harshavardhana	734f258878	fix: slow down auto healing more aggressively (#10730 ) Bonus fixes - logging improvements to ensure that we don't use `go logger.LogIf` to avoid runtime.Caller missing the function name. log where necessary. - remove unused code at erasure sets	2020-10-22 13:36:24 -07:00
Anis Elleuch	0e0c53bba4	tests: Lower expectation in addr selection in rand cache dialer (#10739 ) Test TestDialContextWithDNSCacheRand was failing sometimes because it depends on a random selection of addresses when testing random DNS resolution from cache. Lower addr selection exception to 10%	2020-10-22 09:35:32 -07:00
Poorna Krishnamoorthy	5cc23ae052	validate if iam store is initialized (#10719 ) Fixes panic - regression from `d6d770c1b1`	2020-10-20 21:28:24 -07:00
Harshavardhana	d6d770c1b1	initialize object layer right after config has loaded	2020-10-19 22:04:59 -07:00
Harshavardhana	b07df5cae1	initialize IAM as soon as object layer is initialized (#10700 ) Allow requests to come in for users as soon as object layer and config are initialized, this allows users to be authenticated sooner and would succeed automatically on servers which are yet to fully initialize.	2020-10-19 09:54:40 -07:00
Harshavardhana	c107728676	fix: s3 gateway DNS cache initialization (#10706 ) fixes #10705	2020-10-19 01:34:23 -07:00
Anis Elleuch	284a2b9021	ilm: Send delete marker creation event when appropriate (#10696 ) Before this commit, the crawler ILM will always send object delete event notification though this is wrong.	2020-10-16 21:22:12 -07:00
Ritesh H Shukla	0b53e30ecb	Clean up monitor on delete bucket (#10698 )	2020-10-16 17:59:31 -07:00
Harshavardhana	bd2131ba34	add DNS cache support to avoid DNS flooding (#10693 ) Go stdlib resolver doesn't support caching DNS resolutions, since we compile with CGO disabled we are more probe to DNS flooding for all network calls to resolve for DNS from the DNS server. Under various containerized environments such as VMWare this becomes a problem because there are no DNS caches available and we may end up overloading the kube-dns resolver under concurrent I/O. To circumvent this issue implement a DNSCache resolver which resolves DNS and caches them for around 10secs with every 3sec invalidation attempted.	2020-10-16 14:49:05 -07:00
ebozduman	1aec168c84	fix: azure gateway should reject bucket names with "." (#10635 )	2020-10-16 09:30:18 -07:00
Klaus Post	21a549a83b	fix: keep MRF channel open to avoid random CI crash (#10686 ) There doesn't seem to be any benefit to closing the channel, so just keep it open and let it die with the server.	2020-10-16 09:08:51 -07:00
Ritesh H Shukla	8a16a1a1a9	fix: misc fixes for bandwidth reporting amd monitoring (#10683 ) * Set peer for fetch bandwidth * Fix the limit for bandwidth that is reported. * Reduce CPU burn from bandwidth management.	2020-10-16 09:07:50 -07:00
Harshavardhana	ad726b49b4	rename zones to serverSets to avoid terminology conflict (#10679 ) we are bringing in availability zones, we should avoid zones as per server expansion concept.	2020-10-15 14:28:50 -07:00
Anis Elleuch	db2241066b	heal: Enable removing dangling delete markers (#10688 )	2020-10-15 13:06:40 -07:00
Harshavardhana	f1cc16e788	fix: background heal rely on getOnlineDisks() (#10687 )	2020-10-15 13:06:23 -07:00
Klaus Post	3820a905e0	in getOnlineDisks wait for disks to be populated (#10685 )	2020-10-15 06:37:10 -07:00
Harshavardhana	2042d4873c	rename crawler config option to heal (#10678 )	2020-10-14 13:51:51 -07:00
Harshavardhana	f9be783f3e	fix: allow crawler to crawl on disks without usage constraints (#10677 ) additionally also change the resolution usage wise return of disks, allows to small byte level differences to be masked.	2020-10-14 12:12:10 -07:00
Harshavardhana	71b97fd3ac	fix: connect disks pre-emptively during startup (#10669 ) connect disks pre-emptively upon startup, to ensure we have enough disks are connected at startup rather than wait for them. we need to do this to avoid long wait times for server to be online when we have servers come up in rolling upgrade fashion	2020-10-13 18:28:42 -07:00
Klaus Post	03991c5d41	crawler: Remove waitForLowActiveIO (#10667 ) Only use dynamic delays for the crawler. Even though the max wait was 1 second the number of waits could severely impact crawler speed. Instead of relying on a global metric, we use the stateless local delays to keep the crawler running at a speed more adjusted to current conditions. The only case we keep it is before bitrot checks when enabled.	2020-10-13 13:45:08 -07:00
飞雪无情	614060764d	fix: use the correct Action type for policy.Args and iampolicy.Args (#10650 )	2020-10-12 15:18:22 -07:00
Harshavardhana	a3ba8188d7	fix: allow locker to be niladic	2020-10-12 14:23:44 -07:00
Harshavardhana	2760fc86af	Bump default idleConnsPerHost to control conns in time_wait (#10653 ) This PR fixes a hang which occurs quite commonly at higher concurrency by allowing following changes - allowing lower connections in time_wait allows faster socket open's - lower idle connection timeout to ensure that we let kernel reclaim the time_wait connections quickly - increase somaxconn to 4096 instead of 2048 to allow larger tcp syn backlogs. fixes #10413	2020-10-12 14:19:46 -07:00
Ritesh H Shukla	8ceb2a93fd	fix: peer replication bandwidth monitoring in distributed setup (#10652 )	2020-10-12 09:04:55 -07:00
Ritesh H Shukla	c2f16ee846	Add basic bandwidth monitoring for replication. (#10501 ) This change tracks bandwidth for a bucket and object - [x] Add Admin API - [x] Add Peer API - [x] Add BW throttling - [x] Admin APIs to set replication limit - [x] Admin APIs for fetch bandwidth	2020-10-09 20:36:00 -07:00
Harshavardhana	6484453fc6	optionally allow strict quorum listing (#10649 ) ``` export MINIO_API_LIST_STRICT_QUORUM=on ``` would enable listing in quorum if necessary	2020-10-09 15:40:46 -07:00
Harshavardhana	a0d0645128	remove safeMode behavior in startup (#10645 ) In almost all scenarios MinIO now is mostly ready for all sub-systems independently, safe-mode is not useful anymore and do not serve its original intended purpose. allow server to be fully functional even with config partially configured, this is to cater for availability of actual I/O v/s manually fixing the server. In k8s like environments it will never make sense to take pod into safe-mode state, because there is no real access to perform any remote operation on them.	2020-10-09 09:59:52 -07:00
Harshavardhana	253194e491	do not hold write locks - if objects don't exist (#10644 )	2020-10-08 17:47:21 -07:00
Harshavardhana	736e58dd68	fix: handle concurrent lockers with multiple optimizations (#10640 ) - select lockers which are non-local and online to have affinity towards remote servers for lock contention - optimize lock retry interval to avoid sending too many messages during lock contention, reduces average CPU usage as well - if bucket is not set, when deleteObject fails make sure setPutObjHeaders() honors lifecycle only if bucket name is set. - fix top locks to list out always the oldest lockers always, avoid getting bogged down into map's unordered nature.	2020-10-08 12:32:32 -07:00
Poorna Krishnamoorthy	907a171edd	Generalize error messages for remote targets (#10638 ) This is to allow remote targets to be generalized for replication/ILM transition Also adding a field in BucketTarget to identify a remote target with a label.	2020-10-08 10:54:11 -07:00
Andreas Auernhammer	ed6d2a100f	logger: avoid writing audit log response header twice (#10642 ) This commit fixes a misuse of the `http.ResponseWriter.WriteHeader`. A caller should either call `WriteHeader` exactly once or write to the response writer and causing an implicit 200 OK. Writing the response headers more than once causes a `http: superfluous response.WriteHeader call` log message. This commit fixes this by preventing a 2nd `WriteHeader` call being forwarded to the underlying `ResponseWriter`. Updates #10587	2020-10-08 09:29:10 -07:00
Harshavardhana	effe131090	fix: allow read unlocks to be defensive about split brains (#10637 )	2020-10-07 09:15:01 -07:00
Harshavardhana	18063bf25c	fix: cleanup old directory handling code (#10633 ) we don't need them anymore, remove legacy code.	2020-10-06 12:03:57 -07:00
Poorna Krishnamoorthy	dbbed6f7f0	update minio-go dependency (#10634 )	2020-10-06 08:37:09 -07:00
Poorna Krishnamoorthy	7fbfdceba3	Fix replication slowness (#10632 ) - Increase channel buffer length - Avoid blocking wait on replicaCh	2020-10-05 14:45:42 -07:00
Shireesh Anjal	f1418a50f0	add NVMe drive info [model num, serial num, drive temp. etc.] (#10613 ) * add NVMe drive info [model num, serial num, drive temp. etc.] * Ignore fuse partitions * Add the nvme logic only for linux * Move smart/nvme structs to a separate file Co-authored-by: wlan0 <sidharthamn@gmail.com>	2020-10-04 10:18:46 -07:00
Krishna Srinivas	045e30f2c1	Set LastModified time from source for bucket replication (#10627 )	2020-10-02 18:32:22 -07:00
Harshavardhana	c6a9a94f94	fix: optimize ServerInfo() handler to avoid reading config (#10626 ) fixes #10620	2020-10-02 16:19:44 -07:00
Harshavardhana	8e7c00f3d4	add missing request-id from DeleteObject events (#10623 ) fixes #10621	2020-10-02 13:36:13 -07:00
Harshavardhana	23e8390997	fix: Allow Walk to honor load balanced drives (#10610 )	2020-10-01 20:24:34 -07:00
Anis Elleuch	71403be912	fix: consider partNumber in GET/HEAD requests (#10618 )	2020-10-01 15:41:12 -07:00
Harshavardhana	f28d02b7f2	fix: simplify obd how we calculate transferred bytes (#10617 )	2020-10-01 14:34:51 -07:00
Harshavardhana	e0cb814f3f	fail if port is not accessible (#10616 ) throw proper error when port is not accessible for the regular user, this is possibly a regression. ``` ERROR Unable to start the server: Insufficient permissions to use specified port > Please ensure MinIO binary has 'cap_net_bind_service=+ep' permissions HINT: Use 'sudo setcap cap_net_bind_service=+ep /path/to/minio' to provide sufficient permissions ```	2020-10-01 13:23:31 -07:00
Harshavardhana	98a08e1644	fix: protect updating latencies/throughput slices in obd (#10611 ) Additionally close the transferChan upon function exit.	2020-10-01 09:50:08 -07:00
Klaus Post	3047121255	dataupdate: Bump to force rescan (#10609 ) After #10594 let's invalidate the bloom filters to force the next cycles to go through all data. There is a small chance that the linked PR could have caused missing bloom filter data. This will invalidate the current bloom filters and make the crawler go through everything.	2020-09-30 16:10:40 -07:00
Ritesh H Shukla	5a7f92481e	fix: client errors for DNS service creation errors (#10584 )	2020-09-30 14:09:41 -07:00
Anis Elleuch	0d45c38782	List v1/versions routes based on source IP if found (#10603 ) Routing using on source IP if found. This should distribute the listing load for V1 and versioning on multiple nodes evenly between different clients. If source IP is not found from the http request header, then falls back to bucket name instead.	2020-09-30 13:38:27 -07:00
Poorna Krishnamoorthy	56d1b227cf	Handle changes to versioning config for replication (#10598 ) Disallow versioning suspension on a bucket with pre-existing replication configuration If versioning is suspended on the target,replication should fail.	2020-09-30 13:36:37 -07:00
Lenin Alevski	bea87a5a20	fix: reading multiple TLS certificates when deployed in K8S (#10601 ) Ignore all regular files, CAs directory and any directory that starts with `..` inside the `.minio/certs` folder	2020-09-30 08:21:30 -07:00
Harshavardhana	2b4eb87d77	pick disks which are common maximally used (#10600 ) further optimization to ensure that good disks are always used for listing, other than healing we only use disks that are maximally used.	2020-09-29 22:54:02 -07:00
Harshavardhana	1f9abbee4d	make sure to release locks upon timeout (#10596 ) fixes #10418	2020-09-29 15:18:34 -07:00
Klaus Post	fdf0ae9167	exit data update tracker only upon context completion (#10594 ) The data update tracker saver would exit if data wasn't updated for between cycles.	2020-09-29 13:23:53 -07:00
Harshavardhana	00eb6f6bc9	cache DiskInfo at storage layer for performance (#10586 ) `mc admin info` on busy setups will not move HDD heads unnecessarily for repeated calls, provides a better responsiveness for the call overall. Bonus change allow listTolerancePerSet be N-1 for good entries, to avoid skipping entries for some reason one of the disk went offline.	2020-09-29 09:54:41 -07:00
Harshavardhana	66174692a2	add '.healing.bin' for tracking currently healing disk (#10573 ) add a hint on the disk to allow for tracking fresh disk being healed, to allow for restartable heals, and also use this as a way to track and remove disks. There are more pending changes where we should move all the disk formatting logic to backend drives, this PR doesn't deal with this refactor instead makes it easier to track healing in the future.	2020-09-28 19:39:32 -07:00
飞雪无情	209680e89f	Remove redundant http.HandlerFunc type conversion. (#10576 )	2020-09-28 13:33:49 -07:00
飞雪无情	27d9bd04e5	Handling unhandled errors in the InfoCannedPolicy method. (#10575 )	2020-09-27 10:24:04 -07:00
Harshavardhana	bebcf4f004	unlock() only if locking was successful	2020-09-25 19:36:47 -07:00
Harshavardhana	eafa775952	fix: add lock ownership to expire locks (#10571 ) - Add owner information for expiry, locking, unlocking a resource - TopLocks returns now locks in quorum by default, provides a way to capture stale locks as well with `?stale=true` - Simplify the quorum handling for locks to avoid from storage class, because there were challenges to make it consistent across all situations. - And other tiny simplifications to reset locks.	2020-09-25 19:21:52 -07:00
Harshavardhana	66b4a862e0	fix: network failure err check should ignore context canceled errors (#10567 ) context canceled errors bubbling up from the network layer has the potential to be misconstrued as network errors, taking prematurely a server offline and triggering a health check routine avoid this potential occurrence.	2020-09-25 14:35:47 -07:00
Anis Elleuch	9603489dd3	federation: Honor range with UploadObjectPart to a different cluster (#10570 ) Use gr & length instead of srcInfo.Reader & srcInfo.Size because they don't honor range header	2020-09-25 12:06:42 -07:00
Anis Elleuch	b302c8a5f4	heal: Fix periodic healing cleanup (#10569 ) isEnded() was incorrectly calculating if the current healing sequence is ended or not. h.currentStatus.Items could be empty if healing is very slow and mc admin heal consumed all items.	2020-09-25 10:29:00 -07:00
Praveen raj Mani	b880796aef	Set the maximum open connections limit in PG and MySQL target configs (#10558 ) As the bulk/recursive delete will require multiple connections to open at an instance, The default open connections limit will be reached which results in the following error ```FATAL: sorry, too many clients already``` By setting the open connections to a reasonable value - `2`, We ensure that the max open connections will not be exhausted and lie under bounds. The queries are simple inserts/updates/deletes which is operational and sufficient with the the maximum open connection limit is 2. Fixes #10553 Allow user configuration for MaxOpenConnections	2020-09-24 22:20:30 -07:00
Harshavardhana	37a5d5d7a0	reduce timeouts between servers for faster disconnects (#10562 )	2020-09-24 20:10:07 -07:00
Harshavardhana	3cac262dd1	report heal drives properly, also from global state (#10561 ) It is possible the heal drives are not reported from the maintenance check because the background heal state simply relied on the `format.json` for capturing unformatted drives. It is possible that drives might be still healing - make sure that applications which rely on cluster health check respond back this detail.	2020-09-24 15:36:47 -07:00
poornas	e6ab4db6b8	Fix minimum replication workers started (#10560 ) This PR also fixes GetReplicationConfiguration permission in web-handlers.go to use bucket as resource	2020-09-24 12:25:41 -07:00
Harshavardhana	ca989eb0b3	avoid ListBuckets returning quorum errors when node is down (#10555 ) Also, revamp the way ListBuckets work make few portions of the healing logic parallel - walk objects for healing disks in parallel - collect the list of buckets in parallel across drives - provide consistent view for listBuckets()	2020-09-24 09:53:38 -07:00
飞雪无情	d778d034e7	Remove redundant mgmtQueryKey type. (#10557 ) Remove redundant type conversion.	2020-09-24 08:40:21 -07:00
Harshavardhana	f7f9517b6a	fix: host extraction without port	2020-09-23 12:10:14 -07:00
Harshavardhana	90cff10e2b	avoid crash if disks are not initialized	2020-09-23 12:00:29 -07:00
Harshavardhana	81caf35926	fix: reduce healthcheck interval for storage rest client (#10544 )	2020-09-23 10:43:42 -07:00
poornas	5726cef3ca	validate bucket exists in ListRemoteTargets api (#10552 )	2020-09-23 10:37:54 -07:00
Harshavardhana	8b74a72b21	fix: rename READY deadline to CLUSTER deadline ENV (#10535 )	2020-09-23 09:14:33 -07:00
Klaus Post	eec69d6796	Fix stale context for bucket retrieval (#10551 ) The provided context gets captured by the closure making all subsequent calls fail.	2020-09-23 08:30:31 -07:00
Harshavardhana	0537a21b79	avoid concurrenct use of rand.NewSource (#10543 )	2020-09-22 15:34:27 -07:00
poornas	4c54ed8748	Close replica channel only once (#10542 ) Also enforce s3:GetReplicationConfiguration permission check as a bucket level resource.	2020-09-22 12:47:24 -07:00
Anis Elleuch	4c81201f95	fix: healing delete marker on versioned buckets (#10530 ) Healing was not working correctly in the distributed mode because errFileVersionNotFound was not properly converted in storage rest client. Besides, fixing the healing delete marker is not working as expected.	2020-09-21 15:16:16 -07:00
Harshavardhana	cd8d511d3d	move versionsOrder struct to xl-storage-utils	2020-09-21 14:24:42 -07:00
Harshavardhana	17e17da00d	add parallel workers to perform replication in parallel (#10525 ) set the concurrency for replication be to runtime.NumCPU()/2	2020-09-21 13:43:29 -07:00
Harshavardhana	a5da9120f3	fix: [fs] an error upon rwPool.Write() just attempt rwPool.Create() (#10533 ) On some NFS clients looks like errno is incorrectly set, which leads to incorrect errors thrown upwards.	2020-09-21 12:54:23 -07:00
poornas	aa12d75d75	fix crawler to detect lifecycle on bucket even if filter nil (#10532 )	2020-09-21 11:41:07 -07:00
Harshavardhana	6fcbdd5607	remove unused putObjectDir code (#10528 )	2020-09-21 09:41:39 -07:00
Harshavardhana	3831cc9e3b	fix: [fs] CompleteMultipart use trie structure for partMatch (#10522 ) performance improves by around 100x or more ``` go test -v -run NONE -bench BenchmarkGetPartFile goos: linux goarch: amd64 pkg: github.com/minio/minio/cmd BenchmarkGetPartFileWithTrie BenchmarkGetPartFileWithTrie-4 1000000000 0.140 ns/op 0 B/op 0 allocs/op PASS ok github.com/minio/minio/cmd 1.737s ``` fixes #10520	2020-09-21 01:18:13 -07:00
Krishna Srinivas	230fc0d186	Support for "directory" objects (#10499 )	2020-09-19 08:39:41 -07:00
Harshavardhana	7f9498f43f	fix: ignore faulty drives and continue (#10511 ) drives might return different types of errors handle them individually, and for some errors just log an error and continue	2020-09-18 12:09:05 -07:00
Harshavardhana	1cf322b7d4	change leader locker only for crawler (#10509 )	2020-09-18 11:15:54 -07:00
Klaus Post	0b1c824618	Fix incorrect request start time (#10516 ) Log request start time BEFORE starting processing the request	2020-09-18 09:30:52 -07:00
Klaus Post	c851e022b7	Tweaks to dynamic locks (#10508 ) * Fix cases where minimum timeout > default timeout. * Add defensive code for too small/negative timeouts. * Never set timeout below the maximum value of a request. * Protect against (unlikely) int64 wraps. * Decrease timeout slower. * Don't re-lock before copying.	2020-09-18 09:18:18 -07:00
Klaus Post	5ad032826a	Add a reasonable if unable to get total RAM (#10506 ) Though unlikely we shouldn't skip initializing the API if we cannot get RAM. Add 16GiB as a default and log the error.	2020-09-18 02:03:02 -07:00
Harshavardhana	84bf4624a4	fix: make sure to preserve metadata during overwrite in FS mode (#10512 ) This bug was introduced in `14f0047295` almost 3yrs ago, as a side affect of removing stale `fs.json` but we in-fact end up removing existing good `fs.json` for an existing object, leading to some form of a data loss. fixes #10496	2020-09-18 00:16:16 -07:00
Harshavardhana	4a36cd7035	fix: improve performance ListObjectParts in FS mode (#10510 ) from 20s for 10000 parts to less than 1sec Without the patch ``` ~ time aws --endpoint-url=http://localhost:9000 --profile minio s3api \ list-parts --bucket testbucket --key test \ --upload-id c1cd1f50-ea9a-4824-881c-63b5de95315a real 0m20.394s user 0m0.589s sys 0m0.174s ``` With the patch ``` ~ time aws --endpoint-url=http://localhost:9000 --profile minio s3api \ list-parts --bucket testbucket --key test \ --upload-id c1cd1f50-ea9a-4824-881c-63b5de95315a real 0m0.891s user 0m0.624s sys 0m0.182s ``` fixes #10503	2020-09-17 18:51:16 -07:00
Klaus Post	03490c811b	Fix obd goroutine leak (#10504 ) The gouroutine collecting transfer stats never exits. Add missing channel close.	2020-09-17 10:10:20 -07:00
Harshavardhana	ed78854cea	fix: list across all drives to avoid stale disks	2020-09-16 21:17:10 -07:00
Harshavardhana	e60834838f	fix: background disk heal, to reload format consistently (#10502 ) It was observed in VMware vsphere environment during a pod replacement, `mc admin info` might report incorrect offline nodes for the replaced drive. This issue eventually goes away but requires quite a lot of time for all servers to be in sync. This PR fixes this behavior properly.	2020-09-16 21:14:35 -07:00
Harshavardhana	d616d8a857	serialize replication and feed it through task model (#10500 ) this allows for eventually controlling the concurrency of replication and overally control of throughput	2020-09-16 16:04:55 -07:00
Anis Elleuch	24cab7f9df	ilm: Remove a 'null' version if not latest (#10494 ) If the ILM document requires removing noncurrent versions, the the server should be able to remove 'null' versions as well. 'null' versions are created when versioning is not enabled or suspended.	2020-09-16 10:21:50 -07:00
Harshavardhana	02c1a08a5b	fix: make sure to lock CopyObject for in-place updates (#10492 )	2020-09-15 20:44:48 -07:00
Ritesh H Shukla	5c47ce456e	Run replication in the background (#10491 )	2020-09-15 18:44:58 -07:00
Anis Elleuch	8ea55f9dba	obd: Add console log to OBD output (#10372 )	2020-09-15 18:02:54 -07:00
poornas	80e3dce631	azure: update content-md5 to metadata after upload (#10482 ) Fixes #10453	2020-09-15 16:31:47 -07:00
Harshavardhana	80fab03b63	fix: S3 gateway doesn't support full passthrough for encryption (#10484 ) The entire encryption layer is dependent on the fact that KMS should be configured for S3 encryption to work properly and we only support passing the headers as is to the backend for encryption only if KMS is configured. Make sure that this predictability is maintained, currently the code was allowing encryption to go through and fail at later to indicate that KMS was not configured. We should simply reply "NotImplemented" if KMS is not configured, this allows clients to simply proceed with their tests.	2020-09-15 13:57:15 -07:00
Harshavardhana	730d2dc7be	fix: allow CopyObject/PutObjecTags on pre-existing content (#10485 ) fixes #10475	2020-09-15 09:18:41 -07:00
Harshavardhana	0ee9678190	fix: add missing delete marker created filter (#10481 )	2020-09-14 21:32:52 -07:00
Klaus Post	34859c6d4b	Preallocate (safe) slices when we know the size (#10459 )	2020-09-14 20:44:18 -07:00
Klaus Post	b1c99e88ac	reduce CPU usage upto 50% in readdir (#10466 )	2020-09-14 17:19:54 -07:00
Harshavardhana	0104af6bcc	delayed locks until we have started reading the body (#10474 ) This is to ensure that Go contexts work properly, after some interesting experiments I found that Go net/http doesn't cancel the context when Body is non-zero and hasn't been read till EOF. The following gist explains this, this can lead to pile up of go-routines on the server which will never be canceled and will die at a really later point in time, which can simply overwhelm the server. https://gist.github.com/harshavardhana/c51dcfd055780eaeb71db54f9c589150 To avoid this refactor the locking such that we take locks after we have started reading from the body and only take locks when needed. Also, remove contextReader as it's not useful, doesn't work as expected context is not canceled until the body reaches EOF so there is no point in wrapping it with context and putting a `select {` on it which can unnecessarily increase the CPU overhead. We will still use the context to cancel the lockers etc. Additional simplification in the locker code to avoid timers as re-using them is a complicated ordeal avoid them in the hot path, since locking is very common this may avoid lots of allocations.	2020-09-14 15:57:13 -07:00
Harshavardhana	34ea1d2167	fix: return correct error code for MetadataTooLarge (#10470 ) fixes #10469	2020-09-13 21:26:35 -07:00
Harshavardhana	9d95937018	update KMS docs indicating deprecation of AUTO_ENCRYPTION env	2020-09-13 16:23:28 -07:00

... 10 11 12 13 14 ...

4024 Commits