minio

Commit Graph

Author	SHA1	Message	Date
Klaus Post	cdb1b48ad9	Make localLocker lock attempts cancellable (#16510 )	2023-01-31 09:41:17 -08:00
Kaan Kabalak	2d0f30f062	Fix typo in code comment (#16509 )	2023-01-31 07:54:19 +05:30
Krishnan Parthasarathi	cea2ca8c8e	Add restore-status header for multipart objects (#16508 )	2023-01-31 07:53:45 +05:30
Klaus Post	f713436dd0	Fix truncated list response on deleted replicated objects (#16504 )	2023-01-30 09:13:53 -08:00
Klaus Post	b923a62425	Check pool-index for invalid setups (#16501 )	2023-01-30 18:33:07 +05:30
Harshavardhana	67fce4a5b3	fix: dangling delete() upon success should return 404 (#16494 )	2023-01-27 12:43:45 -08:00
Poorna	eaa65b7ade	fix replication healing on list to consider all versions (#16496 )	2023-01-27 12:43:28 -08:00
Poorna	820d94447c	replication: fix target bucket passed on GET proxy (#16495 )	2023-01-27 10:24:51 -08:00
Poorna	ed20134a7b	replication: detect proxy header presence correctly (#16489 )	2023-01-27 01:29:32 -08:00
Harshavardhana	d19cbc81b5	fix: do not return IAM/Bucket metadata replication errors to client (#16486 )	2023-01-26 11:11:54 -08:00
Anis Elleuch	1fd7946dce	Print golang http errors in MinIO log format (#16465 )	2023-01-26 22:46:16 +05:30
Klaus Post	027ff0f3a8	fix: set modTime to current in snowball if archive shows empty (#16482 )	2023-01-26 22:20:35 +05:30
Harshavardhana	54b561898f	fix: anonymize the x-amz-id-2 value from hostname (#16478 )	2023-01-25 10:25:36 -08:00
Harshavardhana	65c104a589	add x-amz-id-2 to indicate the node that received the request (#16474 )	2023-01-25 09:14:10 -08:00
Anis Elleuch	0a0416b6ea	Better error when setting up replication with a service account alias (#16472 )	2023-01-25 21:50:12 +05:30
Anis Elleuch	441babdc41	Rename peer S3 prefix to avoid collision in the future (#16473 )	2023-01-25 06:46:30 -08:00
Harshavardhana	e64b9f6751	fix: disallow SSE-C encrypted objects on replicated buckets (#16467 )	2023-01-24 15:46:33 -08:00
Florian Schwab	d67a846ec4	allow restarting of decommissioning if completed, failed or canceld (#16464 )	2023-01-24 07:07:59 -08:00
Poorna	ca2a1c3f60	replication: clone metrics while loading metrics cache (#16462 )	2023-01-24 02:10:32 -08:00
Poorna	93fbb228bf	Validate if parent user exists for service acct (#16443 )	2023-01-24 08:17:18 +05:30
Anis Elleuch	f37a5b6dae	Add CPU info in the check update user-agent (#16447 )	2023-01-23 08:07:55 -08:00
Harshavardhana	31b0decd46	migrate to minio/mux from gorilla/mux (#16456 )	2023-01-23 16:42:47 +05:30
Harshavardhana	eb561e1c05	allow bootstrap platform checks to be pool specific (#16455 )	2023-01-23 16:24:50 +05:30
Poorna	ddad231921	replication: Avoid logging PreConditionFailed error (#16450 )	2023-01-21 07:33:04 +05:30
Klaus Post	03b94f907f	fix: deleted object names for directory objects (#16448 )	2023-01-20 21:16:06 +05:30
Shireesh Anjal	0f591d245d	fix: incorrect anonymization of drive endpoint (#16442 )	2023-01-20 07:35:44 +05:30
Poorna	1b02e046c2	Fix bandwidth monitoring to be per remote target (#16360 )	2023-01-19 18:52:16 +05:30
Harshavardhana	d08e3cc895	add a way to avoid blocking queueHealTask() depending on caller (#16433 )	2023-01-19 18:50:54 +05:30
Anis Elleuch	d98116559b	Use async healing in PutObject call (#16431 )	2023-01-19 00:54:22 -08:00
Krishnan Parthasarathi	71c95ad0d0	Signal stop-rebalance to all rebalancing pools (#16438 )	2023-01-19 06:54:23 +05:30
Aditya Manthramurthy	698862ec5d	Fix transports/timeouts related regressions (#16427 )	2023-01-18 10:06:38 +05:30
Harshavardhana	b4ef5ff294	remove unnecessary code checking for supported features (#16423 )	2023-01-17 19:37:47 +05:30
Harshavardhana	3db658e51e	use correct xml package for custom MarshalXML() (#16421 )	2023-01-17 05:08:33 +05:30
Shireesh Anjal	5a9f7516d6	Add monthly license update job (#16391 )	2023-01-17 05:08:15 +05:30
Anis Elleuch	3039fd4519	Optimize background heal status to use LocalStorageInfo (#16414 )	2023-01-17 05:02:00 +05:30
Harshavardhana	095fc0561d	feat: allow decom of multiple pools (#16416 )	2023-01-16 21:36:34 +05:30
Anis Elleuch	beb1924437	Properly restart fresh disk healing when failed in some places (#16413 )	2023-01-14 05:06:46 +05:30
jiuker	c8e1154f1e	fix: reading from erasureDisks must be protected via read lock() (#16407 )	2023-01-13 04:16:23 -08:00
Poorna	b204c2dbec	fix: enforce deny on DeleteVersionAction (#16409 )	2023-01-13 04:16:00 -08:00
Poorna	b22b39de96	Avoid dangling deletes if disk not found (#16401 )	2023-01-12 22:20:19 -08:00
Harshavardhana	c242e6c391	fix: calculate common parity properly (#16406 )	2023-01-13 03:28:16 +05:30
Anis Elleuch	e05205756f	metrics: Add more logs when unable to read bucket usage (#16405 )	2023-01-13 02:32:00 +05:30
Anis Elleuch	475a88b555	fix: error out if an object is found after a full decom (#16277 )	2023-01-12 05:52:51 +05:30
Anis Elleuch	1ece3d1dfe	Add comment field to service accounts (#16380 )	2023-01-10 21:57:52 +04:00
Anis Elleuch	2146ed4033	xl: Quit early when EC config is incorrect (#16390 ) Co-authored-by: Anis Elleuch <anis@min.io>	2023-01-09 23:07:45 -08:00
Anis Elleuch	ebd4388cca	s3: Return XMinioInvalidObjectName if the object contains null char (#16372 )	2023-01-06 10:11:18 -08:00
Anis Elleuch	0333412148	fix: heal only once per disk per set among multiple disks (#16358 )	2023-01-05 20:41:19 -08:00
Harshavardhana	e0086c1be7	reduce startup delays on kubernetes (#16356 )	2023-01-05 02:32:43 -08:00
Anis Elleuch	7883e55da2	Merge buckets list from different nodes in ListBuckets() call (#16357 )	2023-01-04 08:53:58 -08:00
Harshavardhana	a15a2556c3	converge listBuckets() as a peer call (#16346 )	2023-01-03 23:39:40 -08:00
Harshavardhana	f1bbb7fef5	vectorize cluster-wide calls such as bucket operations (#16313 )	2023-01-03 08:16:39 -08:00
Harshavardhana	1cd8e1d8b6	remove the startup jitter before locks() (#16340 )	2023-01-02 01:40:09 -08:00
jiuker	62cd918061	fix: close helmInfo file descriptor (#16319 )	2023-01-01 23:26:59 -08:00
Klaus Post	6a04067514	fix: tweak read buffer size to reduce over-reading (#16338 )	2023-01-01 08:14:20 -08:00
Taran Pelkey	49b3908635	fix: misplaced write response command in DetachPolicy() (#16333 )	2022-12-30 20:04:03 -08:00
Harshavardhana	f93183f66e	fix: a deadlock by refactoring listBuckets() under site replication (#16323 )	2022-12-29 00:08:31 -08:00
Harshavardhana	2937711390	fix: DeleteObject() API with versionId under replication (#16325 )	2022-12-28 22:48:33 -08:00
Anis Elleuch	27417459fb	metrics: Show healing info for all nodes (#16315 )	2022-12-26 08:35:32 -08:00
Harshavardhana	5b8fe2e89a	allow locks with object affinity to spread across pools (#16312 )	2022-12-23 20:55:45 -08:00
Anis Elleuch	acc9c033ed	debug: Add X-Amz-Request-ID to lock/unlock calls (#16309 )	2022-12-23 19:49:07 -08:00
Poorna	8528b265a9	Validate replication target update to avoid duplicate endpoints (#16311 )	2022-12-23 15:44:48 -08:00
Harshavardhana	b882310e2b	avoid locks for internal and invalid buckets in MakeBucket() (#16302 )	2022-12-23 07:46:00 -08:00
Poorna	de0b43de32	persist replication stats with leader lock (#16282 )	2022-12-22 14:25:13 -08:00
jiuker	29dd7f1d68	tier verification leaks fd, that must be closed (#16296 ) Co-authored-by: Harshavardhana <harsha@minio.io>	2022-12-22 10:35:54 -08:00
Poorna	6423e4c767	Remove site replication config if it succeeded locally (#16279 )	2022-12-22 01:31:20 -08:00
Krishnan Parthasarathi	2fa35def2c	Fix DeleteObject when only free versions remain (#16289 )	2022-12-21 16:24:07 -08:00
Anis Elleuch	34167c51d5	trace: Add bootstrap tracing events (#16286 )	2022-12-21 15:52:29 -08:00
Harshavardhana	a5f8af4efb	serialize replication stats() only when needed (#16280 )	2022-12-20 00:07:53 -08:00
Harshavardhana	5a218f38a1	allow retries for transaction lock on startup (#16273 )	2022-12-19 22:00:00 -08:00
Anis Elleuch	e57e946206	Do not save credentials in config.json (#16275 )	2022-12-19 12:27:06 -08:00
Klaus Post	b4f71362e9	Avoid config migration on every startup (#16278 )	2022-12-19 11:10:14 -08:00
Taran Pelkey	ed37b7a9d5	Add API to fetch policy user/group associations (#16239 )	2022-12-19 10:37:03 -08:00
Anis Elleuch	89db3fdb5d	Do not return an error when version disparity is detected (#16269 )	2022-12-16 08:52:12 -08:00
Harshavardhana	80fc3a8a52	use newDynamicTimeoutWithOpts() when appropriate (#16266 )	2022-12-15 13:11:37 -08:00
Klaus Post	988a2e8fed	Faster startup of large distributed systems with latency (#16259 )	2022-12-15 08:31:21 -08:00
Harshavardhana	2433698372	fix: remove unnecessary logs for client conn errors (#16261 )	2022-12-15 08:25:05 -08:00
Harshavardhana	5d7e8f79ed	fix: remove scanner healing with unnecessary logs (#16260 )	2022-12-14 16:39:18 -08:00
Harshavardhana	bad229e16e	fix: support event name s3:Restore:* (#16257 )	2022-12-14 05:12:07 -08:00
Poorna	d37e514733	Cleanup remote targets automatically on replication config removal. (#16221 )	2022-12-14 03:24:06 -08:00
Harshavardhana	c73ea27ed7	do not log checksum mismatch error, client received the error (#16246 )	2022-12-14 01:57:40 -08:00
Krishnan Parthasarathi	0159b56717	fix: rebalance to account for object's on-disk size (#16240 )	2022-12-14 00:15:14 -08:00
Aditya Manthramurthy	9e6cc847f8	Add HTTP2 config option for policy plugin (#16225 )	2022-12-13 14:28:48 -08:00
Taran Pelkey	709eb283d9	Add endpoints for managing IAM policies (#15897 ) Co-authored-by: Taran <taran@minio.io> Co-authored-by: ¨taran-p¨ <¨taran@minio.io¨> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2022-12-13 12:13:23 -08:00
Anis Elleuch	76dde82b41	Implement STS account info API (#16115 )	2022-12-13 08:38:50 -08:00
Anis Elleuch	939c0100a6	log: Do not interpret verbs in object names in console output (#16233 )	2022-12-13 08:27:40 -08:00
Aditya Manthramurthy	2d60bf8c50	Refactor HTTP transports (#16222 )	2022-12-12 20:31:21 -08:00
Harshavardhana	37e20f6ef2	feat: allow listening specific addrs for API port (#16223 )	2022-12-12 18:48:46 -08:00
Harshavardhana	2fc182d8e6	fix: iso8601TimeFormat padding issue for certain nanoseconds (#16207 )	2022-12-12 10:28:30 -08:00
Shireesh Anjal	a2cbeaa9e6	Use different subnet public key during dev/test (#16216 )	2022-12-12 10:28:15 -08:00
Harshavardhana	444ff20bc5	do not rename multipart failed transactions back to tmp (#16204 )	2022-12-12 01:40:29 -08:00
Harshavardhana	20ef5e7a6a	avoid double deletes() when no more versions (#16206 )	2022-12-12 01:40:04 -08:00
Aditya Manthramurthy	e06127566d	Add IAM API to attach/detach policies for LDAP (#16182 )	2022-12-09 13:08:33 -08:00
Harshavardhana	dfe73629a3	fix: delete marker discrepancies via DeleteObject() API (#16195 )	2022-12-08 18:15:16 -08:00
Harshavardhana	b03dd1af17	remove hard limit for number of buckets (#16194 )	2022-12-08 12:24:03 -08:00
Harshavardhana	4bc367c490	fix: translate tier add errors properly (#16191 )	2022-12-08 11:18:07 -08:00
Klaus Post	3eb2d086b2	Replace filepathx with fork (#16192 )	2022-12-08 10:42:44 -08:00
Klaus Post	70986b6e6e	Add version id to healresult (#16193 )	2022-12-08 07:49:10 -08:00
Klaus Post	ebe395788b	feat: Encrypt s3zip file index (#16179 )	2022-12-07 14:56:07 -08:00
Klaus Post	12fd6678ee	Encrypt checksums with KMS on CompleteMultipartUpload (#16177 )	2022-12-07 10:18:18 -08:00
Harshavardhana	90d35b70b4	remove unnecessary logs for truncated XML inputs (#16184 )	2022-12-07 08:30:52 -08:00
Javier Adriel	04ae9058ed	Populate end_session_endpoint (#16183 )	2022-12-06 16:56:37 -08:00
Aditya Manthramurthy	a30cfdd88f	Bump up madmin-go to v2 (#16162 )	2022-12-06 13:46:50 -08:00
Anis Elleuch	1bae32dc96	xl: Delete older data-dir when replacing an existing version-id (#16176 )	2022-12-06 13:43:18 -08:00
Anis Elleuch	932d2c3c62	Add X-Amz-Request-Id to internode calls (#16146 )	2022-12-06 09:27:26 -08:00
jiuker	8d8d07ac5c	use readlock instead of writelock to get heal information (#16175 )	2022-12-06 08:08:22 -08:00
Anis Elleuch	44735be38e	s3: Return correct error when Version is invalid in policy document (#16178 )	2022-12-06 08:07:24 -08:00
Klaus Post	3fd9059b4e	opt: Only stream big data usage caches (#16168 )	2022-12-05 13:01:11 -08:00
Klaus Post	a713aee3d5	Run staticcheck on CI (#16170 )	2022-12-05 11:18:50 -08:00
Andreas Auernhammer	d882ba2cb4	kms: add support for KES enclaves (#16139 ) Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-12-04 02:34:24 -08:00
jiuker	6086f45d25	fix: in disk cache readCacheFileStream should closed upon return (#16138 )	2022-12-04 02:28:10 -08:00
Klaus Post	98cffbce03	s3zip: Limit over-read for single file (#16161 )	2022-12-02 08:53:24 -08:00
Klaus Post	1cd875de1e	Persist updated metadata (#16160 )	2022-12-02 08:35:04 -08:00
Harshavardhana	5a8df7efb3	re-implement StorageInfo to be a peer call (#16155 )	2022-12-01 14:31:35 -08:00
Anis Elleuch	c84e2939e4	trace: Publish storage layer errors (#16153 )	2022-12-01 12:10:54 -08:00
Anis Elleuch	641ab24aec	repl: resync orchestrator to use global shared lock (#16154 )	2022-12-01 12:10:09 -08:00
Harshavardhana	71133105d7	re-order the top-level config keys for priority (#16150 )	2022-12-01 07:50:08 -08:00
Aditya Manthramurthy	87cbd41265	feat: Allow at most one claim based OpenID IDP (#16145 )	2022-11-29 15:40:49 -08:00
Klaus Post	cc1d8f0057	Check for abandoned data when healing (#16122 )	2022-11-28 10:20:55 -08:00
Anis Elleuch	1f1dcdce65	move HTTP recorder to an internal library (#16128 )	2022-11-28 10:20:27 -08:00
Shireesh Anjal	98a67a3776	Improvements in logger and audit webhooks (#16102 )	2022-11-28 08:03:26 -08:00
Poorna	63fc6ba2cd	preserve replicated ETag properly on target (#16129 )	2022-11-26 14:43:32 -08:00
jiuker	ce53d7f6c2	add disk.Close() in healFreshDisk to indicate idiomatic flow of code (#16124 )	2022-11-26 00:26:15 -08:00
jiuker	fe8eed963e	fix: wrapped error will not equal in decommissioning (#16113 )	2022-11-24 08:00:42 -08:00
Shireesh Anjal	59f877fc64	fix: Timestamp not added in diagnostics report (#16114 )	2022-11-23 07:11:22 -08:00
Klaus Post	f96fe9773c	fix: duplicated shared prefix with custom delimiter when listing (#16111 )	2022-11-22 08:51:04 -08:00
Anis Elleuch	04948b4d55	fix: checking for stale STS account under site replication (#16109 )	2022-11-22 07:26:33 -08:00
Klaus Post	98ba622679	Reduce temporary file clean-up waits (#16110 )	2022-11-22 07:23:36 -08:00
Harshavardhana	08103870a5	update single drive setup error message (#16098 )	2022-11-18 14:47:38 -08:00
Anis Elleuch	993e586855	config: return XMinioConfigNotFound code for non existing config (#16065 )	2022-11-18 10:28:14 -08:00
Harshavardhana	58ec835af0	fix: skip free version ID and marker in metadata equality (#16093 )	2022-11-18 05:48:22 -08:00
Harshavardhana	6aea950d74	avoid partID lock validating uploadID exists prematurely (#16086 )	2022-11-18 03:09:35 -08:00
Poorna	7198be5be9	bucket resync: persist reset id to bucket metadata (#16088 )	2022-11-18 01:39:05 -08:00
Klaus Post	a22b4adf4c	distribute replication ops based on names (#16083 )	2022-11-17 15:20:09 -08:00
Klaus Post	b7bb122be8	fix: replication auto-scaling deadlock (#16084 )	2022-11-17 07:35:02 -08:00
Krishnan Parthasarathi	8441a3bf5f	fix: update metacache entry only once (#16072 )	2022-11-16 11:25:00 -08:00
Harshavardhana	853c4de75a	allow changing endpoints in distributed setups (#16071 )	2022-11-16 07:59:10 -08:00
jiuker	3597af789e	allow resultCh to be closed() after clusterMetaHealthInfo() (#16073 )	2022-11-16 03:04:36 -08:00
Shireesh Anjal	5246e3be84	Send health diagnostics data as part of callhome (#16006 )	2022-11-15 13:53:05 -08:00
Klaus Post	8a07000e58	fix: refactor getReplicationDiff for safe use (#16051 )	2022-11-15 07:59:21 -08:00
Krishnan Parthasarathi	3bb82ef60d	top-locks: Include lock-held duration (#16061 )	2022-11-15 07:57:52 -08:00
Harshavardhana	91f45c4aa6	avoid inconsistent versions healing when versions are large (#16066 )	2022-11-14 18:35:26 -08:00
Poorna	d6bc141bd1	feat: Add support for site level resync (#15753 )	2022-11-14 07:16:40 -08:00
jiuker	7ac64ad24a	fix: use errors.Is for wrapped returns (#16062 )	2022-11-14 07:15:46 -08:00
Harshavardhana	6d76db9d6c	improve server startup error when pools are incorrect (#16056 )	2022-11-11 19:40:45 -08:00
jiuker	bdcb485740	netPerfRX Reset() should use write Lock() (#16043 )	2022-11-10 19:44:20 -08:00
Poorna	e32b948a49	fix: parsing multipart uploadID under site replicated setup (#16048 ) continue the fix from #16034	2022-11-10 16:17:45 -08:00
Klaus Post	5b242f1d11	Add Audit target metrics (#16044 )	2022-11-10 10:20:21 -08:00
Poorna	34d28dd79f	replication: Avoid blocking on mrf save (#16045 )	2022-11-10 10:20:02 -08:00
Krishnan Parthasarathi	6eef9b4a23	lifecycle: simplify Eval and HasActiveRules (#16036 )	2022-11-10 07:17:45 -08:00
Aditya Manthramurthy	5f1999cc71	fix: avoid URL unsafe chars in multipart upload ID (#16034 )	2022-11-09 16:41:16 -08:00
Krishnan Parthasarathi	40a2c6b882	Return remote tier as StorageClass for transitioned objects (#16035 )	2022-11-09 15:57:34 -08:00
jiuker	7b7356f04c	close the reader under disk cache bitrot verification (#16024 )	2022-11-09 04:20:11 -08:00
Klaus Post	bbc312fce6	Add notification queue metrics (#16026 )	2022-11-08 16:36:47 -08:00
Anis Elleuch	7260241511	Remove some logs caused by external apps (#16027 )	2022-11-08 13:29:05 -08:00
Anis Elleuch	3b1a9b9fdf	Use the same lock for the scanner and site replication healing (#15985 )	2022-11-08 08:55:55 -08:00
Harshavardhana	72afc2727a	rebalance status must return appropriate error initially (#16022 )	2022-11-08 07:56:45 -08:00
Aditya Manthramurthy	76d822bf1e	Add LDAP policy entities API (#15908 )	2022-11-07 14:35:09 -08:00
Klaus Post	ddeca9f12a	fix: filter rest errors and logs returned (#16019 )	2022-11-07 10:38:08 -08:00
Harshavardhana	1f3db03bf0	allow changing argument for path for SNSD setup (#16013 )	2022-11-07 00:11:58 -08:00
Harshavardhana	944c62daf4	skip flaky tests on windows OS (#16015 )	2022-11-07 00:11:21 -08:00
Harshavardhana	9547b7d0e9	add deadlineConnections on remoteTransport (#16010 )	2022-11-05 11:09:21 -07:00
Klaus Post	808ecfe0f2	merge versions across sets when listing (#16003 )	2022-11-04 11:33:22 -07:00
Klaus Post	2894dd4d1a	fix: hold lock while serializing replication stats (#16007 )	2022-11-04 09:59:14 -07:00
jiuker	fd8750e959	fix: http body must be drained in downloadBinary() (#16001 )	2022-11-04 08:22:38 -07:00
Poorna	4f5d38a4b1	site replication edit: validate endpoint belongs to deployment (#16000 )	2022-11-03 16:23:45 -07:00
Anis Elleuch	7e73fc2870	Implement inspect data API v2 (#15474 ) Co-authored-by: Klaus Post <klauspost@gmail.com>	2022-11-02 13:36:38 -07:00
Harshavardhana	0d49b365ff	converge SNSD deployments into single code (#15988 )	2022-11-01 16:41:01 -07:00
Anis Elleuch	7721595aa9	config: Deprecated delay/max_wait/scanner and introduce speed (#15941 )	2022-11-01 08:04:07 -07:00
Harshavardhana	fd6f6fc8df	cleanup stale parent multipart directories (#15980 )	2022-11-01 08:00:02 -07:00
Aditya Manthramurthy	4fb47cd568	fix: update admin IDP APIs to be more RESTful (#15896 )	2022-10-31 14:52:26 -07:00
Klaus Post	ecc932d5dd	Clean entire tmp-old on restart (#15979 )	2022-10-31 07:27:50 -07:00
Harshavardhana	b57fbff7c1	ignore background healInfo in single drive setup (#15968 )	2022-10-31 07:26:10 -07:00
Poorna	d765b89a63	improve validation for replication resync API (#15964 )	2022-10-28 23:21:33 -07:00
Harshavardhana	6e4acf0504	add a message of removal for gateway and hide the command (#15965 )	2022-10-28 14:11:20 -07:00
Klaus Post	71954faa3a	mark pubsub type safe via generics (#15961 )	2022-10-28 10:55:42 -07:00
Klaus Post	0f0e154315	fix: inconsistent replication delete marker timestamps (#15956 )	2022-10-27 09:46:52 -07:00
Harshavardhana	136d41775f	remove numAvailableDisks check as it doesn't serve any purpose (#15954 )	2022-10-27 09:05:24 -07:00
Harshavardhana	ec77d28e62	make subnet subsys dynamic and simplify callhome (#15927 )	2022-10-27 00:20:01 -07:00
Klaus Post	86420a1f46	Store multipart checksums (#15953 )	2022-10-26 18:14:58 -07:00
Poorna	7dd8b6c8ed	ensure ILM expiry creates non null deleteMarker for versioned bucket (#15947 )	2022-10-26 16:09:27 -07:00
Anis Elleuch	533c9d4fe3	fix: lockName to disallow parallel same erasure set healing (#15951 )	2022-10-26 12:43:54 -07:00
Anis Elleuch	a35ef155fc	return appropriate error status code in the lock handler (#15950 )	2022-10-26 09:51:26 -07:00
Poorna	8dd3c41b2a	allow MakeBucket errors to be handled lazily (#15945 ) remote error is not required to be passed back to the client - this is mostly because we have healing that should eventually, catch up on this and heal the bucket.	2022-10-25 23:32:37 -07:00
Krishnan Parthasarathi	4523da6543	feat: introduce pool-level rebalance (#15483 )	2022-10-25 12:36:57 -07:00
Poorna	ce8456a1a9	proxy multipart to peers via multipart uploadID (#15926 )	2022-10-25 10:52:29 -07:00
Poorna	9ce1884732	reject editing bucket replication config when site replication is enabled (#15937 )	2022-10-24 20:24:32 -07:00
Harshavardhana	23b329b9df	remove gateway completely (#15929 )	2022-10-24 17:44:15 -07:00
Krishnan Parthasarathi	0c34e51a75	Filter out tiering metadata during CopyObject (#15936 )	2022-10-24 16:32:31 -07:00
Anis Elleuch	fc6c794972	Audit dangling object removal (#15933 )	2022-10-24 11:35:07 -07:00
Klaus Post	86d543d0f6	Check for s3zip content offset (#15924 )	2022-10-21 15:37:48 -07:00
Poorna	e4e90b53c1	fix: delete-marker replication check properly (#15923 )	2022-10-21 14:45:06 -07:00
Anis Elleuch	58d776daa0	Set CONSOLE_MINIO_SERVER to 127.0.0.1 by default (#15887 )	2022-10-21 14:42:28 -07:00
Krishnan Parthasarathi	f6b2e89109	Pass encrypted etag as is for immediate tiering (#15925 )	2022-10-21 14:40:50 -07:00
Anis Elleuch	ac85c2af76	lifecycle: refactor rules filtering and tagging support (#15914 )	2022-10-21 10:46:53 -07:00
Shireesh Anjal	5aba2aedb3	Do not freeze s3 traffic in healthinfo api (#15912 )	2022-10-21 00:34:32 -07:00
Harshavardhana	a8332efa94	fix: Delete() of bucket metadata should not parse the config (#15904 )	2022-10-19 17:55:09 -07:00
Aditya Manthramurthy	3dbef72dc7	fix: AccountInfo API for roleARN based accounts (#15907 )	2022-10-19 17:54:41 -07:00
Aditya Manthramurthy	2d16e74f38	Add LDAP IDP Configuration APIs (#15840 )	2022-10-19 11:00:10 -07:00
Anis Elleuch	de5070446d	Deprecate --listeners flag (#15900 )	2022-10-19 08:45:50 -07:00
Harshavardhana	374abd1e7d	add filter support for tags and metadata in batch replication (#15885 )	2022-10-18 21:22:21 -07:00
Anis Elleuch	0506d9e83d	storage: Return errDiskNotFound when a peer is during shutdown (#15868 )	2022-10-18 13:50:46 -07:00
Klaus Post	bd3dfad8b9	Add concurrent Snowball extraction + options (#15836 )	2022-10-18 13:50:21 -07:00
Harshavardhana	9fff315555	do not need to trace ignored objects (#15894 )	2022-10-18 13:47:55 -07:00
Anis Elleuch	18fb86b7be	convert context.DeadlineExceed to offline disk in DiskInfo() (#15886 )	2022-10-18 03:01:16 -07:00
Harshavardhana	58a8275e84	do not assume invalid buf to be non-xl.meta (#15843 )	2022-10-17 09:39:21 -07:00
Aditya Manthramurthy	85fc7cea97	Pass role ARN for OIDC providers to console (#15862 )	2022-10-15 12:57:03 -07:00
Harshavardhana	328d660106	support CRC32 Checksums on single drive setup (#15873 )	2022-10-15 11:58:47 -07:00
Harshavardhana	c68910005b	validate bucket before attempting batch replication (#15861 )	2022-10-15 11:58:31 -07:00
Harshavardhana	c79bcc8838	Revert "convert context.DeadlineExceed to offline disk in DiskInfo() (#15869 )" This reverts commit `0fe58dbb34`.	2022-10-14 20:37:50 -07:00
Anis Elleuch	0fe58dbb34	convert context.DeadlineExceed to offline disk in DiskInfo() (#15869 )	2022-10-14 19:32:13 -07:00
Harshavardhana	6cb2f56395	Revert "Revert "tests: Add context cancelation (#15374 )"" This reverts commit `564a0afae1`.	2022-10-14 03:08:40 -07:00
Harshavardhana	59e33b3b21	validate setBucketTarget properly as per BucketExists() call (#15860 )	2022-10-13 17:46:49 -07:00
Poorna	0e3c92c027	attempt delete marker replication after object is replicated (#15857 ) Ensure delete marker replication success, especially since the recent optimizations to heal on HEAD, LIST and GET can force replication attempts on delete marker before underlying object version could have synced.	2022-10-13 17:45:23 -07:00
Anis Elleuch	db7a9b2c37	heal-info: Return the endpoint of a disk with unknown state (#15854 )	2022-10-13 16:41:44 -07:00
Harshavardhana	44097faec1	support deleteMarkers and all versions in batch replication (#15858 )	2022-10-13 14:42:10 -07:00
Klaus Post	bf3da5081f	Omit empty checksums in responses (#15850 )	2022-10-13 00:49:46 -07:00
Harshavardhana	5532982857	do not disable IsKubernetes(), IsDocker() checks with MINIO_CI_CD (#15852 )	2022-10-12 23:40:48 -07:00
Anis Elleuch	783dd875f7	refactor objectQuorumFromMeta() to search for parity quorum (#15844 )	2022-10-12 16:42:45 -07:00
Harshavardhana	97112c69be	fix: replication stats() to not crash under any situation (#15851 ) Co-authored-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>	2022-10-12 15:47:41 -07:00
Javier Adriel	2939000342	Add metrics, version and apis handlers (#15839 )	2022-10-12 12:08:03 -07:00
Harshavardhana	41e1654f9a	remove spurious logging for object not found (#15842 )	2022-10-12 04:28:21 -07:00
Harshavardhana	e3cb0278ce	honor specified target prefix under batch replication (#15834 )	2022-10-11 14:36:06 -07:00
Harshavardhana	0c81f1bdb3	indicate how long it took to bring the drive online (#15835 )	2022-10-11 11:33:56 -07:00
Klaus Post	6220875803	Add missing server info fields (#15826 )	2022-10-11 11:31:26 -07:00
Aditya Manthramurthy	64cf887b28	use LDAP config from minio/pkg to share with console (#15810 )	2022-10-07 22:12:36 -07:00
Harshavardhana	927a879052	authenticate the request first for headObject() (#15820 )	2022-10-07 21:45:53 -07:00
Anis Elleuch	dfe0c96b87	preserve Version and DeleteMarker sort order in the list XML response (#15819 )	2022-10-07 16:12:36 -07:00
Anis Elleuch	e856e10ac2	ignore VersionNotFound in addition to ObjectNotFound while replicating (#15814 )	2022-10-07 16:11:41 -07:00
Harshavardhana	928feb0889	remove unused debug param from evalActionFromLifecycle (#15813 )	2022-10-07 10:24:12 -07:00
Anis Elleuch	158d0e26a2	decom: Ignore object/version error during deletion (#15806 )	2022-10-06 09:41:58 -07:00
Harshavardhana	78385bfbeb	set bucket creation timestamp properly for legacy FS backend (#15800 )	2022-10-06 02:46:31 -07:00
Harshavardhana	2a13cc28f2	feat: implement support batch replication (#15554 )	2022-10-05 23:00:43 -07:00
Lenin Alevski	4bdf41a6c7	Removing unused getUpdateReaderFromFile function (#15794 ) Signed-off-by: Lenin Alevski <alevsk.8772@gmail.com>	2022-10-05 07:58:27 -07:00
Klaus Post	3c605c93fe	warn when 0 parity has been set as default parity (#15790 )	2022-10-04 22:41:42 -07:00
Anis Elleuch	121f18a443	Use admin request check for ReplicationDiff handler (#15793 )	2022-10-04 17:47:31 -07:00
Harshavardhana	538aeef27a	fix: heal service accounts for LDAP users in site replication (#15785 )	2022-10-04 10:41:47 -07:00
Poorna	be0d2537b7	site replication: fix typo in meta collection (#15792 )	2022-10-04 10:19:17 -07:00
Javier Adriel	3307aa1260	Implement KMS handlers (#15737 )	2022-10-04 10:05:09 -07:00
Harshavardhana	57cfdfd8fb	remove 'perf' tests from health diagnostics (#15780 )	2022-10-03 00:18:41 -07:00
Harshavardhana	f696a221af	allow tagging policy condition for GetObject (#15777 )	2022-10-02 12:29:29 -07:00
Harshavardhana	2aac50571d	fix: de-duplicate conflicting object names on namespace (#15772 )	2022-09-30 15:44:21 -07:00
Shireesh Anjal	45edd27ad7	Re-load config after 'mc admin config reset' (#15771 )	2022-09-30 10:55:53 -07:00
Daryl White	d44f3526dc	Update links to documentation site (#15750 )	2022-09-28 21:28:45 -07:00
Harshavardhana	41b633f5ea	support tagging based policy conditions (#15763 )	2022-09-28 11:25:46 -07:00
Anis Elleuch	86bb48792c	non-blocking initialization of bucket target notifications (#15571 )	2022-09-27 17:23:28 -07:00
Harshavardhana	94dbb4a427	fix: generalize SC config and also skip healing sub-sys under SD (#15757 )	2022-09-26 09:04:54 -07:00
Anis Elleuch	048a46ec2a	Add RPC tcp timeout/errs and AVG duration to prometheus (#15747 )	2022-09-26 09:04:26 -07:00
Poorna	8ea6fb368d	Add auto configuration of replication workers (#15636 )	2022-09-24 16:20:28 -07:00
Harshavardhana	b04c0697e1	validate correct ETag for the parts sent during CompleteMultipart (#15751 )	2022-09-23 21:17:08 -07:00
Harshavardhana	50a8ba6a6f	fix: parse and save retainUntilDate in correct time format (#15741 )	2022-09-23 08:49:27 -07:00
Anis Elleuch	20c89ebbb3	freeze before exit when _MINIO_DEBUG_NO_EXIT is defined (#15709 ) this is to ensure keep k8s pods running, when they reach a "crashloop" stage	2022-09-22 11:57:27 -07:00
Krishnan Parthasarathi	6f56ba80b3	lifecycle: Assign unique id to rules with empty id (#15731 )	2022-09-22 10:51:54 -07:00
Anis Elleuch	6e84283c66	fix: ignoring O_DIRECT in case of erasure single disk (#15734 ) fixes #15733 fixes #15735	2022-09-22 10:41:06 -07:00
Harshavardhana	9d6fddcfdf	persist the non-default creds in config (#15711 )	2022-09-21 16:14:47 -07:00
jiuker	749ce107ee	fix: context leak with replication endpoint hearbeat (#15721 )	2022-09-21 03:08:45 -07:00
Poorna	aec2aa3497	site replication: clear config if remove --all specified (#15716 )	2022-09-20 14:32:23 -07:00
Klaus Post	ff12080ff5	Remove deprecated io/ioutil (#15707 )	2022-09-19 11:05:16 -07:00
Minio Trusted	d89f6af6c4	avoid replication stats crash in Prometheus	2022-09-16 17:09:45 -07:00
Harshavardhana	2c68a19dfd	upgrade all deps and update CREDITS (#15650 )	2022-09-16 01:59:45 -07:00
Harshavardhana	9e5853ecc0	optimize double reads by reusing results from checkUploadIDExists() (#15692 ) Move to using `xl.meta` data structure to keep temporary partInfo, this allows for a future change where we move to different parts to different drives.	2022-09-15 12:43:49 -07:00
Harshavardhana	124544d834	add pre-conditions support for PUT calls during replication (#15674 ) PUT shall only proceed if pre-conditions are met, the new code uses - x-minio-source-mtime - x-minio-source-etag to verify if the object indeed needs to be replicated or not, allowing us to avoid StatObject() call.	2022-09-14 18:44:04 -07:00
Poorna	b910904fa6	change replication stats save path for windows (#15690 )	2022-09-14 13:49:13 -07:00
Klaus Post	eee1ce305c	When listing, do not count delete markers (#15689 ) When limiting listing do not count delete, since they may be discarded. Extend limit, since we may be discarding the forward-to marker. Fix directories always being sent to resolve, since they didn't return as match.	2022-09-14 12:11:27 -07:00
Klaus Post	5c61c3ccdc	Fix flaky TestGetObjectWithOutdatedDisks (#15687 ) On occasion this test fails: ``` 2022-09-12T17:22:44.6562737Z === RUN TestGetObjectWithOutdatedDisks 2022-09-12T17:22:44.6563751Z erasure-object_test.go:1214: Test 2: Expected data to have md5sum = `c946b71bb69c07daf25470742c967e7c`, found `7d16d23f07072af1a809707ba101ae07` 2 ``` Theory: Both objects are written with the same timestamp due to lower timer resolution on Windows. This results in secondary resolution, which is deterministic, but random. Solution: Instead of hacking in a wait we request the specific version we want. Should still keep the test relevant. Bonus: Remote action dependency for vulncheck	2022-09-14 08:17:39 -07:00
Poorna	a0fb0c1835	panic if replication config could not be read from disk (#15685 ) If replication config could not be read from bucket metadata for some reason, issue a panic so that unexpected replication outcomes can be avoided for replicated buckets. For similar reasons, adding a panic while fetching object-lock config if it failed for reason other than non-existence of config.	2022-09-13 21:23:33 -07:00
Aditya Manthramurthy	e152b2a975	Pass groups claim into condition values (#15679 ) This allows using `jwt:groups` as a multi-valued condition key in policies.	2022-09-13 09:45:36 -07:00
Poorna	6b9fd256e1	Persist in-memory replication stats to disk (#15594 ) to avoid relying on scanner-calculated replication metrics. This will improve the accuracy of the replication stats reported. This PR also adds on to #15556 by handing replication traffic that could not be queued by available workers to the MRF queue so that entries in `PENDING` status are healed faster.	2022-09-12 12:40:02 -07:00
Klaus Post	ff9a74b91f	Add fast max-keys=1 support for Listing (#15670 ) Add a listing option to stop when the limit is reached. This can be used by stateless listings for fast results.	2022-09-09 08:13:06 -07:00
Harshavardhana	b579163802	limit number of buckets to 500k (#15668 ) 500k is a reasonable limit for any single MinIO cluster deployment, in future we may increase this value. However for now we are going to keep this limit.	2022-09-09 03:06:34 -07:00
Krishnan Parthasarathi	96bfa77856	serialize updates to healing tracker (#15647 ) When healing is parallelized by setting the ` _MINIO_HEAL_WORKERS` environment variable, multiple goroutines may race while updating the disk's healing tracker. This change serializes only these concurrent updates using a channel. Note, the healing tracker is still not concurrency safe in other contexts.	2022-09-07 08:47:21 -07:00
Harshavardhana	8e997eba4a	fix: trigger Heal when xl.meta needs healing during PUT (#15661 ) This PR is a continuation of the previous change instead of returning an error, instead trigger a spot heal on the 'xl.meta' and return only after the healing is complete. This allows for future GETs on the same resource to be consistent for any version of the object.	2022-09-07 07:25:39 -07:00
Harshavardhana	228c6686f8	allow non-standards fallback for all http.TimeFormats (#15662 ) fixes #15645	2022-09-07 07:24:54 -07:00
Harshavardhana	7776d064cf	allow non-standards fallback for Expires header (#15655 ) fixes #15645	2022-09-05 19:18:18 -07:00
Harshavardhana	2d9b5a65f1	verify RenameData() versions to be consistent (#15649 ) xl.meta gets written and never rolled back, however we definitely need to validate the state that is persisted on the disk, if there are inconsistencies - more than write quorum we should return an error to the client - if write quorum was achieved however there are inconsistent xl.meta's we should simply trigger an MRF on them	2022-09-05 16:51:37 -07:00
Shireesh Anjal	c240da6568	Reuse madmin.ClusterRegistrationInfo (#15654 ) The `clusterInfo` struct in admin-handlers is same as madmin.ClusterRegistrationInfo, except for small differences in field names. Removing this and using madmin.ClusterRegistrationInfo in its place will help in following ways: - The JSON payload generated by mc in case of cluster registration will be consistent (same keys) with cluster.info generated by minio as part of the profile and inspect zip - health-analyzer can parse the cluster.info using the same struct and won't have to define it's own	2022-09-05 10:02:25 -07:00
Harshavardhana	157272dc5b	fix: use optimized json.NewEncoder instead for metrics (#15648 )	2022-09-05 08:06:35 -07:00
yudoutingle	f4c56026a2	fix: potential deadLock caused by unlocking a non-existing lock (#15635 )	2022-09-02 14:24:32 -07:00
Harshavardhana	37e3f5de10	do not print object not found errors in MRF healing (#15646 )	2022-09-02 14:22:40 -07:00
Harshavardhana	5ea629beb2	avoid printing io.ErrUnexpectedEOF for .metacache objects (#15642 )	2022-09-02 12:47:17 -07:00
Anis Elleuch	cf52691959	Save resync status in the backend using a last update timestamp (#15638 ) Currently, there is a short time window where the code is allowed to save the status of a replication resync. Currently, the window is `now.Sub(st.EndTime) <= resyncTimeInterval`. Also, any failure to write in the backend disks is not retried. Refactor the code a little bit to rely on the last timestamp of a successful write of the resync status of any given bucket in the backend disks.	2022-09-01 16:53:36 -07:00
Anis Elleuch	10e75116ef	Avoid replicating dirs in listing with replication enabled (#15641 ) When replication is enabled in a particular bucket, the listing will send objects to bucket replication, but it is also sending prefixes for non recursive listing which is useless and shows a lot of error logs. This commit will ignore prefixes.	2022-09-01 15:22:11 -07:00
Harshavardhana	f649968c69	tier: avoid stats infinite loop in forwardTo method (#15640 ) under some sequence of events following code would reach an infinite loop. ``` idx1, idx2 := 0, 1 for ; idx2 != idx1; idx2++ { fmt.Println(idx2) } ``` fixes #15639	2022-09-01 13:51:06 -07:00
Harshavardhana	bcedc2b0d9	fix: add healing metric type for heal tracing (#15631 ) changes the `heal.checkBucket` to `heal.Bucket` instead since the latter is more meaningful.	2022-08-31 12:28:03 -07:00
Klaus Post	8e4a45ec41	fix: encrypt checksums in metadata (#15620 )	2022-08-31 08:13:23 -07:00
Klaus Post	dec942beb6	feat: Add healing trace (#15616 )	2022-08-31 01:56:12 -07:00
Abirdcfly	d4e0f13bb3	chore: remove duplicate word in comments (#15607 ) Signed-off-by: Abirdcfly <fp544037857@gmail.com> Signed-off-by: Abirdcfly <fp544037857@gmail.com>	2022-08-30 08:26:43 -07:00
Anis Elleuch	1f28a3bb80	Avoid messages from go test output (#15601 ) A lot of warning messages are printed in CI/CD failures generated by go test. Avoid that by requiring at least Error level for logging when doing go test.	2022-08-30 08:23:40 -07:00
Krishnan Parthasarathi	3a1d3a7952	audit-log: Add time to get/restore object from remote-tier (#15602 )	2022-08-29 21:33:59 -07:00
Klaus Post	a9f1ad7924	Add extended checksum support (#15433 )	2022-08-29 16:57:16 -07:00
Poorna	929b9e164e	site replication: Avoid returning root svcacct info in sr metadata (#15608 ) Service accounts of root users should not be replicated.	2022-08-29 11:19:51 -07:00
Harshavardhana	97376f6e8f	improve performance for inlined data (#15603 ) inlined data often is bigger than the allowed O_DIRECT alignment, so potentially we can write 'xl.meta' without O_DSYNC instead we can rely on O_DIRECT + fdatasync() instead. This PR allows O_DIRECT on inlined data that would gain the benefits of performing O_DIRECT, eventually performing an fdatasync() at the end. Performance boost can be observed here for small objects < 128KiB. The performance boost is mainly seen on HDD, and marginal on NVMe setups.	2022-08-29 11:19:29 -07:00
Febriananda Wida Pramudita	1f22a16b15	fix: endpoints for single local disks must retain port info (#15585 )	2022-08-26 12:53:15 -07:00
Harshavardhana	433b6fa8fe	upgrade golang-lint to the latest (#15600 )	2022-08-26 12:52:29 -07:00
Krishnan Parthasarathi	99fbfe2421	Add concurrency to healing objects on a fresh disk (#15575 )	2022-08-25 13:07:15 -07:00
Poorna	b1b6264bea	fix: validate deployment id when adding peer clusters (#15591 ) Fixes: #15573	2022-08-25 11:30:52 -07:00
Aditya Manthramurthy	18dffb26e7	Allow querying a single target in config get API (#15587 )	2022-08-25 00:17:05 -07:00
Harshavardhana	edba7c987b	fix: objects matching prefixes should not leave delete markers (#15586 ) This is needed to ensure that we do not leave prefixes where version is suspended, instead we never leave versions on these paths.	2022-08-24 13:46:29 -07:00
Anis Elleuch	b737c83a66	Ensure that only one node performs site replication healing (#15584 ) When a node finds a change in the other replication cluster and applies to itself will already notify other peers. No need for all nodes in a given cluster to do site replication healing, only one node is sufficient.	2022-08-24 13:46:09 -07:00
Anis Elleuch	97a6322de1	Fix regression in notifying peers about new policy mapping (#15583 ) Switch from mux.Vars() to r.Form to avoid the issue of missing arguments passed to LoadPolicyMappingHandler.	2022-08-24 12:34:52 -07:00
Klaus Post	037fe4afdc	Add listing block reuse (#15579 ) When streaming results, pool metadata slices when sent.	2022-08-24 09:11:16 -07:00
Aditya Manthramurthy	afbb63a197	Factor out external event notification funcs (#15574 ) This change moves external event notification functionality into `event-notification.go`. This simplifies notification related code.	2022-08-24 06:42:36 -07:00
Harshavardhana	8902561f3c	use new xxml for XML responses to support rare control characters (#15511 ) use new xxml/XML responses to support rare control characters fixes #15023	2022-08-23 17:04:11 -07:00
Anis Elleuch	b8cdf060c8	Properly replicate policy mapping for virtual users (#15558 ) Currently, replicating policy mapping for STS users does not work. Fix it is by passing user type to PolicyDBSet.	2022-08-23 11:11:45 -07:00
Poorna	4155c5b695	replication: improve MRF healing. (#15556 ) This PR improves the replication failure healing by persisting most recent failures to disk and re-queuing them until the replication is successful. While this does not eliminate the need for healing during a full scan, queuing MRF vastly improves the ETA to keeping replicated buckets in sync as it does not wait for the scanner visit to detect unreplicated object versions.	2022-08-22 16:53:06 -07:00
Poorna	471467d310	fix: ensure metadata update happens after deletemarker replication (#15564 ) Fixes regression caused by #15521	2022-08-22 15:59:06 -07:00
Harshavardhana	ae4ee95d25	change default lock retry interval to 50ms (#15560 ) competing calls on the same object on versioned bucket mutating calls on the same object may unexpected have higher delays. This can be reproduced with a replicated bucket overwriting the same object writes, deletes repeatedly. For longer locks like scanner keep the 1sec interval	2022-08-19 16:21:05 -07:00
Harshavardhana	e9055e9ef7	fix: walk() should cancel itself upon context cancellation (#15553 ) This PR fixes possible leaks that may emanate from not listening on context cancelation or timeouts. ``` goroutine 60957610 [chan send, 16 minutes]: github.com/minio/minio/cmd.(erasureServerPools).Walk.func1.1.1(...) github.com/minio/minio/cmd/erasure-server-pool.go:1724 +0x368 github.com/minio/minio/cmd.listPathRaw({0x4a9a740, 0xc0666dffc0},... github.com/minio/minio/cmd/metacache-set.go:1022 +0xfc4 github.com/minio/minio/cmd.(erasureServerPools).Walk.func1.1() github.com/minio/minio/cmd/erasure-server-pool.go:1764 +0x528 created by github.com/minio/minio/cmd.(*erasureServerPools).Walk.func1 github.com/minio/minio/cmd/erasure-server-pool.go:1697 +0x1b7 ```	2022-08-18 17:49:08 -07:00
Harshavardhana	d350b666ff	feat: add idempotent delete marker support (#15521 ) The bottom line is delete markers are a nuisance, most applications are not version aware and this has simply complicated the version management. AWS S3 gave an unnecessary complication overhead for customers, they need to now manage these markers by applying ILM settings and clean them up on a regular basis. To make matters worse all these delete markers get replicated as well in a replicated setup, requiring two ILM settings on each site. This PR is an attempt to address this inferior implementation by deviating MinIO towards an idempotent delete marker implementation i.e MinIO will never create any more than single consecutive delete markers. This significantly reduces operational overhead by making versioning more useful for real data. This is an S3 spec deviation for pragmatic reasons.	2022-08-18 16:41:59 -07:00
Harshavardhana	895357607a	avoid using errors.As for 'errors.New' use errors.Is (#15549 ) Bonus: ignore coredns CVE, for now, there is no fix yet https://github.com/coredns/coredns/issues/5574	2022-08-18 11:10:49 -07:00
Harshavardhana	bf38c0c0d1	fix: increase concurrency of DeleteObjects() to N/10th (#15546 ) instead of keeping the value 10 and static, make the concurrency a function of incoming number of objects being deleted.	2022-08-18 09:33:56 -07:00
Poorna	21fe14201f	replication: centralize healthcheck for remote targets (#15516 ) This PR moves health check from minio-go client to being managed on the server. Additionally integrating health check into site replication	2022-08-16 17:46:22 -07:00
Harshavardhana	48640b1de2	fix: trim arn:aws:kms from incoming SSE aws-kms-key-id (#15540 )	2022-08-16 11:28:30 -07:00
Anis Elleuch	5682685c80	Introduce disk io stats metrics (#15512 )	2022-08-16 07:13:49 -07:00
Harshavardhana	c7d535c648	init console after IAM init() (#15531 ) fixes #15527	2022-08-13 12:54:41 -07:00
Aditya Manthramurthy	9986e103cf	Fix env var output in config get/export APIs (#15528 ) Fix a bug where env vars are not output when the config for the subsystem is specified solely via env vars.	2022-08-13 10:39:01 -07:00
Krishnan Parthasarathi	91e6af4470	Add trace support for decommissioning (#15502 ) * Add trace support for decommissioning * Add support for tracing errors during decommission	2022-08-10 12:46:45 -07:00
Shireesh Anjal	316c492842	Upgrade madmin-go to latest version (v1.4.15) (#15510 )	2022-08-10 07:36:13 -07:00
Harshavardhana	74418b542a	fix: incorrect context timeout during listPath() (#15509 ) This PR cleans up the listing code for single drive to ensure that we do not add an incorrect context timeout, while resuming the listing. fixes #15508	2022-08-10 07:35:29 -07:00
Poorna	172e63dbb6	fix: site replication group updates to set status correctly (#15507 ) Fixes: #15486	2022-08-09 15:17:43 -07:00
Poorna	21bf5b4db7	replication: heal proactively upon access (#15501 ) Queue failed/pending replication for healing during listing and GET/HEAD API calls. This includes healing of existing objects that were never replicated or those in the middle of a resync operation. This PR also fixes a bug in ListObjectVersions where lifecycle filtering should be done.	2022-08-09 15:00:24 -07:00
Harshavardhana	a406bb0288	restrict number of disks used for scanning buckets upto GOMAXPROCS (#15492 ) control scanner parallelism to avoid higher CPU usage on nodes that have more drives but an old CPU.	2022-08-08 16:16:44 -07:00
Harshavardhana	1823ab6808	LDAP/OpenID must be initialized IAM Init() (#15491 ) This allows for LDAP/OpenID to be non-blocking, allowing for unreachable Identity targets to be initialized in IAM.	2022-08-08 16:16:27 -07:00
Harshavardhana	8eec49304d	use logger.Info instead of logger.LogIf	2022-08-08 16:13:58 -07:00
Harshavardhana	ecdc2f2f5f	fix: maxConcurrent '0' is an invalid value (#15500 ) log and continue with defaults instead of crashing the service.	2022-08-08 15:18:45 -07:00
Harshavardhana	e178c55bc3	remove non-working GetRawData() from FS mode (#15498 )	2022-08-08 11:34:09 -07:00
Poorna	2c137c0d04	fix: handle invalid endpoint errors in site replication(#15499 ) fixes #15497	2022-08-08 11:12:05 -07:00
Harshavardhana	638c57e466	revert changes in FS implementation for umask fixes #15494	2022-08-08 09:48:24 -07:00
Harshavardhana	5e4213b3be	fix: keep writing previous speedtest result (#15484 ) when object speedtest is running keep writing previous speedtest result back to client until we have a new result - this avoids sending back blank entries in between the speedtest when it is running in 'autotune' mode.	2022-08-07 23:04:03 -07:00
Harshavardhana	e0b0a351c6	remove IAM old migration code (#15476 ) ``` commit `7bdaf9bc50` Author: Aditya Manthramurthy <donatello@users.noreply.github.com> Date: Wed Jul 24 17:34:23 2019 -0700 Update on-disk storage format for users system (#7949) ``` Bonus: fixes a bug when etcd keys were being re-encrypted.	2022-08-05 17:53:23 -07:00
Anis Elleuch	1d2ff46a89	Ensure lock/versioning permissions when creating a bucket (#15432 ) Currently, the code doesn't check if the user creating a bucket with locking feature has bucket locking and versioning permissions enabled, adding it in accordance with S3 spec. https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateBucket.html Object Lock - If ObjectLockEnabledForBucket is set to true in your CreateBucket request, s3:PutBucketObjectLockConfiguration and s3:PutBucketVersioning permissions are required.	2022-08-05 16:27:09 -07:00
Harshavardhana	8f7c739328	feat: add SpeedTest ResponseTimes and TTFB (#15479 ) Capture average, p50, p99, p999 response times and ttfb values. These are needed for latency measurements and overall understanding of our speedtest results.	2022-08-05 09:40:03 -07:00
Poorna	1beea3daba	fix: import bucket metadata import to return a summary (#15462 )	2022-08-05 01:52:50 -07:00
Aditya Manthramurthy	3d94c38ec4	Add env variables to configuration APIs output (#15465 ) Config export and config get APIs now include environment variables set on the server	2022-08-04 22:21:52 -07:00
Harshavardhana	f4af2d3cdc	fix: decodeDirObject() in single drive DeleteObjects() call (#15477 ) Thanks to @bh4t for reproducing this issue.	2022-08-04 18:57:43 -07:00
ebozduman	b57e7321e7	Replaces 'disk'=>'drive' visible to end user (#15464 )	2022-08-04 16:10:08 -07:00
Anis Elleuch	e93867488b	actively cancel listIAMConfigItems to avoid goroutine leak (#15471 ) listConfigItems creates a goroutine but sometimes callers will exit without properly asking listAllIAMConfigItems() to stop sending results, hence a goroutine leak. Create a new context and cancel it for each listAllIAMConfigItems call.	2022-08-04 13:20:43 -07:00
Harshavardhana	3bd9615d0e	fix: log if there is readDir() failure with ListBuckets (#15461 ) This is actionable and must be logged. Bonus: also honor umask by using 0o666 for all Open() syscalls.	2022-08-04 07:23:05 -07:00
Harshavardhana	a6e0ec4e6f	Add support converting non-inlined to inlined (#15444 ) This is a feature to allow for inode compaction on large clusters that use a lot of small files spread across a large heirarchy.	2022-08-02 23:10:22 -07:00
Andreas Auernhammer	d774a3309b	kes: automatically reload KES client certificate (#15450 ) This commit adds support for automatically reloading the MinIO client certificate for authentication to KES. The client certificate will now be reloaded: - when the private key / certificate file changes - when a SIGHUP signal is received - every 15 minutes Fixes #14869 Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-08-02 16:58:09 -07:00
Anis Elleuch	b3edb25377	bloom: healObject to mark a path dirty only for dangling objects (#15458 ) The path is marked dirty automatically when healObject() is called, which is wrong. HealObject() is called during self-healing and this will lead to an increase in the false positive result of the bloom filter. Also move NSUpdated() from renameData() and call it directly in CompleteMultipart and PutObject, this is not a functional change but it will make it less prone to errors in the future.	2022-08-02 16:57:39 -07:00
Harshavardhana	53a816b17a	fix: readdir fallback on root of the drive (#15457 ) fixes #15452	2022-08-02 14:57:36 -07:00
Harshavardhana	043aaa792d	fix: intrument os.OpenFile differently for Reads and Writes (#15449 ) allows us to trace latency for READs or WRITEs	2022-08-01 13:22:43 -07:00
Shireesh Anjal	e6eab2091f	fix: Incorrect ServersCount in cluster.info (#15431 ) The `ServersCount` field in cluster.info is expected to contain the number of nodes, and not number of endpoints.	2022-07-29 22:21:40 -07:00
Harshavardhana	3cdb609cca	allow root users to return appropriate policy in AccountInfo (#15437 ) fixes #15436 This fixes a regression caused after the removal of "consoleAdmin" policy usage for 'root users' in PR #15402	2022-07-29 20:58:03 -07:00
Harshavardhana	aa874010e2	fix: regression in resolving the right versions (#15430 ) fix: regression in resolving right versions commit `d480022711` caused a regression in real resolver, by picking up incorrect versionID.	2022-07-29 10:03:53 -07:00
Cesar Celis Hernandez	8ec888d13d	feat: update binary once and push it to other servers (#15407 )	2022-07-29 08:34:30 -07:00
Harshavardhana	916f274c83	choose starting concurrency based on number of local disks (#15428 ) smaller setups may have less drives per server choosing the concurrency based on number of local drives, and let the MinIO server change the overall concurrency as necessary.	2022-07-29 00:00:06 -07:00
Aditya Manthramurthy	7ac53c07af	fix: passing application configuration to console (#15409 ) This is an update to MinIO server after swagger codegen related build fixes added after issues introduced in `39fd7b0b3b`	2022-07-28 18:30:24 -07:00
Harshavardhana	bc72e4226e	do not allow filesystem fallback in server download (#15429 ) It is possible for anyone with admin access to relatively to get any content of any random OS location by simply providing the file with 'mc admin update alias/ /etc/passwd`. Workaround is to disable 'admin:ServiceUpdate' action. Everyone is advised to upgrade to this patch. Thanks to @alevsk for finding this bug.	2022-07-28 17:44:21 -07:00
Poorna	5e0776e96a	replication: Include replica object versions for resync (#15427 )	2022-07-28 13:43:02 -07:00
Anis Elleuch	2f1ef02d35	Do not update directory access time (#15426 ) Most setups will have relatime it only updates the access time following a change in the directory.	2022-07-28 12:40:48 -07:00
Harshavardhana	aff236e20e	fix: cluster healthcheck for single drive setups (#15415 ) single drive setups must return '200 OK' if drive is accessible, current master returns '503'	2022-07-27 16:46:34 -07:00
Harshavardhana	cbd70d26b5	optimize speedtest for smaller setups (#15414 ) this has been observed in multiple environments where the setups are small `speedtest` naturally fails with default '10s' and the concurrency of '32' is big for such clusters. choose a smaller value i.e equal to number of drives in such clusters and let 'autotune' increase the concurrency instead.	2022-07-27 14:41:59 -07:00
Harshavardhana	5e763b71dc	use logger.LogOnce to reduce printing disconnection logs (#15408 ) fixes #15334 - re-use net/url parsed value for http.Request{} - remove gosimple, structcheck and unusued due to https://github.com/golangci/golangci-lint/issues/2649 - unwrapErrs upto leafErr to ensure that we store exactly the correct errors	2022-07-27 09:44:59 -07:00
Aditya Manthramurthy	7e4e7a66af	Remove internal usage of consoleAdmin (#15402 ) "consoleAdmin" was used as the policy for root derived accounts, but this lead to unexpected bugs when an administrator modified the consoleAdmin policy This change avoids evaluating a policy for root derived accounts as by default no policy is mapped to the root user. If a session policy is attached to a root derived account, it will be evaluated as expected.	2022-07-26 19:06:55 -07:00
Shireesh Anjal	906947a285	fix: typo in json key ClusterInfo DeploymentID (#15406 ) deployement_id -> deployment_id	2022-07-26 19:05:33 -07:00
Poorna	426c902b87	site replication: fix healing of bucket deletes. (#15377 ) This PR changes the handling of bucket deletes for site replicated setups to hold on to deleted bucket state until it syncs to all the clusters participating in site replication.	2022-07-25 17:51:32 -07:00
Anis Elleuch	e4b51235f8	upgrade: Split in two steps to ensure a stable retry (#15396 ) Currently, if one server in a distributed setup fails to upgrade due to any reasons, it is not possible to upgrade again unless nodes are restarted. To fix this, split the upgrade process into two steps : - download the new binary on all servers - If successful, overwrite the old binary with the new one	2022-07-25 17:49:47 -07:00
Eng Zer Jun	0a3b1ad4eb	test: use `T.TempDir` to create temporary test directory (#15400 ) This commit replaces `ioutil.TempDir` with `t.TempDir` in tests. The directory created by `t.TempDir` is automatically removed when the test and all its subtests complete. Prior to this commit, temporary directory created using `ioutil.TempDir` needs to be removed manually by calling `os.RemoveAll`, which is omitted in some tests. The error handling boilerplate e.g. defer func() { if err := os.RemoveAll(dir); err != nil { t.Fatal(err) } } is also tedious, but `t.TempDir` handles this for us nicely. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-07-25 12:37:26 -07:00
Anis Elleuch	f23f442d33	Add cluster info to inspect/profiling archive (#15360 ) Add cluster info to inspect and profiling archive. In addition to the existing data generation for both inspect and profiling, cluster.info file is added. This latter contains some info of the cluster. The generation of cluster.info is is done as the last step and it can fail if it exceed 10 seconds.	2022-07-25 09:11:35 -07:00
Klaus Post	3795b2c8ba	Add compression scheme to header (#15395 ) For easier debugging. We still do not return compressed size for security reasons.	2022-07-24 07:15:49 -07:00
Harshavardhana	7725425e05	fix: fork os.MkdirAll to optimize cases where parent exists (#15379 ) a/b/c/d/ where `a/b/c/` exists results in additional syscalls such as an Lstat() call to verify if the `a/b/c/` exists and its a directory. We do not need to do this on MinIO since the parent prefixes if exist, we can simply return success without spending additional syscalls. Also this implementation attempts to simply use Access() calls to avoid os.Stat() calls since the latter does memory allocation for things we do not need to use. Access() is simpler since we have a predictable structure on the backend and we know exactly how our path structures are.	2022-07-24 00:43:11 -07:00
Aditya Manthramurthy	39fd7b0b3b	Pass multiple IDP config to console (#15270 ) This change passes multiple IDP config via a struct rather than env variables.	2022-07-22 15:28:02 -07:00
Harshavardhana	b0d70a0e5e	support additional claim info in Auditing STS calls (#15381 ) Bonus: Adds a missing AuditLog from AssumeRoleWithCertificate API Fixes #9529	2022-07-22 11:12:03 -07:00
Poorna	7d8c8de827	single drive: Remove bucket metadata on DeleteBucket (#15378 ) from disk and in-memory map	2022-07-21 19:51:53 -07:00
jiuker	3faef829c5	expect full quorum for writing 'format.json' everywhere (#15362 )	2022-07-21 18:04:17 -07:00
Poorna	7560fb6f9a	save IAM export assets relative at a folder prefix (#15355 )	2022-07-21 17:51:33 -07:00
Klaus Post	69bf39f42e	fix: make complete multipart uploads faster encrypted/compressed backends (#15375 ) - Only fetch the parts we need and abort as soon as one is missing. - Only fetch the number of parts requested by "ListObjectParts".	2022-07-21 16:47:58 -07:00
Minio Trusted	564a0afae1	Revert "tests: Add context cancelation (#15374 )" This reverts commit `1e332f0eb1`. Reverting this as tests are failing randomly.	2022-07-21 13:58:56 -07:00
Klaus Post	1e332f0eb1	tests: Add context cancelation (#15374 ) A huge number of goroutines would build up from various monitors When creating test filesystems provide a context so they can shut down when no longer needed.	2022-07-21 11:52:18 -07:00
Poorna	cab8d3d568	feat: add API to return list of objects waiting to be replicated (#15091 )	2022-07-21 11:05:44 -07:00
Klaus Post	be8c4cb24a	fix: support multiple validateAdminReq actions (#15372 ) handle multiple validateAdminReq actions and remove duplicate error responses.	2022-07-21 10:26:59 -07:00
Harshavardhana	65166e4ce4	fix: readQuorum calculation when defaultParityCount is 0 (#15363 ) when parity is '0' the readQuorum must be equal to the number of data disks.	2022-07-21 07:25:54 -07:00
Harshavardhana	d3f89fa6e3	remove unnecessary logs in IAM store (#15356 )	2022-07-20 08:19:12 -07:00
Harshavardhana	ce8397f7d9	use partInfo only for intermediate part.x.meta (#15353 )	2022-07-19 18:56:24 -07:00
Klaus Post	cae9aeca00	fix: reused field crash in PartIndices (#15351 ) `PartIndices` may be set if xlMetaV2Version is reused. Clear before unmarshaling and add sanity check when reading.	2022-07-19 16:49:46 -07:00
Klaus Post	f939d1c183	Independent Multipart Uploads (#15346 ) Do completely independent multipart uploads. In distributed mode, a lock was held to merge each multipart upload as it was added. This lock was highly contested and retries are expensive (timewise) in distributed mode. Instead, each part adds its metadata information uniquely. This eliminates the per object lock required for each to merge. The metadata is read back and merged by "CompleteMultipartUpload" without locks when constructing final object. Co-authored-by: Harshavardhana <harsha@minio.io>	2022-07-19 08:35:29 -07:00
Andreas Auernhammer	242d06274a	kms: add `context.Context` to KMS API calls (#15327 ) This commit adds a `context.Context` to the the KMS `{Stat, CreateKey, GenerateKey}` API calls. The context will be used to terminate external calls as soon as the client requests gets canceled. A follow-up PR will add a `context.Context` to the remaining `DecryptKey` API call. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-07-18 18:54:27 -07:00
Poorna	957e3ed729	export IAM: include site replicator svcacct (#15339 )	2022-07-18 17:38:53 -07:00
Harshavardhana	b6eb8dff64	Add decommission compression+encryption enabled tests (#15322 ) update compression environment variables to follow the expected sub-system style, however support fallback mode.	2022-07-17 08:43:14 -07:00
Harshavardhana	7da9e3a6f8	support encrypted/compressed objects properly during decommission (#15320 ) fixes #15314	2022-07-16 19:35:24 -07:00
Anis Elleuch	876970baea	Exclude upload-ids with incomplete part upload in multipart listing (#15318 ) Uploading a part object can leave an inconsistent state inside .minio.sys/multipart where data are uploaded but xl.meta is not committed yet. Do not list upload-ids that have this state in the multipart listing.	2022-07-16 13:25:58 -07:00
LHHDZ	e68e76e143	fix: data race, which caused tests execution to fail (#15313 )	2022-07-16 07:57:55 -07:00
Harshavardhana	e7ac1ea54c	allow decommission to continue when healing (#15312 ) Bonus: - heal buckets in-case during startup the new pools have bucket missing.	2022-07-15 21:03:23 -07:00
Harshavardhana	5ac6d91525	support 'admin update' for hotfix versions (#15308 ) hotfixed versions are rejected as invalid, allow `mc admin update` from hotfix repos.	2022-07-15 16:00:34 -07:00
Harshavardhana	1cd6713e24	copy query values before update to preserve the expected keys (#15310 ) in success_action_redirect we were missing required query params as per S3 spec - updated tests.	2022-07-15 15:04:48 -07:00
Harshavardhana	1b339ea062	allow force delete on decom pool (#15302 ) Bonus: - skip suspended pool from being considered for multipart uploads - add more context for decomErrors()	2022-07-14 20:44:22 -07:00
Harshavardhana	236ef03dbd	fix: skip objects expired via lifecycle rules during decommission (#15300 )	2022-07-14 16:47:09 -07:00
Poorna	7e32a17742	fix: site replication healing of missing buckets (#15298 ) fixes a regression from #15186 - Adding tests to cover healing of buckets. - Also dereference quota in SiteReplicationStatus only when non-nil	2022-07-14 14:27:47 -07:00
Krishnan Parthasarathi	1d42133d44	listing: Expire object versions past expiry (#15287 ) We skip object versions which are past their ILM expiry. This change schedules them for expiry while at it.	2022-07-14 07:21:26 -07:00
Poorna	b4f6901903	resync: Avoid concurrent access/write on map (#15286 ) fixes a crash ``` fatal error: concurrent map iteration and map write minio[19309]: goroutine 18640 [running]: minio[19309]: runtime.throw({0x27a3399?, 0x1785?}) minio[19309]: runtime/panic.go:992 +0x71 fp=0xc0062f1c80 sp=0xc0062f1c50 pc=0x438671 minio[19309]: runtime.mapiternext(0xc0062f1e90?) minio[19309]: runtime/map.go:871 +0x4eb fp=0xc0062f1cf0 sp=0xc0062f1c80 pc=0x41002b minio[19309]: github.com/minio/minio/cmd.(*ReplicationPool).periodicResyncMetaSave(0xc0056c00c0, {0x4d06a48, 0xc0005b2480}, {0x4d22fc0, 0xc0015ea0 ```	2022-07-13 16:29:10 -07:00
Klaus Post	0149382cdc	Add padding to compressed+encrypted files (#15282 ) Add up to 256 bytes of padding for compressed+encrypted files. This will obscure the obvious cases of extremely compressible content and leave a similar output size for a very wide variety of inputs. This does not mean the compression ratio doesn't leak information about the content, but the outcome space is much smaller, so often less information is leaked.	2022-07-13 07:52:15 -07:00
Klaus Post	697c9973a7	Upgrade compression package (#15284 ) Includes mitigation for CVE-2022-30631 (Go should still be updated) Remove functions now available upstream.	2022-07-13 07:48:14 -07:00
Harshavardhana	788fd3df81	preserve incoming query params in success_action_redirect (#15280 ) fixes #15274	2022-07-13 07:46:44 -07:00
Anis Elleuch	996cac5fed	Avoid listing buckets from a suspended pool (#15283 ) Make bucket requests sent after decommissioning is started are not created in a suspended pool. Therefore listing buckets should avoid suspended pools as well.	2022-07-13 07:44:50 -07:00
Harshavardhana	0a8b78cb84	fix: simplify passing auditLog eventType (#15278 ) Rename Trigger -> Event to be a more appropriate name for the audit event. Bonus: fixes a bug in AddMRFWorker() it did not cancel the waitgroup, leading to waitgroup leaks.	2022-07-12 10:43:32 -07:00
Harshavardhana	b4eb74f5ff	allow custom speedtest bucket (#15271 ) this allows for specifying existing buckets with - object replication enabled - object encryption enabled - object versioning enabled - object locking enabled	2022-07-12 10:12:47 -07:00
Anis Elleuch	57d1f31054	Do not log erasure read failure when disk goes offline (#15277 ) Avoid printing the following log ``` API: SYSTEM Time: Fri Jul 08 2022 11:48:40 GMT+0100 Error: Error(disk not found) reading erasure shards at... Backtrace: 0: internal/logger/logger.go:278:logger.LogIf() 1: cmd/bitrot-streaming.go:156:cmd.(streamingBitrotReader).ReadAt() 2: cmd/erasure-decode.go:165:cmd.(parallelReader).Read.func1() ```	2022-07-12 09:56:56 -07:00
Klaus Post	9f02f51b87	Add 4K minimum compressed size (#15273 ) There is no point in compressing very small files. Typically the effective size on disk will be the same due to disk blocks. So don't waste resources on extremely small files. We don't check on multipart. 1) because we don't know and 2) this is very likely a big object anyway.	2022-07-12 07:42:04 -07:00
Klaus Post	911a17b149	Add compressed file index (#15247 )	2022-07-11 17:30:56 -07:00
Poorna	3d969bd2b4	fix: ignore missing targets/replication config during site removal (#15269 )	2022-07-11 14:11:46 -07:00
Andreas Auernhammer	f800cee4fa	metric: add KMS-related metrics (#15258 ) This commit adds a minimal set of KMS-related metrics: ``` # HELP minio_cluster_kms_online Reports whether the KMS is online (1) or offline (0) # TYPE minio_cluster_kms_online gauge minio_cluster_kms_online{server="127.0.0.1:9000"} 1 # HELP minio_cluster_kms_request_error Number of KMS requests that failed with a well-defined error # TYPE minio_cluster_kms_request_error counter minio_cluster_kms_request_error{server="127.0.0.1:9000"} 16790 # HELP minio_cluster_kms_request_success Number of KMS requests that succeeded # TYPE minio_cluster_kms_request_success counter minio_cluster_kms_request_success{server="127.0.0.1:9000"} 348031 ``` Currently, we report whether the KMS is available and how many requests succeeded/failed. However, KES exposes much more metrics that can be exposed if necessary. See: https://pkg.go.dev/github.com/minio/kes#Metric Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-07-11 09:17:28 -07:00
Praveen raj Mani	b49fc33cb3	purge objects immediately with `x-minio-force-delete` in DeleteObject and DeleteBucket API (#15148 )	2022-07-11 09:15:54 -07:00
Klaus Post	37a6b2da67	Allow compaction at bucket top level. (#15266 ) If more than 1M folders (objects or prefixes) are found at the top level in a bucket allow it to be compacted. While very suboptimal structure we should limit memory usage at some point.	2022-07-11 07:59:03 -07:00
Harshavardhana	913e977c8d	remove auto-port warning for console-address (#15260 )	2022-07-08 13:36:41 -07:00
Harshavardhana	c2ddcb3b40	do not recreate deprecated delete-journal.bin, only read it (#15185 ) simplify deprecated code, re-enable hot-swap replace disks	2022-07-08 12:17:02 -07:00
Anis Elleuch	ed0cbfb31e	fix: rootdisk detection by not using cached value when GetDiskInfo() errors out (#15249 ) GetDiskInfo() uses timedValue to cache the disk info for one second. timedValue behavior was recently changed to return an old cached value when calculating a new value returns an error. When a mount point is empty, GetDiskInfo() will return errUnformattedDisk, timedValue will return cached disk info with unexpected IsRootDisk value, e.g. false if the mount point belongs to a root disk. Therefore, the mount point will be considered a valid disk and will be formatted as well. This commit will also add more defensive code when marking root disks: always mark a disk offline for any GetDiskInfo() error except errUnformattedDisk. The server will try anyway to reconnect to those disks every 10 seconds.	2022-07-07 17:05:23 -07:00
Harshavardhana	32b2f6117e	fix: do not pass around sync.Map (#15250 ) it is not safe to pass around sync.Map through pointers, as it may be concurrently updated by different callers. this PR simplifies by avoiding sync.Map altogether, we do not need sync.Map to keep object->erasureMap association. This PR fixes a crash when concurrently using this value when audit logs are configured. ``` fatal error: concurrent map iteration and map write goroutine 247651580 [running]: runtime.throw({0x277a6c1?, 0xc002381400?}) runtime/panic.go:992 +0x71 fp=0xc004d29b20 sp=0xc004d29af0 pc=0x438671 runtime.mapiternext(0xc0d6e87f18?) runtime/map.go:871 +0x4eb fp=0xc004d29b90 sp=0xc004d29b20 pc=0x41002b ```	2022-07-07 17:04:25 -07:00
Harshavardhana	ae92521310	remove unnecessary nAgreed value in partial() func (#15242 )	2022-07-07 13:45:34 -07:00
Harshavardhana	5802df4365	retry and resume decom operation upon retriable failures (#15244 ) it is possible in a k8s-like system reading pool.bin might not have quorum during startup, however, add a way to retry after this failure.	2022-07-07 12:31:44 -07:00
Anis Elleuch	8d98282afd	Better reporting of total/free usable capacity of the cluster (#15230 ) The current code uses approximation using a ratio. The approximation can skew if we have multiple pools with different disk capacities. Replace the algorithm with a simpler one which counts data disks and ignore parity disks.	2022-07-06 13:29:49 -07:00
Harshavardhana	3af6073576	no 'replicate status' without replication config (#15233 ) 'replicate status' shouldn't be displaying historic values unless replication config is present on the relevant bucket.	2022-07-06 09:53:33 -07:00
Harshavardhana	2518af5f9e	fix: allow certain mutations on objects during decommissioning (#15231 ) fix: allow certain mutation on objects during decommission currently by mistake deletion of objects was skipped, if the object resided on the pool being decommissioned. delete's are okay to be allowed since decommission is designed to run on a cluster with active I/O.	2022-07-06 09:53:16 -07:00
Harshavardhana	7b793d84c8	fix: calculate scanner metric paths for single drive (#15232 ) Additionally use pathJoin() to avoid double `//` in path names.	2022-07-06 07:48:38 -07:00
Aditya Manthramurthy	af9bc7ea7d	Add external IDP management Admin API for OpenID (#15152 )	2022-07-05 18:18:04 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
haslersn	df42914da6	Fix missing whitespace in error message for IncompleteBody (#15227 )	2022-07-05 12:19:57 -07:00
Klaus Post	2471bdda00	fix: for DiskInfo call cache disk metrics (#15229 ) Small uploads spend a significant amount of time (~5%) fetching disk info metrics. Also maps are allocated for each call. Add a 100ms cache to disk metrics.	2022-07-05 11:02:30 -07:00
Harshavardhana	9d80ff5a05	fix: decommission delete markers for non-current objects (#15225 ) versioned buckets were not creating the delete markers present in the versioned stack of an object, this essentially would stop decommission to succeed. This PR fixes creating such delete markers properly during a decommissioning process, adds tests as well.	2022-07-05 07:37:24 -07:00
Harshavardhana	b311abed31	decom IAM, Bucket metadata properly (#15220 ) Current code incorrectly passed the config asset object name while decommissioning, make sure that we pass the right object name to be hashed on the newer set of pools. This PR fixes situations after a successful decommission, the users and policies might go missing due to wrong hashed set.	2022-07-04 14:02:54 -07:00
Harshavardhana	ce667ddae0	do not print errFileNotFound in entries.resolve() (#15216 )	2022-07-04 06:40:46 -07:00
Harshavardhana	0fee993a4b	return appropriate error under 'decom status' (#15213 ) fixes #15208	2022-07-01 16:21:23 -07:00
Poorna	0ea5c9d8e8	site healing: Skip stale iam asset updates from peer. (#15203 ) Allow healing to apply IAM change only when peer gave the most recent update.	2022-07-01 13:19:13 -07:00
Harshavardhana	63ac260bd5	Simplify Prometheus metrics gather (#15210 )	2022-07-01 13:18:39 -07:00
Harshavardhana	f9a4ad7904	update banner with version+runtime (#15206 )	2022-06-30 13:58:09 -07:00
Minio Trusted	e60b67d246	Revert "Tighten enforcement of object retention (#14993 )" This reverts commit `5e3010d455`. This commit causes regression on object locked buckets causine delete-markers to be not created.	2022-06-30 13:06:32 -07:00
Klaus Post	9004d69c6f	Make ReqInfo concurrency safe (#15204 ) Some read/writes of ReqInfo did not get appropriate locks, leading to races. Make sure reading and writing holds appropriate locks.	2022-06-30 10:48:50 -07:00
Harshavardhana	8856a2d77b	finalize startup-banner and remove unnecessary logs (#15202 )	2022-06-29 16:32:04 -07:00
Anis Elleuch	54a061bdda	Save minio version information centrally (#15181 )	2022-06-29 14:45:49 -07:00
Poorna	7cc9286e0f	site healing: Skip stale bucket metadata updates from peer (#15186 ) Allow healing to apply bucket metadata change only when peer gave the most recent update.	2022-06-28 18:09:20 -07:00
Harshavardhana	2f25639ea0	update banner to reflect the final agreed UI (#15192 )	2022-06-28 16:37:40 -07:00
Harshavardhana	2070c215a2	handle missing funcNames for handlers (#15188 ) also use designated names for internal calls - storageREST calls are storageR - lockREST calls are lockR - peerREST calls are just peer Named in this fashion to facilitate wildcard matches by having prefixes of the same name. Additionally, also enable funcNames for generic handlers that return errors, currently we disable '<unknown>'	2022-06-28 05:04:10 -07:00
Harshavardhana	9c605ad153	allow support for parity '0', '1' enabling support for 2,3 drive setups (#15171 ) allows for further granular setups - 2 drives (1 parity, 1 data) - 3 drives (1 parity, 2 data) Bonus: allows '0' parity as well.	2022-06-27 20:22:18 -07:00
Anis Elleuch	b7c7e59dac	Revert proxying requests with precondition errors (#15180 ) In a replicated setup, when an object is updated in one cluster but still waiting to be replicated to the other cluster, GET requests with if-match, and range headers will likely fail. It is better to proxy requests instead. Also, this commit avoids printing verbose logs about precondition & range errors.	2022-06-27 14:03:44 -07:00
Harshavardhana	699cf6ff45	perform object sweep after equeue the latest CopyObject() (#15183 ) keep it similar to PutObject/CompleteMultipart	2022-06-27 12:11:33 -07:00
Anis Elleuch	9201870f6c	Remove unnecessary code in WalkDir() (#15168 ) Recalculating forward is useless. It is never used and it will be computed again when calling scanDir() again.	2022-06-27 10:26:56 -07:00
Harshavardhana	6722f58668	save MinIO version with each version (8-bytes extra) (#15170 ) store MinIO version along with each version in 'xl.meta' for future purposes, can be used as ways to add specific code for bug fixes if any.	2022-06-27 03:59:41 -07:00
Harshavardhana	7b9b7cef11	add license banner for GNU AGPLv3 (#15178 ) Bonus: rewrite subnet re-use of Transport	2022-06-27 03:58:25 -07:00
Harshavardhana	bd099f5e71	fix: change timedValue to return the previously cached value (#15169 ) fix: change timedvalue to return previous cached value caller can interpret the underlying error and decide accordingly, places where we do not interpret the errors upon timedValue.Get() - we should simply use the previously cached value instead of returning "empty". Bonus: remove some unused code	2022-06-25 08:50:16 -07:00
Klaus Post	baf257adcb	fix: health client leak when calling UpdateAllTargets (#15167 ) When `LoadBucketMetadataHandler` is called and `UpdateAllTargets` gets called. Since targets are rebuilt we cancel all.	2022-06-24 11:12:52 -07:00
Anis Elleuch	4fd1986885	Trace all http requests (#15064 ) Add a generic handler that adds a new tracing context to the request if tracing is enabled. Other handlers are free to modify the tracing context to update information on the fly, such as, func name, enable body logging etc.. With this commit, requests like this ``` curl -H "Host: ::1:3000" http://localhost:9000/ ``` will be traced as well.	2022-06-23 23:19:24 -07:00
Harshavardhana	e1afac9439	reduce sha256 CPU usage by turning it off for speedtests (#15154 ) continuation of the PR #15151, keeping signature v4 for the headers however avoiding sha256 for the body.	2022-06-23 11:26:53 -07:00
Poorna	580d9db85e	Add APIs to import/export IAM data (#15014 )	2022-06-23 09:25:15 -07:00
Anis Elleuch	42e2fd35d8	heal: Include dir markers when healing a fresh disk (#15158 ) Directories markers are not healed when healing a new fresh disk. A a proper fix would be moving object names encoding/decoding to erasure object level but it is too late now since the object to set distribution is calculated at a higher level.	2022-06-23 06:47:33 -07:00
Harshavardhana	1a40c7c27c	use signature-v2 for 'object perf' tests to avoid CPU using sha256 (#15151 ) It is observed in a local 8 drive system the CPU seems to be bottlenecked at ``` (pprof) top Showing nodes accounting for 1385.31s, 88.47% of 1565.88s total Dropped 1304 nodes (cum <= 7.83s) Showing top 10 nodes out of 159 flat flat% sum% cum cum% 724s 46.24% 46.24% 724s 46.24% crypto/sha256.block 219.04s 13.99% 60.22% 226.63s 14.47% syscall.Syscall 158.04s 10.09% 70.32% 158.04s 10.09% runtime.memmove 127.58s 8.15% 78.46% 127.58s 8.15% crypto/md5.block 58.67s 3.75% 82.21% 58.67s 3.75% github.com/minio/highwayhash.updateAVX2 40.07s 2.56% 84.77% 40.07s 2.56% runtime.epollwait 33.76s 2.16% 86.93% 33.76s 2.16% github.com/klauspost/reedsolomon._galMulAVX512Parallel84 8.88s 0.57% 87.49% 11.56s 0.74% runtime.step 7.84s 0.5% 87.99% 7.84s 0.5% runtime.memclrNoHeapPointers 7.43s 0.47% 88.47% 22.18s 1.42% runtime.pcvalue ``` Bonus changes: - re-use transport for bucket replication clients, also site replication clients. - use 32KiB buffer for all read and writes at transport layer seems to help TLS read connections. - Do not have 'MaxConnsPerHost' this is problematic to be used with net/http connection pooling 'MaxIdleConnsPerHost' is enough.	2022-06-22 16:28:25 -07:00
Poorna	cb097e6b0a	CopyObject: fix read/write err on closed pipe (#15135 ) Fixes: #15128 Regression from PR#14971	2022-06-21 19:20:11 -07:00
Poorna	1cfb03fb74	replication: Avoid proxying when precondition failed (#15134 ) Proxying is not required when content is on this cluster and does not meet pre-conditions specified in the request. Fixes #15124	2022-06-21 14:11:35 -07:00
Harshavardhana	f293df647c	s3/zip: extract metadata properly for Zipped objects (#15123 ) s3/zip: extra metadata properly for Zipped objects fixes #15121	2022-06-21 14:11:12 -07:00
sota	e2e5bd6f19	fix: cant parse comment without '=' in environment file (#15130 )	2022-06-21 10:37:15 -07:00
Andreas Auernhammer	cd7a0a9757	fips: simplify TLS configuration (#15127 ) This commit simplifies the TLS configuration. It inlines the FIPS / non-FIPS code. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-06-21 07:54:48 -07:00
Anis Elleuch	b3eda248a3	Parallelize new disks healing of different erasure sets (#15112 ) - Always reformat all disks when a new disk is detected, this will ensure new uploads to be written in new fresh disks - Always heal all buckets first when an erasure set started to be healed - Use a lock to prevent two disks belonging to different nodes but in the same erasure set to be healed in parallel - Heal different sets in parallel Bonus: - Avoid logging errUnformattedDisk when a new fresh disk is inserted but not detected by healing mechanism yet (10 seconds lag)	2022-06-21 07:53:55 -07:00
Harshavardhana	486888f595	remove gateway banner and some other TODO loggers (#15125 )	2022-06-21 05:25:40 -07:00
Poorna	b3ebc69034	improve error message for bucket metadata export/import API (#15120 )	2022-06-20 16:13:45 -07:00
Harshavardhana	761dde2f1b	fix: add 'mc support inspect' support for single drive deployment (#15122 )	2022-06-20 16:11:19 -07:00
Harshavardhana	2bb6a3f4d0	cleanup site replication error handling (#15113 ) site replication errors were printed at various random locations, repeatedly - this PR attempts to remove double logging and capture all of them at a common place. This PR also enhances the code to show partial success and errors as well.	2022-06-20 10:48:11 -07:00
Anis Elleuch	73733a8fb9	heal: Report correctly in multip-pools setup (#15117 ) `mc admin heal -r <alias>` in a multi setup pools returns incorrectly grey objects. The reason is that erasure-server-pools.HealObject() runs HealObject in all pools and returns the result of the first nil error. However, in the lower erasureObject level, HealObject() returns nil if an object does not exist + missing error in each disk of the object in that pool, therefore confusing mc. Make erasureObject.HealObject() to return not found error in the lower level, so at least erasureServerPools will know what pools to ignore.	2022-06-20 08:07:45 -07:00
Poorna	2fa1d8ac48	Add import/export APIs to migrate bucket metadata (#14929 )	2022-06-18 06:55:39 -07:00
Poorna	8b9a19eef1	fix: typo in site replication version healing (#15103 )	2022-06-17 16:43:24 -07:00
Aditya Manthramurthy	7f629df4d5	Add generic function to retrieve config value with metadata (#15083 ) `config.ResolveConfigParam` returns the value of a configuration for any subsystem based on checking env, config store, and default value. Also returns info about which config source returned the value. This is useful to return info about config params overridden via env in the user APIs. Currently implemented only for OpenID subsystem, but will be extended for others subsequently.	2022-06-17 11:39:21 -07:00
Anis Elleuch	98ddc3596c	Avoid CompleteMultipart freeze with unexpected network issue (#15102 ) If sending a white space during a long S3 handler call fails, the whitespace goroutine forgets to return a result to the caller. Therefore, the complete multipart handler will be blocked. Remember to send the header written result to the caller or/and close the channel.	2022-06-17 10:41:25 -07:00
Harshavardhana	5d23be6242	fix: ignore printing io.EOF during WalkDir() on concurrently modified objects (#15100 ) fix: ignore print io.EOF during WalkDir() on concurrently modified objects	2022-06-17 08:23:47 -07:00
Poorna	55ee94bed0	initialize site replication subsys after loading metadata (#15099 )	2022-06-16 19:00:35 -07:00
Harshavardhana	d228d29944	update '-v' flag behavior to include copyRight and license (#15097 ) ``` ~ minio -v minio version DEVELOPMENT.2022-06-16T20-40-14Z (commit-id=e083228e2a06bfdcd006fee28d449cd2b47c542a) Runtime: go1.18.3 linux/amd64 Copyright (c) 2015-2022 MinIO, Inc. Licence AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html> ```	2022-06-16 16:10:48 -07:00
Harshavardhana	013cc66d8e	add dataErrs for healing debug log (#15092 )	2022-06-16 09:42:45 -07:00
Harshavardhana	c7ed6eee5e	fix: background local test also via channel (#15086 ) current implementation for `standalone` setups was blocking the `perf drive`. Bonus: remove all old unused complicated code.	2022-06-15 14:51:42 -07:00
Harshavardhana	8082d1fed6	add bucket level S3 received/sent bytes (#15084 ) adds bucket level metrics for bytes received and sent bytes on all S3 API calls.	2022-06-14 15:14:24 -07:00
Harshavardhana	d2a10dbe69	fix: simplify healthcheck code to freeze calls only once (#15082 ) - currently subnet health check was freezing and calling locks at multiple locations, avoid them. - throw errors if first attempt itself fails with no results	2022-06-14 11:22:07 -07:00
Anis Elleuch	14645142db	erasure-sd: Evaluate versioning Prefix in multi-delete objects (#15081 ) Erasure SD DeleteObjects() is only inheriting bucket versioning status from the handler layer. Add the missing versioning prefix evaluation for each object that will deleted.	2022-06-14 10:05:12 -07:00
Anis Elleuch	0d00f3a55b	kms: initialize after cli parsing (#15076 ) KMS depends on the --certs-dir flag. Ensure KMS is initialized after loading the flag.	2022-06-13 13:06:13 -07:00
Anis Elleuch	dd53b287f2	sts: Avoid printing all STS errors (#15065 ) Limit printing STS errors to - STS internal error - STS not initialized - STS upstream error	2022-06-11 12:55:32 -07:00
Harshavardhana	7413045f0e	fix: add missing minio_s3_requests_total (#15070 ) PR #15052 caused a regression, add the missing metrics back. Bonus: - internode information should be only for distributed setups - update the dashboard to include 4xx and 5xx error panels.	2022-06-11 00:50:31 -07:00
Harshavardhana	af1944f28d	support reading systemctl config automatically on baremetal setups (#15066 ) this allows for customers to use `mc admin service restart` directly even when performing RPM, DEB upgrades. Upon such 'restart' after upgrade MinIO will re-read the /etc/default/minio for any newer environment variables. As long as `MINIO_CONFIG_ENV_FILE=/etc/default/minio` is set, this is honored.	2022-06-10 09:59:15 -07:00
Harshavardhana	214ea14f29	fix: for frozen calls return if client disconnects (#15062 )	2022-06-09 05:06:47 -07:00
Anis Elleuch	5fb420c703	prometheus: Add S3 4xx and 5xx S3 monitoring (#15052 ) Currently minio_s3_requests_errors_total covers 4xx and 5xx S3 responses which can be confusing when s3 applications sent a lot of HEAD requests with obvious 404 responses or when the replication is enabled. Add - minio_s3_requests_4xx_errors_total - minio_s3_requests_5xx_errors_total to help users monitor 4xx and 5xx HTTP status codes separately.	2022-06-08 11:22:34 -07:00
Harshavardhana	2420f6c000	fix: make metrics endpoint responsive by reducing the chatter (#15055 ) peerOnlineCounter was making NxN calls to many peers, this can be really long and tedious if there are random servers that are going down. Instead we should calculate online peers from the point of view of "self" and return those online and offline appropriately by performing a healthcheck.	2022-06-08 02:43:13 -07:00
Harshavardhana	b0d7332a0c	healthcheck cluster endpoint should honor write/readQuorum per pool (#15053 )	2022-06-07 19:08:21 -07:00
Harshavardhana	d55efc791f	relax O_DIRECT in single drive mode if unsupported (#15045 )	2022-06-07 06:44:01 -07:00
Minio Trusted	e2d4d097e7	do not print errors upon 'nil' err	2022-06-06 17:33:41 -07:00
Shireesh Anjal	4ce81fd07f	Add periodic callhome functionality (#14918 ) * Add periodic callhome functionality Periodically (every 24hrs by default), fetch callhome information and upload it to SUBNET. New config keys under the `callhome` subsystem: enable - Set to `on` for enabling callhome. Default `off` frequency - Interval between callhome cycles. Default `24h` * Improvements based on review comments - Update `enableCallhome` safely - Rename pctx to ctx - Block during execution of callhome - Store parsed proxy URL in global subnet config - Store callhome URL(s) in constants - Use existing global transport - Pass auth token to subnetPostReq - Use `config.EnableOn` instead of `"on"` * Use atomic package instead of lock * Use uber atomic package * Use `Cancel` instead of `cancel` Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2022-06-06 16:14:52 -07:00
Harshavardhana	df9eeb7f8f	fix: do not log concurrently when multiple disks return errors (#15044 ) since the values inside 'context' are mutated internally by logger, make sure to log serially upon errors not concurrently.	2022-06-06 15:15:11 -07:00
Harshavardhana	31c4fdbf79	fix: resyncing 'null' version on pre-existing content (#15043 ) PR #15041 fixed replicating 'null' version however due to a regression from #14994 caused the target versions for these 'null' versioned objects to have different 'versions', this may cause confusion with bi-directional replication and cause double replication. This PR fixes this properly by making sure we replicate the correct versions on the objects.	2022-06-06 15:14:56 -07:00
Harshavardhana	48e367ff7d	reject resync start on misconfigured replication rules (#15041 ) we expect resync to start on buckets with replication rule ExistingObjects enabled, if not we reject such calls.	2022-06-06 02:54:39 -07:00
Anis Elleuch	fd02492cb7	avoid limits on the number of parallel trace/bucket notifications listeners (#14799 ) Simplifies overall limits on the incoming listeners for notifications. Fixes #14566	2022-06-05 14:29:12 -07:00
Harshavardhana	5afdc56796	allow single drive mode to run on root disk (#15037 ) for practical reasons, allow root disk based installs for single drive mode.	2022-06-03 12:53:42 -07:00
Harshavardhana	c3e1da8e04	honor canceled context and do not leak on mergeChannels (#15034 ) mergeEntryChannels has the potential to perpetually wait on the results channel, context might be closed and we did not honor the caller context canceling.	2022-06-03 05:59:02 -07:00
Anis Elleuch	20a753e2e5	Fix a possible service freeze after perf object (#15036 ) The S3 service can be frozen indefinitely if a client or mc asks for object perf API but quits early or has some networking issues. The reason is that partialWrite() can block indefinitely. This commit makes partialWrite() listens to context cancellation as well. It also renames deadlinedCtx to healthCtx since it covers handler context cancellation and not only not only the speedtest deadline.	2022-06-03 05:58:45 -07:00
Aditya Manthramurthy	61a7434379	Update --version option behavior (#15032 ) - Add git commit ID - Add go version	2022-06-02 18:40:53 -07:00
Poorna	29edb4ccfe	fix: site replication bucket heal to not panic if replication config is missing (#15025 )	2022-06-02 12:34:03 -07:00
Anis Elleuch	d4e565e595	Add defensive check for one stream message size (#15029 ) In a streaming response, the client knows the size of a streamed message but never checks the message size. Add the check to error out if the response message is truncated.	2022-06-02 09:16:26 -07:00
Klaus Post	f7cecf0945	Make isIndexedMetaV2 return errors (#15012 ) Indexed streams would be decoded by the legacy loader if there was an error loading it. Return an error when the stream is indexed and it cannot be loaded. Fixes "unknown minor metadata version" on corrupted xl.meta files and returns an actual error.	2022-05-31 19:06:57 -07:00
Harshavardhana	52221db7ef	fix: for unexpected errors in reading versioning config panic (#14994 ) We need to make sure if we cannot read bucket metadata for some reason, and bucket metadata is not missing and returning corrupted information we should panic such handlers to disallow I/O to protect the overall state on the system. In-case of such corruption we have a mechanism now to force recreate the metadata on the bucket, using `x-minio-force-create` header with `PUT /bucket` API call. Additionally fix the versioning config updated state to be set properly for the site replication healing to trigger correctly.	2022-05-31 02:57:57 -07:00
Anis Elleuch	56a61bab56	test: Add GetObjectNInfo test with some outdated disks (#15004 ) Add a test reading an object which has some old data in some outdated disks, in a versioned and non-versioned bucket.	2022-05-30 17:52:59 -07:00
Harshavardhana	d480022711	fix: invalidate outdated disks appropriately during readAllXL (#15002 ) readAllXL would return inlined data for outdated disks causing "read" to return incorrect content to the client, this PR fixes this behavior by making sure we skip such outdated disks appropriately based on the latest ModTime on the disk.	2022-05-30 12:43:54 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	5792be71fa	fix: add timeouts to avoid goroutine leaks in net/http (#14995 ) Following code can reproduce an unending go-routine buildup, while keeping connections established due to lack of client not closing the connections. https://gist.github.com/harshavardhana/2d00e6f909054d2d2524c71485ad02e1 Without this PR all MinIO deployments can be put into denial of service attacks, causing entire service to be unavailable. We bring in two timeouts at this stage to control such go-routine build ups, new change - IdleTimeout (to kill off idle connections) - ReadHeaderTimeout (to kill off connections that are too slow) This new change also brings two hidden options to make any additional relevant changes if desired in some setups.	2022-05-30 06:24:51 -07:00
Poorna	5e3010d455	Tighten enforcement of object retention (#14993 ) Ref issue#14991 - in the rare case that object in bucket under retention has null version, make sure to enforce retention rules.	2022-05-28 02:21:19 -07:00
Anis Elleuch	ccbf65c8e8	site-repl: Fix deadlock after an IAM loading error (#14990 ) Fix forgotten IAM cache lock releases when reading some data from disk/etcd Co-authored-by: Anis Elleuch <anis@min.io>	2022-05-27 10:26:38 -07:00
Harshavardhana	9d07cde385	use crypto/sha256 only for FIPS 140-2 compliance (#14983 ) It would seem like the PR #11623 had chewed more than it wanted to, non-fips build shouldn't really be forced to use slower crypto/sha256 even for presumed "non-performance" codepaths. In MinIO there are really no "non-performance" codepaths. This assumption seems to have had an adverse effect in certain areas of CPU usage. This PR ensures that we stick to sha256-simd on all non-FIPS builds, our most common build to ensure we get the best out of the CPU at any given point in time.	2022-05-27 06:00:19 -07:00
Aditya Manthramurthy	464b9d7c80	Add support for Identity Management Plugin (#14913 ) - Adds an STS API `AssumeRoleWithCustomToken` that can be used to authenticate via the Id. Mgmt. Plugin. - Adds a sample identity manager plugin implementation - Add doc for plugin and STS API - Add an example program using go SDK for AssumeRoleWithCustomToken	2022-05-26 17:58:09 -07:00
Poorna	5c81d0d89a	site replication: heal missing/invalid replication config (#14979 ) Validate remote target ARNs and heal any stale rules in the replication config	2022-05-26 17:57:23 -07:00
Klaus Post	c0bf02b8b2	Ignore disks with 0 total space (#14981 ) Ignore disks with 0 total Mainly defensive to ensure no `/0` in percent calculation.	2022-05-26 06:01:50 -07:00
Harshavardhana	fd46a1c3b3	fix: some races when accessing ldap/openid config globally (#14978 )	2022-05-25 18:32:53 -07:00
Aditya Manthramurthy	5aae7178ad	Fix listing of service and sts accounts (#14977 ) Now returns user does not exist error if the user is not known to the system	2022-05-25 15:28:54 -07:00
Harshavardhana	dea8220eee	do not heal outdated disks > parityBlocks (#14976 ) this PR also fixes a situation where incorrect partsMetadata slice was used where fi.Data was re-used from a single drive causing duplication of the shards across all drives. This happens for situations where shouldHeal() returns true for all drives > parityBlocks. To avoid this we should never attempt to heal on all drives > parityBlocks, unless we are doing metadata migration from xl.json -> xl.meta	2022-05-25 15:17:10 -07:00
Klaus Post	a4be0b88f6	Add server pool reserved space (#14974 ) If one or more pools reach 85% usage in a set, we will only use pools that have more free space. In case all pools are above 85% we allow all of them to be used with the regular distribution.	2022-05-25 13:20:20 -07:00
Poorna	d8101573be	Disallow deletion of ARN when under active replication (#14972 ) fixes a regression from #12880	2022-05-24 19:40:45 -07:00
Klaus Post	41cdb357bb	Compensate for different server pool sizes (#14968 ) When a server pool with a different number of sets is added they are not compensated when choosing a destination pool for new objects. This leads to the unbalanced placement of objects with smaller pools getting a bigger number of objects since we only compare the destination sets directly. This change will compensate for differences in set sizes when choosing the destination pool. Different set sizes are already compensated by fewer disks.	2022-05-24 18:57:14 -07:00
Harshavardhana	38caddffe7	fix: copyObject on versioned bucket when updating metadata (#14971 ) updating metadata with CopyObject on a versioned bucket causes the latest version to be not readable, this PR fixes this properly by handling the inline data bug fix introduced in PR #14780. This bug affects only inlined data.	2022-05-24 17:27:45 -07:00
Poorna	0e26f983d6	site replication: Allow replication rule edit (#14969 ) Revert commit `b42cfcea60` as too restrictive	2022-05-24 13:27:33 -07:00
Anis Elleuch	77dc99e71d	Do not use inline data size in xl.meta quorum calculation (#14831 ) * Do not use inline data size in xl.meta quorum calculation Data shards of one object can different inline/not-inline decision in multiple disks. This happens with outdated disks when inline decision changes. For example, enabling bucket versioning configuration will change the small file threshold. When the parity of an object becomes low, GET object can return 503 because it is not unable to calculate the xl.meta quorum, just because some xl.meta has inline data and other are not. So this commit will be disable taking the size of the inline data into consideration when calculating the xl.meta quorum. * Add tests for simulatenous inline/notinline object Co-authored-by: Anis Elleuch <anis@min.io>	2022-05-24 06:26:38 -07:00
Anis Elleuch	5041bfcb5c	replication healing: Fix typo when healing bucket quota info (#14966 ) A typo is found in the replication healing code where an empty quota configuration is sent to peer sites instead of the correct one. .io>	2022-05-24 06:26:13 -07:00
Harshavardhana	f8650a3493	fetch bucket replication stats across peers in single call (#14956 ) current implementation relied on recursively calling one bucket at a time across all peers, this would be very slow and chatty when there are 100's of buckets which would mean 100*peerCount amount of network operations. This PR attempts to reduce this entire call into `peerCount` amount of network calls only. This functionality addresses also a concern where the Prometheus metrics would significantly slow down when one of the peers is offline.	2022-05-23 09:15:30 -07:00
Klaus Post	90a52a29c5	Fix WalkDir fallback hot loop (#14961 ) Fix fallback hot loop fd was never refreshed, leading to an infinite hot loop if a disk failed and the fallback disk fails as well. Fix & simplify retry loop. Fixes #14960	2022-05-23 06:28:46 -07:00
Poorna	8859c92f80	Relax site replication syncing of service accounts (#14955 ) Synchronous replication of service/sts accounts can be relaxed as site replication healing should catch up when peer clusters are back online.	2022-05-20 19:09:11 -07:00
Anis Elleuch	01e5632949	mrf: Fix stale MRF data showed in heal info (#14953 ) One usee reported having mc admin heal status output ETA increasing by time. It turned out it is MRF that is not clearing its data due to a bug in the code. pendingItems is increased when an object is queued to be healed but never decreasd when there is a healing error. This commit will decrease pendingItems and pendingBytes even when there is an error to give accurate reporting.	2022-05-20 07:33:18 -07:00
Anis Elleuch	95a6b2c991	Merge LDAP STS policy evaluation with the generic STS code (#14944 ) If LDAP is enabled, STS security token policy is evaluated using a different code path and expects ldapUser claim to exist in the security token. This means other STS temporary accounts generated by any Assume Role function, such as AssumeRoleWithCertificate, won't be allowed to do any operation as these accounts do not have LDAP user claim. Since IsAllowedLDAPSTS() is similar to IsAllowedSTS(), this commit will merge both. Non harmful changes: - IsAllowed for LDAP will start supporting RoleARN claim - IsAllowed for LDAP will not check for parent claim anymore. This check doesn't seem to be useful since all STS login compare access/secret/security-token with the one saved in the disk. - LDAP will support $username condition in policy documents. Co-authored-by: Anis Elleuch <anis@min.io> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2022-05-19 11:06:55 -07:00
Harshavardhana	30c9e50701	make sure to ignore expected errors and dirname deletes (#14945 )	2022-05-18 17:58:19 -07:00
Aditya Manthramurthy	9aadd725d2	Avoid calling .Reset() on active timer (#14941 ) .Reset() documentation states: For a Timer created with NewTimer, Reset should be invoked only on stopped or expired timers with drained channels. This change is just to comply with this requirement as there might be some runtime dependent situation that might lead to unexpected behavior.	2022-05-18 15:37:58 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Harshavardhana	2dc8ac1e62	allow IAM cache load to be granular and capture missed state (#14930 ) anything that is stuck on the disk today can cause latency spikes for all incoming S3 I/O, we need to have this de-coupled so that we can make sure that latency in loading credentials are not reflected back to the S3 API calls. The approach this PR takes is by checking if the calls were updated just in case when the IAM load was in progress, so that we can use merge instead of "replacement" to avoid missing state.	2022-05-17 19:58:47 -07:00
Harshavardhana	040ac5cad8	fix: when logger queue is full exit quickly upon doneCh (#14928 ) Additionally only reload requested sub-system not everything	2022-05-16 16:10:51 -07:00
Harshavardhana	03f8b25b50	disable connectDisks loop under testing (#14920 ) avoids races during tests, keeps tests predictable	2022-05-16 05:36:00 -07:00
Aditya Manthramurthy	f28a8eca91	Add Access Management Plugin tests with OpenID (#14919 )	2022-05-13 12:48:02 -07:00
Anis Elleuch	ca69e54cb6	tests: Fix sporadic failure of TestXLStorageDeleteFile (#14911 ) The test expects from DeleteFile to return errDiskNotFound when the disk is not available. It calls os.RemoveAll() to remove one disk after XL storage initialization. However, this latter contains some goroutines which can race with os.RemoveAll() and then the test fails sporadically with returning random errors. The commit will tweak the initialization routine of the XL storage to only run deletion of temporary and metacache data in the background, so TestXLStorageDeleteFile won't fail anymore.	2022-05-12 15:24:58 -07:00
Aditya Manthramurthy	4629abd5a2	Add tests for Access Management Plugin (#14909 )	2022-05-12 15:24:19 -07:00
Harshavardhana	dc99f4a7a3	allow bucket to be listed when GetBucketLocation is enabled (#14903 ) currently, we allowed buckets to be listed from the API call if and when the user has ListObject() permission at the global level, this is okay to be extended to GetBucketLocation() as well since GetBucketLocation() is a "read" call and allowing "reads" on a bucket has an implicit assumption that ListBuckets() should be allowed. This makes discoverability of access for read-only users becomes easier or users with specific restrictions on their policies.	2022-05-12 10:46:20 -07:00
Harshavardhana	9341201132	logger lock should be more granular (#14901 ) This PR simplifies few things by splitting the locks between audit, logger targets to avoid potential contention between them. any failures inside audit/logger HTTP targets must only log to console instead of other targets to avoid cyclical dependency. avoids unneeded atomic variables instead uses RWLock to differentiate a more common read phase v/s lock phase.	2022-05-12 07:20:58 -07:00
Krishnan Parthasarathi	88dd83a365	lifecycle: Set opts.VersionSuspended when expiring objects (#14902 )	2022-05-12 06:09:24 -07:00
Harshavardhana	60d0611ac2	use BadRequest HTTP status instead of Conflict for certain errors (#14900 ) PutBucketVersioning API should return BadRequest for errors instead of Conflict, Conflict is used for "AlreadyExists" resource situations.	2022-05-11 13:44:16 -07:00
Harshavardhana	f939222942	add support for extra prometheus labels (#14899 ) fixes #14353	2022-05-11 13:04:53 -07:00
Krishna Srinivas	e34ca9acd1	retry each object decom upto 3 times, in-case of failure (#14861 )	2022-05-11 11:37:32 -07:00
Aditya Manthramurthy	83071a3459	Add support for Access Management Plugin (#14875 ) - This change renames the OPA integration as Access Management Plugin - there is nothing specific to OPA in the integration, it is just a webhook. - OPA configuration is automatically migrated to Access Management Plugin and OPA specific configuration is marked as deprecated. - OPA doc is updated and moved.	2022-05-10 17:14:55 -07:00
Anis Elleuch	edf364bf21	tracing: Add disk path to storage tracing (#14883 ) Example: 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/2 / 227.834µs 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/4 / 236.042µs 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/3 / 130.958µs 2022-05-09T17:14:04:000 [STORAGE] storage.ListVols 127.0.0.1:9000 /tmp/xl/1 / 102.875µs	2022-05-10 07:48:07 -07:00
Anis Elleuch	1e037883b0	pools: GetObjectNInfo should cover locking during object read (#14887 ) In case of multi-pools setup, GetObjectNInfo returns a GetObjectReader but it unlocks the read lock when quitting GetObjectNInfo. This should not happen, unlock should only happen when GetObjectReader is closed.	2022-05-10 07:47:40 -07:00
Klaus Post	d909f167ff	tests: Add localLocker RUnlock test (#14882 )	2022-05-09 09:55:52 -07:00
Harshavardhana	62aa42cccf	avoid replication proxy on version excluded paths (#14878 ) no need to attempt proxying objects that were never replicated, but do have local `null` versions on them.	2022-05-08 16:50:31 -07:00
Harshavardhana	5cffd3780a	fix: multiple fixes in prefix exclude implementation (#14877 ) - do not need to restrict prefix exclusions that do not have `/` as suffix, relax this requirement as spark may have staging folders with other autogenerated characters , so we are better off doing full prefix March and skip. - multiple delete objects was incorrectly creating a null delete marker on a versioned bucket instead of creating a proper versioned delete marker. - do not suspend paths on the excluded prefixes during delete operations to avoid creating `null` delete markers, honor suspension of versioning only at bucket level for delete markers.	2022-05-07 22:06:44 -07:00
Harshavardhana	def75ffcfe	allow versioning config changes under site replication (#14876 ) PR #14828 introduced prefix-level exclusion of versioning and replication - however our site replication implementation since it defaults versioning on all buckets did not allow changing versioning configuration once the bucket was created. This PR changes this and ensures that such changes are honored and also propagated/healed across sites appropriately.	2022-05-07 18:39:40 -07:00
Krishnan Parthasarathi	ad8e611098	feat: implement prefix-level versioning exclusion (#14828 ) Spark/Hadoop workloads which use Hadoop MR Committer v1/v2 algorithm upload objects to a temporary prefix in a bucket. These objects are 'renamed' to a different prefix on Job commit. Object storage admins are forced to configure separate ILM policies to expire these objects and their versions to reclaim space. Our solution: This can be avoided by simply marking objects under these prefixes to be excluded from versioning, as shown below. Consequently, these objects are excluded from replication, and don't require ILM policies to prune unnecessary versions. - MinIO Extension to Bucket Version Configuration ```xml <VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Status>Enabled</Status> <ExcludeFolders>true</ExcludeFolders> <ExcludedPrefixes> <Prefix>app1-jobs//_temporary/</Prefix> </ExcludedPrefixes> <ExcludedPrefixes> <Prefix>app2-jobs//__magic/</Prefix> </ExcludedPrefixes> <!-- .. up to 10 prefixes in all --> </VersioningConfiguration> ``` Note: `ExcludeFolders` excludes all folders in a bucket from versioning. This is required to prevent the parent folders from accumulating delete markers, especially those which are shared across spark workloads spanning projects/teams. - To enable version exclusion on a list of prefixes ``` mc version enable --excluded-prefixes "app1-jobs//_temporary/,app2-jobs//_magic," --exclude-prefix-marker myminio/test ```	2022-05-06 19:05:28 -07:00
Shireesh Anjal	3ec1844e4a	return kubernetes info in health report (#14865 )	2022-05-06 12:41:07 -07:00
Poorna	523670ba0d	fix: site removal API error handling (#14870 ) when the site is being removed is missing replication config. This can happen when a new deployment is brought in place of a site that is lost/destroyed and needs to delink old deployment from site replication.	2022-05-06 12:40:34 -07:00
Harshavardhana	35dea24ffd	fix: console log peer API from its broken implementation (#14873 ) console logging peer API was broken as it would timeout after 15minutes, this never really worked beyond this value and basically failed to provide the streaming "log" functionality that was expected from this implementation. also fix convoluted channel handling by keeping things simple, this is rewritten.	2022-05-06 12:39:58 -07:00
Harshavardhana	c7df1ffc6f	avoid concurrent reads and writes to opts.UserDefined (#14862 ) do not modify opts.UserDefined after object-handler has set all the necessary values, any mutation needed should be done on a copy of this value not directly. As there are other pieces of code that access opts.UserDefined concurrently this becomes challenging. fixes #14856	2022-05-05 04:14:41 -07:00
Aditya Manthramurthy	2b7e75e079	Add OPA doc and remove deprecation marking (#14863 )	2022-05-04 23:53:42 -07:00
Anis Elleuch	44a3b58e52	Add audit log for decommissioning (#14858 )	2022-05-04 00:45:27 -07:00
Anis Elleuch	46de9ac03e	Decom: Easily restart decommission when it is done (#14855 ) When a decommission task is successfully completed, failed, or canceled, this commit allows restarting the decommission again. Restarting is not allowed when there is an ongoing decommission task.	2022-05-03 13:36:08 -07:00
Harshavardhana	f0462322fd	fix: remove embedded-policy as requested by the user (#14847 ) this PR introduces a few changes such as - sessionPolicyName is not reused in an extracted manner to apply policies for incoming authenticated calls, instead uses a different key to designate this information for the callers. - this differentiation is needed to ensure that service account updates do not accidentally store JSON representation instead of base64 equivalent on the disk. - relax requirements for Deleting a service account, allow deleting a service account that might be unreadable, i.e a situation where the user might have removed session policy which now carries a JSON representation, making it unparsable. - introduce some constants to reuse instead of strings. fixes #14784	2022-05-02 17:56:19 -07:00
Klaus Post	c59d2a6288	Log Range Header if present in the request (#14851 ) Add Range header as param to easier debug of Range requests.	2022-05-02 10:37:26 -07:00
Klaus Post	3e3ff2a70b	Check error status codes (#14850 ) If an invalid status code is generated from an error we risk panicking. Even if there are no potential problems at the moment we should prevent this in the future. Add safeguards against this. Sample trace: ``` May 02 06:41:39 minio[52806]: panic: "GET /20180401230655.PDF": invalid WriteHeader code 0 May 02 06:41:39 minio[52806]: goroutine 16040430822 [running]: May 02 06:41:39 minio[52806]: runtime/debug.Stack(0xc01fff7c20, 0x25c4b00, 0xc0490e4080) May 02 06:41:39 minio[52806]: runtime/debug/stack.go:24 +0x9f May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd.setCriticalErrorHandler.func1.1(0xc022048800, 0x4f38ab0, 0xc0406e0fc0) May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd/generic-handlers.go:469 +0x85 May 02 06:41:39 minio[52806]: panic(0x25c4b00, 0xc0490e4080) May 02 06:41:39 minio[52806]: runtime/panic.go:965 +0x1b9 May 02 06:41:39 minio[52806]: net/http.checkWriteHeaderCode(...) May 02 06:41:39 minio[52806]: net/http/server.go:1092 May 02 06:41:39 minio[52806]: net/http.(response).WriteHeader(0xc0406e0fc0, 0x0) May 02 06:41:39 minio[52806]: net/http/server.go:1126 +0x718 May 02 06:41:39 minio[52806]: github.com/minio/minio/internal/logger.(ResponseWriter).WriteHeader(0xc032fa3ea0, 0x0) May 02 06:41:39 minio[52806]: github.com/minio/minio/internal/logger/audit.go:116 +0xb1 May 02 06:41:39 minio[52806]: github.com/minio/minio/internal/logger.(ResponseWriter).WriteHeader(0xc032fa3f40, 0x0) May 02 06:41:39 minio[52806]: github.com/minio/minio/internal/logger/audit.go:116 +0xb1 May 02 06:41:39 minio[52806]: github.com/minio/minio/internal/logger.(ResponseWriter).WriteHeader(0xc002ce8000, 0x0) May 02 06:41:39 minio[52806]: github.com/minio/minio/internal/logger/audit.go:116 +0xb1 May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd.writeResponse(0x4f364a0, 0xc002ce8000, 0x0, 0xc0443b86c0, 0x1cb, 0x224, 0x2a9651e, 0xf) May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd/api-response.go:736 +0x18d May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd.writeErrorResponse(0x4f44218, 0xc069086ae0, 0x4f364a0, 0xc002ce8000, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc00656afc0) May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd/api-response.go:798 +0x306 May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd.objectAPIHandlers.getObjectHandler(0x4b73768, 0x4b73730, 0x4f44218, 0xc069086ae0, 0x4f82090, 0xc002d80620, 0xc040e03885, 0xe, 0xc040e03894, 0x61, ...) May 02 06:41:39 minio[52806]: github.com/minio/minio/cmd/object-handlers.go:456 +0x252c ```	2022-05-02 10:36:29 -07:00
Harshavardhana	16bc11e72e	fix: disallow newer policies, users & groups with space characters (#14845 ) space characters at the beginning or at the end can lead to confusion under various UI elements in differentiating the actual name of "policy, user or group" - to avoid this behavior this PR onwards we shall reject such inputs for newer entries. existing saved entries will behave as is and are going to be operable until they are removed/renamed to something more meaningful.	2022-05-02 09:27:35 -07:00
Harshavardhana	2719f1efaa	fix: reject invalid r.Host headers (#14846 ) r.Host headers can come in unparsed that may contain invalid hostnames, reject such requests as invalid. This is a continuation fix from #14844	2022-05-02 04:42:41 -07:00
Harshavardhana	39ac62a1a1	fix: panic in browser redirect handler for unexpected r.Host (#14844 ) ``` panic: "GET /": invalid hostname goroutine 148 [running]: runtime/debug.Stack() runtime/debug/stack.go:24 +0x65 github.com/minio/minio/cmd.setCriticalErrorHandler.func1.1() github.com/minio/minio/cmd/generic-handlers.go:469 +0x8e panic({0x2201f00, 0xc001f1ddd0}) runtime/panic.go:1038 +0x215 github.com/minio/pkg/net.URL.String({{0x25aa417, 0x5}, {0x0, 0x0}, 0x0, {0xc000174380, 0xd7}, {0x0, 0x0}, {0x0, ...}, ...}) github.com/minio/pkg@v1.1.23/net/url.go:97 +0xfe github.com/minio/minio/cmd.setBrowserRedirectHandler.func1({0x49af080, 0xc0003c20e0}, 0xc00002ea00) github.com/minio/minio/cmd/generic-handlers.go:136 +0x118 net/http.HandlerFunc.ServeHTTP(0xc00002ea00, {0x49af080, 0xc0003c20e0}, 0xa) net/http/server.go:2047 +0x2f github.com/minio/minio/cmd.setAuthHandler.func1({0x49af080, 0xc0003c20e0}, 0xc00002ea00) github.com/minio/minio/cmd/auth-handler.go:525 +0x3d8 net/http.HandlerFunc.ServeHTTP(0xc00002e900, {0x49af080, 0xc0003c20e0}, 0xc001f33701) net/http/server.go:2047 +0x2f github.com/gorilla/mux.(Router).ServeHTTP(0xc0025d0780, {0x49af080, 0xc0003c20e0}, 0xc00002e800) github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1cf github.com/rs/cors.(Cors).Handler.func1({0x49af080, 0xc0003c20e0}, 0xc00002e800) github.com/rs/cors@v1.7.0/cors.go:219 +0x1bd net/http.HandlerFunc.ServeHTTP(0x0, {0x49af080, 0xc0003c20e0}, 0xc00068d9f8) net/http/server.go:2047 +0x2f github.com/minio/minio/cmd.setCriticalErrorHandler.func1({0x49af080, 0xc0003c20e0}, 0x4a5cd3) github.com/minio/minio/cmd/generic-handlers.go:476 +0x83 net/http.HandlerFunc.ServeHTTP(0x72, {0x49af080, 0xc0003c20e0}, 0x0) net/http/server.go:2047 +0x2f github.com/minio/minio/internal/http.(Server).Start.func1({0x49af080, 0xc0003c20e0}, 0x10000c001f1dda0) github.com/minio/minio/internal/http/server.go:105 +0x1b6 net/http.HandlerFunc.ServeHTTP(0x0, {0x49af080, 0xc0003c20e0}, 0x46982e) net/http/server.go:2047 +0x2f net/http.serverHandler.ServeHTTP({0xc003dc1950}, {0x49af080, 0xc0003c20e0}, 0xc00002e800) net/http/server.go:2879 +0x43b net/http.(conn).serve(0xc000514d20, {0x49cfc38, 0xc0010c0e70}) net/http/server.go:1930 +0xb08 created by net/http.(*Server).Serve net/http/server.go:3034 +0x4e8 ```	2022-05-01 13:45:45 -07:00
Harshavardhana	85f3a9f3b0	Remove Azure gateway implementation (#14418 ) refer #14331	2022-04-29 12:51:23 -07:00
Klaus Post	13ba4b433d	Clean up cpuio profiling (#14838 ) Don't start regular cpu profile as well. Use bed madmin const.	2022-04-29 09:35:42 -07:00
Aditya Manthramurthy	0e502899a8	Add support for multiple OpenID providers with role policies (#14223 ) - When using multiple providers, claim-based providers are not allowed. All providers must use role policies. - Update markdown config to allow `details` HTML element	2022-04-28 18:27:09 -07:00
Harshavardhana	424b44c247	allow changing server command line from http->https (#14832 ) this is allowed as long as order is preserved as is on an existing setup, the new command line is updated in `pool.bin` to facilitate future decommission's on these pools.	2022-04-28 16:27:53 -07:00
Harshavardhana	01a71c366d	allow service accounts and temp credentials site-level healing (#14829 ) This PR introduces support for site level - service account healing - temporary credentials healing	2022-04-28 02:39:00 -07:00
Harshavardhana	5a9a898ba2	allow forcibly creating metadata on buckets (#14820 ) introduce x-minio-force-create environment variable to force create a bucket and its metadata as required, it is useful in some situations when bucket metadata needs recovery.	2022-04-27 04:44:07 -07:00
Harshavardhana	c56a139fdc	fix: support decommissioning directory objects (#14822 ) improvements in this PR include - decommission objects that have __XLDIR__ suffix - decommission objects that have `null` version on a versioned bucket. - make sure to look for any "decom" failures to ensure that we do not wrong conclude decom as complete without all files getting copied over. - break out eagerly upon first error for objects with multiple versions, leave the object as is for support debugging and analysis.	2022-04-26 20:06:41 -07:00
Anis Elleuch	df50eda811	Add number of versions in server info API (#14812 ) The goal is to show the number of versions in the server info API.	2022-04-25 22:04:10 -07:00
Aditya Manthramurthy	f5d3313210	Increase context timeout for IAM concurrency test (#14817 ) - This should reduce failures in Windows CI	2022-04-25 20:14:20 -07:00
Daniel Valdivia	b7dd61f6bc	Fix double slash subpath for console (#14815 ) Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>	2022-04-25 13:05:56 -07:00
Harshavardhana	0cc993f403	Remove GCS, HDFS gateway implementations #14418 refer #14331	2022-04-24 10:19:17 -07:00
Poorna	3a64580663	Add support for site replication healing (#14572 ) heal bucket metadata and IAM entries for sites participating in site replication from the site with the most updated entry. Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Aditya Manthramurthy <aditya@minio.io>	2022-04-24 02:36:31 -07:00
Harshavardhana	d087e28dce	start using t.SetEnv instead of os.Setenv (#14787 )	2022-04-23 15:33:45 -07:00
Klaus Post	96adfaebe1	Make storage class config dynamic (#14791 ) Updating the storage class is already thread safe, so we can do this safely.	2022-04-21 12:07:33 -07:00
Aditya Manthramurthy	ddf84f8257	fix: concurrency bug in site-replication (#14786 ) The site replication status call was using a loop iteration variable sent directly into go-routines instead of being passed as an argument. As the variable is being updated in the loop, previously launched go routines do not necessarily use the value at the time they were launched.	2022-04-20 16:20:07 -07:00
Harshavardhana	507f993075	attempt to real resolve when there is a quorum failure on reads (#14613 )	2022-04-20 12:49:05 -07:00
Harshavardhana	73a6a60785	fix: replication deleteObject() regression and CopyObject() behavior (#14780 ) This PR fixes two issues - The first fix is a regression from #14555, the fix itself in #14555 is correct but the interpretation of that information by the object layer code for "replication" was not correct. This PR tries to fix this situation by making sure the "Delete" replication works as expected when "VersionPurgeStatus" is already set. Without this fix, there is a DELETE marker created incorrectly on the source where the "DELETE" was triggered. - The second fix is perhaps an older problem started since we inlined-data on the disk for small objects, CopyObject() incorrectly inline's a non-inlined data. This is due to the fact that we have code where we read the `part.1` under certain conditions where the size of the `part.1` is less than the specific "threshold". This eventually causes problems when we are "deleting" the data that is only inlined, which means dataDir is ignored leaving such dataDir on the disk, that looks like an inconsistent content on the namespace. fixes #14767	2022-04-20 10:22:05 -07:00
Anis Elleuch	cf4cf58faf	Do not allow parallel upgrade in one server (#14782 ) It is wasteful to allow parallel upgrades of MinIO server. This also generates weird error invoked by selfupdate module when it happens such as: 'rename /opt/bin/.minio.old /opt/bin/..minio.old.old'	2022-04-20 06:18:21 -07:00
polaris-megrez	6bc3c74c0c	honor client context in IAM user/policy listing calls (#14682 )	2022-04-19 09:00:19 -07:00
Harshavardhana	598ce1e354	supply prefix filtering when necessary (#14772 ) currently filterPefix was never used and set that would filter out entries when needed when `prefix` doesn't end with `/` - this often leads to objects getting Walked(), Healed() that were never requested by the caller.	2022-04-19 08:20:48 -07:00
Harshavardhana	7e248fc0ba	wait on parallel decom to complete before returning (#14764 ) without this wait there is a potential for some objects that are in actively being decommissioned would cancel, however the decommission status might wrongly conclude this as "Complete". To avoid this make sure to add waitgroups on the parallel workers, allowing parallel copies to complete fully before we return.	2022-04-18 13:26:29 -07:00
Daniel Valdivia	c526fa9119	Support console UI access at a subpath on a subdomain (#14761 ) fixes #14285 Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>	2022-04-17 16:01:49 -07:00
Anis Elleuch	a5b3548ede	Bring back listing LDAP users temporarly (#14760 ) In previous releases, mc admin user list would return the list of users that have policies mapped in IAM database. However, this was removed but this commit will bring it back until we revamp this.	2022-04-15 21:26:02 -07:00
Harshavardhana	8318aa0113	cancel active routine only after metadata has been saved (#14757 ) currently updated pool.bin was not saved properly, that would lead to unable to remove a pool upon a successful decommission. fixes #14756	2022-04-15 13:16:15 -07:00
Harshavardhana	e69c42956b	fix: IAM reload should only list at config/iam/ precisely (#14753 )	2022-04-15 12:12:45 -07:00
Aditya Manthramurthy	e8e48e4c4a	S3 select switch to new parquet library and reduce locking (#14731 ) - This change switches to a new parquet library - SelectObjectContent now takes a single lock at the beginning and holds it during the operation. Previously the operation took a lock every time the parquet library performed a Seek on the underlying object stream. - Add basic support for LogicalType annotations for timestamps.	2022-04-14 06:54:47 -07:00
Harshavardhana	2a6a40e93b	enable go1.18.x builds (#14746 )	2022-04-13 14:21:55 -07:00
Harshavardhana	eda34423d7	update gofumpt -w - new changes	2022-04-13 12:00:11 -07:00
Shireesh Anjal	5c53620a72	Include speedtest as part of healthinfo api (#14696 ) Execute the object, drive and net speedtests as part of the healthinfo (if requested by the client), and include their result in the response. The options for the speedtests have been picked from the default values used by `mc support perf` command.	2022-04-12 13:17:44 -07:00
Krishna Srinivas	5f94cec1e2	Allow parallel decom migration threads to be more than erasure sets (#14733 )	2022-04-12 10:49:53 -07:00
Krishnan Parthasarathi	28d3ad3ada	Honor object retention when applying ILM policies (#14732 )	2022-04-11 21:55:56 -07:00
Aditya Manthramurthy	66b14a0d32	Fix service account privilege escalation (#14729 ) Ensure that a regular unprivileged user is unable to create service accounts for other users/root.	2022-04-11 15:30:28 -07:00
Harshavardhana	153a612253	fetch bucket retention config once for ILM evalAction (#14727 ) This is mainly an optimization, does not change any existing functionality.	2022-04-11 13:25:32 -07:00
Krishnan Parthasarathi	1a1b55e133	Add support for minio tier type (#14468 )	2022-04-11 13:24:40 -07:00
Harshavardhana	e77ad3f9bb	make sure to pass Lifecycle if set for List filtering (#14722 ) PR #14606 never really passed the Lifecycle filter down to the listing callers to ensure skipping the entries.	2022-04-10 11:14:52 -07:00
Harshavardhana	4ce86ff5fa	align atomic variables once more for 32bit (#14721 )	2022-04-09 22:19:44 -07:00
Harshavardhana	601a744159	pass the necessary query params for remote NSSCanner (#14719 ) fixes a regression from #14464	2022-04-09 08:09:52 -07:00
Poorna	a1b01e6d5f	Combine profiling start/stop APIs into one (#14662 ) Take profile duration as a query parameter for profile API	2022-04-08 12:44:35 -07:00
Krishna Srinivas	48594617b5	Parallelize decommissioning process (#14704 )	2022-04-07 23:19:13 -07:00
Krishna Srinivas	b35b9dcff7	Use S3 client for uplooads/downloads during perf test (#14570 )	2022-04-07 21:20:40 -07:00
Lenin Alevski	a3e317773a	Skip commented lines when parsing MinIO configuration file (#14710 ) Signed-off-by: Lenin Alevski <alevsk.8772@gmail.com>	2022-04-07 16:02:51 -07:00
Anis Elleuch	16431d222c	heal: Enable periodic bitrot scan configuration (#14464 )	2022-04-07 08:10:40 -07:00
Harshavardhana	ee49a23220	resume/start decommission on the first node of the pool under decommission (#14705 ) Additionally fixes - IsSuspended() can use read locks - Avoid double cancels panic on canceler	2022-04-06 23:42:05 -07:00
Harshavardhana	a9eef521ec	skip config/history/ during IAM load (#14698 )	2022-04-06 21:03:41 -07:00
Klaus Post	901d33b59c	Tweak listing quorum (#14703 ) Always go for 50% quorum, and only use non-healing disks. Fixes #14635	2022-04-06 12:24:21 -07:00
Harshavardhana	00ebea2536	skip config/history/ during IAM load (#14698 )	2022-04-05 19:00:59 -07:00
Klaus Post	dedf9774c7	Set inspect-input.txt modtime (#14688 ) If no time given, use current time.	2022-04-05 13:06:10 -07:00
Andreas Auernhammer	6b1c62133d	listing: improve listing of encrypted objects (#14667 ) This commit improves the listing of encrypted objects: - Use `etag.Format` and `etag.Decrypt` - Detect SSE-S3 single-part objects in a single iteration - Fix batch size to `250` - Pass request context to `DecryptAll` to not waste resources when a client cancels the operation. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-04-04 11:42:03 -07:00
Anis Elleuch	d4251b2545	Remove unnecessary log printing (#14685 ) Co-authored-by: Anis Elleuch <anis@min.io>	2022-04-04 11:10:06 -07:00
Andreas Auernhammer	b9d1698d74	etag: add `Format` and `Decrypt` functions (#14659 ) This commit adds two new functions to the internal `etag` package: - `ETag.Format` - `Decrypt` The `Decrypt` function decrypts an encrypted ETag using a decryption key. It returns not encrypted / multipart ETags unmodified. The `Decrypt` function is mainly used when handling SSE-S3 encrypted single-part objects. In particular, the ETag of an SSE-S3 encrypted single-part object needs to be decrypted since S3 clients expect that this ETag is equal to the content MD5. The `ETag.Format` method also covers SSE ETag handling. MinIO encrypts all ETags of SSE single part objects. However, only the ETag of SSE-S3 encrypted single part objects needs to be decrypted. The ETag of an SSE-C or SSE-KMS single part object does not correspond to its content MD5 and can be a random value. The `ETag.Format` function formats an ETag such that it is an AWS S3 compliant ETag. In particular, it returns non-encrypted ETags (single / multipart) unmodified. However, for encrypted ETags it returns the trailing 16 bytes as ETag. For encrypted ETags the last 16 bytes will be a random value. The main purpose of `Format` is to format ETags such that clients accept them as well-formed AWS S3 ETags. It differs from the `String` method since `String` will return string representations for encrypted ETags that are not AWS S3 compliant. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-04-03 13:29:13 -07:00
Shireesh Anjal	7c696e1cb6	Write deployment id to health report at the start (#14673 ) The deployment id was being written to the health report towards the end of the handler. Because of this, if there was a timeout in any of the data fetching, the deployment id was not getting written at all. Upload of such reports fails on SUBNET as deployment id is the unique identifier for a cluster in subnet. Fixed by writing the deployment id at the beginning of the processing.	2022-04-03 13:15:02 -07:00
Aditya Manthramurthy	165d60421d	Add metrics for observing IAM sync operations (#14680 )	2022-04-03 13:08:59 -07:00
Poorna	0e6aedc7ed	Capture cmdline args for inspect API (#14668 ) Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2022-03-31 16:05:43 -07:00
Aditya Manthramurthy	fc9668baa5	Increase IAM refresh rate to every 10 mins (#14661 ) Add timing information for IAM init and refresh	2022-03-30 17:02:59 -07:00
Andreas Auernhammer	ba17d46f15	ListObjectParts: simplify ETag decryption and size adjustment (#14653 ) This commit simplifies the ETag decryption and size adjustment when listing object parts. When listing object parts, MinIO has to decrypt the ETag of all parts if and only if the object resp. the parts is encrypted using SSE-S3. In case of SSE-KMS and SSE-C, MinIO returns a pseudo-random ETag. This is inline with AWS S3 behavior. Further, MinIO has to adjust the size of all encrypted parts due to the encryption overhead. The ListObjectParts does specifically not use the KMS bulk decryption API (`4d2fc530d0`) since the ETags of all parts are encrypted using the same object encryption key. Therefore, MinIO only has to connect to the KMS once, even if there are multiple parts resp. ETags. It can simply reuse the same object encryption key. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-30 15:23:25 -07:00
Krishna Srinivas	bdd816488d	Get the BackendInfo to fill the apporpriate struct fields (#14660 )	2022-03-30 10:48:35 -07:00
Krishna Srinivas	36dcfee2f7	Allow decomission of pool even if a drive in it is down (#14656 )	2022-03-29 22:51:31 -07:00
Poorna	4d13ddf6b3	Avoid shadowing error during replication proxy check (#14655 ) Fixes #14652	2022-03-29 10:53:09 -07:00
Poorna	9e25475475	Validate tier manager is initialized in tier Empty() check (#14646 ) Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2022-03-29 10:10:06 -07:00
Andreas Auernhammer	e955aa7f2a	kes: add support for encrypted private keys (#14650 ) This commit adds support for encrypted KES client private keys. Now, it is possible to encrypt the KES client private key (`MINIO_KMS_KES_KEY_FILE`) with a password. For example, KES CLI already supports the creation of encrypted private keys: ``` kes identity new --encrypt --key client.key --cert client.crt MinIO ``` To decrypt an encrypted private key, the password needs to be provided: ``` MINIO_KMS_KES_KEY_PASSWORD=<password> ``` Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-29 09:53:33 -07:00
Harshavardhana	7956ff0313	fix: multiple pool setup return incorrect DeleteMarker metadata (#14642 )	2022-03-27 23:39:50 -07:00
Aditya Manthramurthy	9ff25fb64b	Load IAM in-memory cache using only a single list call (#14640 ) - Increase global IAM refresh interval to 30 minutes - Also print a log after loading IAM subsystem	2022-03-27 18:48:01 -07:00
Andreas Auernhammer	04df69f633	listing: decrypt only SSE-S3 single-part ETags (#14638 ) This commit optimises the ETag decryption when listing objects. When MinIO lists objects, it has to decrypt the ETags of single-part SSE-S3 objects. It does not need to decrypt ETags of - plaintext objects => Their ETag is not encrypted - SSE-C objects => Their ETag is not the content MD5 - SSE-KMS objects => Their ETag is not the content MD5 - multipart objects => Their ETag is not encrypted Hence, MinIO only needs to make a call to the KMS when it needs to decrypt a single-part SSE-S3 object. It can resolve the ETags off all other object types locally. This commit implements the above semantics by processing an object listing in batches. If the batch contains no single-part SSE-S3 object, then no KMS calls will be made. If the batch contains at least one single-part SSE-S3 object we have to make at least one KMS call. No we first filter all single-part SSE-S3 objects such that we only request the decryption keys for these objects. Once we know which objects resp. ETags require a decryption key, MinIO either uses the KES bulk decryption API (if supported) or decrypts each ETag serially. This commit is a significant improvement compared to the previous listing code. Before, a single non-SSE-S3 object caused MinIO to fall-back to a serial ETag decryption. For example, if a batch consisted of 249 SSE-S3 objects and one single SSE-KMS object, MinIO would send 249 requests to the KMS. Now, MinIO will send a single request for exactly those 249 objects and skip the one SSE-KMS object since it can handle its ETag locally. Further, MinIO would request decryption keys for SSE-S3 multipart objects in the past - even though multipart ETags are not encrypted. So, if a bucket contained only multipart SSE-S3 objects, MinIO would make totally unnecessary requests to the KMS. Now, MinIO simply skips these multipart objects since it can handle the ETags locally. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-27 18:34:11 -07:00
Anis Elleuch	908eb57795	Always get the actual object size (#14637 ) In bulk ETag decryption, do not rely on the etag to check if it is encrypted or not to decide if we should set the actual object size in ObjectInfo. The reason is that multipart objects ETags are not encrypted. Always get the actual object size in that case.	2022-03-27 08:54:25 -07:00
Harshavardhana	5cfedcfe33	askDisks for strict quorum to be equal to read quorum (#14623 )	2022-03-25 16:29:45 -07:00
Andreas Auernhammer	4d2fc530d0	add support for SSE-S3 bulk ETag decryption (#14627 ) This commit adds support for bulk ETag decryption for SSE-S3 encrypted objects. If KES supports a bulk decryption API, then MinIO will check whether its policy grants access to this API. If so, MinIO will use a bulk API call instead of sending encrypted ETags serially to KES. Note that MinIO will not use the KES bulk API if its client certificate is an admin identity. MinIO will process object listings in batches. A batch has a configurable size that can be set via `MINIO_KMS_KES_BULK_API_BATCH_SIZE=N`. It defaults to `500`. This env. variable is experimental and may be renamed / removed in the future. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-25 15:01:41 -07:00
Harshavardhana	f046f557fa	request only 1 best version for latest version resolution (#14625 ) ListObjects, ListObjectsV2 calls are being heavily taxed when there are many versions on objects left over from a previous release or ILM was never setup to clean them up. Instead of being absolutely correct at resolving the exact latest version of an object, we simply rely on the top most 1 version and resolve the rest. Once we have obtained the top most "1" version for ListObject, ListObjectsV2 call we break out.	2022-03-25 08:50:07 -07:00
Harshavardhana	401958938d	add load balance properly restClientFromHash() bucket/prefix (#14621 ) spread out resuming further to other nodes	2022-03-25 03:41:31 -07:00
Poorna	566cffe53d	save format.json by default for inspect API (#14620 )	2022-03-25 02:02:17 -07:00
Minio Trusted	a42b576382	keep maximum concurrent operations to 512 (to sustain upto 1024 open fds)	2022-03-23 17:02:04 -07:00
Klaus Post	2ac54e5a7b	ListObjects: Filter lifecycle expired objects (#14606 ) For ListObjects and ListObjectsV2 perform lifecycle checks on all objects before returning. This will filter out objects that are pending lifecycle expiration. Bonus: Cheaper server pool conflict resolution by not converting to FileInfo.	2022-03-22 12:39:45 -07:00
Harshavardhana	8eecdc6d1f	odd stripe sizes should choose (odd+1)/2 to get correct quorum (#14610 )	2022-03-22 12:21:14 -07:00
Klaus Post	50577e2bd2	Allow adjusting request pool both ways (#14609 ) When reloading a dynamic config allow the request pool to scale both ways. Existing requests hold on to the previous pool, so they will pop the elements from that.	2022-03-22 11:28:54 -07:00
Klaus Post	7bc1f986e8	Do not wait for results when canceled (#14607 ) When canceled nobody may be listening for the results. Prevents memory buildup from cancelled requests.	2022-03-22 09:37:01 -07:00
Harshavardhana	d796621ccc	choose smaller default deadline for diagnostics without --full (#14599 )	2022-03-21 23:25:24 -07:00
Harshavardhana	f6113264f4	add detection for GOMAXPROCS < NumCPU	2022-03-21 19:05:10 -07:00
Harshavardhana	a3534a730b	fallback quorum should be "strict" globally if config is not loaded (#14589 )	2022-03-20 17:39:06 -07:00
Harshavardhana	bd6f7b6d83	fix: make decommission restart non-blocking (#14591 ) currently an on-going decommission, during a server restart might block the startup sequence for relatively longer periods, instead start the decommission in background lazily.	2022-03-20 14:46:43 -07:00
Andreas Auernhammer	b0a4beb66a	PutObjectPart: set SSE-KMS headers and truncate ETags. (#14578 ) This commit fixes two bugs in the `PutObjectPartHandler`. First, `PutObjectPart` should return SSE-KMS headers when the object is encrypted using SSE-KMS. Before, this was not the case. Second, the ETag should always be a 16 byte hex string, perhaps followed by a `-X` (where `X` is the number of parts). However, `PutObjectPart` used to return the encrypted ETag in case of SSE-KMS. This leaks MinIO internal etag details through the S3 API. The combination of both bugs causes clients that use SSE-KMS to fail when trying to validate the ETag. Since `PutObjectPart` did not send the SSE-KMS response headers, the response looked like a plaintext `PutObjectPart` response. Hence, the client tries to verify that the ETag is the content-md5 of the part. This could never be the case, since MinIO used to return the encrypted ETag. Therefore, clients behaving as specified by the S3 protocol tried to verify the ETag in a situation they should not. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-19 10:15:12 -07:00
Harshavardhana	01ee49045e	fix: handle race in server setup global CI/CD variable (#14579 )	2022-03-18 18:21:09 -07:00
Harshavardhana	7bd9f821dd	return correct context errors for locking operations (#14569 ) if a context is canceled do not need to return a timeout error instead, return the appropriate error for context canceled.	2022-03-18 15:32:45 -07:00
Klaus Post	61eb9d4e29	Fix listing fallback re-using disks (#14576 ) When more than 2 disks are unavailable for listing, the same disk will be used for fallback. This makes quorum calculations incorrect since the same disk will have multiple entries. This PR keeps track of which fallback disks have been handed out and only every returns a disk once.	2022-03-18 11:35:27 -07:00
Harshavardhana	43eb5a001c	re-use transport for AdminInfo() call (#14571 ) avoids creating new transport for each `isServerResolvable` request, instead re-use the available global transport and do not try to forcibly close connections to avoid TIME_WAIT build upon large clusters. Never use httpClient.CloseIdleConnections() since that can have a drastic effect on existing connections on the transport pool. Remove it everywhere.	2022-03-17 16:20:10 -07:00
Klaus Post	c1760fb764	Move apiCalls to front for field alignment (#14568 ) Fixes #14565	2022-03-17 10:57:52 -07:00
Minio Trusted	ffcadcd99e	Revert "Use S3 client for uplooads/downloads during perf test (#14553 )" This reverts commit `ff811f594b`. Speedtest is broken need to fix this more cleanly.	2022-03-16 23:34:49 -07:00
Krishnan Parthasarathi	7b81967a3c	Fix handling of object versions pending purge (#14555 ) - GetObject() with vid should return 405 - GetObject() without vid should return 404 - ListObjects() should ignore this object if this is the "latest" version of the object - ListObjectVersions() should list this object as "DELETE marker" - Remove data parts before sync'ing the version pending purge	2022-03-16 16:59:43 -07:00
Krishna Srinivas	ff811f594b	Use S3 client for uplooads/downloads during perf test (#14553 )	2022-03-16 16:58:46 -07:00
Harshavardhana	e3071157f0	allow MakeBucketLocation to work for metaBucket (#14548 ) decommission would fail to start due to failure in MakeBucketLocation() error on .minio.sys/ bucket creation. Allow these special buckets.	2022-03-14 11:25:24 -07:00
Klaus Post	c07af89e48	select: Add ScanRange to CSV&JSON (#14546 ) Implements https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html#AmazonS3-SelectObjectContent-request-ScanRange Fixes #14539	2022-03-14 09:48:36 -07:00
Harshavardhana	9c846106fa	decouple service accounts from root credentials (#14534 ) changing root credentials makes service accounts in-operable, this PR changes the way sessionToken is generated for service accounts. It changes service account behavior to generate sessionToken claims from its own secret instead of using global root credential. Existing credentials will be supported by falling back to verify using root credential. fixes #14530	2022-03-14 09:09:22 -07:00
Harshavardhana	cf94d1f1f1	do not crash readXLMetaNoData - if the `xl.meta` has incorrect content (#14538 ) ``` tmp = buf[want:] ``` Would potentially crash when `buf` is truncated for some reason and does not have the expected bytes, this is of course considered not normal and is an odd situation. But we do not need to crash here instead allow for errors to be returned and let callers handle the errors.	2022-03-14 09:07:46 -07:00
Poorna	f8d6eaaa96	fix: regression from range GET proxy on replicated buckets #14345 (#14532 ) Fixes: #14531	2022-03-11 15:56:49 -08:00
Poorna	75b925c326	Deprecate root disk for disk caching (#14527 ) This PR modifies #14513 to issue a deprecation warning rather than reject settings on startup.	2022-03-10 18:42:44 -08:00
Harshavardhana	91d419ee6c	warn issues about large block I/O performance for Linux older than 4.0.0 (#14524 ) This PR simply adds a warning message when it detects older kernel versions and warn's them about potential performance issues on this kernel. The issue can be seen only with parallel I/O across all drives on denser setups such as 90 drives or 45 drives per server configurations.	2022-03-10 17:36:13 -08:00
Harshavardhana	41079f1015	heal: remove blocking healDiskMeta upon startup (#14514 ) This type of code is not necessary, read's of all metadata content at `.minio.sys/config` automatically triggers healing when necessary in the GetObjectNInfo() call-path. Having this code is not useful and this also adds to the overall startup time of MinIO when there are lots of users and policies.	2022-03-10 02:45:14 -08:00
Poorna	712dfa40cd	Add missing site replication hook for clearing sse config (#14512 )	2022-03-10 00:04:34 -08:00
Klaus Post	b890bbfa63	Add local disk health checks (#14447 ) The main goal of this PR is to solve the situation where disks stop responding to operations. This generally causes an FD build-up and eventually will crash the server. This adds detection of hung disks, where calls on disk get stuck. We add functionality to `xlStorageDiskIDCheck` where it keeps track of the number of concurrent requests on a given disk. A total number of 100 operations are allowed. If this limit is reached we will block (but not reject) new requests, but we will monitor the state of the disk. If no requests have been completed or updated within a 15-second window, we mark the disk as offline. Requests that are blocked will be unblocked and return an error as "faulty disk". New requests will be rejected until the disk is marked OK again. Once a disk has been marked faulty, a check will run every 5 seconds that will attempt to write and read back a file. As long as this fails the disk will remain faulty. To prevent lots of long-running requests to mark the disk faulty we implement a callback feature that allows updating the status as parts of these operations are running. We add a reader and writer wrapper that will update the status of each successful read/write operation. This should allow fine enough granularity that a slow, but still operational disk will not reach 15 seconds where 50 operations have not progressed. Note that errors themselves are not enough to mark a disk faulty. A nil (or io.EOF) error will mark a disk as "good". * Make concurrent disk setting configurable via `_MINIO_DISK_MAX_CONCURRENT`. * de-couple IsOnline() from disk health tracker The purpose of IsOnline() is to ensure that we reconnect the drive only when the "drive" was - disconnected from network we need to validate if the drive is "correct" and is the same drive which belongs to this server. - drive was replaced we have to format it - we support hot swapping of the drives. IsOnline() is not meant for taking the drive offline when it is hung, it is not useful we can let the drive be online instead "return" errors for relevant calls. * return errFaultyDisk for DiskInfo() call Co-authored-by: Harshavardhana <harsha@minio.io> Possible future Improvements: * Unify the REST server and local xlStorageDiskIDCheck. This would also improve stats significantly. * Allow reads/writes to be aborted by the context. * Add usage stats, concurrent count, blocked operations, etc.	2022-03-09 11:38:54 -08:00
Poorna	46ba15ab03	Return MethodNotAllowed if force del on replicated bucket (#14505 )	2022-03-08 14:28:51 -08:00
Poorna	1e39ca39c3	fix: consistent replies for incorrect range requests on replicated buckets (#14345 ) Propagate error from replication proxy target correctly to the client if range GET is unsatisfiable.	2022-03-08 13:58:55 -08:00
Krishnan Parthasarathi	80ef1ae51c	Simplify assembling of tierStats from data-usage (#14504 )	2022-03-08 12:08:29 -08:00
Krishna Srinivas	4d0715d226	Implement netperf for "mc support perf net" (#14397 ) Co-authored-by: Klaus Post <klauspost@gmail.com>	2022-03-08 09:54:38 -08:00
Klaus Post	8a274169da	heal: Fix first entry on dangling (#14495 ) Instead of the first, the last entry was returned pointerizing the range value.	2022-03-08 09:04:20 -08:00
Harshavardhana	5d6f6d8d5b	create missing .minio.sys/config, .minio.sys/buckets during decommission (#14497 )	2022-03-07 16:18:57 -08:00
Anis Elleuch	bacf6156c1	metrics: Avoid crash when fetching tier metrics (#14493 ) Data usage does not always contain tiering info even if the data usage information is valid. Avoid a crash in that case. (e.g. the scanner scanned the namespace, the user enables tiering, prometheus scrapes the server before the scanner gets a chance to update the data usage with new tiering information)	2022-03-07 10:59:32 -08:00
Klaus Post	1d1b213f1f	scanner: Consider preselection bias when selecting for Healing (#14492 ) Healing decisions would align with skipped folder counters. This can lead to files never being selected for heal checks on "clean" paths. Use different hashing methods and take objectHealProbDiv into account when calculating the cycle. Found by @vadmeste	2022-03-07 09:25:53 -08:00
Harshavardhana	92a77cc78e	update pkg v1.1.20 to reload certs in k8s always (#14470 )	2022-03-04 20:34:39 -08:00
Harshavardhana	b0c84e3de7	fix: deleteVersions causing xl.meta to have empty Versions[] slice (#14483 ) This is a side-affect of the optimization done in PR #13544 which causes a certain type of delete operations on given object versions can cause lastVersion indication to be skipped, which leads to an `xl.meta` where Versions[] slice is empty while the entire file is intact by itself. This PR tries to ensure that such files are visible and deletable by regular means of listing as null 'delete-marker' and also avoid the situation where this potential issue might arise.	2022-03-04 20:01:26 -08:00
Anis Elleuch	bbc914e174	heal: Do not override heal scan mode mode if it is set (#14476 ) mc admin heal has --scan=deep flag which enforces bitrot checking when doing the healing. Do not force override an existing heal scan option.	2022-03-04 18:25:06 -08:00
Anis Elleuch	3fca4055d2	heal: Re-heal an object when a corruption is found during normal scan (#14482 ) When scanning using normal mode, HealObject() can report an error saying that it found a corrupted part. This doesn't have when HealObject() is called with bitrot scan flag. However, when this happens, we can still restart HealObject() with the bitrot scan. This is also important because this means the scanner and the new disks healer will not be able to heal an object that doesn't exist in a specific disk and has corruption in another disk. Also without this PR, mc admin heal command without bitrot will report an error.	2022-03-04 18:24:34 -08:00
Harshavardhana	66afa16aed	canceled PUTs throw frivolous logs (#14475 ) remote drives might throw frivolous logs, if the caller canceled the PUT operation in such scenarios there is no reason to log.	2022-03-04 10:31:33 -08:00
Harshavardhana	0e3bafcc54	improve logs, fix banner formatting (#14456 )	2022-03-03 13:21:16 -08:00
Andreas Auernhammer	b48f719b8e	kes: remove unnecessary error conversion (#14459 ) This commit removes some duplicate code that converts KES API errors. This code was added since KES `0.18.0` changed some exported API errors. However, the KES SDK handles this error conversion itself. Therefore, it is not necessary to duplicate this behavior in MinIO. See: `21555fa624/error.go (L94)` Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-03-03 09:42:37 -08:00
Lenin Alevski	289fcbd08c	KES dependency upgrade (#14454 ) - Updating KES dependency to v.0.18.0 - Fixing incompatibility issue when checking for errors during KES key creation Signed-off-by: Lenin Alevski <alevsk.8772@gmail.com>	2022-03-02 23:03:40 -08:00
Harshavardhana	7e803adf13	do not attempt force delete on bucket (#14452 ) caller needs to ask explicitly for force delete otherwise, the force delete might end up deleting an existing bucket with data. fixes #14445	2022-03-02 20:47:53 -08:00
Anis Elleuch	4a15bd8ff8	Return info for DiskInfo when the disk is unformatted (#14427 ) In a distributed setup, a DiskInfo REST call to an unformatted disk returns an error with no disk information, such as the disk endpoint URL, which is unexpected.	2022-03-01 15:06:47 -08:00
Klaus Post	b030ef1aca	tests: Clean up dsync package (#14415 ) Add non-constant timeouts to dsync package. Reduce test runtime by minutes. Hopefully not too aggressive.	2022-03-01 11:14:28 -08:00
Harshavardhana	cc46a99f97	skip object-lock headers without values (#14430 ) metadata headers can have headers without values as per AWS S3 spec however, we need to skip some headers that do not have values that potentially can have empty values set.	2022-03-01 11:04:47 -08:00
Xuehan Xu	becec6cb6b	correct mrf.newSetReconnected invocation's param order (#14426 ) Signed-off-by: xuxuehan <xuxuehan@qianxin.com>	2022-02-28 09:13:19 -08:00
Harshavardhana	b7c90751b0	allow drive tests to respond only drive paths	2022-02-25 18:54:46 -08:00
Harshavardhana	e43cc316ff	remove errCh usage from HealObjects() simplify it (#14414 ) errCh is not needed instead, rely on errs slice to capture and return errors instead. most probably fixes #14247	2022-02-25 12:20:41 -08:00
hellivan	03b35ecdd0	collect correct parentUser for OIDC creds auto expiration (#14400 )	2022-02-24 11:43:15 -08:00
Harshavardhana	c08540c7b7	reject speedtest when there isn't enough disk space available (#14402 ) small setups do not return appropriate errors when speedtest cannot run on small tiny setups, allow the tests to fail appropriately more pro-actively. many users bring toy setups, this PR simply returns an error in such situations.	2022-02-24 09:06:18 -08:00
Shireesh Anjal	3934700a08	Make audit webhook and kafka config dynamic (#14390 )	2022-02-24 09:05:33 -08:00
Harshavardhana	2d78e20120	enable CI environment additionally for MINIO_CI_CD (#14395 ) all CI/CD environments set CI=true this is enough for MinIO to be run inside CI environments, support it.	2022-02-23 16:01:59 -08:00
Harshavardhana	2e6f8bdf19	do not skip healing disks during deletes (#14394 ) healing disks take active I/O it is possible that deleted objects might stay in .trash folder for a really long time until the drive is fully healed. this PR changes it such that we are making sure we purge the active content written to these disks as well.	2022-02-23 14:30:46 -08:00
Shireesh Anjal	25144fedd5	Send deployment id and minio version in http header (#14378 )	2022-02-23 13:36:01 -08:00
Krishnan Parthasarathi	27f64dd9a4	Add support for tier-remove and tier-verify (#14382 ) * Add tier remove support only if it's empty * Add support for tier verify	2022-02-23 13:34:25 -08:00
Harshavardhana	9d7648f02f	reduce unnecessary logging during speedtest (#14387 ) - speedtest logs calls that were canceled spuriously, in situations where it should be ignored. - all errors of interest are always sent back to the client there is no need to log them on the server console. - PUT failures should negate the increments such that GET is not attempted on unsuccessful calls. - do not attempt MRF on speedtest objects.	2022-02-23 11:59:13 -08:00
Poorna	1ef8babfef	cache: improve error reported for atime check (#14384 )	2022-02-23 11:57:06 -08:00
Poorna	4ea7bf0510	Use custom transport for site replication (#14391 ) Also, ensure that tiering uses a different instance of custom transport	2022-02-23 11:50:40 -08:00
Anis Elleuch	5dcf1d13a9	ci: Always set disks as non root disks (#14389 ) In the testing mode, reformatting disks will fail because the healing code will complain if one disk is in root mode. This commit will automatically set all disks as non-root if MINIO_CI_CD is set.	2022-02-23 10:11:33 -08:00
Shireesh Anjal	94d37d05e5	Apply dynamic config at sub-system level (#14369 ) Currently, when applying any dynamic config, the system reloads and re-applies the config of all the dynamic sub-systems. This PR refactors the code in such a way that changing config of a given dynamic sub-system will work on only that sub-system.	2022-02-22 10:59:28 -08:00
Harshavardhana	0cbdc458c5	fix: do not reload disk format.json on a reconnected disk (#14351 ) An onlineDisk means its a valid disk but it may be a re-connected disk, this PR verifies that based on LastConn() to only trigger MRF. Current code would again re-load the disk 'format.json' which is not necessary and perhaps an unnecessary call. A potential side affect of this is closing perfectly online disks and getting re-replaced by reloading 'format.json'. This PR tries to avoid this situation by making sure MRF is triggered but not reloading 'format.json' because of MRF.	2022-02-21 15:51:54 -08:00
Harshavardhana	65b1a4282e	fix: console logger regression with dynamic logger webhook registration (#14346 ) fixes a regression from #14289	2022-02-17 17:50:10 -08:00
Harshavardhana	af3dc25dfe	align 32bit integers with atomic values in structs (#14344 ) fixes #14341	2022-02-17 15:22:26 -08:00
Krishnan Parthasarathi	5a0c0079a1	Don't add free-version on restore-object (#14340 )	2022-02-17 15:05:19 -08:00
Harshavardhana	af8f563ed3	allow clearing FIFO config as fallback (#14338 ) FIFO is already removed, for users who upgrade are allowed to clear their configs.	2022-02-17 12:49:46 -08:00
Poorna	93af4a4864	Handle non existent kms key correctly (#14329 ) - in PutBucketEncryption API - admin APIs for `mc admin KMS key [create\|info]` - PutObject API when invalid KMS key is specified	2022-02-17 11:36:14 -08:00
Shireesh Anjal	28f188e3ef	Make logger webhook config dynamic (#14289 ) It should not be required to restart the server after setting the logger webhook config.	2022-02-17 11:11:15 -08:00
Harshavardhana	d756da41b9	fix: print gateway banner on removal notice	2022-02-16 20:34:47 -08:00
Krishnan Parthasarathi	cdab4a3b85	Update hourly tier-stats only on succesful tiering (#14330 )	2022-02-16 17:29:12 -08:00
Klaus Post	b88c57ba93	Add fgprof profiles (#14321 ) https://github.com/felixge/fgprof#rocket-fgprof---the-full-go-profiler	2022-02-16 12:00:10 -08:00
Klaus Post	60cd513a33	Fix leaked healing goroutines (#14322 ) Only the first `listAndHeal` would ever be able to write on errCh, blocking all others infinitely. Instead read all errors but return the first non-nil, if any. The intention appears to be that this should cancel on any error, so that part is kept. Regression from #13990	2022-02-16 08:40:18 -08:00
Harshavardhana	03a6e8aee2	fix: creating steep directory structure on trash folder (#14314 ) weird directory structures get created on the '.trash' folder upon server restarts, this PR fixes this.	2022-02-15 16:34:03 -08:00
Anis Elleuch	4afbb89774	nas: Clean stale background appended files (#14295 ) When more than one gateway reads and writes from the same mount point and there is a load balancer pointing to those gateways. Each gateway will try to create its own temporary append file but fails to clear it later when not needed. This commit creates a routine that checks all upload IDs saved in multipart directory and remove any stale entry with the same upload id in the memory and in the temporary background append folder as well.	2022-02-15 09:25:47 -08:00
Klaus Post	5ec57a9533	Add GetObject gzip option (#14226 ) Enabled with `mc admin config set alias/ api gzip_objects=on` Standard filtering applies (1K response minimum, not compressed content type, not range request, gzip accepted by client).	2022-02-14 09:19:01 -08:00
Anis Elleuch	1f92fc3fc0	Always check for root disks unless MINIO_CI_CD is set (#14232 ) The current code considers a pool with all root disks to be as part of a testing environment even if there are other pools with mounted disks. This will result to illegitimate writing in root disks. Fix this by simplifing the logic: require MINIO_CI_CD in order to skip root disk check.	2022-02-13 15:42:07 -08:00
Harshavardhana	fad3d66093	parallelize background cleanup on local disks across sets (#14290 )	2022-02-11 14:22:48 -08:00
Poorna	ed3418c046	Refactor replication resync to be an active process (#14266 ) When resync is triggered, walk the bucket namespace and resync objects that are unreplicated. This PR also adds an API to report resync progress.	2022-02-10 10:16:52 -08:00
Anis Elleuch	71bab74148	Fix adding bucket forwarder handler in server mode (#14288 ) MinIO configuration is loaded after the initialization of the server handlers, which will miss the initialization of the bucket forwarder handler. Though the federation is deprecated, let's fix this for the time being.	2022-02-10 08:49:36 -08:00
Anis Elleuch	661ea57907	restore: Add quotes some fields in x-amz-restore header (#14281 ) S3 spec returns x-amz-restore header in HEAD/GET object with the following format: ``` x-amz-restore: ongoing-request="false", expiry-date="Fri, 21 Dec 2012 00:00:00 GMT" ``` This commit adds quotes as the current code does not support it. It will also supports the old format saved in the disk (in xl.meta) for backward compatibility.	2022-02-09 13:17:41 -08:00
Anis Elleuch	1f18efb0ba	gateway: Active bucket forwarding handler (#14277 ) A regression removed support of federation in the gateway mode. Enable it again. Federation is deprecated for a while but let's fix this for the time being.	2022-02-09 09:31:47 -08:00
Daniel	8ae46bce93	fix the error logs have been omitted because of retryCount never exceed 10 (#14268 )	2022-02-09 03:14:22 -08:00
Harshavardhana	f19a414e09	fix: allow danging objects to be purged properly deleteMultipleObjects() (#14273 ) Deleting bulk objects had an issue since the relevant versionID is not passed through the layers to ensure that the dangling object purge actually works cleanly. This is a continuation of quorum related error returned by multi-object delete API from #14248 This PR ensures that we pass down correct information as well as extend the scope of dangling object detection.	2022-02-08 20:08:23 -08:00
Krishnan Parthasarathi	0ee2933234	Export tier metrics via Prometheus (#13413 ) e.g ``` minio_cluster_ilm_transitioned_bytes{server="minio3:9000",tier="S3TIER-1"} 1.36317772e+08 minio_cluster_ilm_transitioned_bytes{server="minio3:9000",tier="S3TIER-2"} 2892 minio_cluster_ilm_transitioned_bytes{server="minio3:9000",tier="STANDARD"} 1.3631488e+08 minio_cluster_ilm_transitioned_objects{server="minio3:9000",tier="S3TIER-1"} 1 minio_cluster_ilm_transitioned_objects{server="minio3:9000",tier="S3TIER-2"} 0 minio_cluster_ilm_transitioned_objects{server="minio3:9000",tier="STANDARD"} 1 minio_cluster_ilm_transitioned_versions{server="minio3:9000",tier="S3TIER-1"} 3 minio_cluster_ilm_transitioned_versions{server="minio3:9000",tier="S3TIER-2"} 2 minio_cluster_ilm_transitioned_versions{server="minio3:9000",tier="STANDARD"} 1 ```	2022-02-08 12:45:28 -08:00
Shireesh Anjal	9890f579f8	Add subsystem level validation on `config set` (#14269 ) When setting a config of a particular sub-system, validate the existing config and notification targets of only that sub-system, so that existing errors related to one sub-system (e.g. notification target offline) do not result in errors for other sub-systems.	2022-02-08 10:36:41 -08:00
Anis Elleuch	2ee337ead5	prometheus: Add incoming requests metrics since last scrape (#14261 ) Some users running MinIO claim that their system became slow. One way to investigate is to look at this Prometheus history of the number of the requests reaching the server. The existing current S3 requests metric is not enough because it can increase of the system really becomes slow, due to disk issues for example.	2022-02-07 16:30:14 -08:00
Harshavardhana	3c87e1e60d	fix: rename some function names to avoid confusion (#14262 )	2022-02-07 11:49:07 -08:00
Harshavardhana	0cac868a36	speed-up startup time, do not block on ListBuckets() (#14240 ) Bonus fixes #13816	2022-02-07 10:39:57 -08:00
Harshavardhana	186c477f3c	init console server after server config is initialized fixes #14259	2022-02-07 00:17:33 -08:00
Harshavardhana	6123377e66	speedup getFormatErasureInQuorum use driveCount (#14239 ) startup speed-up, currently getFormatErasureInQuorum() would spend up to 2-3secs when there are 3000+ drives for example in a setup, simplify this implementation to use drive counts.	2022-02-04 12:21:21 -08:00
Harshavardhana	0256dae657	fix: quorum requirement for DeleteMarkers and parity upgraded objects (#14248 ) DeleteMarkers do not have a default quorum, i.e it is possible that DeleteMarkers were created with n/2+1 quorum as well to make sure that we satisfy situations such as those we need to make sure delete markers only expect n/2 read quorum. Additionally we should also look at additional metadata on the actual objects that might have been "erasure" upgraded with new parity when disks are down. In such a scenario do not default to the standard storage class parity, instead use the parityBlocks present on the FileInfo to ensure that we are dealing with the correct quorum for READs and DELETEs.	2022-02-04 02:47:36 -08:00
Harshavardhana	84b121bbe1	return error with empty x-amz-copy-source-range headers (#14249 ) fixes #14246	2022-02-03 16:58:27 -08:00
Harshavardhana	01e550a9be	ignore unreadable metrics on certain closed systems (#14234 ) fixes #14233	2022-02-03 09:45:12 -08:00
Poorna	63a2e0bab6	Remove notification from NotificationSys on bucket deletion (#14236 )	2022-02-02 17:11:56 -08:00
Harshavardhana	24657859a8	when o_direct is disabled do not attempt fadvise call (#14230 )	2022-02-02 08:54:52 -08:00
Sidhartha Mani	d7df6bc738	add support for speedtest drive (#14182 )	2022-02-01 22:38:05 -08:00
Poorna	a4e1de93a7	Add API for removing site(s) from site replication (#14104 )	2022-02-01 17:26:09 -08:00
Klaus Post	067d21d0f2	fs: Retry listing if no marker (#14221 ) Retry listings, when no next marker is returned and the result isn't truncated. This can happen when an object is queued, but no info can be fetched. Fixes #14190	2022-02-01 10:00:14 -08:00
Shireesh Anjal	3882da6ac5	Add subnet proxy config (#14225 ) Will store the HTTP(S) proxy URL to use for connecting to SUBNET.	2022-02-01 09:52:38 -08:00
Anis Elleuch	127e8bf3b6	heal: Avoid printing repetitive error to heal a root disk (#14220 ) The healing code repeatedly tries to heal a root disk when it is empty the reason is that connectEndpoint() returns errUnformattedDisk even if the disk is a root disk. Changing that to returning another error will avoid queueing the disk to the healing code in each connect disks iteration.	2022-01-31 17:28:20 -08:00
Harshavardhana	74faed166a	Add quota usage as part of prometheus metrics (#14222 ) Bonus: pass caller context when needed to all bucket metadata handling calls.	2022-01-31 17:27:43 -08:00
Harshavardhana	dbd05d6e82	remove FIFO bucket quota, use ILM expiration instead (#14206 )	2022-01-31 11:07:04 -08:00
Harshavardhana	b5d35c7e09	ignore disk metrics for single drive mode (#14212 ) fixes #14211	2022-01-31 00:44:26 -08:00
Poorna	0f88cdc80e	Return all stats in SiteReplicationStatus API if options unset (#14207 )	2022-01-28 21:19:38 -08:00
Poorna	38e3c7a8f7	Added filters for SiteReplicationStatus API to support new UI changes (#14177 )	2022-01-28 15:37:55 -08:00
Poorna	a4be47d7ad	Validate config before saving changes after config reset (#14203 )	2022-01-27 18:28:16 -08:00
Harshavardhana	aaea94a48d	update quorum requirement to list all objects (#14201 ) some upgraded objects might not get listed due to different quorum ratios across objects. make sure to list all objects that satisfy the maximum possible quorum.	2022-01-27 17:00:15 -08:00
Aditya Manthramurthy	c3d9c45f58	Ensure that AssumeRole calls are sent to Audit log (#14202 ) When authentication fails MinIO was not sending out an Audit log event for this STS call	2022-01-27 16:17:11 -08:00
Klaus Post	a2a48cc065	Optimize read locker cleanup (#14200 ) When objects hold a lot of read locks cleanup time grows exponentially. ``` BEFORE: Unable to complete tests. AFTER: === RUN Test_localLocker_expireOldLocksExpire/100-locks/1-read local-locker_test.go:298: Scan Took: 0s. Left: 100/100 local-locker_test.go:317: Expire 50% took: 0s. Left: 44/44 local-locker_test.go:331: Expire rest took: 0s. Left: 0/0 === RUN Test_localLocker_expireOldLocksExpire/100-locks/100-read local-locker_test.go:298: Scan Took: 0s. Left: 10000/100 local-locker_test.go:317: Expire 50% took: 1ms. Left: 5000/100 local-locker_test.go:331: Expire rest took: 1ms. Left: 0/0 === RUN Test_localLocker_expireOldLocksExpire/100-locks/1000-read local-locker_test.go:298: Scan Took: 2ms. Left: 100000/100 local-locker_test.go:317: Expire 50% took: 55ms. Left: 50038/100 local-locker_test.go:331: Expire rest took: 29ms. Left: 0/0 === RUN Test_localLocker_expireOldLocksExpire/10000-locks/1-read local-locker_test.go:298: Scan Took: 1ms. Left: 10000/10000 local-locker_test.go:317: Expire 50% took: 2ms. Left: 5019/5019 local-locker_test.go:331: Expire rest took: 2ms. Left: 0/0 === RUN Test_localLocker_expireOldLocksExpire/10000-locks/100-read local-locker_test.go:298: Scan Took: 23ms. Left: 1000000/10000 local-locker_test.go:317: Expire 50% took: 160ms. Left: 499798/10000 local-locker_test.go:331: Expire rest took: 138ms. Left: 0/0 === RUN Test_localLocker_expireOldLocksExpire/10000-locks/1000-read local-locker_test.go:298: Scan Took: 200ms. Left: 10000000/10000 local-locker_test.go:317: Expire 50% took: 5.888s. Left: 5000196/10000 local-locker_test.go:331: Expire rest took: 3.417s. Left: 0/0 === RUN Test_localLocker_expireOldLocksExpire/1000000-locks/1-read local-locker_test.go:298: Scan Took: 133ms. Left: 1000000/1000000 local-locker_test.go:317: Expire 50% took: 348ms. Left: 500255/500255 local-locker_test.go:331: Expire rest took: 307ms. Left: 0/0 ```	2022-01-27 14:10:57 -08:00
Harshavardhana	cf407f7176	do not expect 'speedtest' to be a bucket (#14199 ) fixes #14196	2022-01-27 08:13:03 -08:00
Harshavardhana	d6dd17a483	make sure to pass groups for all credentials while verifying policies (#14193 ) fixes #14180	2022-01-26 21:53:36 -08:00
Aditya Manthramurthy	7dfa565d00	Identity LDAP: Allow multiple search base DNs (#14191 ) This change allows the MinIO server to lookup users in different directory sub-trees by allowing specification of multiple search bases separated by semicolons.	2022-01-26 15:05:59 -08:00
Krishnan Parthasarathi	d2e5f01542	feat: maintain in-memory tier stats for the last 24hrs (#13782 )	2022-01-26 14:33:10 -08:00
yfanswer	f4e373e0d2	de-couple cache completeMultipartUpload with caller context (#14181 )	2022-01-26 11:55:58 -08:00
Harshavardhana	57118919d2	cached diskIDs are not needed for scanner healing (#14170 ) This PR removes an unnecessary state that gets passed around for DiskIDs, which is not necessary since each disk exactly knows which pool and which set it belongs to on a running system. Currently cached DiskId's won't work properly because it always ends up skipping offline disks and never runs healing when disks are offline, as it expects all the cached diskIDs to be present always. This also sort of made things in-flexible in terms perhaps a new diskID for `format.json`. (however this is not a big issue) This is an unnecessary requirement that healing via scanner needs all drives to be online, instead healing should trigger even when partial nodes and drives are available this ensures that we keep the SLA in-tact on the objects when disks are offline for a prolonged period of time.	2022-01-26 08:34:56 -08:00
Klaus Post	7db05a80dd	locking: Fix wrong map id (#14184 ) Wrong resource is being fetched, since idx is incremented, but mapID is reused. Regression caused by #13454 - that part didn't optimize anything anyway.	2022-01-26 08:34:09 -08:00
Anis Elleuch	45a99c3fd3	publish storage API latency through node metrics (#14117 ) Publish storage functions latency to help compare the performance of different disks in a single deployment. e.g.: ``` minio_node_disk_latency_us{api="storage.WalkDir",disk="/tmp/xl/1",server="localhost:9001"} 226 minio_node_disk_latency_us{api="storage.WalkDir",disk="/tmp/xl/2",server="localhost:9002"} 1180 minio_node_disk_latency_us{api="storage.WalkDir",disk="/tmp/xl/3",server="localhost:9003"} 1183 minio_node_disk_latency_us{api="storage.WalkDir",disk="/tmp/xl/4",server="localhost:9004"} 1625 ```	2022-01-25 16:31:44 -08:00
Harshavardhana	b68f0cbde4	ignore remote disks with diskID empty as offline (#14168 ) concurrent loading of erasure sets can now expose a situation in a distributed setup that might return diskID as empty, treat such disks as offline.	2022-01-24 19:40:02 -08:00
Krishnan Parthasarathi	ebc3627c73	further improvements to newXLStorage (#14166 ) - create internal erasure volumes only if the disk is unformatted - return a copy of format data in xlStorage.ReadAll - parse env vars only once, to be re-used by xl-storage	2022-01-24 17:09:12 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
Harshavardhana	f6d13f57bb	fix: correct parentUser lookup for OIDC auto expiration (#14154 ) fixes #14026 This is a regression from #13884	2022-01-22 16:36:11 -08:00
Poorna	48da4aeee0	Add API for removing site(s) from site replication (#14022 )	2022-01-21 08:48:21 -08:00
Harshavardhana	7f214a0e46	use dnscache resolver for resolving command line endpoints (#14135 ) this helps in caching the resolved values early on, avoids causing further resolution for individual nodes when object layer comes online. this can speed up our startup time during, upgrades etc by an order of magnitude. additional changes in connectLoadInitFormats() and parallelize all calls that might be potentially blocking.	2022-01-20 13:03:15 -08:00
Klaus Post	e1a0a1e73c	fs: Return prefix as listing marker if no objects (#14143 ) Fixes #14132	2022-01-20 10:55:18 -08:00
Harshavardhana	9d588319dd	support site replication to replicate IAM users,groups (#14128 ) - Site replication was missing replicating users, groups when an empty site was added. - Add site replication for groups and users when they are disabled and enabled. - Add support for replicating bucket quota config.	2022-01-19 20:02:24 -08:00
Klaus Post	0012ca8ca5	Fix inconsistent metadata after healing (#14125 ) When calculating signatures empty part ETags were not discarded, leading to a different signature compared to freshly created ones. This would mean that after a heal signature of the healed metadata would be different. Fixing the calculation of signature will make these consistent. Furthermore when inconsistent entries, with zero version ID, with the same mod times but different signatures, the one with the lowest signature would be picked for quorum check. Since this is 50/50, we fall back to a simple quorum count on all signatures. Each of these fixes by themselves will lead to quorum. Tests were added for regressions and expected outcomes.	2022-01-19 10:48:00 -08:00
Poorna	288e276abe	Specify tags in options while selecting replication targets (#14126 ) When the replication rule is based on tag matches, the replication process should pick up targets matching the tags specified in the replication rule. Fixing regression due to #12880	2022-01-19 10:45:42 -08:00
Jarbitz	f22e745514	fix: ListBucketUsers comment doc (#14129 )	2022-01-19 10:45:13 -08:00
Krishnan Parthasarathi	070c31eac5	Wait for updates collector when disk.NSScanner returns error (#14127 )	2022-01-19 00:46:43 -08:00
Harshavardhana	70e1cbda21	allow disabling O_DIRECT in certain environments for reads (#14115 ) repeated reads on single large objects in HPC like workloads, need the following option to disable O_DIRECT for a more effective usage of the kernel page-cache. However this optional should be used in very specific situations only, and shouldn't be enabled on all servers. NVMe servers benefit always from keeping O_DIRECT on.	2022-01-17 08:34:14 -08:00
Harshavardhana	60f2df54e0	Add envVars for CLI arguments (#14114 ) fixes #14107	2022-01-15 16:20:02 -08:00
Harshavardhana	ba708f51f2	fix: copyMetrics to avoid map references elsewhere (#14113 ) map labels might have been referenced else, this can lead to concurrent access at lower layers. avoid this by copying the information while concurrently serving the metrics.	2022-01-14 16:48:19 -08:00
Harshavardhana	0df31f63ab	reject changing pools when there are pending decommissions in-progress (#14102 ) do not allow mutation to pool command line when there are unfinished decommissions in place, disallow such scenarios to avoid user mistakes. also add testcases to cover all relevant scenarios.	2022-01-14 10:32:35 -08:00
Klaus Post	64d4da5a37	Add Put input readahead (#14084 ) When reading input for PutObject or PutObjectPart add a readahead buffer for big inputs. This will make network reads+hashing separate run async with erasure coding and writes. This will reduce overall latency in distributed setups where the input is from upstream and writes go to other servers. We will read at 2 buffers ahead, meaning one will always be ready/waiting and one is currently being read from. This improves PutObject and PutObjectParts for these cases.	2022-01-14 10:01:25 -08:00
Harshavardhana	7aec38a73e	Simplify the messaging for internode versions (#14103 ) provide a cleaner message instead of cryptic logs, also provide the relevant link on how to do recommended way to upgrade.	2022-01-13 17:25:08 -08:00
Klaus Post	a2fd8caa69	Ignore version not found in deleteVersions (#14093 ) When deleting multiple versions it "gives" up with an errFileVersionNotFound if a version cannot be found. This effectively skips deleting other versions sent in the same request. This can happen on inconsistent objects. We should ignore errFileVersionNotFound and continue with others. We already ignore these at the caller level, this PR is continuation of `54a9877`	2022-01-13 14:28:07 -08:00
Harshavardhana	f546636c52	fix: use renameAll instead of deleteObject() for purging temporary files (#14096 ) This PR simplifies few things - Multipart parts are renamed, upon failure are unrenamed() keep this multipart specific behavior it is needed and works fine. - AbortMultipart should blindly delete once lock is acquired instead of re-reading metadata and calculating quorum, abort is a delete() operation and client has no business looking for errors on this. - Skip Access() calls to folders that are operating on `.minio.sys/multipart` folder as well.	2022-01-13 11:07:41 -08:00
Harshavardhana	38ccc4f672	fix: make sure to avoid calling RenameData() on disconnected disks. (#14094 ) Large clusters with multiple sets, or multi-pool setups at times might fail and report unexpected "file not found" errors. This can become a problem during startup sequence when some files need to be created at multiple locations. - This PR ensures that we nil the erasure writers such that they are skipped in RenameData() call. - RenameData() doesn't need to "Access()" calls for `.minio.sys` folders they always exist. - Make sure PutObject() never returns ObjectNotFound{} for any errors, make sure it always returns "WriteQuorum" when renameData() fails with ObjectNotFound{}. Return appropriate errors for all other cases.	2022-01-12 18:49:01 -08:00
Harshavardhana	cc3f139d1f	replication: attempt abort multipart-upload at max 3 times on remote (#14087 ) this is mainly an attempt to relinquish space on the remote site, if this still doesn't do it we give and let the admin know with a log message.	2022-01-11 22:32:29 -08:00
Harshavardhana	d50442da01	fix: simplify usage calculation and progress (#14086 )	2022-01-11 18:48:43 -08:00
Harshavardhana	404b05a44c	fix: ignore drained pool in Healing, hold lock additionally (#14080 )	2022-01-11 12:27:47 -08:00
Harshavardhana	3d7c1ad31d	ignore configNotFound error in AccountInfo() (#14082 ) fixes #14081	2022-01-11 08:43:18 -08:00
yinhen	d300e775a6	Avoid reconnect of disk during startup sequence (#14070 )	2022-01-10 23:33:58 -08:00
Harshavardhana	7ee2d1c339	fix: when healing log path when we give up (#14079 )	2022-01-10 21:22:17 -08:00
Poorna	54a98773f8	fix: replication of tag removal (#14056 ) Currently tag removal leaves replication state as `PENDING` because the `HEAD` api returns just a tag count but not the actual tags, and this is treated as a no-op	2022-01-10 19:06:10 -08:00
Harshavardhana	737a3f0bad	fix: decommission bugfixes found during migration of .minio.sys/config (#14078 )	2022-01-10 17:26:00 -08:00
Harshavardhana	3bd9636a5b	do not remove Sid from svcaccount policies (#14064 ) fixes #13905	2022-01-10 14:26:26 -08:00
Harshavardhana	76b21de0c6	feat: decommission feature for pools (#14012 ) ``` λ mc admin decommission start alias/ http://minio{1...2}/data{1...4} ``` ``` λ mc admin decommission status alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Active │ │ 2nd │ http://minio{3...4}/data{1...4} │ 329 GiB (used) / 421 GiB (total) │ Active │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────┘ ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} Progress: ===================> [1GiB/sec] [15%] [4TiB/50TiB] Time Remaining: 4 hours (started 3 hours ago) ``` ``` λ mc admin decommission status alias/ http://minio{1...2}/data{1...4} ERROR: This pool is not scheduled for decommissioning currently. ``` ``` λ mc admin decommission cancel alias/ ┌─────┬─────────────────────────────────┬──────────────────────────────────┬──────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining │ └─────┴─────────────────────────────────┴──────────────────────────────────┴──────────┘ ``` > NOTE: Canceled decommission will not make the pool active again, since we might have > Potentially partial duplicate content on the other pools, to avoid this scenario be > very sure to start decommissioning as a planned activity. ``` λ mc admin decommission cancel alias/ http://minio{1...2}/data{1...4} ┌─────┬─────────────────────────────────┬──────────────────────────────────┬────────────────────┐ │ ID │ Pools │ Capacity │ Status │ │ 1st │ http://minio{1...2}/data{1...4} │ 439 GiB (used) / 561 GiB (total) │ Draining(Canceled) │ └─────┴─────────────────────────────────┴──────────────────────────────────┴────────────────────┘ ```	2022-01-10 09:07:49 -08:00
Harshavardhana	b7c5e45fff	heal: isObjectDangling should return false when it cannot decide (#14053 ) In a multi-pool setup when disks are coming up, or in a single pool setup let's say with 100's of erasure sets with a slow network. It's possible when healing is attempted on `.minio.sys/config` folder, it can lead to healing unexpectedly deleting some policy files as dangling due to a mistake in understanding when `isObjectDangling` is considered to be 'true'. This issue happened in commit `30135eed86` when we assumed the validMeta with empty ErasureInfo is considered to be fully dangling. This implementation issue gets exposed when the server is starting up. This is most easily seen with multiple-pool setups because of the disconnected fashion pools that come up. The decision to purge the object as dangling is taken incorrectly prior to the correct state being achieved on each pool, when the corresponding drive let's say returns 'errDiskNotFound', a 'delete' is triggered. At this point, the 'drive' comes online because this is part of the startup sequence as drives can come online lazily. This kind of situation exists because we allow (totalDisks/2) number of drives to be online when the server is being restarted. Implementation made an incorrect assumption here leading to policies getting deleted. Added tests to capture the implementation requirements.	2022-01-07 19:11:54 -08:00
Aditya Manthramurthy	0a224654c2	fix: progagation of service accounts for site replication (#14054 ) - Only non-root-owned service accounts are replicated for now. - Add integration tests for OIDC with site replication	2022-01-07 17:41:43 -08:00
Aditya Manthramurthy	1981fe2072	Add internal IDP and OIDC users support for site-replication (#14041 ) - This allows site-replication to be configured when using OpenID or the internal IDentity Provider. - Internal IDP IAM users and groups will now be replicated to all members of the set of replicated sites. - When using OpenID as the external identity provider, STS and service accounts are replicated. - Currently this change dis-allows root service accounts from being replicated (TODO: discuss security implications).	2022-01-06 15:52:43 -08:00
Minio Trusted	76877eb6fa	move gofumpt to golang-ci	2022-01-06 13:08:21 -08:00
Klaus Post	3d66d053c7	Add small client TLS PSK cache (#14039 )	2022-01-06 11:34:02 -08:00
Klaus Post	0e31cff762	fix: DeleteMultipleObjects to finish even if cancelled + concurrent sets (#14038 ) * Process sets concurrently. * Disconnect context from request. * Insert context cancellation checks. * errFileNotFound and errFileVersionNotFound are ok, unless creating delete markers.	2022-01-06 10:47:49 -08:00
Shireesh Anjal	c27110e37d	Add timeinfo to health data (#14013 ) Capture RoundtripDuration to figure out NTP issues in subnet health analyzer.	2022-01-06 01:51:10 -08:00
Harshavardhana	89441a22aa	enforceRetentionForDeletion should return false early for delete-marker (#14033 )	2022-01-05 17:05:28 -08:00
Poorna	4d39fd4165	Add API for cluster replication status visibility (#13885 )	2022-01-05 02:44:08 -08:00
Harshavardhana	001b77e7e1	use readConfig/saveConfig to simplify I/O on usage/tracker info (#14019 )	2022-01-03 10:22:58 -08:00
Harshavardhana	a60ac7ca17	fix: audit log to support object names in multipleObjectNames() handler (#14017 )	2022-01-03 01:28:52 -08:00
Harshavardhana	42ba0da6b0	fix: initialize new drwMutex for each attempt in 'for {' loop. (#14009 ) It is possible that GetLock() call remembers a previously failed releaseAll() when there are networking issues, now this state can have potential side effects. This PR tries to avoid this side affect by making sure to initialize NewNSLock() for each GetLock() attempts made to avoid any prior state in the memory that can interfere with the new lock grants.	2022-01-02 09:15:34 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	79df2c7ce7	correctly calculate read quorum based on the available fileInfo (#14000 ) The current usage of assuming `default` parity of `4` is not correct for all objects stored on MinIO, objects in .minio.sys have maximum parity, healing won't trigger on these objects due to incorrect verification of quorum.	2021-12-28 15:33:03 -08:00
Harshavardhana	866a95de38	fix: choose appropriate quorum for a given erasure set (#13998 ) multiObject delete should honor expected quorum	2021-12-28 12:41:52 -08:00
Minio Trusted	bb97eafa82	madmin-go v1.1.23 and pkg v1.1.11	2021-12-26 23:23:18 -08:00
Harshavardhana	c980804514	trim values from envrionment files (#13991 ) trim values to remove any spaces, newlines from the files while importing credentials and other values.	2021-12-25 22:02:54 -08:00
Harshavardhana	b883803b21	fix: healing across pools removing dangling objects (#13990 ) adds other simplifications to the code when running namespace heals across pools.	2021-12-25 09:01:44 -08:00
Harshavardhana	7e3a7d7044	add healing for invalid shards by skipping the blocks (#13978 ) Built on top of #13945, now we need to simply skip the shards and its automated.	2021-12-23 23:01:46 -08:00
Aditya Manthramurthy	5a96cbbeaa	Fix user privilege escalation bug (#13976 ) The AddUser() API endpoint was accepting a policy field. This API is used to update a user's secret key and account status, and allows a regular user to update their own secret key. The policy update is also applied though does not appear to be used by any existing client-side functionality. This fix changes the accepted request body type and removes the ability to apply policy changes as that is possible via the policy set API. NOTE: Changing passwords can be disabled as a workaround for this issue by adding an explicit "Deny" rule to disable the API for users.	2021-12-23 09:21:21 -08:00
Harshavardhana	54ec0a1308	add configurable delta for skipping shards (#13967 ) This PR is an attempt to make this configurable as not all situations have same level of tolerable delta, i.e disks are replaced days apart or even hours. There is also a possibility that nodes have drifted in time, when NTP is not configured on the system.	2021-12-22 11:43:01 -08:00
Harshavardhana	1cf726348f	return meaningful error for disabled users (#13968 ) fixes #13958	2021-12-22 11:40:21 -08:00
Harshavardhana	0e3037631f	skip inconsistent shards if possible (#13945 ) data shards were wrong due to a healing bug reported in #13803 mainly with unaligned object sizes. This PR is an attempt to automatically avoid these shards, with available information about the `xl.meta` and actually disk mtime.	2021-12-21 10:08:26 -08:00
Aditya Manthramurthy	6fbf4f96b6	Move last remaining IAM notification calls into IAMSys methods (#13941 )	2021-12-21 02:16:50 -08:00
Aditya Manthramurthy	526e10a2e0	Fix regression in STS permissions via group in internal IDP (#13955 ) - When using MinIO's internal IDP, STS credential permissions did not check the groups of a user. - Also fix bug in policy checking in AccountInfo call	2021-12-20 14:07:16 -08:00
Harshavardhana	499872f31d	Add configurable channel queue_size for audit/logger webhook targets (#13819 ) Also log all the missed events and logs instead of silently swallowing the events. Bonus: Extend the logger webhook to support mTLS similar to audit webhook target.	2021-12-20 13:16:53 -08:00
Anis Elleuch	5cc16e098c	env: Remove quotes when parsing a config env file (#13953 ) The code parsing the config environment file does not remove quotes of environment variables values. This commit adds this capability.	2021-12-20 13:13:06 -08:00
Aditya Manthramurthy	1f4e0bd17c	fix: access for root user's STS credential (#13947 ) add a test to cover this case	2021-12-19 23:05:20 -08:00
Aditya Manthramurthy	997e808088	fix; race in bucket replication stats (#13942 ) - r.ulock was not locked when r.UsageCache was being modified Bonus: - simplify code by removing some unnecessary clone methods - we can do this because go arrays are values (not pointers/references) that are automatically copied on assignment. - remove some unnecessary map allocation calls	2021-12-17 15:33:13 -08:00
Shireesh Anjal	13441ad0f8	Add IsKubernetes and IsDocker to health data (#13936 )	2021-12-17 14:46:54 -08:00
Harshavardhana	aa508591c1	cache only metrics served from the disks (#13940 ) do not need to cache in-memory instant metrics	2021-12-17 11:40:09 -08:00
Harshavardhana	818f0201fc	re-implement prometheus metrics endpoint to be simpler (#13922 ) data-structures were repeatedly initialized this causes GC pressure, instead re-use the collectors. Initialize collectors in `init()`, also make sure to honor the cache semantics for performance requirements. Avoid a global map and a global lock for metrics lookup instead let them all be lock-free unless the cache is being invalidated.	2021-12-17 10:11:04 -08:00
Aditya Manthramurthy	890f43ffa5	Map policy to parent for STS (#13884 ) When STS credentials are created for a user, a unique (hopefully stable) parent user value exists for the credential, which corresponds to the user for whom the credentials are created. The access policy is mapped to this parent-user and is persisted. This helps ensure that all STS credentials of a user have the same policy assignment at all times. Before this change, for an OIDC STS credential, when the policy claim changes in the provider (when not using RoleARNs), the change would not take effect on existing credentials, but only on new ones. To support existing STS credentials without parent-user policy mappings, we lookup the policy in the policy claim value. This behavior should be deprecated when such support is no longer required, as it can still lead to stale policy mappings. Additionally this change also simplifies the implementation for all non-RoleARN STS credentials. Specifically, for AssumeRole (internal IDP) STS credentials, policies are picked up from the parent user's policies; for AssumeRoleWithCertificate STS credentials, policies are picked up from the parent user mapping created when the STS credential is generated. AssumeRoleWithLDAP already picks up policies mapped to the virtual parent user.	2021-12-17 00:46:30 -08:00
Poorna K	e270ab65b3	fix: healing of replication delete markers (#13933 ) A corner case can occur where the delete-marker was propagated but the metadata could not be updated on the primary. Sending a RemoveObject call with the Delete marker version would end up permanently deleting the version on target. Instead, perform a Stat on the delete-marker version on target and redo replication only if the delete-marker is missing on target.	2021-12-16 15:34:55 -08:00
Anis Elleuch	926373f9c1	Run the data scanner routine in a loop (#13928 ) After the introduction of Refresh logic in locks, the data scanner can quit when the data scanner lock is not able to get refreshed. In that case, the context of the data scanner will get canceled and runDataScanner() will quit. Another server would pick the scanning routine but after some time, all nodes can just have all scanning routine aborted, as described above. This fix will just run the data scanner in a loop.	2021-12-16 08:32:15 -08:00
Poorna K	111c6177d2	Deprecate caching for erasure/distributed mode (#13909 ) Fixes: #13907 Also removing default value of `writethrough` for cache commit which was interfering with cache_after setting	2021-12-15 16:48:34 -08:00
Poorna K	b42cfcea60	Disallow versioning/replication change in cluster replication setup (#13910 )	2021-12-15 10:37:08 -08:00
Klaus Post	aca6dfbd60	Check for nil RPC in listing (#13917 ) Fixes #13915	2021-12-15 09:19:11 -08:00
Harshavardhana	5f7e6d03ff	copy bucket slice to avoid skipping .minio.sys/buckets (#13912 ) healing was skipping `.minio.sys/buckets` path so essentially not healing `.usage.json` - fix this by making a copy of `buckets` slice.	2021-12-15 09:18:09 -08:00
Harshavardhana	88ad742da0	fix: error handling cases in site-replication (#13901 ) - Allow proper SRError to be propagated to handlers and converted appropriately. - Make sure to enable object locking on buckets when requested in MakeBucketHook. - When DNSConfig is enabled attempt to delete it first before deleting buckets locally.	2021-12-14 14:09:57 -08:00
Krishnan Parthasarathi	44a9339c0a	Newer noncurrent versions (#13815 ) - Rename MaxNoncurrentVersions tag to NewerNoncurrentVersions Note: We apply overlapping NewerNoncurrentVersions rules such that we honor the highest among applicable limits. e.g if 2 overlapping rules are configured with 2 and 3 noncurrent versions to be retained, we will retain 3. - Expire newer noncurrent versions after noncurrent days - MinIO extension: allow noncurrent days to be zero, allowing expiry of noncurrent version as soon as more than configured NewerNoncurrentVersions are present. - Allow NewerNoncurrentVersions rules on object-locked buckets - No x-amz-expiration when NewerNoncurrentVersions configured - ComputeAction should skip rules with NewerNoncurrentVersions > 0 - Add unit tests for lifecycle.ComputeAction - Support lifecycle rules with MaxNoncurrentVersions - Extend ExpectedExpiryTime to work with zero days - Fix all-time comparisons to be relative to UTC	2021-12-14 09:41:44 -08:00
Harshavardhana	113c7ff49a	add code to parse secrets natively instead of shell scripts (#13883 )	2021-12-13 18:23:31 -08:00
Poorna K	d422d24278	replication: warn if insufficient workers (#13899 ) This should give an early warning if configured replication workers are insufficient to meet application workload.	2021-12-13 18:22:56 -08:00
Aditya Manthramurthy	de400f3473	Allow setting non-existent policy on a user/group (#13898 )	2021-12-13 15:55:52 -08:00
Harshavardhana	8144a125ce	check for update in background (#13889 )	2021-12-13 09:43:03 -08:00
jiangfucheng	88c0d0120c	update heal object unit test (#13886 )	2021-12-11 09:04:07 -08:00
Aditya Manthramurthy	44fefe5b9f	Add option to policy info API to return create/mod timestamps (#13796 ) - This introduces a new admin API with a query parameter (v=2) to return a response with the timestamps - Older API still works for compatibility/smooth transition in console	2021-12-11 09:03:39 -08:00
Aditya Manthramurthy	f2bd026d0e	Allow OIDC user to query user info if policies permit (#13882 )	2021-12-10 15:03:39 -08:00
Klaus Post	81e43b87c2	Don't zero buffer if big enough (#13877 ) Only append zeroed bytes when we don't have enough space anyway.	2021-12-10 13:08:10 -08:00
Aditya Manthramurthy	a02e17f15c	Add tests to ensure that OIDC user can create IAM users (#13881 )	2021-12-10 13:04:21 -08:00
Harshavardhana	5b7c00ff52	add more tests to cover areas for weird object names (#13873 ) continuation of #13858 to add more tests and also validate the written object data.	2021-12-09 17:52:53 -08:00
Aditya Manthramurthy	b9f0046ee7	Allow STS credentials to create users (#13874 ) - allow any regular user to change their own password - allow STS credentials to create users if permissions allow Bonus: do not allow changes to sts/service account credentials (via add user API)	2021-12-09 17:48:51 -08:00
Harshavardhana	3b79f7e4ae	ignore if volume exists in MakeVolBulk, return other errors (#13866 )	2021-12-09 15:55:42 -08:00
Aditya Manthramurthy	85d2df02b9	fix: user listing with LDAP (#13872 ) Users listing was showing just a weird policy mapping output which does not make sense here.	2021-12-09 15:55:28 -08:00
Harshavardhana	2f1e8ba612	add more directory marker tests and fix a bug (#13871 ) ListObjects() should never list a delete-marked folder if latest is delete marker and delimiter is not provided. ListObjectVersions() should list a delete-marked folder even if latest is delete marker and delimiter is not provided. Enhance further versioning listing on the buckets	2021-12-09 14:59:23 -08:00
Anis Elleuch	84c690cb07	storage: Use request.Form and avoid mux matching (#13858 ) request.Form uses less memory allocation and avoids gorilla mux matching with weird characters in parameters such as '\n' - Remove Queries() to avoid matching - Ensure r.ParseForm is called to populate fields - Add a unit test for object names with '\n'	2021-12-09 08:38:46 -08:00
Harshavardhana	239bbad7ab	add test to expect prefix without a directory object (#13865 ) Motivation is to cover more areas	2021-12-09 08:36:54 -08:00
Harshavardhana	dcff6c996d	fix: do not list delete-marked objects (#13864 ) delete marked objects should not be considered for listing when listing is delimited, this issue as introduced in PR #13804 which was mainly to address listing of directories in listing when delimited. This PR fixes this properly and adds tests to ensure that we behave in accordance with how an S3 API behaves for ListObjects() without versions.	2021-12-08 17:34:52 -08:00
Poorna K	0a66a6f1e5	Avoid cache GC of writebacks before commit syncs (#13860 ) Save part.1 for writebacks in a separate folder and move it to cache dir atomically while saving the cache metadata. This is to avoid GC mistaking part.1 as orphaned cache entries and purging them. This PR also fixes object size being overwritten during retries for write-back mode.	2021-12-08 14:52:31 -08:00
Harshavardhana	e82a5c5c54	fix: site replication issues and add tests (#13861 ) - deleting policies was deleting all LDAP user mapping, this was a regression introduced in #13567 - deleting of policies is properly sent across all sites. - remove unexpected errors instead embed the real errors as part of the 500 error response.	2021-12-08 11:50:15 -08:00
Harshavardhana	b9aae1aaae	fix: speedtest should exit upon errors cleanly (#13851 ) - deleteBucket() should be called for cleanup if client abruptly disconnects - out of disk errors should be sent to client properly and also cancel the calls - limit concurrency to available MAXPROCS not 32 for auto-tuned setup, if procs are beyond 32 then continue normally. this is to handle smaller setups. fixes #13834	2021-12-06 16:36:14 -08:00
Harshavardhana	7d70afc937	fix: potential crash in diskCache when fileScorer is empty (#13850 ) ``` goroutine 115 [running]: github.com/minio/minio/cmd.(*diskCache).purge.func3({0xc007a10a40, 0x40}, 0x40) github.com/minio/minio/cmd/disk-cache-backend.go:430 +0x90d ```	2021-12-06 15:55:29 -08:00
Aditya Manthramurthy	12b63061c2	Fix LDAP service account creation (#13849 ) - when a user has only group permissions - fixes regression from `ac74237f0` (#13657) - fixes https://github.com/minio/console/issues/1291	2021-12-06 15:55:11 -08:00
Klaus Post	038fdeea83	snowball: return errors on failures (#13836 ) Return errors when untar fails at once. Current error handling was quite a mess. Errors are written to the stream, but processing continues. Instead, return errors when they occur and transform internal errors to bad request errors, since it is likely a problem with the input. Fixes #13832	2021-12-06 09:45:23 -08:00
Anis Elleuch	0b6225bcc3	Better error msg when version mismatch of internode API (#13845 ) Sometimes, we see an error message like "Server expects 'storage' API version 'v41', instead found 'v41'" shows a more generic error message with the path of the REST call.	2021-12-06 09:44:48 -08:00
Anis Elleuch	f286ef8e17	isMultipart to test on parts sizes only if object is encrypted (#13839 ) ObjectInfo.isMultipart() is testing if parts sizes are compatible with encrypted parts but this only can be done if the object is encrypted.	2021-12-06 09:43:43 -08:00
Harshavardhana	b120bcb60a	validate if cached value is empty before use (#13830 ) fixes a crash reproduced while running hadoop tests ``` goroutine 201564 [running]: github.com/minio/minio/cmd.metaCacheEntries.resolve({0xc0206ab7a0, 0x4, 0xc0015b1908}, 0xc0212a7040) github.com/minio/minio/cmd/metacache-entries.go:352 +0x58a ``` Bonus: HeadBucket() should always provide content-type	2021-12-06 02:59:51 -08:00
Harshavardhana	be34fc9134	fix: kms-id header should have arn:aws:kms: prefix (#13833 ) arn:aws:kms: is a must for KMS keyID.	2021-12-06 00:39:32 -08:00
Harshavardhana	8591d17d82	return appropriate errors upon parseErrors (#13831 )	2021-12-05 11:36:26 -08:00
Harshavardhana	f6190d6751	Add single drive support for directory prefixes in Listing (#13829 ) This fixes the compatibility issue with Hadoop 3.3.1 fixes #13710	2021-12-03 18:08:40 -08:00
Aditya Manthramurthy	4f35054d29	Ensure that role ARNs don't collide (#13817 ) This is to prepare for multiple providers enhancement.	2021-12-03 13:15:56 -08:00
Shireesh Anjal	d29df6714a	Introduce new config `subnet api_key` (#13793 ) The earlier approach of using a license token for communicating with SUBNET is being replaced with a simpler mechanism of API keys. Unlike the license which is a JWT token, these API keys will be simple UUID tokens and don't have any embedded information in them. SUBNET would generate the API key on cluster registration, and then it would be saved in this config, to be used for subsequent communication with SUBNET.	2021-12-03 09:32:11 -08:00
jiangfucheng	7460fb8349	fix padding error and compatible with uploaded objects (#13803 )	2021-12-03 09:26:30 -08:00
Harshavardhana	a7c430355a	fix: throw appropriate errors when all disks fail (#13820 ) when all disks fail with same error, fail server startup anyways - we cannot proceed. fixes #13818	2021-12-03 09:25:17 -08:00
Aditya Manthramurthy	b14527b7af	If role policy is configured, require that role ARN be set in STS (#13814 )	2021-12-02 15:43:39 -08:00
Klaus Post	3db931dc0e	Improve listing consistency with version merging (#13723 )	2021-12-02 11:29:16 -08:00
Klaus Post	8309ddd486	Fix panic (not fatal) on connection drops (#13811 ) Fix more regressions from #13597 with double closed channels. ``` panic: "POST /minio/storage/data/distxl-plain/s1/d2/v42/createfile?disk-id=c789f7e1-2b52-442a-b518-aa2dac03f3a1&file-path=f6161668-b939-4543-9873-91b9da4cdff6%2F5eafa986-a3bf-4b1c-8bc0-03a37de390a3%2Fpart.1&length=2621760&volume=.minio.sys%2Ftmp": send on closed channel goroutine 1977 [running]: runtime/debug.Stack() c:/go/src/runtime/debug/stack.go:24 +0x65 github.com/minio/minio/cmd.setCriticalErrorHandler.func1.1() d:/minio/minio/cmd/generic-handlers.go:468 +0x8e panic({0x2928860, 0x4fb17e0}) c:/go/src/runtime/panic.go:1038 +0x215 github.com/minio/minio/cmd.keepHTTPReqResponseAlive.func2({0x4fe4ea0, 0xc02737d8a0}) d:/minio/minio/cmd/storage-rest-server.go:818 +0x48 github.com/minio/minio/cmd.(*storageRESTServer).CreateFileHandler(0xc0015a8510, {0x50073e0, 0xc0273ec460}, 0xc029b9a400) d:/minio/minio/cmd/storage-rest-server.go:334 +0x1d2 net/http.HandlerFunc.ServeHTTP(...) c:/go/src/net/http/server.go:2046 github.com/minio/minio/cmd.httpTraceHdrs.func1({0x50073e0, 0xc0273ec460}, 0x0) d:/minio/minio/cmd/handler-utils.go:372 +0x53 net/http.HandlerFunc.ServeHTTP(0x5007380, {0x50073e0, 0xc0273ec460}, 0x10) c:/go/src/net/http/server.go:2046 +0x2f github.com/minio/minio/cmd.addCustomHeaders.func1({0x5007380, 0xc0273dcf00}, 0xc0273f7340) ``` Reverts but adds write checks.	2021-12-02 11:22:32 -08:00
Harshavardhana	21c868a646	fix: do not ignore delete-marker directories in ListObjects() (#13804 ) Following scenario such as objects that exist inside a prefix say `folder/` must be included in the listObjects() response. ``` 2aa16073-387e-492c-9d59-b4b0b7b6997a v2 DEL folder/ a5b9ce68-7239-4921-90ab-20aed402c7a2 v1 PUT folder/ f2211798-0eeb-4d9e-9184-fcfeae27d069 v1 PUT folder/1.txt ``` Current master does not handle this scenario, because it ignores the top level delete-marker on folders. This is however unexpected. It is expected that list-objects returns the top level prefix in this situation. ``` aws s3api list-objects --bucket harshavardhana --prefix unique/ \ --delimiter / --profile minio --endpoint-url http://localhost:9000 { "CommonPrefixes": [ { "Prefix": "unique/folder/" } ] } ``` There are applications in the wild such as Hadoop s3a connector that exploit this behavior and expect the folder to be present in the response. This also makes the behavior consistent with AWS S3.	2021-12-02 08:46:33 -08:00
Harshavardhana	24d904d194	reload certs from disk upon SIGHUP (#13792 )	2021-12-01 00:38:32 -08:00

... 15 16 17 18 19 ...

5752 Commits