minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	8a291e1dc0	Cluster healthcheck improvements (#10408 ) - do not fail the healthcheck if heal status was not obtained from one of the nodes, if many nodes fail then report this as a catastrophic error. - add "x-minio-write-quorum" value to match the write tolerance supported by server. - admin info now states if a drive is healing where madmin.Disk.Healing is set to true and madmin.Disk.State is "ok"	2020-09-02 22:54:56 -07:00
Harshavardhana	2acb530ccd	update rulesguard with new rules (#10392 ) Co-authored-by: Nitish Tiwari <nitish@minio.io> Co-authored-by: Praveen raj Mani <praveen@minio.io>	2020-09-01 16:58:13 -07:00
Andreas Auernhammer	18725679c4	crypto: allow multiple KES endpoints (#10383 ) This commit addresses a maintenance / automation problem when MinIO-KES is deployed on bare-metal. In orchestrated env. the orchestrator (K8S) will make sure that `n` KES servers (IPs) are available via the same DNS name. There it is sufficient to provide just one endpoint.	2020-08-31 18:10:52 -07:00
Klaus Post	1b119557c2	getDisksInfo: Attribute failed disks to correct endpoint (#10360 ) If DiskInfo calls failed the information returned was used anyway resulting in no endpoint being set. This would make the drive be attributed to the local system since `disk.Endpoint == disk.DrivePath` in that case. Instead, if the call fails record the endpoint and the error only.	2020-08-26 10:11:26 -07:00
Harshavardhana	e57c742674	use single dynamic timeout for most locked API/heal ops (#10275 ) newDynamicTimeout should be allocated once, in-case of temporary locks in config and IAM we should have allocated timeout once before the `for loop` This PR doesn't fix any issue as such, but provides enough dynamism for the timeout as per expectation.	2020-08-17 11:29:58 -07:00
Harshavardhana	2a9819aff8	fix: refactor background heal for cluster health (#10225 )	2020-08-07 19:43:06 -07:00
Harshavardhana	6c6137b2e7	add cluster maintenance healthcheck drive heal affinity (#10218 )	2020-08-07 13:22:53 -07:00
Harshavardhana	0b8255529a	fix: proxies set keep-alive timeouts to be system dependent (#10199 ) Split the DialContext's one for internode and another for all other external communications especially proxy forwarders, gateway transport etc.	2020-08-04 14:55:53 -07:00
Harshavardhana	25a55bae6f	fix: avoid buffering of server sent events by proxies (#10164 )	2020-07-30 19:45:12 -07:00
Harshavardhana	3a73f1ead5	refactor server update behavior (#10107 )	2020-07-23 08:03:31 -07:00
Harshavardhana	e7d7d5232c	fix: admin info output and improve overall performance (#10015 ) - admin info node offline check is now quicker - admin info now doesn't duplicate the code across doing the same checks for disks - rely on StorageInfo to return appropriate errors instead of calling locally. - diskID checks now return proper errors when disk not found v/s format.json missing. - add more disk states for more clarity on the underlying disk errors.	2020-07-13 09:51:07 -07:00
Andreas Auernhammer	a317a2531c	admin: new API for creating KMS master keys (#9982 ) This commit adds a new admin API for creating master keys. An admin client can send a POST request to: ``` /minio/admin/v3/kms/key/create?key-id=<keyID> ``` The name / ID of the new key is specified as request query parameter `key-id=<ID>`. Creating new master keys requires KES - it does not work with the native Vault KMS (deprecated) nor with a static master key (deprecated). Further, this commit removes the `UpdateKey` method from the `KMS` interface. This method is not needed and not used anymore.	2020-07-08 18:50:43 -07:00
Harshavardhana	2743d4ca87	fix: Add support for preserving mtime for replication (#9995 ) This PR is needed for bucket replication support	2020-07-08 17:36:56 -07:00
Harshavardhana	cdb0e6ffed	support proper values for listMultipartUploads/listParts (#9970 ) object KMS is configured with auto-encryption, there were issues when using docker registry - this has been left unnoticed for a while. This PR fixes an issue with compatibility. Additionally also fix the continuation-token implementation infinite loop issue which was missed as part of #9939 Also fix the heal token to be generated as a client facing value instead of what is remembered by the server, this allows for the server to be stateless regarding the token's behavior.	2020-07-03 19:27:13 -07:00
Anis Elleuch	2be20588bf	Reroute requests based token heal/listing (#9939 ) When manual healing is triggered, one node in a cluster will become the authority to heal. mc regularly sends new requests to fetch the status of the ongoing healing process, but a load balancer could land the healing request to a node that is not doing the healing request. This PR will redirect a request to the node based on the node index found described as part of the client token. A similar technique is also used to proxy ListObjectsV2 requests by encoding this information in continuation-token	2020-07-03 11:53:03 -07:00
Harshavardhana	a38ce29137	fix: simplify background heal and trigger heal items early (#9928 ) Bonus fix during versioning merge one of the PR was missing the offline/online disk count fix from #9801 port it correctly over to the master branch from release. Additionally, add versionID support for MRF Fixes #9910 Fixes #9931	2020-06-29 13:07:26 -07:00
Harshavardhana	b8cb21c954	allow more than N number of locks in TopLocks (#9883 )	2020-06-20 06:33:01 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
Harshavardhana	96ed0991b5	fix: optimize IAM users load, add fallback (#9809 ) Bonus fix, load service accounts properly when service accounts were generated with LDAP	2020-06-11 14:11:30 -07:00
Harshavardhana	b2db8123ec	Preserve errors returned by diskInfo to detect disk errors (#9727 ) This PR basically reverts #9720 and re-implements it differently	2020-05-28 13:03:04 -07:00
Harshavardhana	53aaa5d2a5	Export bucket usage counts as part of bucket metrics (#9710 ) Bonus fixes in quota enforcement to use the new datastructure and use timedValue to cache a value/reload automatically avoids one less global variable.	2020-05-27 06:45:43 -07:00
Sidhartha Mani	c121d27f31	progressively report obd results (#9639 )	2020-05-22 17:56:45 -07:00
Harshavardhana	bd032d13ff	migrate all bucket metadata into a single file (#9586 ) this is a major overhaul by migrating off all bucket metadata related configs into a single object '.metadata.bin' this allows us for faster bootups across 1000's of buckets and as well as keeps the code simple enough for future work and additions. Additionally also fixes #9396, #9394	2020-05-19 13:53:54 -07:00
Harshavardhana	1bc32215b9	enable full linter across the codebase (#9620 ) enable linter using golangci-lint across codebase to run a bunch of linters together, we shall enable new linters as we fix more things the codebase. This PR fixes the first stage of this cleanup.	2020-05-18 09:59:45 -07:00
Harshavardhana	814ddc0923	add missing admin actions, enhance AccountUsageInfo (#9607 )	2020-05-15 18:16:45 -07:00
Harshavardhana	6ac48a65cb	fix: use unused cacheMetrics code in prometheus (#9588 ) remove all other unusued/deadcode	2020-05-13 08:15:26 -07:00
Harshavardhana	337c2a7cb4	add audit logging for all admin calls (#9568 ) - add ServiceRestart/ServiceStop actions - audit log appropriately in all admin handlers fixes #9522	2020-05-11 10:34:08 -07:00
Harshavardhana	27d716c663	simplify usage of mutexes and atomic constants (#9501 )	2020-05-03 22:35:40 -07:00
Harshavardhana	7a5271ad96	fix: re-use connections in webhook/elasticsearch (#9461 ) - elasticsearch client should rely on the SDK helpers instead of pure HTTP calls. - webhook shouldn't need to check for IsActive() for all notifications, failure should be delayed. - Remove DialHTTP as its never used properly Fixes #9460	2020-04-28 13:57:56 -07:00
Praveen raj Mani	322385f1b6	fix: only show active/available ARNs in server startup banner (#9392 )	2020-04-21 09:38:32 -07:00
Klaus Post	c4464e36c8	fix: limit HTTP transport tuables to affordable values (#9383 ) Close connections pro-actively in transient calls	2020-04-17 11:20:56 -07:00
Harshavardhana	69fb68ef0b	fix simplify code to start using context (#9350 )	2020-04-16 10:56:18 -07:00
Sidhartha Mani	ec11e99667	implement configurable timeout for OBD tests (#9324 )	2020-04-14 11:48:32 -07:00
Harshavardhana	37d066b563	fix: deprecate requirement of session token for service accounts (#9320 ) This PR fixes couple of behaviors with service accounts - not need to have session token for service accounts - service accounts can be generated by any user for themselves implicitly, with a valid signature. - policy input for AddNewServiceAccount API is not fully typed allowing for validation before it is sent to the server. - also bring in additional context for admin API errors if any when replying back to client. - deprecate GetServiceAccount API as we do not need to reply back session tokens	2020-04-14 11:28:56 -07:00
Praveen raj Mani	bfec5fe200	fix: fetchLambdaInfo should return consistent results (#9332 ) - Introduced a function `FetchRegisteredTargets` which will return a complete set of registered targets irrespective to their states, if the `returnOnTargetError` flag is set to `False` - Refactor NewTarget functions to return non-nil targets - Refactor GetARNList() to return a complete list of configured targets	2020-04-14 11:19:25 -07:00
Harshavardhana	4314ee1670	fix: remove unusued PerfInfoHandler code (#9328 ) - Removes PerfInfo admin API as its not OBDInfo - Keep the drive path without the metaBucket in OBD global latency map. - Remove all the unused code related to PerfInfo API - Do not redefined global mib,gib constants use humanize.MiByte and humanize.GiByte instead always	2020-04-12 19:37:09 -07:00
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	2020-04-09 09:30:02 -07:00
Harshavardhana	2642e12d14	fix: change policies API to return and take struct (#9181 ) This allows for order guarantees in returned values can be consumed safely by the caller to avoid any additional parsing and validation. Fixes #9171	2020-04-07 19:30:59 -07:00
Sidhartha Mani	c8243706b4	Add Parallel NetOBD tests to saturate all nodes at once (#9241 )	2020-03-31 17:08:28 -07:00
Sidhartha Mani	7b732b566f	[Bugfix] Fix Net tests being omitted (#9234 )	2020-03-31 01:15:21 -07:00
Sidhartha Mani	0c80bf45d0	Implement oboard diagnostics admin API (#9024 ) - Implement a graph algorithm to test network bandwidth from every node to every other node - Saturate any network bandwidth adaptively, accounting for slow and fast network capacity - Implement parallel drive OBD tests - Implement a paging mechanism for OBD test to provide periodic updates to client - Implement Sys, Process, Host, Mem OBD Infos	2020-03-26 21:07:39 -07:00
Harshavardhana	3d3beb6a9d	Add response header timeouts (#9170 ) - Add conservative timeouts upto 3 minutes for internode communication - Add aggressive timeouts of 30 seconds for gateway communication Fixes #9105 Fixes #8732 Fixes #8881 Fixes #8376 Fixes #9028	2020-03-21 22:10:13 -07:00
Harshavardhana	b4bfdc92cc	fix: admin console logger changes to log.Info	2020-03-20 15:14:14 -07:00
Harshavardhana	ae654831aa	Add madmin package context support (#9172 ) This is to improve responsiveness for all admin API operations and allowing callers to cancel any on-going admin operations, if they happen to be waiting too long.	2020-03-20 15:00:44 -07:00
Harshavardhana	cfd12914e1	fix: crash in serverInfo handler when ldap is configured (#9123 )	2020-03-11 23:13:32 -07:00
Anis Elleuch	fdf65aa9b9	heal: Add info about the next background healing round (#9122 ) - avoid setting last heal activity when starting self-healing This can be confusing to users thinking that the self healing cycle was already performed. - add info about the next background healing round	2020-03-11 23:00:31 -07:00
Nitish Tiwari	7c32f3f554	Fix the URL for MinIO update when using custom download server (#9111 ) Co-authored-by: Nitish Tiwari <nitish@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2020-03-11 20:09:20 +05:30
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2020-02-20 09:21:33 +05:30
Andreas Auernhammer	086fbb745e	fix and improve KMS server info (#8944 ) This commit fixes typos in the displayed server info w.r.t. the KMS and removes the update status. For more information about why the update status is removed see: PR #8943	2020-02-06 06:18:34 +05:30
Andreas Auernhammer	4f37c8ccf2	refine the KMS admin API (#8943 ) This commit removes the `Update` functionality from the admin API. While this is technically a breaking change I think this will not cause any harm because: - The KMS admin API is not complete, yet. At the moment only the status can be fetched. - The `mc` integration hasn't been merged yet. So no `mc` client could have used this API in the past. The `Update`/`Rewrap` status is not useful anymore. It provided a way to migrate from one master key version to another. However, KES does not support the concept of key versions. Instead, key migration should be implemented as migration from one master key to another. Basically, the `Update` functionality has been implemented just for Vault.	2020-02-05 22:47:35 +05:30

1 2 3 4 5

205 Commits