minio

Commit Graph

Author	SHA1	Message	Date
Shireesh Anjal	e6eab2091f	fix: Incorrect ServersCount in cluster.info (#15431 ) The `ServersCount` field in cluster.info is expected to contain the number of nodes, and not number of endpoints.	2022-07-29 22:21:40 -07:00
Cesar Celis Hernandez	8ec888d13d	feat: update binary once and push it to other servers (#15407 )	2022-07-29 08:34:30 -07:00
Harshavardhana	916f274c83	choose starting concurrency based on number of local disks (#15428 ) smaller setups may have less drives per server choosing the concurrency based on number of local drives, and let the MinIO server change the overall concurrency as necessary.	2022-07-29 00:00:06 -07:00
Harshavardhana	cbd70d26b5	optimize speedtest for smaller setups (#15414 ) this has been observed in multiple environments where the setups are small `speedtest` naturally fails with default '10s' and the concurrency of '32' is big for such clusters. choose a smaller value i.e equal to number of drives in such clusters and let 'autotune' increase the concurrency instead.	2022-07-27 14:41:59 -07:00
Shireesh Anjal	906947a285	fix: typo in json key ClusterInfo DeploymentID (#15406 ) deployement_id -> deployment_id	2022-07-26 19:05:33 -07:00
Poorna	426c902b87	site replication: fix healing of bucket deletes. (#15377 ) This PR changes the handling of bucket deletes for site replicated setups to hold on to deleted bucket state until it syncs to all the clusters participating in site replication.	2022-07-25 17:51:32 -07:00
Anis Elleuch	e4b51235f8	upgrade: Split in two steps to ensure a stable retry (#15396 ) Currently, if one server in a distributed setup fails to upgrade due to any reasons, it is not possible to upgrade again unless nodes are restarted. To fix this, split the upgrade process into two steps : - download the new binary on all servers - If successful, overwrite the old binary with the new one	2022-07-25 17:49:47 -07:00
Anis Elleuch	f23f442d33	Add cluster info to inspect/profiling archive (#15360 ) Add cluster info to inspect and profiling archive. In addition to the existing data generation for both inspect and profiling, cluster.info file is added. This latter contains some info of the cluster. The generation of cluster.info is is done as the last step and it can fail if it exceed 10 seconds.	2022-07-25 09:11:35 -07:00
Andreas Auernhammer	242d06274a	kms: add `context.Context` to KMS API calls (#15327 ) This commit adds a `context.Context` to the the KMS `{Stat, CreateKey, GenerateKey}` API calls. The context will be used to terminate external calls as soon as the client requests gets canceled. A follow-up PR will add a `context.Context` to the remaining `DecryptKey` API call. Signed-off-by: Andreas Auernhammer <hi@aead.dev>	2022-07-18 18:54:27 -07:00
Harshavardhana	b4eb74f5ff	allow custom speedtest bucket (#15271 ) this allows for specifying existing buckets with - object replication enabled - object encryption enabled - object versioning enabled - object locking enabled	2022-07-12 10:12:47 -07:00
Anis Elleuch	8d98282afd	Better reporting of total/free usable capacity of the cluster (#15230 ) The current code uses approximation using a ratio. The approximation can skew if we have multiple pools with different disk capacities. Replace the algorithm with a simpler one which counts data disks and ignore parity disks.	2022-07-06 13:29:49 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
Harshavardhana	c7ed6eee5e	fix: background local test also via channel (#15086 ) current implementation for `standalone` setups was blocking the `perf drive`. Bonus: remove all old unused complicated code.	2022-06-15 14:51:42 -07:00
Harshavardhana	8082d1fed6	add bucket level S3 received/sent bytes (#15084 ) adds bucket level metrics for bytes received and sent bytes on all S3 API calls.	2022-06-14 15:14:24 -07:00
Harshavardhana	d2a10dbe69	fix: simplify healthcheck code to freeze calls only once (#15082 ) - currently subnet health check was freezing and calling locks at multiple locations, avoid them. - throw errors if first attempt itself fails with no results	2022-06-14 11:22:07 -07:00
Anis Elleuch	5fb420c703	prometheus: Add S3 4xx and 5xx S3 monitoring (#15052 ) Currently minio_s3_requests_errors_total covers 4xx and 5xx S3 responses which can be confusing when s3 applications sent a lot of HEAD requests with obvious 404 responses or when the replication is enabled. Add - minio_s3_requests_4xx_errors_total - minio_s3_requests_5xx_errors_total to help users monitor 4xx and 5xx HTTP status codes separately.	2022-06-08 11:22:34 -07:00
Anis Elleuch	fd02492cb7	avoid limits on the number of parallel trace/bucket notifications listeners (#14799 ) Simplifies overall limits on the incoming listeners for notifications. Fixes #14566	2022-06-05 14:29:12 -07:00
Anis Elleuch	20a753e2e5	Fix a possible service freeze after perf object (#15036 ) The S3 service can be frozen indefinitely if a client or mc asks for object perf API but quits early or has some networking issues. The reason is that partialWrite() can block indefinitely. This commit makes partialWrite() listens to context cancellation as well. It also renames deadlinedCtx to healthCtx since it covers handler context cancellation and not only not only the speedtest deadline.	2022-06-03 05:58:45 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	6cfb1cb6fd	fix: timer usage across codebase (#14935 ) it seems in some places we have been wrongly using the timer.Reset() function, nicely exposed by an example shared by @donatello https://go.dev/play/p/qoF71_D1oXD this PR fixes all the usage comprehensively	2022-05-17 22:42:59 -07:00
Shireesh Anjal	3ec1844e4a	return kubernetes info in health report (#14865 )	2022-05-06 12:41:07 -07:00
Anis Elleuch	df50eda811	Add number of versions in server info API (#14812 ) The goal is to show the number of versions in the server info API.	2022-04-25 22:04:10 -07:00
Shireesh Anjal	5c53620a72	Include speedtest as part of healthinfo api (#14696 ) Execute the object, drive and net speedtests as part of the healthinfo (if requested by the client), and include their result in the response. The options for the speedtests have been picked from the default values used by `mc support perf` command.	2022-04-12 13:17:44 -07:00
Poorna	a1b01e6d5f	Combine profiling start/stop APIs into one (#14662 ) Take profile duration as a query parameter for profile API	2022-04-08 12:44:35 -07:00
Krishna Srinivas	b35b9dcff7	Use S3 client for uplooads/downloads during perf test (#14570 )	2022-04-07 21:20:40 -07:00
Klaus Post	dedf9774c7	Set inspect-input.txt modtime (#14688 ) If no time given, use current time.	2022-04-05 13:06:10 -07:00
Shireesh Anjal	7c696e1cb6	Write deployment id to health report at the start (#14673 ) The deployment id was being written to the health report towards the end of the handler. Because of this, if there was a timeout in any of the data fetching, the deployment id was not getting written at all. Upload of such reports fails on SUBNET as deployment id is the unique identifier for a cluster in subnet. Fixed by writing the deployment id at the beginning of the processing.	2022-04-03 13:15:02 -07:00
Poorna	0e6aedc7ed	Capture cmdline args for inspect API (#14668 ) Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>	2022-03-31 16:05:43 -07:00
Poorna	566cffe53d	save format.json by default for inspect API (#14620 )	2022-03-25 02:02:17 -07:00
Harshavardhana	d796621ccc	choose smaller default deadline for diagnostics without --full (#14599 )	2022-03-21 23:25:24 -07:00
Harshavardhana	43eb5a001c	re-use transport for AdminInfo() call (#14571 ) avoids creating new transport for each `isServerResolvable` request, instead re-use the available global transport and do not try to forcibly close connections to avoid TIME_WAIT build upon large clusters. Never use httpClient.CloseIdleConnections() since that can have a drastic effect on existing connections on the transport pool. Remove it everywhere.	2022-03-17 16:20:10 -07:00
Minio Trusted	ffcadcd99e	Revert "Use S3 client for uplooads/downloads during perf test (#14553 )" This reverts commit `ff811f594b`. Speedtest is broken need to fix this more cleanly.	2022-03-16 23:34:49 -07:00
Krishna Srinivas	ff811f594b	Use S3 client for uplooads/downloads during perf test (#14553 )	2022-03-16 16:58:46 -07:00
Krishna Srinivas	4d0715d226	Implement netperf for "mc support perf net" (#14397 ) Co-authored-by: Klaus Post <klauspost@gmail.com>	2022-03-08 09:54:38 -08:00
Harshavardhana	b0c84e3de7	fix: deleteVersions causing xl.meta to have empty Versions[] slice (#14483 ) This is a side-affect of the optimization done in PR #13544 which causes a certain type of delete operations on given object versions can cause lastVersion indication to be skipped, which leads to an `xl.meta` where Versions[] slice is empty while the entire file is intact by itself. This PR tries to ensure that such files are visible and deletable by regular means of listing as null 'delete-marker' and also avoid the situation where this potential issue might arise.	2022-03-04 20:01:26 -08:00
Harshavardhana	c08540c7b7	reject speedtest when there isn't enough disk space available (#14402 ) small setups do not return appropriate errors when speedtest cannot run on small tiny setups, allow the tests to fail appropriately more pro-actively. many users bring toy setups, this PR simply returns an error in such situations.	2022-02-24 09:06:18 -08:00
Shireesh Anjal	3934700a08	Make audit webhook and kafka config dynamic (#14390 )	2022-02-24 09:05:33 -08:00
Anis Elleuch	2ee337ead5	prometheus: Add incoming requests metrics since last scrape (#14261 ) Some users running MinIO claim that their system became slow. One way to investigate is to look at this Prometheus history of the number of the requests reaching the server. The existing current S3 requests metric is not enough because it can increase of the system really becomes slow, due to disk issues for example.	2022-02-07 16:30:14 -08:00
Sidhartha Mani	d7df6bc738	add support for speedtest drive (#14182 )	2022-02-01 22:38:05 -08:00
Harshavardhana	cf407f7176	do not expect 'speedtest' to be a bucket (#14199 ) fixes #14196	2022-01-27 08:13:03 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Shireesh Anjal	13441ad0f8	Add IsKubernetes and IsDocker to health data (#13936 )	2021-12-17 14:46:54 -08:00
Harshavardhana	b9aae1aaae	fix: speedtest should exit upon errors cleanly (#13851 ) - deleteBucket() should be called for cleanup if client abruptly disconnects - out of disk errors should be sent to client properly and also cancel the calls - limit concurrency to available MAXPROCS not 32 for auto-tuned setup, if procs are beyond 32 then continue normally. this is to handle smaller setups. fixes #13834	2021-12-06 16:36:14 -08:00
Harshavardhana	99d87c5ca2	fix: totalDrives reported in speedTest for multiple-pools (#13770 ) totalDrives reported in speedTest result were wrong for multiple pools, this PR fixes this. Bonus: add support for configurable storage-class, this allows us to test REDUCED_REDUNDANCY to see further maximum throughputs across the cluster.	2021-11-29 09:05:46 -08:00
Aditya Manthramurthy	4ce6d35e30	Add new `site` config sub-system intended to replace `region` (#13672 ) - New sub-system has "region" and "name" fields. - `region` subsystem is marked as deprecated, however still works, unless the new region parameter under `site` is set - in this case, the region subsystem is ignored. `region` subsystem is hidden from top-level help (i.e. from `mc admin config set myminio`), but appears when specifically requested (i.e. with `mc admin config set myminio region`). - MINIO_REGION, MINIO_REGION_NAME are supported as legacy environment variables for server region. - Adds MINIO_SITE_REGION as the current environment variable to configure the server region and MINIO_SITE_NAME for the site name.	2021-11-25 13:06:25 -08:00
Harshavardhana	91e0823ff0	allow service freeze/unfreeze on a setup (#13707 ) an active running speedTest will reject all new S3 requests to the server, until speedTest is complete. this is to ensure that speedTest results are accurate and trusted. Co-authored-by: Klaus Post <klauspost@gmail.com>	2021-11-23 12:02:16 -08:00
Harshavardhana	fb268add7a	do not flush if Write() failed (#13597 ) - Go might reset the internal http.ResponseWriter() to `nil` after Write() failure if the go-routine has returned, do not flush() such scenarios and avoid spurious flushes() as returning handlers always flush. - fix some racy tests with the console - avoid ticker leaks in certain situations	2021-11-18 17:19:58 -08:00
Shireesh Anjal	7152915318	Use pointer based TLS field (#13659 ) This will help other projects like `health-analyzer` to verify that the struct was indeed populated by the minio server, and is not default-populated during unmarshalling of the JSON. Signed-off-by: Shireesh Anjal <shireesh@minio.io>	2021-11-18 09:02:33 -08:00
Harshavardhana	661b263e77	add gocritic/ruleguard checks back again, cleanup code. (#13665 ) - remove some duplicated code - reported a bug, separately fixed in #13664 - using strings.ReplaceAll() when needed - using filepath.ToSlash() use when needed - remove all non-Go style comments from the codebase Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2021-11-16 09:28:29 -08:00
Harshavardhana	c3d24fb26d	use single encoder for sending speedtest results (#13579 ) Bonus: if runs have PUT higher then capture it anyways to display an unexpected result, which provides a way to understand what might be slowing things down on the system. For example on a Data24 WDC setup it is clearly visible there is a bug in the hardware. ``` ./mc admin speedtest wdc/ ⠧ Running speedtest (With 64 MiB object size, 32 concurrency) PUT: 31 GiB/s GET: 24 GiB/s ⠹ Running speedtest (With 64 MiB object size, 48 concurrency) PUT: 38 GiB/s GET: 24 GiB/s MinIO 2021-11-04T06:08:33Z, 6 servers, 48 drives PUT: 38 GiB/s, 605 objs/s GET: 24 GiB/s, 383 objs/s ``` Reads are almost 14GiB/sec slower than Writes which is practically not possible.	2021-11-04 12:11:52 -07:00

1 2 3 4 5 ...

340 Commits