minio

Commit Graph

Author	SHA1	Message	Date
Anis Eleuch	8432fd5ac2	prom: Add online and healing drives metrics per erasure set (#18700 )	2023-12-21 16:56:43 -08:00
Harshavardhana	6829ae5b13	completely remove drive caching layer from gateway days (#18217 ) This has already been deprecated for close to a year now.	2023-10-11 21:18:17 -07:00
Harshavardhana	9788d85ea3	remove logging for invalid metadata values (#18068 )	2023-09-20 15:49:55 -07:00
Aditya Manthramurthy	cbc0ef459b	Fix policy package import name (#18031 ) We do not need to rename the import of minio/pkg/v2/policy as iampolicy any more.	2023-09-14 14:50:16 -07:00
Aditya Manthramurthy	1c99fb106c	Update to minio/pkg/v2 (#17967 )	2023-09-04 12:57:37 -07:00
Poorna	b48bbe08b2	Add additional info for replication metrics API (#17293 ) to track the replication transfer rate across different nodes, number of active workers in use and in-queue stats to get an idea of the current workload. This PR also adds replication metrics to the site replication status API. For site replication, prometheus metrics are no longer at the bucket level - but at the cluster level. Add prometheus metric to track credential errors since uptime	2023-08-30 01:00:59 -07:00
Harshavardhana	6426b74770	move bucket centric metrics to /minio/v2/metrics/bucket handlers (#17663 ) users/customers do not have a reasonable number of buckets anymore, this is why we must avoid overpopulating cluster endpoints, instead move the bucket monitoring to a separate endpoint. some of it's a breaking change here for a couple of metrics, but it is imperative that we do it to improve the responsiveness of our Prometheus cluster endpoint. Bonus: Added new cluster metrics for usage, objects and histograms	2023-07-18 22:25:12 -07:00
Anis Eleuch	6d0bc5ab1e	prometheus: Fix internode stats (#17594 ) Internode calculation was done inside S3 handlers, fix it by moving it to internode handlers. Remove admin stats since it is not used.	2023-07-08 07:35:11 -07:00
Klaus Post	d85da9236e	Add Object Version count histogram (#16739 )	2023-03-10 08:53:59 -08:00
Harshavardhana	14cf8f1b22	upgrade deps for minio/pkg v1.6.1 to include groups conditions (#16538 )	2023-02-06 09:27:29 -08:00
Harshavardhana	2433698372	fix: remove unnecessary logs for client conn errors (#16261 )	2022-12-15 08:25:05 -08:00
Anis Elleuch	932d2c3c62	Add X-Amz-Request-Id to internode calls (#16146 )	2022-12-06 09:27:26 -08:00
Harshavardhana	5a8df7efb3	re-implement StorageInfo to be a peer call (#16155 )	2022-12-01 14:31:35 -08:00
Harshavardhana	23b329b9df	remove gateway completely (#15929 )	2022-10-24 17:44:15 -07:00
Poorna	6b9fd256e1	Persist in-memory replication stats to disk (#15594 ) to avoid relying on scanner-calculated replication metrics. This will improve the accuracy of the replication stats reported. This PR also adds on to #15556 by handing replication traffic that could not be queued by available workers to the MRF queue so that entries in `PENDING` status are healed faster.	2022-09-12 12:40:02 -07:00
ebozduman	b57e7321e7	Replaces 'disk'=>'drive' visible to end user (#15464 )	2022-08-04 16:10:08 -07:00
Anis Elleuch	8d98282afd	Better reporting of total/free usable capacity of the cluster (#15230 ) The current code uses approximation using a ratio. The approximation can skew if we have multiple pools with different disk capacities. Replace the algorithm with a simpler one which counts data disks and ignore parity disks.	2022-07-06 13:29:49 -07:00
Harshavardhana	63ac260bd5	Simplify Prometheus metrics gather (#15210 )	2022-07-01 13:18:39 -07:00
Harshavardhana	2420f6c000	fix: make metrics endpoint responsive by reducing the chatter (#15055 ) peerOnlineCounter was making NxN calls to many peers, this can be really long and tedious if there are random servers that are going down. Instead we should calculate online peers from the point of view of "self" and return those online and offline appropriately by performing a healthcheck.	2022-06-08 02:43:13 -07:00
Harshavardhana	f1abb92f0c	feat: Single drive XL implementation (#14970 ) Main motivation is move towards a common backend format for all different types of modes in MinIO, allowing for a simpler code and predictable behavior across all features. This PR also brings features such as versioning, replication, transitioning to single drive setups.	2022-05-30 10:58:37 -07:00
Harshavardhana	85f3a9f3b0	Remove Azure gateway implementation (#14418 ) refer #14331	2022-04-29 12:51:23 -07:00
Harshavardhana	0cc993f403	Remove GCS, HDFS gateway implementations #14418 refer #14331	2022-04-24 10:19:17 -07:00
Harshavardhana	9c846106fa	decouple service accounts from root credentials (#14534 ) changing root credentials makes service accounts in-operable, this PR changes the way sessionToken is generated for service accounts. It changes service account behavior to generate sessionToken claims from its own secret instead of using global root credential. Existing credentials will be supported by falling back to verify using root credential. fixes #14530	2022-03-14 09:09:22 -07:00
Harshavardhana	d6dd17a483	make sure to pass groups for all credentials while verifying policies (#14193 ) fixes #14180	2022-01-26 21:53:36 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	914bfb2d9c	fix: allow compaction on replicated buckets (#13711 ) currently getReplicationConfig() failure incorrectly returns error on unexpected buckets upon upgrade, we should always calculate usage as much as possible.	2021-11-19 14:46:14 -08:00
Anis Elleuch	4caed7cc0d	metrics: Add replication latency metrics (#13515 ) Add a new Prometheus metric for bucket replication latency e.g.: minio_bucket_replication_latency_ns{ bucket="testbucket", operation="upload", range="LESS_THAN_1_MiB", server="127.0.0.1:9001", targetArn="arn:minio:replication::45da043c-14f5-4da4-9316-aba5f77bf730:testbucket"} 2.2015663e+07 Co-authored-by: Klaus Post <klauspost@gmail.com>	2021-11-17 12:10:57 -08:00
Daniel A. Ochoa	07dd0692b6	Fix hdfs gateway concurrent map writes (#13596 ) Co-authored-by: Harshavardhana <harsha@minio.io>	2021-11-08 09:07:58 -08:00
Anis Elleuch	20761e053e	replication: Fix replica stats during crawling (#13499 ) Also show replica stats with an ARN in Prometheus output.	2021-10-22 19:13:50 -07:00
Poorna K	e7f559c582	Fixes to replication metrics (#13493 ) For reporting ReplicaSize and loading initial replication metrics correctly.	2021-10-21 18:52:55 -07:00
Poorna Krishnamoorthy	19ecdc75a8	replication: Simplify metrics calculation (#13274 ) Also doing some code cleanup	2021-09-22 10:48:45 -07:00
Poorna Krishnamoorthy	c4373ef290	Add support for multi site replication (#12880 )	2021-09-18 13:31:35 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00
Poorna Krishnamoorthy	3690de0c6b	Drop Pending size and count from replication metrics (#12378 ) Real-time metrics calculated in-memory rely on the initial replication metrics saved with data usage. However, this can lag behind the actual state of the cluster at the time of server restart leading to inaccurate Pending size/counts reported to Prometheus. Dropping the Pending metrics as this can be more reliably monitored by applications with replication notifications. Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>	2021-05-31 20:26:52 -07:00
Harshavardhana	fdc2020b10	move to iam, bucket policy from minio/pkg (#12400 )	2021-05-29 21:16:42 -07:00
Harshavardhana	ebf75ef10d	fix: remove all unused code (#12360 )	2021-05-24 09:28:19 -07:00
Harshavardhana	1aa5858543	move madmin to github.com/minio/madmin-go (#12239 )	2021-05-06 08:52:02 -07:00
Harshavardhana	069432566f	update license change for MinIO Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-23 11:58:53 -07:00
Harshavardhana	d0d67f9de0	feat: allow prometheus for only authorized users (#12121 ) allow restrictions on who can access Prometheus endpoint, additionally add prometheus as part of diagnostics canned policy. Signed-off-by: Harshavardhana <harsha@minio.io>	2021-04-22 18:55:30 -07:00
Poorna Krishnamoorthy	075bccda42	Fix cluster bucket stats API for prometheus (#11970 ) Metrics calculation was accumulating inital usage across all nodes rather than using initial usage only once. Also fixing: - bug where all peer traffic was going to the same node. - reset counters when replication status changes from PENDING -> FAILED	2021-04-06 08:36:54 -07:00
Harshavardhana	09ee303244	add cluster support for realtime bucket stats (#11963 ) implementation in #11949 only catered from single node, but we need cluster metrics by capturing from all peers. introduce bucket stats API that will be used for capturing in-line bucket usage as well eventually	2021-04-04 15:34:33 -07:00
Poorna Krishnamoorthy	47c09a1e6f	Various improvements in replication (#11949 ) - collect real time replication metrics for prometheus. - add pending_count, failed_count metric for total pending/failed replication operations. - add API to get replication metrics - add MRF worker to handle spill-over replication operations - multiple issues found with replication - fixes an issue when client sends a bucket name with `/` at the end from SetRemoteTarget API call make sure to trim the bucket name to avoid any extra `/`. - hold write locks in GetObjectNInfo during replication to ensure that object version stack is not overwritten while reading the content. - add additional protection during WriteMetadata() to ensure that we always write a valid FileInfo{} and avoid ever writing empty FileInfo{} to the lowest layers. Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2021-04-03 09:03:42 -07:00
Anis Elleuch	2c296652f7	Simplify access to local node name (#11907 ) The local node name is heavily used in tracing, create a new global variable to store it. Multiple goroutines can access it since it won't be changed later.	2021-03-26 11:37:58 -07:00
Klaus Post	749e9c5771	metrics: Add canceled requests (#11881 ) Add metric for canceled requests	2021-03-24 10:25:27 -07:00
Harshavardhana	c6a120df0e	fix: Prometheus metrics to re-use storage disks (#11647 ) also re-use storage disks for all `mc admin server info` calls as well, implement a new LocalStorageInfo() API call at ObjectLayer to lookup local disks storageInfo also fixes bugs where there were double calls to StorageInfo()	2021-03-02 17:28:04 -08:00
Harshavardhana	ffea6fcf09	fix: rename crawler as scanner in config (#11549 )	2021-02-17 12:04:11 -08:00
Ritesh H Shukla	67a8f37df0	fix: disk usage capacity metric reporting (#11435 )	2021-02-04 12:26:58 -08:00
Ritesh H Shukla	b4add82bb6	Updated Prometheus metrics (#11141 ) * Add metrics for nodes online and offline * Add cluster capacity metrics * Introduce v2 metrics	2021-01-18 20:35:38 -08:00
Harshavardhana	a4383051d9	remove/deprecate crawler disable environment (#11214 ) with changes present to automatically throttle crawler at runtime, there is no need to have an environment value to disable crawling. crawling is a fundamental piece for healing, lifecycle and many other features there is no good reason anyone would need to disable this on a production system. * Apply suggestions from code review	2021-01-04 09:43:31 -08:00
Harshavardhana	e7ae49f9c9	fix: calculate prometheus disks_offline/disks_total correctly (#11215 ) fixes #11196	2021-01-04 09:42:09 -08:00

1 2

87 Commits