minio

mirror of https://github.com/minio/minio.git synced 2024-12-26 23:25:54 -05:00

Author	SHA1	Message	Date
Aditya Manthramurthy	62ce52c8fd	cachevalue: simplify exported interface (#19137 ) - Also add cache options type	2024-02-28 09:09:09 -08:00
Klaus Post	2b5e4b853c	Improve caching (#19130 ) * Remove lock for cached operations. * Rename "Relax" to `ReturnLastGood`. * Add `CacheError` to allow caching values even on errors. * Add NoWait that will return current value with async fetching if within 2xTTL. * Make benchmark somewhat representative. ``` Before: BenchmarkCache-12 16408370 63.12 ns/op 0 B/op After: BenchmarkCache-12 428282187 2.789 ns/op 0 B/op ``` * Remove `storageRESTClient.scanning`. Nonsensical - RPC clients will not have any idea about scanning. * Always fetch remote diskinfo metrics and cache them. Seems most calls are requesting metrics. * Do async fetching of usage caches.	2024-02-26 10:49:19 -08:00
Harshavardhana	a3ac62596c	move timedValue -> cachevalue package (#19114 )	2024-02-23 13:28:14 -08:00
Harshavardhana	2faba02d6b	fix: allow diskInfo at storageRPC to be cached (#19112 ) Bonus: convert timedValue into a typed implementation	2024-02-23 09:21:38 -08:00
Harshavardhana	53aa8f5650	use typos instead of codespell (#19088 )	2024-02-21 22:26:06 -08:00
Klaus Post	92180bc793	Add array recycling safety (#19103 ) Nil entries when recycling arrays.	2024-02-21 12:27:35 -08:00
Klaus Post	e06168596f	Convert more peer <--> peer REST calls (#19004 ) * Convert more peer <--> peer REST calls * Clean up in general. * Add JSON wrapper. * Add slice wrapper. * Add option to make handler return nil error if no connection is given, `IgnoreNilConn`. Converts the following: ``` + HandlerGetMetrics + HandlerGetResourceMetrics + HandlerGetMemInfo + HandlerGetProcInfo + HandlerGetOSInfo + HandlerGetPartitions + HandlerGetNetInfo + HandlerGetCPUs + HandlerServerInfo + HandlerGetSysConfig + HandlerGetSysServices + HandlerGetSysErrors + HandlerGetAllBucketStats + HandlerGetBucketStats + HandlerGetSRMetrics + HandlerGetPeerMetrics + HandlerGetMetacacheListing + HandlerUpdateMetacacheListing + HandlerGetPeerBucketMetrics + HandlerStorageInfo + HandlerGetLocks + HandlerBackgroundHealStatus + HandlerGetLastDayTierStats + HandlerSignalService + HandlerGetBandwidth ```	2024-02-19 14:54:46 -08:00
Poorna	0cc9fb73e1	metrics: fix typo in namespace for proxy tagging metric (#19039 ) Relevant PR introducing this metric: #18957	2024-02-12 13:02:27 -08:00
Harshavardhana	eac4e4b279	honor replaced disk properly by updating globalLocalDrives (#19038 ) globalLocalDrives seem to be not updated during the HealFormat() leads to a requirement where the server needs to be restarted for the healing to continue.	2024-02-12 13:00:20 -08:00
Poorna	27d02ea6f7	metrics: add replication metrics on proxied requests (#18957 )	2024-02-05 22:00:45 -08:00
Poorna	29b1a29044	fix metrics panic in node metrics endpoint (#18894 )	2024-01-29 12:32:44 -08:00
Harshavardhana	cff8235068	remove getReplicationNodeMetrics() from peer metrics groups	2024-01-28 18:45:20 -08:00
Harshavardhana	944f3c1477	remove local disk metrics from cluster metrics (#18886 ) local disk metrics were polluting cluster metrics Please remove them instead of adding relevant ones. - batch job metrics were incorrectly kept at bucket metrics endpoint, move it to cluster metrics. - add tier metrics to cluster peer metrics from the node. - fix missing set level cluster health metrics	2024-01-28 12:53:59 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00
Shubhendu	65c4d550cb	Distribution bucket metrics with site replication (#18841 ) If site replication is enabled, we should still show the size and version distribution histogram metrics at bucket level. Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2024-01-22 08:45:36 -08:00
Harshavardhana	e11d851aee	add new drive I/O waiting/tokens metric (#18836 ) Bonus: add virtual memory used as well part of the system resource metrics.	2024-01-19 14:51:36 -08:00
Shubhendu	19387cafab	Use +Inf label additionally for Histogram metrics (#18807 )	2024-01-18 14:51:28 -08:00
jiuker	c1a78224cf	fix: prevent queries from starting before initialization (#18766 )	2024-01-10 15:21:52 -08:00
jiuker	53ceb0791f	fix: prevent queries from starting before initialization (#18756 ) Prevent queries from starting before initialization	2024-01-08 12:40:27 -08:00
Anis Eleuch	414bcb0c73	prom: Add read quorum per erasure set metric (#18736 )	2024-01-04 15:05:13 -08:00
Harshavardhana	a50ea92c64	feat: introduce list_quorum="auto" to prefer quorum drives (#18084 ) NOTE: This feature is not retro-active; it will not cater to previous transactions on existing setups. To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment variable as part of systemd service or k8s configmap. Once this has been enabled, you need to also set `list_quorum`. ``` ~ mc admin config set alias/ api list_quorum=auto` ``` A new debugging tool is available to check for any missing counters.	2023-12-29 15:52:41 -08:00
Anis Eleuch	8432fd5ac2	prom: Add online and healing drives metrics per erasure set (#18700 )	2023-12-21 16:56:43 -08:00
Krishnan Parthasarathi	56b7045c20	Export tier metrics (#18678 ) minio_node_tier_ttlb_seconds - Distribution of time to last byte for streaming objects from warm tier minio_node_tier_requests_success - Number of requests to download object from warm tier that were successful minio_node_tier_requests_failure - Number of requests to download object from warm tier that failed	2023-12-20 20:13:40 -08:00
Krishnan Parthasarathi	162eced7d2	Fix incorrect metric desc for bucketRequestsDuration (#18657 )	2023-12-14 19:02:11 -08:00
Krishnan Parthasarathi	bec1f7c26a	metrics: Refactor handling of histogram vectors (#18632 )	2023-12-14 14:02:52 -08:00
Praveen raj Mani	10ca0a6936	Label the notification target metrics by their target IDs (#18633 ) This patch adds the targetID to the existing notification target metrics and deprecates the current target metrics which points to the overall event notification subsystem	2023-12-14 09:09:26 -08:00
jiuker	6ca6788bb7	feat: add events_errors_total metric (#18610 )	2023-12-07 16:21:17 -08:00
Harshavardhana	e98172d72d	avoid hot-tier SLA to be tied to warm-tier SLA (#18581 ) it is okay if the warm-tier cannot keep up, we should continue to take I/O at hot-tier, only fail hot-tier or block it when we are disk full. Bonus: add metrics counter for these missed tasks, we will know for sure if one of the node is lagging behind or is losing too many tasks during transitioning.	2023-12-02 13:02:12 -08:00
Krishnan Parthasarathi	a50f26b7f5	Implement batch-expiration for objects (#17946 ) Based on an initial PR from - https://github.com/minio/minio/pull/17792 But fully completes it with newer finalized YAML spec.	2023-12-02 02:51:33 -08:00
Anis Eleuch	fe63664164	prom: Add drive failure tolerance per erasure set (#18424 )	2023-11-13 00:59:48 -08:00
vicmunoz	da95a2d13f	fix: object versions metric help (#18388 )	2023-11-03 11:43:52 -07:00
Klaus Post	128256e3ab	Add event counters (#18232 ) Export metric for global events sent and skipped for the lifetime of the server.	2023-10-12 15:39:22 -07:00
Harshavardhana	6829ae5b13	completely remove drive caching layer from gateway days (#18217 ) This has already been deprecated for close to a year now.	2023-10-11 21:18:17 -07:00
Matthew Toohey	f731e7ea36	Fix current_send_in_progress metric always being zero (#18160 )	2023-10-09 17:28:17 -07:00
Shireesh Anjal	6d20ec3bea	Add support for resource metrics (#18057 ) Add a new endpoint for "resource" metrics `/v2/metrics/resource` This should return system metrics related to drives, network, CPU and memory. Except for drives, other metrics should have corresponding "avg" and "max" values also. Reuse the real-time feature to capture the required data, introducing CPU and memory metrics in it. Collect the data every minute and keep updating the average and max values accordingly, returning the latest values when the API is called.	2023-09-30 13:40:20 -07:00
Harshavardhana	9788d85ea3	remove logging for invalid metadata values (#18068 )	2023-09-20 15:49:55 -07:00
Poorna	812f5a02d7	metrics: fix panic in replication stats reporting (#17979 )	2023-09-05 10:26:18 -07:00
Poorna	b48bbe08b2	Add additional info for replication metrics API (#17293 ) to track the replication transfer rate across different nodes, number of active workers in use and in-queue stats to get an idea of the current workload. This PR also adds replication metrics to the site replication status API. For site replication, prometheus metrics are no longer at the bucket level - but at the cluster level. Add prometheus metric to track credential errors since uptime	2023-08-30 01:00:59 -07:00
Harshavardhana	ba4566e86d	add missing IAM node metrics to cluster and node endpoint (#17908 )	2023-08-24 09:26:37 -07:00
Harshavardhana	c4ca0a5a57	add two more drive metrics when metrics is available (#17854 )	2023-08-15 10:55:47 -07:00
Yang Wu	23e4895dfc	Create metrics slice when necessary (#17809 )	2023-08-07 02:21:22 -07:00
Harshavardhana	2fa561f22e	do not crash on invalid metric values (#17764 ) ``` minio[1032735]: panic: label value "\xc0.\xc0." is not valid UTF-8 minio[1032735]: goroutine 1781101 [running]: minio[1032735]: github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...) ``` log such errors for investigation	2023-08-01 00:55:39 -07:00
Harshavardhana	114fab4c70	export cluster health as prometheus metrics (#17741 )	2023-07-28 01:16:53 -07:00
drivebyer	a7fb3a3853	fix: Create metrics slice when necessary in getCacheMetrics() (#17711 )	2023-07-24 08:40:21 -07:00
Krishnan Parthasarathi	9eeee92d36	Add deletemarker_total metric (#17689 )	2023-07-20 07:52:32 -07:00
Harshavardhana	6426b74770	move bucket centric metrics to /minio/v2/metrics/bucket handlers (#17663 ) users/customers do not have a reasonable number of buckets anymore, this is why we must avoid overpopulating cluster endpoints, instead move the bucket monitoring to a separate endpoint. some of it's a breaking change here for a couple of metrics, but it is imperative that we do it to improve the responsiveness of our Prometheus cluster endpoint. Bonus: Added new cluster metrics for usage, objects and histograms	2023-07-18 22:25:12 -07:00
drivebyer	04c792476f	fix: provide a possible slice cap for heal failed metrics items (#17647 ) Signed-off-by: Wu <yang.wu@daocloud.io>	2023-07-14 11:02:45 -07:00
Harshavardhana	5b7c83341b	move per bucket metrics to peer location (#17627 )	2023-07-11 07:46:24 -07:00
Anis Eleuch	6d0bc5ab1e	prometheus: Fix internode stats (#17594 ) Internode calculation was done inside S3 handlers, fix it by moving it to internode handlers. Remove admin stats since it is not used.	2023-07-08 07:35:11 -07:00
Harshavardhana	abb1f22057	Revert "change ttfb_distribution metrics to histogramMetric (#17115 )" This reverts commit `9112ca4e29`.	2023-07-07 13:57:37 -07:00

1 2 3

135 Commits