Commit Graph

12 Commits

Author SHA1 Message Date
Shireesh Anjal
074d70112d
Consolidate drive health related metrics into single metric (#19706)
Instead of having "online" and "healing" as two metrics, replace with a
single metric "health" which can have following values:

0 = offline
1 = healthy
2 = healing
2024-05-12 10:23:50 -07:00
Shireesh Anjal
60d7e8143a
Move /cluster/audit to /audit (#19708)
As the audit metrics are server level and not 
overall cluster level.
2024-05-10 07:50:39 -07:00
Shireesh Anjal
04f92f1291
Change endpoint format for per-bucket metrics (#19655)
Per-bucket metrics endpoints always start with /bucket and the bucket
name is appended to the path. e.g. if the collector path is /bucket/api,
the endpoint for the bucket "mybucket" would be
/minio/metrics/v3/bucket/api/mybucket

Change the existing bucket api endpoint accordingly from /api/bucket to
/bucket/api
2024-05-02 10:37:57 -07:00
Bala FA
e5b16adb1c
Add cluster IAM metrics in metrics-v3 (#19595)
Signed-off-by: Bala.FA <bala@minio.io>
2024-05-02 01:20:42 -07:00
Shireesh Anjal
4caa3422bd
Add process metrics in metrics-v3 (#19612)
endpoint: /minio/metrics/v3/system/process
metrics:
- locks_read_total
- locks_write_total
- cpu_total_seconds
- go_routine_total
- io_rchar_bytes
- io_read_bytes
- io_wchar_bytes
- io_write_bytes
- start_time_seconds
- uptime_seconds
- file_descriptor_limit_total
- file_descriptor_open_total
- syscall_read_total
- syscall_write_total
- resident_memory_bytes
- virtual_memory_bytes
- virtual_memory_max_bytes

Since the standard process collector implements only a subset of these
metrics, remove it and implement our own custom process collector that
captures all the process metrics we need.
2024-04-26 09:07:23 -07:00
Harshavardhana
c54ffde568
add metrics ioerror counter for alerts on I/O errors (#19618) 2024-04-25 15:01:31 -07:00
Bala FA
14cdadfb56
Add cluster notification metrics in metrics-v3 (#19533)
Signed-off-by: Bala.FA <bala@minio.io>
2024-04-23 21:10:35 -07:00
Shireesh Anjal
f7b665347e
Add system CPU metrics to metrics-v3 (#19560)
endpoint: /minio/metrics/v3/system/cpu

metrics:
- minio_system_cpu_avg_idle
- minio_system_cpu_avg_iowait
- minio_system_cpu_load
- minio_system_cpu_load_perc
- minio_system_cpu_nice
- minio_system_cpu_steal
- minio_system_cpu_system
- minio_system_cpu_user
2024-04-23 16:56:12 -07:00
Shireesh Anjal
ca5fab8656
Add cluster audit metrics in metrics-v3 (#19514)
endpoint: /minio/metrics/v3/cluster/audit
metrics:
- failed_messages (counter)
- total_messages (counter)
- target_queue_length (gauge)
2024-04-17 02:18:02 -07:00
Shireesh Anjal
6df76ca73c
Add system memory metrics in v3 (#19486)
Following memory metrics will be added under /system/memory

- available
- buffers
- cache
- free
- shared
- total
- used
- used_perc
2024-04-16 22:10:25 -07:00
Shireesh Anjal
08d3d06a06
Add drive metrics in metrics-v3 (#19452)
Add following metrics:

- used_inodes
- total_inodes
- healing
- online
- reads_per_sec
- reads_kb_per_sec
- reads_await
- writes_per_sec
- writes_kb_per_sec
- writes_await
- perc_util

To be able to calculate the `per_sec` values, we capture the IOStats-related 
data in the beginning (along with the time at which they were captured), 
and compare them against the current values subsequently. This is because 
dividing by "time since server uptime." doesn't work in k8s environments.
2024-04-11 10:46:34 -07:00
Aditya Manthramurthy
b2c5b75efa
feat: Add Metrics V3 API (#19068)
Metrics v3 is mainly a reorganization of metrics into smaller groups of
metrics and the removal of internal aggregation of metrics received from
peer nodes in a MinIO cluster.

This change adds the endpoint `/minio/metrics/v3` as the top-level metrics
endpoint and under this, various sub-endpoints are implemented. These
are currently documented in `docs/metrics/v3.md`

The handler will serve metrics at any path
`/minio/metrics/v3/PATH`, as follows:

when PATH is a sub-endpoint listed above => serves the group of
metrics under that path; or when PATH is a (non-empty) parent 
directory of the sub-endpoints listed above => serves metrics
from each child sub-endpoint of PATH. otherwise, returns a no 
resource found error

All available metrics are listed in the `docs/metrics/v3.md`. More will
be added subsequently.
2024-03-10 01:15:15 -08:00