Metrics being added: - read_tolerance: No of drive failures that can be tolerated without disrupting read operations - write_tolerance: No of drive failures that can be tolerated without disrupting write operations - read_health: Health of the erasure set in a pool for read operations (1=healthy, 0=unhealthy) - write_health: Health of the erasure set in a pool for write operations (1=healthy, 0=unhealthy)
30 KiB
Metrics Version 3
In metrics version 3, all metrics are available under the endpoint:
/minio/metrics/v3
however, a specific path under this is required.
Metrics are organized into groups at paths relative to the top-level endpoint above.
Metrics Request Handling
Each endpoint below can be queried at different intervals as needed via a scrape configuration in Prometheus or a compatible metrics collection tool.
For ease of configuration, each (non-empty) parent of the path serves all metric endpoints that are at descendant paths. For example, to query all system metrics one needs to only scrape /minio/metrics/v3/system/
.
Some metrics are bucket specific. These will have a /bucket
component in their path. As the number of buckets can be large, the metrics scrape operation needs to be provided with a specific list of buckets via the bucket
query parameter. Only metrics for the given buckets will be returned (with the bucket label set). For example to query API metrics for buckets test1
and test2
, make a scrape request to /minio/metrics/v3/api/bucket?buckets=test1,test2
.
Instead of a metrics scrape, it is also possible to list the metrics that would be returned by a path. This is done by adding a ?list
query parameter. The MinIO server will then list all possible metrics that could be returned. During an actual metrics scrape, only available metrics are returned - not all of them. With the list
query parameter, the output format can be selected - just set the request Content-Type
to application/json
for JSON output, or text/plain
for a simple markdown formatted table. The latter is the default.
Request, System and Cluster Metrics
At a high level metrics are grouped into three categories, listed in the following sub-sections. The path in each of the tables is relative to the top-level endpoint.
Request metrics
These are metrics about requests served by the (current) node.
Path | Description |
---|---|
/api/requests |
Metrics over all requests |
/api/bucket |
Metrics over all requests split by bucket labels |
Audit metrics
These are metrics about the minio process and the node.
Path | Description |
---|---|
/audit |
Metrics related to audit functionality |
System metrics
These are metrics about the minio process and the node.
Path | Description |
---|---|
/system/drive |
Metrics about drives on the system |
/system/memory |
Metrics about memory on the system |
/system/network/internode |
Metrics about internode requests made by the node |
/system/process |
Standard process metrics |
Debug metrics
These are metrics for debugging
Path | Description |
---|---|
/debug/go |
Standard Go lang metrics |
Cluster metrics
These present metrics about the whole MinIO cluster.
Path | Description |
---|---|
/cluster/health |
Cluster health metrics |
/cluster/usage/objects |
Object statistics |
/cluster/usage/buckets |
Object statistics by bucket |
/cluster/erasure-set |
Erasure set metrics |
Metrics Listing
Each of the following sub-sections list metrics returned by each of the endpoints.
The standard metrics group for GoCollector is not shown below.
/api/requests
Name | Type | Help | Labels |
---|---|---|---|
minio_api_requests_rejected_auth_total |
counter |
Total number of requests rejected for auth failure | type,pool_index,server |
minio_api_requests_rejected_header_total |
counter |
Total number of requests rejected for invalid header | type,pool_index,server |
minio_api_requests_rejected_timestamp_total |
counter |
Total number of requests rejected for invalid timestamp | type,pool_index,server |
minio_api_requests_rejected_invalid_total |
counter |
Total number of invalid requests | type,pool_index,server |
minio_api_requests_waiting_total |
gauge |
Total number of requests in the waiting queue | type,pool_index,server |
minio_api_requests_incoming_total |
gauge |
Total number of incoming requests | type,pool_index,server |
minio_api_requests_inflight_total |
gauge |
Total number of requests currently in flight | name,type,pool_index,server |
minio_api_requests_total |
counter |
Total number of requests | name,type,pool_index,server |
minio_api_requests_errors_total |
counter |
Total number of requests with (4xx and 5xx) errors | name,type,pool_index,server |
minio_api_requests_5xx_errors_total |
counter |
Total number of requests with 5xx errors | name,type,pool_index,server |
minio_api_requests_4xx_errors_total |
counter |
Total number of requests with 4xx errors | name,type,pool_index,server |
minio_api_requests_canceled_total |
counter |
Total number of requests canceled by the client | name,type,pool_index,server |
minio_api_requests_ttfb_seconds_distribution |
counter |
Distribution of time to first byte across API calls | name,type,le,pool_index,server |
minio_api_requests_traffic_sent_bytes |
counter |
Total number of bytes sent | type,pool_index,server |
minio_api_requests_traffic_received_bytes |
counter |
Total number of bytes received | type,pool_index,server |
/bucket/api
Name | Type | Help | Labels |
---|---|---|---|
minio_bucket_api_traffic_received_bytes |
counter |
Total number of bytes sent for a bucket | bucket,type,server,pool_index |
minio_bucket_api_traffic_sent_bytes |
counter |
Total number of bytes received for a bucket | bucket,type,server,pool_index |
minio_bucket_api_inflight_total |
gauge |
Total number of requests currently in flight for a bucket | bucket,name,type,server,pool_index |
minio_bucket_api_total |
counter |
Total number of requests for a bucket | bucket,name,type,server,pool_index |
minio_bucket_api_canceled_total |
counter |
Total number of requests canceled by the client for a bucket | bucket,name,type,server,pool_index |
minio_bucket_api_4xx_errors_total |
counter |
Total number of requests with 4xx errors for a bucket | bucket,name,type,server,pool_index |
minio_bucket_api_5xx_errors_total |
counter |
Total number of requests with 5xx errors for a bucket | bucket,name,type,server,pool_index |
minio_bucket_api_ttfb_seconds_distribution |
counter |
Distribution of time to first byte across API calls for a bucket | bucket,name,le,type,server,pool_index |
/audit
Name | Type | Help | Labels |
---|---|---|---|
minio_audit_failed_messages |
counter |
Total number of messages that failed to send since start | target_id,server |
minio_audit_target_queue_length |
gauge |
Number of unsent messages in queue for target | target_id,server |
minio_audit_total_messages |
counter |
Total number of messages sent since start | target_id,server |
/system/drive
Name | Type | Help | Labels |
---|---|---|---|
minio_system_drive_used_bytes |
gauge |
Total storage used on a drive in bytes | drive,set_index,drive_index,pool_index,server |
minio_system_drive_free_bytes |
gauge |
Total storage free on a drive in bytes | drive,set_index,drive_index,pool_index,server |
minio_system_drive_total_bytes |
gauge |
Total storage available on a drive in bytes | drive,set_index,drive_index,pool_index,server |
minio_system_drive_used_inodes |
gauge |
Total used inodes on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_free_inodes |
gauge |
Total free inodes on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_total_inodes |
gauge |
Total inodes available on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_timeout_errors_total |
counter |
Total timeout errors on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_io_errors_total |
counter |
Total I/O errors on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_availability_errors_total |
counter |
Total availability errors (I/O errors, timeouts) on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_waiting_io |
gauge |
Total waiting I/O operations on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_api_latency_micros |
gauge |
Average last minute latency in µs for drive API storage operations | drive,api,set_index,drive_index,pool_index,server |
minio_system_drive_offline_count |
gauge |
Count of offline drives | pool_index,server |
minio_system_drive_online_count |
gauge |
Count of online drives | pool_index,server |
minio_system_drive_count |
gauge |
Count of all drives | pool_index,server |
minio_system_drive_health |
gauge |
Drive health (0 = offline, 1 = healthy, 2 = healing) | drive,set_index,drive_index,pool_index,server |
minio_system_drive_reads_per_sec |
gauge |
Reads per second on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_reads_kb_per_sec |
gauge |
Kilobytes read per second on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_reads_await |
gauge |
Average time for read requests served on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_writes_per_sec |
gauge |
Writes per second on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_writes_kb_per_sec |
gauge |
Kilobytes written per second on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_writes_await |
gauge |
Average time for write requests served on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_perc_util |
gauge |
Percentage of time the disk was busy | drive,set_index,drive_index,pool_index,server |
/system/memory
Name | Type | Help | Labels |
---|---|---|---|
minio_system_memory_used |
gauge |
Used memory on the node | server |
minio_system_memory_used_perc |
gauge |
Used memory percentage on the node | server |
minio_system_memory_free |
gauge |
Free memory on the node | server |
minio_system_memory_total |
gauge |
Total memory on the node | server |
minio_system_memory_buffers |
gauge |
Buffers memory on the node | server |
minio_system_memory_cache |
gauge |
Cache memory on the node | server |
minio_system_memory_shared |
gauge |
Shared memory on the node | server |
minio_system_memory_available |
gauge |
Available memory on the node | server |
/system/cpu
Name | Type | Help | Labels |
---|---|---|---|
minio_system_cpu_avg_idle |
gauge |
Average CPU idle time | server |
minio_system_cpu_avg_iowait |
gauge |
Average CPU IOWait time | server |
minio_system_cpu_load |
gauge |
CPU load average 1min | server |
minio_system_cpu_load_perc |
gauge |
CPU load average 1min (percentage) | server |
minio_system_cpu_nice |
gauge |
CPU nice time | server |
minio_system_cpu_steal |
gauge |
CPU steal time | server |
minio_system_cpu_system |
gauge |
CPU system time | server |
minio_system_cpu_user |
gauge |
CPU user time | server |
/system/network/internode
Name | Type | Help | Labels |
---|---|---|---|
minio_system_network_internode_errors_total |
counter |
Total number of failed internode calls | server,pool_index |
minio_system_network_internode_dial_errors_total |
counter |
Total number of internode TCP dial timeouts and errors | server,pool_index |
minio_system_network_internode_dial_avg_time_nanos |
gauge |
Average dial time of internodes TCP calls in nanoseconds | server,pool_index |
minio_system_network_internode_sent_bytes_total |
counter |
Total number of bytes sent to other peer nodes | server,pool_index |
minio_system_network_internode_recv_bytes_total |
counter |
Total number of bytes received from other peer nodes | server,pool_index |
/system/process
Name | Type | Help | Labels |
---|---|---|---|
locks_read_total |
gauge |
Number of current READ locks on this peer | server |
locks_write_total |
gauge |
Number of current WRITE locks on this peer | server |
cpu_total_seconds |
counter |
Total user and system CPU time spent in seconds | server |
go_routine_total |
gauge |
Total number of go routines running | server |
io_rchar_bytes |
counter |
Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar | server |
io_read_bytes |
counter |
Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes | server |
io_wchar_bytes |
counter |
Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar | server |
io_write_bytes |
counter |
Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes | server |
start_time_seconds |
gauge |
Start time for MinIO process in seconds since Unix epoc | server |
uptime_seconds |
gauge |
Uptime for MinIO process in seconds | server |
file_descriptor_limit_total |
gauge |
Limit on total number of open file descriptors for the MinIO Server process | server |
file_descriptor_open_total |
gauge |
Total number of open file descriptors by the MinIO Server process | server |
syscall_read_total |
counter |
Total read SysCalls to the kernel. /proc/[pid]/io syscr | server |
syscall_write_total |
counter |
Total write SysCalls to the kernel. /proc/[pid]/io syscw | server |
resident_memory_bytes |
gauge |
Resident memory size in bytes | server |
virtual_memory_bytes |
gauge |
Virtual memory size in bytes | server |
virtual_memory_max_bytes |
gauge |
Maximum virtual memory size in bytes | server |
/cluster/health
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_health_drives_offline_count |
gauge |
Count of offline drives in the cluster | |
minio_cluster_health_drives_online_count |
gauge |
Count of online drives in the cluster | |
minio_cluster_health_drives_count |
gauge |
Count of all drives in the cluster | |
minio_cluster_health_nodes_offline_count |
gauge |
Count of offline nodes in the cluster | |
minio_cluster_health_nodes_online_count |
gauge |
Count of online nodes in the cluster | |
minio_cluster_health_capacity_raw_total_bytes |
gauge |
Total cluster raw storage capacity in bytes | |
minio_cluster_health_capacity_raw_free_bytes |
gauge |
Total cluster raw storage free in bytes | |
minio_cluster_health_capacity_usable_total_bytes |
gauge |
Total cluster usable storage capacity in bytes | |
minio_cluster_health_capacity_usable_free_bytes |
gauge |
Total cluster usable storage free in bytes |
/cluster/usage/objects
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_usage_objects_since_last_update_seconds |
gauge |
Time since last update of usage metrics in seconds | |
minio_cluster_usage_objects_total_bytes |
gauge |
Total cluster usage in bytes | |
minio_cluster_usage_objects_count |
gauge |
Total cluster objects count | |
minio_cluster_usage_objects_versions_count |
gauge |
Total cluster object versions (including delete markers) count | |
minio_cluster_usage_objects_delete_markers_count |
gauge |
Total cluster delete markers count | |
minio_cluster_usage_objects_buckets_count |
gauge |
Total cluster buckets count | |
minio_cluster_usage_objects_size_distribution |
gauge |
Cluster object size distribution | range |
minio_cluster_usage_objects_version_count_distribution |
gauge |
Cluster object version count distribution | range |
/cluster/usage/buckets
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_usage_buckets_since_last_update_seconds |
gauge |
Time since last update of usage metrics in seconds | |
minio_cluster_usage_buckets_total_bytes |
gauge |
Total bucket size in bytes | bucket |
minio_cluster_usage_buckets_objects_count |
gauge |
Total objects count in bucket | bucket |
minio_cluster_usage_buckets_versions_count |
gauge |
Total object versions (including delete markers) count in bucket | bucket |
minio_cluster_usage_buckets_delete_markers_count |
gauge |
Total delete markers count in bucket | bucket |
minio_cluster_usage_buckets_quota_total_bytes |
gauge |
Total bucket quota in bytes | bucket |
minio_cluster_usage_buckets_object_size_distribution |
gauge |
Bucket object size distribution | range,bucket |
minio_cluster_usage_buckets_object_version_count_distribution |
gauge |
Bucket object version count distribution | range,bucket |
/cluster/erasure-set
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_erasure_set_overall_write_quorum |
gauge |
Overall write quorum across pools and sets | |
minio_cluster_erasure_set_overall_health |
gauge |
Overall health across pools and sets (1=healthy, 0=unhealthy) | |
minio_cluster_erasure_set_read_quorum |
gauge |
Read quorum for the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_write_quorum |
gauge |
Write quorum for the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_online_drives_count |
gauge |
Count of online drives in the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_healing_drives_count |
gauge |
Count of healing drives in the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_health |
gauge |
Health of the erasure set in a pool (1=healthy, 0=unhealthy) | pool_id,set_id |
minio_cluster_erasure_set_read_tolerance |
gauge |
No of drive failures that can be tolerated without disrupting read operations | pool_id,set_id |
minio_cluster_erasure_set_write_tolerance |
gauge |
No of drive failures that can be tolerated without disrupting write operations | pool_id,set_id |
minio_cluster_erasure_set_read_health |
gauge |
Health of the erasure set in a pool for read operations (1=healthy, 0=unhealthy) | pool_id,set_id |
minio_cluster_erasure_set_write_health |
gauge |
Health of the erasure set in a pool for write operations (1=healthy, 0=unhealthy) | pool_id,set_id |
/cluster/notification
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_notification_current_send_in_progress |
counter |
Number of concurrent async Send calls active to all targets | |
minio_cluster_notification_events_errors_total |
counter |
Events that were failed to be sent to the targets | |
minio_cluster_notification_events_sent_total |
counter |
Total number of events sent to the targets | |
minio_cluster_notification_events_skipped_total |
counter |
Events that were skipped to be sent to the targets due to the in-memory queue being full |
/cluster/iam
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_iam_last_sync_duration_millis |
counter |
Last successful IAM data sync duration in milliseconds | |
minio_cluster_iam_plugin_authn_service_failed_requests_minute |
counter |
When plugin authentication is configured, returns failed requests count in the last full minute | |
minio_cluster_iam_plugin_authn_service_last_fail_seconds |
counter |
When plugin authentication is configured, returns time (in seconds) since the last failed request to the service | |
minio_cluster_iam_plugin_authn_service_last_succ_seconds |
counter |
When plugin authentication is configured, returns time (in seconds) since the last successful request to the service | |
minio_cluster_iam_plugin_authn_service_succ_avg_rtt_ms_minute |
counter |
When plugin authentication is configured, returns average round-trip-time of successful requests in the last full minute | |
minio_cluster_iam_plugin_authn_service_succ_max_rtt_ms_minute |
counter |
When plugin authentication is configured, returns maximum round-trip-time of successful requests in the last full minute | |
minio_cluster_iam_plugin_authn_service_total_requests_minute |
counter |
When plugin authentication is configured, returns total requests count in the last full minute | |
minio_cluster_iam_since_last_sync_millis |
counter |
Time (in milliseconds) since last successful IAM data sync | |
minio_cluster_iam_sync_failures |
counter |
Number of failed IAM data syncs since server start | |
minio_cluster_iam_sync_successes |
counter |
Number of successful IAM data syncs since server start |