Metrics v3 is mainly a reorganization of metrics into smaller groups of metrics and the removal of internal aggregation of metrics received from peer nodes in a MinIO cluster. This change adds the endpoint `/minio/metrics/v3` as the top-level metrics endpoint and under this, various sub-endpoints are implemented. These are currently documented in `docs/metrics/v3.md` The handler will serve metrics at any path `/minio/metrics/v3/PATH`, as follows: when PATH is a sub-endpoint listed above => serves the group of metrics under that path; or when PATH is a (non-empty) parent directory of the sub-endpoints listed above => serves metrics from each child sub-endpoint of PATH. otherwise, returns a no resource found error All available metrics are listed in the `docs/metrics/v3.md`. More will be added subsequently.
17 KiB
Metrics Version 3
In metrics version 3, all metrics are available under the endpoint:
/minio/metrics/v3
however, a specific path under this is required.
Metrics are organized into groups at paths relative to the top-level endpoint above.
Metrics Request Handling
Each endpoint below can be queried at different intervals as needed via a scrape configuration in Prometheus or a compatible metrics collection tool.
For ease of configuration, each (non-empty) parent of the path serves all metric endpoints that are at descendant paths. For example, to query all system metrics one needs to only scrape /minio/metrics/v3/system/
.
Some metrics are bucket specific. These will have a /bucket
component in their path. As the number of buckets can be large, the metrics scrape operation needs to be provided with a specific list of buckets via the bucket
query parameter. Only metrics for the given buckets will be returned (with the bucket label set). For example to query API metrics for buckets test1
and test2
, make a scrape request to /minio/metrics/v3/api/bucket?buckets=test1,test2
.
Instead of a metrics scrape, it is also possible to list the metrics that would be returned by a path. This is done by adding a ?list
query parameter. The MinIO server will then list all possible metrics that could be returned. During an actual metrics scrape, only available metrics are returned - not all of them. With the list
query parameter, the output format can be selected - just set the request Content-Type
to application/json
for JSON output, or text/plain
for a simple markdown formatted table. The latter is the default.
Request, System and Cluster Metrics
At a high level metrics are grouped into three categories, listed in the following sub-sections. The path in each of the tables is relative to the top-level endpoint.
Request metrics
These are metrics about requests served by the (current) node.
Path | Description |
---|---|
/api/requests |
Metrics over all requests |
/api/bucket |
Metrics over all requests split by bucket labels |
System metrics
These are metrics about the minio process and the node.
Path | Description |
---|---|
/system/drive |
Metrics about drives on the system |
/system/network/internode |
Metrics about internode requests made by the node |
/system/process |
Standard process metrics |
/system/go |
Standard Go lang metrics |
Cluster metrics
These present metrics about the whole MinIO cluster.
Path | Description |
---|---|
/cluster/health |
Cluster health metrics |
/cluster/usage/objects |
Object statistics |
/cluster/usage/buckets |
Object statistics by bucket |
/cluster/erasure-set |
Erasure set metrics |
Metrics Listing
Each of the following sub-sections list metrics returned by each of the endpoints.
The standard metrics groups for ProcessCollector and GoCollector are not shown below.
/api/requests
Name | Type | Help | Labels |
---|---|---|---|
minio_api_requests_rejected_auth_total |
counter |
Total number of requests rejected for auth failure | type,pool_index,server |
minio_api_requests_rejected_header_total |
counter |
Total number of requests rejected for invalid header | type,pool_index,server |
minio_api_requests_rejected_timestamp_total |
counter |
Total number of requests rejected for invalid timestamp | type,pool_index,server |
minio_api_requests_rejected_invalid_total |
counter |
Total number of invalid requests | type,pool_index,server |
minio_api_requests_waiting_total |
gauge |
Total number of requests in the waiting queue | type,pool_index,server |
minio_api_requests_incoming_total |
gauge |
Total number of incoming requests | type,pool_index,server |
minio_api_requests_inflight_total |
gauge |
Total number of requests currently in flight | name,type,pool_index,server |
minio_api_requests_total |
counter |
Total number of requests | name,type,pool_index,server |
minio_api_requests_errors_total |
counter |
Total number of requests with (4xx and 5xx) errors | name,type,pool_index,server |
minio_api_requests_5xx_errors_total |
counter |
Total number of requests with 5xx errors | name,type,pool_index,server |
minio_api_requests_4xx_errors_total |
counter |
Total number of requests with 4xx errors | name,type,pool_index,server |
minio_api_requests_canceled_total |
counter |
Total number of requests canceled by the client | name,type,pool_index,server |
minio_api_requests_ttfb_seconds_distribution |
counter |
Distribution of time to first byte across API calls | name,type,le,pool_index,server |
minio_api_requests_traffic_sent_bytes |
counter |
Total number of bytes sent | type,pool_index,server |
minio_api_requests_traffic_received_bytes |
counter |
Total number of bytes received | type,pool_index,server |
/api/bucket
Name | Type | Help | Labels |
---|---|---|---|
minio_api_bucket_traffic_received_bytes |
counter |
Total number of bytes sent for a bucket | bucket,type,server,pool_index |
minio_api_bucket_traffic_sent_bytes |
counter |
Total number of bytes received for a bucket | bucket,type,server,pool_index |
minio_api_bucket_inflight_total |
gauge |
Total number of requests currently in flight for a bucket | bucket,name,type,server,pool_index |
minio_api_bucket_total |
counter |
Total number of requests for a bucket | bucket,name,type,server,pool_index |
minio_api_bucket_canceled_total |
counter |
Total number of requests canceled by the client for a bucket | bucket,name,type,server,pool_index |
minio_api_bucket_4xx_errors_total |
counter |
Total number of requests with 4xx errors for a bucket | bucket,name,type,server,pool_index |
minio_api_bucket_5xx_errors_total |
counter |
Total number of requests with 5xx errors for a bucket | bucket,name,type,server,pool_index |
minio_api_bucket_ttfb_seconds_distribution |
counter |
Distribution of time to first byte across API calls for a bucket | bucket,name,le,type,server,pool_index |
/system/drive
Name | Type | Help | Labels |
---|---|---|---|
minio_system_drive_used_bytes |
gauge |
Total storage used on a drive in bytes | drive,set_index,drive_index,pool_index,server |
minio_system_drive_free_bytes |
gauge |
Total storage free on a drive in bytes | drive,set_index,drive_index,pool_index,server |
minio_system_drive_total_bytes |
gauge |
Total storage available on a drive in bytes | drive,set_index,drive_index,pool_index,server |
minio_system_drive_free_inodes |
gauge |
Total free inodes on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_timeout_errors_total |
counter |
Total timeout errors on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_availability_errors_total |
counter |
Total availability errors (I/O errors, permission denied and timeouts) on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_waiting_io |
gauge |
Total waiting I/O operations on a drive | drive,set_index,drive_index,pool_index,server |
minio_system_drive_api_latency_micros |
gauge |
Average last minute latency in µs for drive API storage operations | drive,api,set_index,drive_index,pool_index,server |
minio_system_drive_offline_count |
gauge |
Count of offline drives | pool_index,server |
minio_system_drive_online_count |
gauge |
Count of online drives | pool_index,server |
minio_system_drive_count |
gauge |
Count of all drives | pool_index,server |
/system/network/internode
Name | Type | Help | Labels |
---|---|---|---|
minio_system_network_internode_errors_total |
counter |
Total number of failed internode calls | server,pool_index |
minio_system_network_internode_dial_errors_total |
counter |
Total number of internode TCP dial timeouts and errors | server,pool_index |
minio_system_network_internode_dial_avg_time_nanos |
gauge |
Average dial time of internodes TCP calls in nanoseconds | server,pool_index |
minio_system_network_internode_sent_bytes_total |
counter |
Total number of bytes sent to other peer nodes | server,pool_index |
minio_system_network_internode_recv_bytes_total |
counter |
Total number of bytes received from other peer nodes | server,pool_index |
/cluster/health
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_health_drives_offline_count |
gauge |
Count of offline drives in the cluster | |
minio_cluster_health_drives_online_count |
gauge |
Count of online drives in the cluster | |
minio_cluster_health_drives_count |
gauge |
Count of all drives in the cluster | |
minio_cluster_health_nodes_offline_count |
gauge |
Count of offline nodes in the cluster | |
minio_cluster_health_nodes_online_count |
gauge |
Count of online nodes in the cluster | |
minio_cluster_health_capacity_raw_total_bytes |
gauge |
Total cluster raw storage capacity in bytes | |
minio_cluster_health_capacity_raw_free_bytes |
gauge |
Total cluster raw storage free in bytes | |
minio_cluster_health_capacity_usable_total_bytes |
gauge |
Total cluster usable storage capacity in bytes | |
minio_cluster_health_capacity_usable_free_bytes |
gauge |
Total cluster usable storage free in bytes |
/cluster/usage/objects
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_usage_objects_since_last_update_seconds |
gauge |
Time since last update of usage metrics in seconds | |
minio_cluster_usage_objects_total_bytes |
gauge |
Total cluster usage in bytes | |
minio_cluster_usage_objects_count |
gauge |
Total cluster objects count | |
minio_cluster_usage_objects_versions_count |
gauge |
Total cluster object versions (including delete markers) count | |
minio_cluster_usage_objects_delete_markers_count |
gauge |
Total cluster delete markers count | |
minio_cluster_usage_objects_buckets_count |
gauge |
Total cluster buckets count | |
minio_cluster_usage_objects_size_distribution |
gauge |
Cluster object size distribution | range |
minio_cluster_usage_objects_version_count_distribution |
gauge |
Cluster object version count distribution | range |
/cluster/usage/buckets
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_usage_buckets_since_last_update_seconds |
gauge |
Time since last update of usage metrics in seconds | |
minio_cluster_usage_buckets_total_bytes |
gauge |
Total bucket size in bytes | bucket |
minio_cluster_usage_buckets_objects_count |
gauge |
Total objects count in bucket | bucket |
minio_cluster_usage_buckets_versions_count |
gauge |
Total object versions (including delete markers) count in bucket | bucket |
minio_cluster_usage_buckets_delete_markers_count |
gauge |
Total delete markers count in bucket | bucket |
minio_cluster_usage_buckets_quota_total_bytes |
gauge |
Total bucket quota in bytes | bucket |
minio_cluster_usage_buckets_object_size_distribution |
gauge |
Bucket object size distribution | range,bucket |
minio_cluster_usage_buckets_object_version_count_distribution |
gauge |
Bucket object version count distribution | range,bucket |
/cluster/erasure-set
Name | Type | Help | Labels |
---|---|---|---|
minio_cluster_erasure_set_overall_write_quorum |
gauge |
Overall write quorum across pools and sets | |
minio_cluster_erasure_set_overall_health |
gauge |
Overall health across pools and sets (1=healthy, 0=unhealthy) | |
minio_cluster_erasure_set_read_quorum |
gauge |
Read quorum for the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_write_quorum |
gauge |
Write quorum for the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_online_drives_count |
gauge |
Count of online drives in the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_healing_drives_count |
gauge |
Count of healing drives in the erasure set in a pool | pool_id,set_id |
minio_cluster_erasure_set_health |
gauge |
Health of the erasure set in a pool (1=healthy, 0=unhealthy) | pool_id,set_id |