Add cluster and bucket replication metrics in metrics-v3 (#19546)

endpoint: /minio/metrics/v3/cluster/replication
metrics:
- average_active_workers
- average_queued_bytes
- average_queued_count
- average_transfer_rate
- current_active_workers
- current_transfer_rate
- last_minute_queued_bytes
- last_minute_queued_count
- max_active_workers
- max_queued_bytes
- max_queued_count
- max_transfer_rate
- recent_backlog_count

endpoint: /minio/metrics/v3/api/bucket/replication
metrics:
- last_hour_failed_bytes
- last_hour_failed_count
- last_minute_failed_bytes
- last_minute_failed_count
- latency_ms
- proxied_delete_tagging_requests_total
- proxied_get_requests_failures
- proxied_get_requests_total
- proxied_get_tagging_requests_failures
- proxied_get_tagging_requests_total
- proxied_head_requests_failures
- proxied_head_requests_total
- proxied_put_tagging_requests_failures
- proxied_put_tagging_requests_total
- sent_bytes
- sent_count
- total_failed_bytes
- total_failed_count
- proxied_delete_tagging_requests_failures
This commit is contained in:
Shireesh Anjal
2024-05-23 13:11:18 +05:30
committed by GitHub
parent 6d5bc045bc
commit 7981509cc8
6 changed files with 395 additions and 45 deletions

View File

@@ -31,7 +31,7 @@ These are metrics about requests served by the (current) node.
| Path | Description |
|-----------------|--------------------------------------------------|
| `/api/requests` | Metrics over all requests |
| `/api/bucket` | Metrics over all requests split by bucket labels |
| `/bucket/api` | Metrics over all requests for a given bucket |
| | |
### Audit metrics
@@ -122,6 +122,30 @@ The standard metrics group for GoCollector is not shown below.
| `minio_bucket_api_5xx_errors_total` | `counter` | Total number of requests with 5xx errors for a bucket | `bucket,name,type,server,pool_index` |
| `minio_bucket_api_ttfb_seconds_distribution` | `counter` | Distribution of time to first byte across API calls for a bucket | `bucket,name,le,type,server,pool_index` |
### `/bucket/replication`
| Name | Type | Help | Labels |
|---------------------------------------------------------------------|-----------|---------------------------------------------------------------------------------------------|-------------------------------------------|
| `minio_bucket_replication_last_hour_failed_bytes` | `gauge` | Total number of bytes failed at least once to replicate in the last hour on a bucket | `bucket,server` |
| `minio_bucket_replication_last_hour_failed_count` | `gauge` | Total number of objects which failed replication in the last hour on a bucket | `bucket,server` |
| `minio_bucket_replication_last_minute_failed_bytes` | `gauge` | Total number of bytes failed at least once to replicate in the last full minute on a bucket | `bucket,server` |
| `minio_bucket_replication_last_minute_failed_count` | `gauge` | Total number of objects which failed replication in the last full minute on a bucket | `bucket,server` |
| `minio_bucket_replication_latency_ms` | `gauge` | Replication latency on a bucket in milliseconds | `bucket,operation,range,targetArn,server` |
| `minio_bucket_replication_proxied_delete_tagging_requests_total` | `counter` | Number of DELETE tagging requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_get_requests_failures` | `counter` | Number of failures in GET requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_get_requests_total` | `counter` | Number of GET requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_get_tagging_requests_failures` | `counter` | Number of failures in GET tagging requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_get_tagging_requests_total` | `counter` | Number of GET tagging requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_head_requests_failures` | `counter` | Number of failures in HEAD requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_head_requests_total` | `counter` | Number of HEAD requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_put_tagging_requests_failures` | `counter` | Number of failures in PUT tagging requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_proxied_put_tagging_requests_total` | `counter` | Number of PUT tagging requests proxied to replication target | `bucket,server` |
| `minio_bucket_replication_sent_bytes` | `counter` | Total number of bytes replicated to the target | `bucket,server` |
| `minio_bucket_replication_sent_count` | `counter` | Total number of objects replicated to the target | `bucket,server` |
| `minio_bucket_replication_total_failed_bytes` | `counter` | Total number of bytes failed at least once to replicate since server start | `bucket,server` |
| `minio_bucket_replication_total_failed_count` | `counter` | Total number of objects which failed replication since server start | `bucket,server` |
| `minio_bucket_replication_proxied_delete_tagging_requests_failures` | `counter` | Number of failures in DELETE tagging requests proxied to replication target | `bucket,server` |
### `/audit`
| Name | Type | Help | Labels |
@@ -195,25 +219,25 @@ The standard metrics group for GoCollector is not shown below.
### `/system/process`
| Name | Type | Help | Labels |
|-------------------------------|-----------|----------------------------------------------------------------------------------------------------------------|----------|
| `locks_read_total` | `gauge` | Number of current READ locks on this peer | `server` |
| `locks_write_total` | `gauge` | Number of current WRITE locks on this peer | `server` |
| `cpu_total_seconds` | `counter` | Total user and system CPU time spent in seconds | `server` |
| `go_routine_total` | `gauge` | Total number of go routines running | `server` |
| `io_rchar_bytes` | `counter` | Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar | `server` |
| `io_read_bytes` | `counter` | Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes | `server` |
| `io_wchar_bytes` | `counter` | Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar | `server` |
| `io_write_bytes` | `counter` | Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes | `server` |
| `start_time_seconds` | `gauge` | Start time for MinIO process in seconds since Unix epoc | `server` |
| `uptime_seconds` | `gauge` | Uptime for MinIO process in seconds | `server` |
| `file_descriptor_limit_total` | `gauge` | Limit on total number of open file descriptors for the MinIO Server process | `server` |
| `file_descriptor_open_total` | `gauge` | Total number of open file descriptors by the MinIO Server process | `server` |
| `syscall_read_total` | `counter` | Total read SysCalls to the kernel. /proc/[pid]/io syscr | `server` |
| `syscall_write_total` | `counter` | Total write SysCalls to the kernel. /proc/[pid]/io syscw | `server` |
| `resident_memory_bytes` | `gauge` | Resident memory size in bytes | `server` |
| `virtual_memory_bytes` | `gauge` | Virtual memory size in bytes | `server` |
| `virtual_memory_max_bytes` | `gauge` | Maximum virtual memory size in bytes | `server` |
| Name | Type | Help | Labels |
|----------------------------------------------------|-----------|----------------------------------------------------------------------------------------------------------------|----------|
| `minio_system_process_locks_read_total` | `gauge` | Number of current READ locks on this peer | `server` |
| `minio_system_process_locks_write_total` | `gauge` | Number of current WRITE locks on this peer | `server` |
| `minio_system_process_cpu_total_seconds` | `counter` | Total user and system CPU time spent in seconds | `server` |
| `minio_system_process_go_routine_total` | `gauge` | Total number of go routines running | `server` |
| `minio_system_process_io_rchar_bytes` | `counter` | Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar | `server` |
| `minio_system_process_io_read_bytes` | `counter` | Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes | `server` |
| `minio_system_process_io_wchar_bytes` | `counter` | Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar | `server` |
| `minio_system_process_io_write_bytes` | `counter` | Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes | `server` |
| `minio_system_process_start_time_seconds` | `gauge` | Start time for MinIO process in seconds since Unix epoc | `server` |
| `minio_system_process_uptime_seconds` | `gauge` | Uptime for MinIO process in seconds | `server` |
| `minio_system_process_file_descriptor_limit_total` | `gauge` | Limit on total number of open file descriptors for the MinIO Server process | `server` |
| `minio_system_process_file_descriptor_open_total` | `gauge` | Total number of open file descriptors by the MinIO Server process | `server` |
| `minio_system_process_syscall_read_total` | `counter` | Total read SysCalls to the kernel. /proc/[pid]/io syscr | `server` |
| `minio_system_process_syscall_write_total` | `counter` | Total write SysCalls to the kernel. /proc/[pid]/io syscw | `server` |
| `minio_system_process_resident_memory_bytes` | `gauge` | Resident memory size in bytes | `server` |
| `minio_system_process_virtual_memory_bytes` | `gauge` | Virtual memory size in bytes | `server` |
| `minio_system_process_virtual_memory_max_bytes` | `gauge` | Maximum virtual memory size in bytes | `server` |
### `/cluster/health`
@@ -302,3 +326,20 @@ The standard metrics group for GoCollector is not shown below.
| `minio_logger_webhook_failed_messages` | `counter` | Number of messages that failed to send | `server,name,endpoint` |
| `minio_logger_webhook_queue_length` | `gauge` | Webhook queue length | `server,name,endpoint` |
| `minio_logger_webhook_total_message` | `counter` | Total number of messages sent to this target | `server,name,endpoint` |
### `/replication`
| Name | Type | Help | Labels |
|---------------------------------------------------|---------|-----------------------------------------------------------------------------|----------|
| `minio_replication_average_active_workers` | `gauge` | Average number of active replication workers | `server` |
| `minio_replication_average_queued_bytes` | `gauge` | Average number of bytes queued for replication since server start | `server` |
| `minio_replication_average_queued_count` | `gauge` | Average number of objects queued for replication since server start | `server` |
| `minio_replication_average_data_transfer_rate` | `gauge` | Average replication data transfer rate in bytes/sec | `server` |
| `minio_replication_current_active_workers` | `gauge` | Total number of active replication workers | `server` |
| `minio_replication_current_data_transfer_rate` | `gauge` | Current replication data transfer rate in bytes/sec | `server` |
| `minio_replication_last_minute_queued_bytes` | `gauge` | Number of bytes queued for replication in the last full minute | `server` |
| `minio_replication_last_minute_queued_count` | `gauge` | Number of objects queued for replication in the last full minute | `server` |
| `minio_replication_max_active_workers` | `gauge` | Maximum number of active replication workers seen since server start | `server` |
| `minio_replication_max_queued_bytes` | `gauge` | Maximum number of bytes queued for replication since server start | `server` |
| `minio_replication_max_queued_count` | `gauge` | Maximum number of objects queued for replication since server start | `server` |
| `minio_replication_max_data_transfer_rate` | `gauge` | Maximum replication data transfer rate in bytes/sec seen since server start | `server` |