Split the replication dashboard in cluster and node level (#19374)

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
This commit is contained in:
Shubhendu 2024-03-28 22:45:39 +05:30 committed by GitHub
parent aa0eec16ab
commit d87f91720b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
8 changed files with 5379 additions and 2826 deletions

View File

@ -15,9 +15,13 @@ Refer to the dashboard [json file here](https://raw.githubusercontent.com/minio/
![Grafana](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/grafana-minio.png) ![Grafana](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/grafana-minio.png)
Replication metrics can be viewed in the Grafana dashboard using [json file here](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/replication/minio-replication.json) Node level Replication metrics can be viewed in the Grafana dashboard using [json file here](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/replication/minio-replication-node.json)
![Grafana](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/replication/grafana-replication.png) ![Grafana](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/replication/grafana-replication-node.png)
CLuster level Replication metrics can be viewed in the Grafana dashboard using [json file here](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/replication/minio-replication-cluster.json)
![Grafana](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/replication/grafana-replication-cluster.png)
Bucket metrics can be viewed in the Grafana dashboard using [json file here](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/bubcket/minio-bucket.json) Bucket metrics can be viewed in the Grafana dashboard using [json file here](https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/bubcket/minio-bucket.json)

Binary file not shown.

After

Width:  |  Height:  |  Size: 444 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 228 KiB

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -82,6 +82,36 @@ For deployments behind a load balancer, use the load balancer hostname instead o
Metrics marked as ``Site Replication Only`` only populate on deployments with [Site Replication](https://min.io/docs/minio/linux/operations/install-deploy-manage/multi-site-replication.html) configurations. Metrics marked as ``Site Replication Only`` only populate on deployments with [Site Replication](https://min.io/docs/minio/linux/operations/install-deploy-manage/multi-site-replication.html) configurations.
For deployments with [bucket](https://min.io/docs/minio/linux/administration/bucket-replication.html) or [batch](https://min.io/docs/minio/linux/administration/batch-framework.html#replicate) configurations, these metrics populate instead under the [Bucket Metrics](#bucket-metrics) endpoint. For deployments with [bucket](https://min.io/docs/minio/linux/administration/bucket-replication.html) or [batch](https://min.io/docs/minio/linux/administration/batch-framework.html#replicate) configurations, these metrics populate instead under the [Bucket Metrics](#bucket-metrics) endpoint.
| Name | Description
|:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------|
| `minio_cluster_replication_last_hour_failed_bytes` | (_Site Replication Only_) Total number of bytes failed at least once to replicate in the last full hour. |
| `minio_cluster_replication_last_hour_failed_count` | (_Site Replication Only_) Total number of objects which failed replication in the last full hour. |
| `minio_cluster_replication_last_minute_failed_bytes` | Total number of bytes failed at least once to replicate in the last full minute. |
| `minio_cluster_replication_last_minute_failed_count` | Total number of objects which failed replication in the last full minute. |
| `minio_cluster_replication_total_failed_bytes` | (_Site Replication Only_) Total number of bytes failed at least once to replicate since server start. |
| `minio_cluster_replication_total_failed_count` | (_Site Replication Only_) Total number of objects which failed replication since server start. |
| `minio_cluster_replication_received_bytes` | (_Site Replication Only_) Total number of bytes replicated to this cluster from another source cluster. |
| `minio_cluster_replication_received_count` | (_Site Replication Only_) Total number of objects received by this cluster from another source cluster. |
| `minio_cluster_replication_sent_bytes` | (_Site Replication Only_) Total number of bytes replicated to the target cluster. |
| `minio_cluster_replication_sent_count` | (_Site Replication Only_) Total number of objects replicated to the target cluster. |
| `minio_cluster_replication_credential_errors` | (_Site Replication Only_) Total number of replication credential errors since server start |
| `minio_cluster_replication_proxied_get_requests_total` | (_Site Replication Only_)Number of GET requests proxied to replication target |
| `minio_cluster_replication_proxied_head_requests_total` | (_Site Replication Only_)Number of HEAD requests proxied to replication target |
| `minio_cluster_replication_proxied_delete_tagging_requests_total` | (_Site Replication Only_)Number of DELETE tagging requests proxied to replication target |
| `minio_cluster_replication_proxied_get_tagging_requests_total` | (_Site Replication Only_)Number of GET tagging requests proxied to replication target |
| `minio_cluster_replication_proxied_put_tagging_requests_total` | (_Site Replication Only_)Number of PUT tagging requests proxied to replication target |
| `minio_cluster_replication_proxied_get_requests_failures` | (_Site Replication Only_)Number of failures in GET requests proxied to replication target |
| `minio_cluster_replication_proxied_head_requests_failures` | (_Site Replication Only_)Number of failures in HEAD requests proxied to replication target |
| `minio_cluster_replication_proxied_delete_tagging_requests_failures` | (_Site Replication Only_)Number of failures proxying DELETE tagging requests to replication target |
| `minio_cluster_replication_proxied_get_tagging_requests_failures` | (_Site Replication Only_)Number of failures proxying GET tagging requests to replication target |
| `minio_cluster_replication_proxied_put_tagging_requests_failures` | (_Site Replication Only_)Number of failures proxying PUT tagging requests to replication target |
## Node Replication Metrics
Metrics marked as ``Site Replication Only`` only populate on deployments with [Site Replication](https://min.io/docs/minio/linux/operations/install-deploy-manage/multi-site-replication.html) configurations.
For deployments with [bucket](https://min.io/docs/minio/linux/administration/bucket-replication.html) or [batch](https://min.io/docs/minio/linux/administration/batch-framework.html#replicate) configurations, these metrics populate instead under the [Bucket Metrics](#bucket-metrics) endpoint.
| Name | Description | Name | Description
|:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------| |:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------|
| `minio_cluster_replication_current_active_workers` | Total number of active replication workers | | `minio_cluster_replication_current_active_workers` | Total number of active replication workers |
@ -103,28 +133,6 @@ For deployments with [bucket](https://min.io/docs/minio/linux/administration/buc
| `minio_cluster_replication_max_queued_bytes` | Maximum number of bytes queued for replication seen since server start | | `minio_cluster_replication_max_queued_bytes` | Maximum number of bytes queued for replication seen since server start |
| `minio_cluster_replication_max_queued_count` | Maximum number of objects queued for replication seen since server start | | `minio_cluster_replication_max_queued_count` | Maximum number of objects queued for replication seen since server start |
| `minio_cluster_replication_recent_backlog_count` | Total number of objects seen in replication backlog in the last 5 minutes | | `minio_cluster_replication_recent_backlog_count` | Total number of objects seen in replication backlog in the last 5 minutes |
| `minio_cluster_replication_last_minute_failed_bytes` | Total number of bytes failed at least once to replicate in the last full minute. |
| `minio_cluster_replication_last_minute_failed_count` | Total number of objects which failed replication in the last full minute. |
| `minio_cluster_replication_last_hour_failed_bytes` | (_Site Replication Only_) Total number of bytes failed at least once to replicate in the last full hour. |
| `minio_cluster_replication_last_hour_failed_count` | (_Site Replication Only_) Total number of objects which failed replication in the last full hour. |
| `minio_cluster_replication_total_failed_bytes` | (_Site Replication Only_) Total number of bytes failed at least once to replicate since server start. |
| `minio_cluster_replication_total_failed_count` | (_Site Replication Only_) Total number of objects which failed replication since server start. |
| `minio_cluster_replication_received_bytes` | (_Site Replication Only_) Total number of bytes replicated to this cluster from another source cluster. |
| `minio_cluster_replication_received_count` | (_Site Replication Only_) Total number of objects received by this cluster from another source cluster. |
| `minio_cluster_replication_sent_bytes` | (_Site Replication Only_) Total number of bytes replicated to the target cluster. |
| `minio_cluster_replication_sent_count` | (_Site Replication Only_) Total number of objects replicated to the target cluster. |
| `minio_cluster_replication_credential_errors` | (_Site Replication Only_) Total number of replication credential errors since server start |
| `minio_cluster_replication_proxied_get_requests_total` | (_Site Replication Only_)Number of GET requests proxied to replication target |
| `minio_cluster_replication_proxied_head_requests_total` | (_Site Replication Only_)Number of HEAD requests proxied to replication target |
| `minio_cluster_replication_proxied_delete_tagging_requests_total` | (_Site Replication Only_)Number of DELETE tagging requests proxied to replication target |
| `minio_cluster_replication_proxied_get_tagging_requests_total` | (_Site Replication Only_)Number of GET tagging requests proxied to replication target |
| `minio_cluster_replication_proxied_put_tagging_requests_total` | (_Site Replication Only_)Number of PUT tagging requests proxied to replication target |
| `minio_cluster_replication_proxied_get_requests_failures` | (_Site Replication Only_)Number of failures in GET requests proxied to replication target |
| `minio_cluster_replication_proxied_head_requests_failures` | (_Site Replication Only_)Number of failures in HEAD requests proxied to replication target |
| `minio_cluster_replication_proxied_delete_tagging_requests_failures` | (_Site Replication Only_)Number of failures proxying DELETE tagging requests to replication target |
| `minio_cluster_replication_proxied_get_tagging_requests_failures` | (_Site Replication Only_)Number of failures proxying GET tagging requests to replication target |
| `minio_cluster_replication_proxied_put_tagging_requests_failures` | (_Site Replication Only_)Number of failures proxying PUT tagging requests to replication target |
## Healing Metrics ## Healing Metrics