Add top level metrics document to summarize monitoring endpoints (#5923)

Minio server supports healthcheck and prometheus related unauthenticated endpoints. This document summarizes this information in a single place and add links for more detailed documentation if needed.
2025-11-20 01:50:24 -05:00 · 2018-05-16 00:53:21 +05:30
parent 5c21e89559
commit 9cab0f25e0
3 changed files with 22 additions and 38 deletions
--- a/docs/metric/README.md
+++ b/docs/metric/README.md
@@ -1,36 +0,0 @@
-## Minio Prometheus Metric
-
-Minio server exposes an endpoint for Promethueus to scrape server data at `/minio/prometheus/metrics`.
-
-### Prometheus probe
-Prometheus is used to monitor Minio server information like http request, disk storage, network stats etc.. It uses a config file named `prometheus.yaml` to scrape data from server. The value for `metrics_path` and `targets` need to be configured in the config yaml to specify the endpoint and url as shown:
-```
-scrape_configs:
-  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
-  - job_name: minio
-    metrics_path: /minio/prometheus/metrics
-
-    # metrics_path defaults to '/metrics'
-    # scheme defaults to 'http'.
-
-    static_configs:
-      - targets: ['localhost:9000']
-```
- Prometheus can be run by executing :
-```
-./prometheus --config.file=prometheus.yml
-```
-
-### List of Minio metric exposed
-Minio exposes the following list of metric to Prometheus
- `minio_disk_storage_bytes` : Total byte count of disk storage available to current Minio server instance
- `minio_disk_storage_free_bytes` : Total byte count of free disk storage available to current Minio server instance
- `minio_http_requests_duration_seconds_bucket` : The bucket into which observations are counted for creating Histogram
- `minio_http_requests_duration_seconds_count` : The count of current number of observations i.e. total HTTP requests (HEAD/GET/PUT/POST/DELETE).
- `minio_http_requests_duration_seconds_sum` : The current aggregate time spent servicing all HTTP requests (HEAD/GET/PUT/POST/DELETE) in seconds
- `minio_http_requests_total` : Total number of requests served by current Minio server instance
- `minio_network_received_bytes_total` : Total number of bytes received by current Minio server instance
- `minio_network_sent_bytes_total` : Total number of bytes sent by current Minio server instance
- `minio_offline_disks` : Total number of offline disks for current Minio server instance
- `minio_total_disks` : Total number of disks for current Minio server instance
- `minio_server_start_time_seconds` : Time Unix time in seconds when current Minio server instance started
--- a/docs/metrics/README.md
+++ b/docs/metrics/README.md
@@ -0,0 +1,20 @@
+## Minio Monitoring Guide
+
+Minio server exposes monitoring data over un-authenticated endpoints so monitoring tools can pick the data without you having to share Minio server credentials. This document lists the monitoring endpoints and relevant documentation.
+
+### Healthcheck Probe
+
+Minio server has two healthcheck related endpoints, a liveness probe to indicate if server is working fine and a readiness probe to indicate if server is not accepting connections due to heavy load.
+
+- Liveness probe available at `/minio/health/live`
+- Readiness probe available at `/minio/health/ready`
+
+Read more on how to use these endpoints in [Minio healthcheck guide](./healthcheck/README.md).
+
+### Prometheus Probe
+
+Minio server exposes Prometheus compatible data on a single endpoint.
+
+- Prometheus data available at `/minio/prometheus/metrics`
+
+To use this endpoint, setup Prometheus to scrape data from this endpoint. Read more on how to use Prometheues to monitor Minio server in [How to monitor Minio server with Prometheus](https://github.com/minio/cookbook/blob/master/docs/how-to-monitor-minio-with-prometheus.md).
--- a/docs/metrics/healthcheck/README.md
+++ b/docs/metrics/healthcheck/README.md
@@ -4,11 +4,11 @@ Minio server exposes two un-authenticated, healthcheck endpoints - liveness prob

 ### Liveness probe

-This probe is used to identify situations where the server is running but may not behave optimally, i.e. sluggish response or corrupt backend. Such problems can be *only* fixed by a restart.
+This probe is used to identify situations where the server is running but may not behave optimally, i.e. sluggish response or corrupt back-end. Such problems can be *only* fixed by a restart.

 Internally, Minio liveness probe handler does a ListBuckets call. If successful, the server returns 200 OK, otherwise 503 Service Unavailable.

-When liveness probe fails, Kubernetes like platforms restart the container. 
+When liveness probe fails, Kubernetes like platforms restart the container.

 ### Readiness probe