Export bucket usage counts as part of bucket metrics (#9710)

Bonus fixes in quota enforcement to use the
new datastructure and use timedValue to cache
a value/reload automatically avoids one less
global variable.
This commit is contained in:
Harshavardhana
2020-05-27 06:45:43 -07:00
committed by GitHub
parent cccf2de129
commit 53aaa5d2a5
9 changed files with 281 additions and 115 deletions

View File

@@ -110,7 +110,7 @@ These are the new set of metrics which will be in effect after `RELEASE.2019-10-
- Metrics that records the http statistics and latencies are labeled to their respective APIs (putobject,getobject etc).
- Disk usage metrics are distributed and labeled to the respective disk paths.
For more details, please check the `Migration guide for the new set of metrics`
For more details, please check the `Migration guide for the new set of metrics`.
The list of metrics and its definition are as follows. (NOTE: instance here is one MinIO node)
@@ -118,42 +118,83 @@ The list of metrics and its definition are as follows. (NOTE: instance here is o
> 1. Instance here is one MinIO node.
> 2. `s3 requests` exclude internode requests.
- standard go runtime metrics prefixed by `go_`
- process level metrics prefixed with `process_`
- prometheus scrape metrics prefixed with `promhttp_`
### Default set of information
| name | description |
|:------------|:--------------------------------|
| `go_` | all standard go runtime metrics |
| `process_` | all process level metrics |
| `promhttp_` | all prometheus scrape metrics |
- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `minio_disks_offline`: Total number of offline disks in current MinIO instance.
- `minio_disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds`: Histogram that holds the latency information of the requests.
### MinIO node specific information
| name | description |
|:---------------------------|:-------------------------------------------------------------------------------|
| `minio_version_info` | Current MinIO version with its commit-id |
| `minio_disks_offline` | Total number of offline disks on current MinIO instance |
| `minio_disks_total` | Total number of disks on current MinIO instance |
### Disk metrics are labeled by 'disk' which indentifies each disk
| name | description |
|:---------------------------|:-------------------------------------------------------------------------------|
| `disk_storage_total` | Total size of the disk |
| `disk_storage_used` | Total disk space used per disk |
| `disk_storage_available` | Total available disk space per disk |
### S3 API metrics are labeled by 'api' which identifies different S3 API requests
| name | description |
|:---------------------------|:-------------------------------------------------------------------------------|
| `s3_requests_total` | Total number of s3 requests in current MinIO instance |
| `s3_errors_total` | Total number of errors in s3 requests in current MinIO instance |
| `s3_requests_current` | Total number of active s3 requests in current MinIO instance |
| `s3_rx_bytes_total` | Total number of s3 bytes received by current MinIO server instance |
| `s3_tx_bytes_total` | Total number of s3 bytes sent by current MinIO server instance |
| `s3_ttfb_seconds` | Histogram that holds the latency information of the requests |
#### Internode metrics only available in a distributed setup
| name | description |
|:---------------------------|:-------------------------------------------------------------------------------|
| `internode_rx_bytes_total` | Total number of internode bytes received by current MinIO server instance |
| `internode_tx_bytes_total` | Total number of bytes sent to the other nodes by current MinIO server instance |
Apart from above metrics, MinIO also exposes below mode specific metrics
### Bucket usage specific metrics
All metrics are labeled by `bucket`, each metric is displayed per bucket. `buckets_objects_histogram` is additionally labeled by `object_size` string which is represented by any of the following values
- *LESS_THAN_1024_B*
- *BETWEEN_1024_B_AND_1_MB*
- *BETWEEN_1_MB_AND_10_MB*
- *BETWEEN_10_MB_AND_64_MB*
- *BETWEEN_64_MB_AND_128_MB*
- *BETWEEN_128_MB_AND_512_MB*
- *GREATER_THAN_512_MB*
| name | description |
|:---------------------------|:----------------------------------------------------|
| `bucket_usage_size` | Total size of the bucket |
| `bucket_objects_count` | Total number of objects in a bucket |
| `bucket_objects_histogram` | Total number of objects filtered by different sizes |
### Cache specific metrics
MinIO Gateway instances enabled with Disk-Caching expose caching related metrics.
- `cache_data_served`: Total number of bytes served from cache.
- `cache_hits_total`: Total number of cache hits.
- `cache_misses_total`: Total number of cache misses.
| name | description |
|:---------------------|:----------------------------------------|
| `cache_data_served` | Total number of bytes served from cache |
| `cache_hits_total` | Total number of cache hits |
| `cache_misses_total` | Total number of cache misses |
### Gateway & Cache specific metrics
MinIO Gateway instance exposes metrics related to Gateway communication with the cloud backend (S3, Azure & GCS Gateway).
- `gateway_<gateway_type>_requests`: Total number of requests made to cloud backend. This metrics has a label `method` that identifies GET, HEAD, PUT and POST Requests.
- `gateway_<gateway_type>_bytes_sent`: Total number of bytes sent to cloud backend (in PUT & POST Requests).
- `gateway_<gateway_type>_bytes_received`: Total number of bytes received from cloud backend (in GET & HEAD Requests).
`<gateway_type>` changes based on the gateway in use can be 's3', 'gcs' or 'azure'. Other metrics are labeled with `method` that identifies HTTP GET, HEAD, PUT and POST requests to the backend.
| name | description |
|:----------------------------------------|:---------------------------------------------------------------------------|
| `gateway_<gateway_type>_requests` | Total number of requests made to the gateway backend |
| `gateway_<gateway_type>_bytes_sent` | Total number of bytes sent to cloud backend (in PUT & POST Requests) |
| `gateway_<gateway_type>_bytes_received` | Total number of bytes received from cloud backend (in GET & HEAD Requests) |
Note that this is currently only support for Azure, S3 and GCS Gateway.
@@ -161,10 +202,12 @@ Note that this is currently only support for Azure, S3 and GCS Gateway.
MinIO exposes self-healing related metrics for erasure-code deployments _only_. These metrics are _not_ available on Gateway or Single Node, Single Drive deployments. Note that these metrics will be exposed _only_ when there is a relevant event happening on MinIO server.
- `self_heal_time_since_last_activity`: Time elapsed since last self-healing related activity.
- `self_heal_objects_scanned`: Number of objects scanned by self-healing thread in its current run. This will reset when a fresh self-healing run starts. This is labeled with the object type scanned.
- `self_heal_objects_healed`: Number of objects healing by self-healing thread in its current run. This will reset when a fresh self-healing run starts. This is labeled with the object type scanned.
- `self_heal_objects_heal_failed`: Number of objects for which self-healing failed in its current run. This will reset when a fresh self-healing run starts. This is labeled with disk status and its endpoint.
| name | description |
|:-------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `self_heal_time_since_last_activity` | Time elapsed since last self-healing related activity |
| `self_heal_objects_scanned` | Number of objects scanned by self-healing thread in its current run. This will reset when a fresh self-healing run starts. This is labeled with the object type scanned |
| `self_heal_objects_healed` | Number of objects healing by self-healing thread in its current run. This will reset when a fresh self-healing run starts. This is labeled with the object type scanned |
| `self_heal_objects_heal_failed` | Number of objects for which self-healing failed in its current run. This will reset when a fresh self-healing run starts. This is labeled with disk status and its endpoint |
## Migration guide for the new set of metrics
@@ -174,20 +217,20 @@ This migration guide applies for older releases or any releases before `RELEASE.
The migrations include
- `minio_total_disks` to `minio_disks_total`
- `minio_offline_disks` to `minio_disks_offline`
- `minio_total_disks` to `minio_disks_total`
- `minio_offline_disks` to `minio_disks_offline`
### MinIO disk level metrics - `disk_storage_*`
These metrics have one label.
- `disk`: Holds the disk path
- `disk`: Holds the disk path
The migrations include
- `minio_disk_storage_used_bytes` to `disk_storage_used`
- `minio_disk_storage_available_bytes` to `disk_storage_available`
- `minio_disk_storage_total_bytes` to `disk_storage_total`
- `minio_disk_storage_used_bytes` to `disk_storage_used`
- `minio_disk_storage_available_bytes` to `disk_storage_available`
- `minio_disk_storage_total_bytes` to `disk_storage_total`
### MinIO network level metrics
@@ -195,11 +238,11 @@ These metrics are detailed to cover the s3 and internode network statistics.
The migrations include
- `minio_network_sent_bytes_total` to `s3_tx_bytes_total` and `internode_tx_bytes_total`
- `minio_network_received_bytes_total` to `s3_rx_bytes_total` and `internode_rx_bytes_total`
- `minio_network_sent_bytes_total` to `s3_tx_bytes_total` and `internode_tx_bytes_total`
- `minio_network_received_bytes_total` to `s3_rx_bytes_total` and `internode_rx_bytes_total`
Some of the additional metrics added were
- `s3_requests_total`
- `s3_errors_total`
- `s3_ttfb_seconds`
- `s3_requests_total`
- `s3_errors_total`
- `s3_ttfb_seconds`