Commit Graph

59 Commits

Author SHA1 Message Date
Shubhendu
3bd3470d0b
Corrected names of node replication metrics (#19932)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-13 15:26:54 -07:00
Shubhendu
39ac720826
Remove hardcoded override as not needed (#19868)
Fixes: https://github.com/minio/minio/issues/19867

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-06-04 06:24:37 -07:00
Shubhendu
1654a9b7e6
Use point in time values for gauge metrics in graphs (#19690)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-05-24 04:11:51 -07:00
Markus Wagner
0cf3d93360
removed hardcoded datasource uid (#19477) 2024-04-15 03:03:01 -07:00
Shubhendu
d96d696841
Dont use deprecated angular (#19396)
Support for Angular would be stopped with newer versions of grafana

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-04-03 19:01:53 -07:00
Shubhendu
d87f91720b
Split the replication dashboard in cluster and node level (#19374)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-03-28 10:15:39 -07:00
Shubhendu
d63e603040
Pre populate the server names using a query (#19367)
User doesn't need to remember and enter the server values,
rather they can select from the pre populated list.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-03-28 08:14:26 -07:00
Shubhendu
3d4fc28ec9
Render node graphs by node (#19356)
As total drives count, online vs offline are per node basis, its
corect to select node for which graphs need to be rendered.

Set prometheus scrape jobs to fetch metrics from all nodes. A sample
scrape job for node metrics could be as below

```
- job_name: minio-job-node
  bearer_token: <token>
  metrics_path: /minio/v2/metrics/node
  scheme: https
  tls_config:
    insecure_skip_verify: true
  static_configs:
  - targets: [tenant1-ss-0-0.tenant1-hl.tenant-ns.svc.cluster.local:9000,tenant1-ss-0-1.tenant1-hl.tenant-ns.svc.cluster.local:9000,tenant1-ss-0-2.tenant1-hl.tenant-ns.svc.cluster.local:9000,tenant1-ss-0-3.tenant1-hl.tenant-ns.svc.cluster.local:9000]
```

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-03-27 10:41:08 -07:00
Shubhendu
53a14c7301
Adding dashboard for MinIO node metrics (#19329)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-03-26 08:01:28 -07:00
Shubhendu
f46bee242c
Re-organized grafana dashboards (#19157)
Moved different dashboards to their specific directories. Also
mentioned that these dashbards are examples of how to create
graphs using MinIO provided and metrics and customers should
change / add graphs on their specific need basis.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-02-29 10:35:20 -08:00
Shubhendu
cb7dab17cb
Graph cluster and bucket replication proxied requests (#19078)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-02-20 01:45:00 -08:00
Harshavardhana
944f3c1477
remove local disk metrics from cluster metrics (#18886)
local disk metrics were polluting cluster metrics
Please remove them instead of adding relevant ones.

- batch job metrics were incorrectly kept at bucket
  metrics endpoint, move it to cluster metrics.

- add tier metrics to cluster peer metrics from the node.

- fix missing set level cluster health metrics
2024-01-28 12:53:59 -08:00
Harshavardhana
dd2542e96c
add codespell action (#18818)
Original work here, #18474,  refixed and updated.
2024-01-17 23:03:17 -08:00
Mario Bros
fbd8dfe60f
Adding ~ to match job when multiple jobs (#18706) 2023-12-27 15:39:20 -08:00
Shubhendu
9d7660b409
Graph cluster wide where applicable (#18705)
Graph the maximum value reported across nodes at cluster
level for applicable scenarios.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-12-22 08:14:32 -08:00
Shubhendu
6d4c1156d6
Changed the expression to render the value (#18627)
The metrics `minio_bucket_replication_received_bytes` and
`minio_bucket_replication_sent_bytes` are additive in nature
and rendering the value as is looks fine.

Also added sort order for few graphs for better reading of tool
tips as keeping ones with highest value at top helps.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-12-13 10:05:47 -08:00
Shubhendu
e4b619ce1a
Added graph for Erasure Set Tolerance value (#18472)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-17 10:38:15 -08:00
Shubhendu
ef67c39910
Added graphs for KMS metrics (#18321)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-10-30 03:20:53 -07:00
Shubhendu
e47e625f73
Added replication graphs for site replication metrics (#17951)
This dashboard graphs the metrics when site replication is enabled
across MinIO instances.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-31 08:31:16 -07:00
Shubhendu
0ce9e00ffa
Added node scanner and node drives graphs (#17949)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-30 14:01:51 -07:00
Shubhendu
c778c381b5
Added new bucket replication graphs (#17947)
This PR adds new bucket replication graphs for better and granular
monitoring of bucket replication. Also arranged all replication graphs
together.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-30 11:57:41 -07:00
Shubhendu
c3c8441a1d
Corrected the count of buckets and objects graphs (#17883)
In distributed setup with a load balancer, randmoly any server
would report the metrics `minio_cluster_bucket_total` and
`minio_cluster_usage_object_total` and while graphing it, we should
take max of reported values.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-21 09:04:38 -07:00
Harshavardhana
8a9b886011
update grafana dashboard with disk -> drive rename (#17857) 2023-08-15 16:04:20 -07:00
Harshavardhana
c4ca0a5a57
add two more drive metrics when metrics is available (#17854) 2023-08-15 10:55:47 -07:00
Shubhendu
b6b6d6e8d8
Removed replication dashboard (#17815)
As all replication metrics are moved at bucket level, all replication
graphs as well are added under minio-bucket.json. Removing the independent
replication dashboard.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-08-08 08:13:45 -07:00
Shubhendu
e1731d9403
Added bucket specific grafana dashboard (#17727)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-07-26 15:10:11 -07:00
Harshavardhana
e1094dde08 update MinIO replication dashboard with latest metrics 2023-07-21 17:30:04 -07:00
Doom And Love
d004c45386
grafana-dashboard: Update scrape_jobs variable to be single select (#17696)
Set `includeAll` and `mult` to be false since this dashboard only works with a single value being selected
2023-07-21 14:12:44 -07:00
Harshavardhana
c0a5bdaed9
update grafana dashboard JSON with the new metrics (#17683) 2023-07-19 08:16:04 -07:00
Harshavardhana
6426b74770
move bucket centric metrics to /minio/v2/metrics/bucket handlers (#17663)
users/customers do not have a reasonable number of buckets anymore,
this is why we must avoid overpopulating cluster endpoints, instead
move the bucket monitoring to a separate endpoint.

some of it's a breaking change here for a couple of metrics, but
it is imperative that we do it to improve the responsiveness of
our Prometheus cluster endpoint.

Bonus: Added new cluster metrics for usage, objects and histograms
2023-07-18 22:25:12 -07:00
Anis Eleuch
46d45a6923
grafana: Add TCP dial errors panel (#17101) 2023-04-28 11:11:17 -07:00
Anis Eleuch
2448a9e047
grafana: Remove minio_s3_requests_errors_total metric (#17094) 2023-04-27 10:55:30 -07:00
Jiffs Maverick
61101d82d9 Rename inodes metric in grafana dashboards (#17030) 2023-04-21 11:07:30 -07:00
Harshavardhana
e47a31f9fc
fix: object size distribution in metrics for all objects (#16539) 2023-02-04 21:10:10 -08:00
Anis Elleuch
e73894fa50
grafana: Show one metric for the total data growth (#16449) 2023-01-20 09:39:28 -08:00
ebozduman
b57e7321e7
Replaces 'disk'=>'drive' visible to end user (#15464) 2022-08-04 16:10:08 -07:00
MohammadReza
f4d5c861f3
update grafana dashboard (#15357) 2022-07-21 15:17:44 -07:00
Minio Trusted
f34b2ef90b update dashboard Data Usage Growth as time series 2022-06-13 22:05:36 -07:00
Harshavardhana
7413045f0e
fix: add missing minio_s3_requests_total (#15070)
PR #15052 caused a regression, add the missing metrics back.

Bonus:

- internode information should be only for distributed setups 
- update the dashboard to include 4xx and 5xx error panels.
2022-06-11 00:50:31 -07:00
Minio Trusted
f63645546d update minimum goroutine threshold on dashboard 2022-06-06 22:13:54 -07:00
Harshavardhana
c2630bb3a3 add total usage pie chart based on total/free bytes 2022-05-28 09:53:53 -07:00
Harshavardhana
e3e0532613
cleanup markdown docs across multiple files (#14296)
enable markdown-linter
2022-02-11 16:51:25 -08:00
chrisbecke
ef0b8367b5
Update minio-overview.json data source panel (#13730)
Add missing datasource in `Healing` panel.
2021-11-23 09:01:07 -08:00
Mani
7b82411e6f
change the unit of measurement from TB to TiB (#13686) 2021-11-18 20:06:37 -08:00
Ashish Kumar Sinha
3d2bc15e9a
Add grafana json file for replication metrics (#13678) 2021-11-17 14:49:46 -08:00
Harshavardhana
90e505e58f calculate API requests/error as increase() intervals not as rate() 2021-09-12 11:28:28 -07:00
Nitish Tiwari
60394ddf83
Add support for changing job name in Grafana dashboard (#13050) 2021-08-24 09:51:09 -07:00
Nitish Tiwari
32017454ee
fix typo in Grafana dashboard json (#12471) 2021-06-09 08:04:12 -07:00
Nitish Tiwari
00c5d7e1b3
Add healing related metrics in official dashboard (#12456) 2021-06-07 12:46:54 -07:00
Nitish Tiwari
a592d3be19
fix the dashboard to use $rate_interval (#12277)
refer https://grafana.com/blog/2020/09/28/new-in-grafana-7.2-__rate_interval-for-prometheus-rate-queries-that-just-work/
for further information
2021-05-12 08:06:47 -07:00