Commit Graph

78 Commits

Author SHA1 Message Date
Harshavardhana
e7d7d5232c
fix: admin info output and improve overall performance (#10015)
- admin info node offline check is now quicker
- admin info now doesn't duplicate the code
  across doing the same checks for disks
- rely on StorageInfo to return appropriate errors
  instead of calling locally.
- diskID checks now return proper errors when
  disk not found v/s format.json missing.
- add more disk states for more clarity on the
  underlying disk errors.
2020-07-13 09:51:07 -07:00
Harshavardhana
f9aa239973
fix: export prometheus metrics for cache GC triggers (#9815)
Bonus change to use channel to serialize triggers,
instead of using atomic variables. More efficient
mechanism for synchronization.

Co-authored-by: Nitish Tiwari <nitish@minio.io>
2020-06-15 09:05:35 -07:00
Harshavardhana
4915433bd2
Support bucket versioning (#9377)
- Implement a new xl.json 2.0.0 format to support,
  this moves the entire marshaling logic to POSIX
  layer, top layer always consumes a common FileInfo
  construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
  for object placements.

Fixes #2111
2020-06-12 20:04:01 -07:00
Harshavardhana
b2db8123ec
Preserve errors returned by diskInfo to detect disk errors (#9727)
This PR basically reverts #9720 and re-implements it differently
2020-05-28 13:03:04 -07:00
Harshavardhana
53aaa5d2a5
Export bucket usage counts as part of bucket metrics (#9710)
Bonus fixes in quota enforcement to use the
new datastructure and use timedValue to cache
a value/reload automatically avoids one less
global variable.
2020-05-27 06:45:43 -07:00
Harshavardhana
6ac48a65cb
fix: use unused cacheMetrics code in prometheus (#9588)
remove all other unusued/deadcode
2020-05-13 08:15:26 -07:00
Harshavardhana
27d716c663
simplify usage of mutexes and atomic constants (#9501) 2020-05-03 22:35:40 -07:00
Harshavardhana
f44cfb2863
use GlobalContext whenever possible (#9280)
This change is throughout the codebase to
ensure that all codepaths honor GlobalContext
2020-04-09 09:30:02 -07:00
poornas
336460f67e
fix: gateway_s3_bytes_sent metric for all API methods (#9242)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-04-01 12:52:31 -07:00
Nitish Tiwari
6b984410d5
Add support for self-healing related metrics in Prometheus (#9079)
Fixes #8988

Co-authored-by: Anis Elleuch <vadmeste@users.noreply.github.com>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-24 22:40:45 -07:00
Anis Elleuch
d4dcf1d722
metrics: Use StorageInfo() instead to have consistent info (#9006)
Metrics used to have its own code to calculate offline disks.
StorageInfo() was avoided because it is an expensive operation
by sending calls to all nodes.

To make metrics & server info share the same code, a new
argument `local` is added to StorageInfo() so it will only
query local disks when needed.

Metrics now calls StorageInfo() as server info handler does
but with the local flag set to false.

Co-authored-by: Praveen raj Mani <praveen@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-20 09:21:33 +05:30
Nitish Tiwari
63be4709b7
Add metrics support for Azure & GCS Gateway (#8954)
We added support for caching and S3 related metrics in #8591. As
a continuation, it would be helpful to add support for Azure & GCS
gateway related metrics as well.
2020-02-11 21:08:01 +05:30
Nitish Tiwari
24ad59316d
Use atomic.Uint64 for gateway metrics count instead of mutex (#8615) 2019-12-07 11:21:52 +05:30
Nitish Tiwari
3df7285c3c Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591)
This PR adds support below metrics

- Cache Hit Count
- Cache Miss Count
- Data served from Cache (in Bytes)
- Bytes received from AWS S3
- Bytes sent to AWS S3
- Number of requests sent to AWS S3

Fixes #8549
2019-12-05 23:16:06 -08:00
Harshavardhana
347b29d059 Implement bucket expansion (#8509) 2019-11-19 17:42:27 -08:00
Harshavardhana
822eb5ddc7 Bring in safe mode support (#8478)
This PR refactors object layer handling such
that upon failure in sub-system initialization
server reaches a stage of safe-mode operation
wherein only certain API operations are enabled
and available.

This allows for fixing many scenarios such as

 - incorrect configuration in vault, etcd,
   notification targets
 - missing files, incomplete config migrations
   unable to read encrypted content etc
 - any other issues related to notification,
   policies, lifecycle etc
2019-11-09 09:27:23 -08:00
Harshavardhana
9e7a3e6adc Extend further validation of config values (#8469)
- This PR allows config KVS to be validated properly
  without being affected by ENV overrides, rejects
  invalid values during set operation

- Expands unit tests and refactors the error handling
  for notification targets, returns error instead of
  ignoring targets for invalid KVS

- Does all the prep-work for implementing safe-mode
  style operation for MinIO server, introduces a new
  global variable to toggle safe mode based operations
  NOTE: this PR itself doesn't provide safe mode operations
2019-10-30 23:39:09 -07:00
Praveen raj Mani
8836d57e3c The prometheus metrics refractoring (#8003)
The measures are consolidated to the following metrics

- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `disks_offline`: Total number of offline disks in current MinIO instance.
- `disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests.

And this PR also modifies the current StorageInfo queries

- Decouples StorageInfo from ServerInfo .
- StorageInfo is enhanced to give endpoint information.

NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR

Fixes #7873
2019-10-22 21:01:14 -07:00
Praveen raj Mani
ad75683bde Authorize prometheus endpoint with bearer token (#7640) 2019-09-22 20:27:12 +05:30
Praveen raj Mani
be72609d1f Expose version info in prometheus (#7812)
Fixes #7795
2019-06-26 10:36:54 -07:00
kannappanr
5ecac91a55
Replace Minio refs in docs with MinIO and links (#7494) 2019-04-09 11:39:42 -07:00
Harshavardhana
0188009c7e Expose total and available disk space (#7453) 2019-04-05 09:51:50 +05:30
Krishna Srinivas
52f6d5aafc Rename of structs and methods (#6230)
Rename of ErasureStorage to Erasure (and rename of related variables and methods)
2018-08-23 23:35:37 -07:00
Harshavardhana
5282639f3c Add prometheus endpoint to support total Used storageInfo (#5988)
Since we deprecated Total/Free we don't need to update
prometheus with those metrics. This PR also adds support
for caching implementation.
2018-05-30 11:30:14 -07:00
Yaroslav Skopets
a50cc7e937 Add Prometheus metrics for Minio gateway (#5987) 2018-05-30 10:13:46 +05:30
Nitish Tiwari
e6ebcc4cb6 Remove redundant prometheus data points (#5934)
Removed field minio_http_requests_total as it was redundant with
minio_http_requests_duration_seconds_count

Also removed field minio_server_start_time_seconds as it was
redundant with process_start_time_seconds
2018-05-15 12:23:43 -07:00
Ashish Kumar Sinha
deb685c5b5 Enhancements in Minio Prometheus exporter (#5848)
Standardized Minio collectors based on Prometheus 
recommendations.
2018-05-09 01:38:27 -07:00
Ashish Kumar Sinha
9ebb72aa99 Introduce new unauthenticated endpoint /metric (#5723) (#5829)
/metric exposes Promethus compatible data for scraping metrics

Fixes: #5723
2018-04-18 16:01:42 -07:00