Commit Graph

48 Commits

Author SHA1 Message Date
Anis Elleuch 0eb146e1b2
add additional metrics per disk API latency, API call counts #11250)
```
mc admin info --json
```

provides these details, for now, we shall eventually 
expose this at Prometheus level eventually. 

Co-authored-by: Harshavardhana <harsha@minio.io>
2021-03-16 20:06:57 -07:00
Harshavardhana 651487507a
fix: Merge() should merge and return a copy (#11714)
fixes #11713
2021-03-05 09:42:46 -08:00
Klaus Post fa9cf1251b
Imporve healing and reporting (#11312)
* Provide information on *actively* healing, buckets healed/queued, objects healed/failed.
* Add concurrent healing of multiple sets (typically on startup).
* Add bucket level resume, so restarts will only heal non-healed buckets.
* Print summary after healing a disk is done.
2021-03-04 14:36:23 -08:00
Harshavardhana c6a120df0e
fix: Prometheus metrics to re-use storage disks (#11647)
also re-use storage disks for all `mc admin server info`
calls as well, implement a new LocalStorageInfo() API
call at ObjectLayer to lookup local disks storageInfo

also fixes bugs where there were double calls to StorageInfo()
2021-03-02 17:28:04 -08:00
Bala FA 23f7ab40b3
Add PoolNumber field to madmin.ServerProperties (#11327) 2021-02-28 21:26:28 -08:00
Shireesh Anjal 3afa499885
fix: empty buckets/objects nodes in new setup (#11493) 2021-02-09 09:52:38 -08:00
Harshavardhana af9cb5f5f2 remove deprecated StandardSCData 2021-02-05 01:34:23 -08:00
Andreas Auernhammer 33554651e9
crypto: deprecate native Hashicorp Vault support (#11352)
This commit deprecates the native Hashicorp Vault
support and removes the legacy Vault documentation.

The native Hashicorp Vault documentation is marked as
outdated and deprecated for over a year now. We give
another 6 months before we start removing Hashicorp Vault
support and show a deprecation warning when a MinIO server
starts with a native Vault configuration.
2021-01-29 17:55:37 -08:00
Harshavardhana a6c146bd00
validate storage class across pools when setting config (#11320)
```
mc admin config set alias/ storage_class standard=EC:3
```

should only succeed if parity ratio is valid for all
server pools, if not we should fail proactively.

This PR also needs to bring other changes now that
we need to cater for variadic drive counts per pool.

Bonus fixes also various bugs reproduced with

- GetObjectWithPartNumber()
- CopyObjectPartWithOffsets()
- CopyObjectWithMetadata()
- PutObjectPart,PutObject with truncated streams
2021-01-22 12:09:24 -08:00
Anis Elleuch 2ecaab55a6
admin: ServerInfo returns info without object layer initialized (#11142) 2020-12-21 09:35:19 -08:00
Shireesh Anjal 6e138f955e
Fix a couple of typos in json config (#10605)
Vault.Encrypt: encryp -> encrypt
SysOBDProcess.Uids: uidsomitempty -> uids,omitempty
2020-09-30 13:08:11 -07:00
Shireesh Anjal b17dc81540
Change "disks" node to "drives" in OBD output (#10540) 2020-09-22 11:53:19 -07:00
Harshavardhana 8a291e1dc0
Cluster healthcheck improvements (#10408)
- do not fail the healthcheck if heal status
  was not obtained from one of the nodes,
  if many nodes fail then report this as a
  catastrophic error.
- add "x-minio-write-quorum" value to match
  the write tolerance supported by server.
- admin info now states if a drive is healing
  where madmin.Disk.Healing is set to true
  and madmin.Disk.State is "ok"
2020-09-02 22:54:56 -07:00
Harshavardhana 74116204ce
handle fresh setup with mixed drives (#10273)
fresh drive setups when one of the drive is
a root drive, we should ignore such a root
drive and not proceed to format.

This PR handles this properly by marking
the disks which are root disk and they are
taken offline.
2020-08-18 14:37:26 -07:00
Harshavardhana e7d7d5232c
fix: admin info output and improve overall performance (#10015)
- admin info node offline check is now quicker
- admin info now doesn't duplicate the code
  across doing the same checks for disks
- rely on StorageInfo to return appropriate errors
  instead of calling locally.
- diskID checks now return proper errors when
  disk not found v/s format.json missing.
- add more disk states for more clarity on the
  underlying disk errors.
2020-07-13 09:51:07 -07:00
Harshavardhana 4915433bd2
Support bucket versioning (#9377)
- Implement a new xl.json 2.0.0 format to support,
  this moves the entire marshaling logic to POSIX
  layer, top layer always consumes a common FileInfo
  construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
  for object placements.

Fixes #2111
2020-06-12 20:04:01 -07:00
Harshavardhana 1bc32215b9
enable full linter across the codebase (#9620)
enable linter using golangci-lint across
codebase to run a bunch of linters together,
we shall enable new linters as we fix more
things the codebase.

This PR fixes the first stage of this
cleanup.
2020-05-18 09:59:45 -07:00
Harshavardhana 814ddc0923
add missing admin actions, enhance AccountUsageInfo (#9607) 2020-05-15 18:16:45 -07:00
Harshavardhana 4314ee1670
fix: remove unusued PerfInfoHandler code (#9328)
- Removes PerfInfo admin API as its not OBDInfo
- Keep the drive path without the metaBucket in OBD
  global latency map.
- Remove all the unused code related to PerfInfo API
- Do not redefined global mib,gib constants use
  humanize.MiByte and humanize.GiByte instead always
2020-04-12 19:37:09 -07:00
Harshavardhana ae654831aa
Add madmin package context support (#9172)
This is to improve responsiveness for all
admin API operations and allowing callers
to cancel any on-going admin operations,
if they happen to be waiting too long.
2020-03-20 15:00:44 -07:00
Klaus Post 8d98662633
re-implement data usage crawler to be more efficient (#9075)
Implementation overview: 

https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b
2020-03-18 16:19:29 -07:00
Andreas Auernhammer 086fbb745e
fix and improve KMS server info (#8944)
This commit fixes typos in the displayed server info
w.r.t. the KMS and removes the update status.

For more information about why the update status
is removed see: PR #8943
2020-02-06 06:18:34 +05:30
Anis Elleuch 52bdbcd046
Add new admin API to return Accounting Usage (#8689) 2020-02-04 18:20:39 -08:00
Ashish Kumar Sinha abc266caa1 Add bucket and object count along with total object size (#8639) 2019-12-12 09:58:59 -08:00
Anis Elleuch 555969ee42 Add data usage collect with its new admin API (#8553)
Admin data usage info API returns the following

(Only FS & XL, for now)

- Number of buckets
- Number of objects
- The total size of objects
- Objects histogram
- Bucket sizes
2019-12-12 06:02:37 -08:00
Ashish Kumar Sinha e2c5d29017 Bucket,Object count & Usage removed if set to default (#8638) 2019-12-11 21:56:47 -08:00
kannappanr d266b3a066
Admin Info: Modify Uptime to return seconds (#8635) 2019-12-11 17:56:02 -08:00
Ashish Kumar Sinha 24fb1bf258 New Admin Info (#8497) 2019-12-11 14:27:03 -08:00
Praveen raj Mani e3273bc5bf Fix runtime panic in BackendDisks.Merge() (#8524) 2019-11-14 10:17:41 -08:00
Praveen raj Mani 8836d57e3c The prometheus metrics refractoring (#8003)
The measures are consolidated to the following metrics

- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `disks_offline`: Total number of offline disks in current MinIO instance.
- `disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests.

And this PR also modifies the current StorageInfo queries

- Decouples StorageInfo from ServerInfo .
- StorageInfo is enhanced to give endpoint information.

NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR

Fixes #7873
2019-10-22 21:01:14 -07:00
Bala FA 2a2ff96ee1 change `ReadPerf` into `ReadThroughput` in NetPerfInfo. (#8316)
Previously `ReadPerf` was in time.Duration is changed to `ReadThroughput` in uint64.
2019-09-27 00:01:18 +05:30
Krishnan Parthasarathi 6ba323b009 Add ability to test drive speeds on a MinIO setup (#7664)
- Extends existing Admin API to measure disk performance
2019-09-13 03:22:30 +05:30
Bala FA fa3546bb03 Add NetPerfInfo() API in madmin (#8112) 2019-08-31 08:27:53 +05:30
kannappanr 5ecac91a55
Replace Minio refs in docs with MinIO and links (#7494) 2019-04-09 11:39:42 -07:00
Harshavardhana 0188009c7e Expose total and available disk space (#7453) 2019-04-05 09:51:50 +05:30
Krishnan Parthasarathi 8a77a298f2 Add deploymentID to ServerInfo (#7372) 2019-03-19 16:12:24 +05:30
Sidhartha Mani 34e7259f95 Add Historic CPU and memory stats (#7136)
Collect historic cpu and mem stats.  Also, use actual values 
instead of formatted strings while returning to the client. The string 
formatting prevents values from being processed by the server or 
by the client without parsing it. 

This change will allow the values to be processed (eg. 
compute rolling-average over the lifetime of the minio server)
and offloads the formatting to the client.
2019-01-30 12:47:32 +05:30
Sidhartha Mani f3f47d8cd3 Add ServerCPULoadInfo() and ServerMemUsageInfo() admin API (#7038) 2019-01-09 19:04:19 -08:00
Nitish Tiwari fcb56d864c Add ServerDrivesPerfInfo() admin API (#6969)
This is part of implementation for mc admin health command. The
ServerDrivesPerfInfo() admin API returns read and write speed
information for all the drives (local and remote) in a given Minio
server deployment.

Part of minio/mc#2606
2018-12-31 09:46:44 -08:00
Harshavardhana 000e360196 Deprecate showing drive capacity and total free (#5976)
This addresses a situation that we shouldn't be
displaying Total/Free anymore, instead we should simply
show the total usage.
2018-05-23 17:30:25 -07:00
Harshavardhana e6ec645035 Implement support for calculating disk usage per tenant (#5969)
Fixes #5961
2018-05-23 15:41:29 +05:30
Harshavardhana fb96779a8a Add large bucket support for erasure coded backend (#5160)
This PR implements an object layer which
combines input erasure sets of XL layers
into a unified namespace.

This object layer extends the existing
erasure coded implementation, it is assumed
in this design that providing > 16 disks is
a static configuration as well i.e if you started
the setup with 32 disks with 4 sets 8 disks per
pack then you would need to provide 4 sets always.

Some design details and restrictions:

- Objects are distributed using consistent ordering
  to a unique erasure coded layer.
- Each pack has its own dsync so locks are synchronized
  properly at pack (erasure layer).
- Each pack still has a maximum of 16 disks
  requirement, you can start with multiple
  such sets statically.
- Static sets set of disks and cannot be
  changed, there is no elastic expansion allowed.
- Static sets set of disks and cannot be
  changed, there is no elastic removal allowed.
- ListObjects() across sets can be noticeably
  slower since List happens on all servers,
  and is merged at this sets layer.

Fixes #5465
Fixes #5464
Fixes #5461
Fixes #5460
Fixes #5459
Fixes #5458
Fixes #5460
Fixes #5488
Fixes #5489
Fixes #5497
Fixes #5496
2018-02-15 17:45:57 -08:00
Aditya Manthramurthy a337ea4d11 Move admin APIs to new path and add redesigned heal APIs (#5351)
- Changes related to moving admin APIs
   - admin APIs now have an endpoint under /minio/admin
   - admin APIs are now versioned - a new API to server the version is
     added at "GET /minio/admin/version" and all API operations have the
     path prefix /minio/admin/v1/<operation>
   - new service stop API added
   - credentials change API is moved to /minio/admin/v1/config/credential
   - credentials change API and configuration get/set API now require TLS
     so that credentials are protected
   - all API requests now receive JSON
   - heal APIs are disabled as they will be changed substantially

- Heal API changes
   Heal API is now provided at a single endpoint with the ability for a
   client to start a heal sequence on all the data in the server, a
   single bucket, or under a prefix within a bucket.

   When a heal sequence is started, the server returns a unique token
   that needs to be used for subsequent 'status' requests to fetch heal
   results.

   On each status request from the client, the server returns heal result
   records that it has accumulated since the previous status request. The
   server accumulates upto 1000 records and pauses healing further
   objects until the client requests for status. If the client does not
   request any further records for a long time, the server aborts the
   heal sequence automatically.

   A heal result record is returned for each entity healed on the server,
   such as system metadata, object metadata, buckets and objects, and has
   information about the before and after states on each disk.

   A client may request to force restart a heal sequence - this causes
   the running heal sequence to be aborted at the next safe spot and
   starts a new heal sequence.
2018-01-22 14:54:55 -08:00
Nitish Tiwari 42633748db
Update madmin package to return storage class parity (#5387)
After the addition of Storage Class support, readQuorum
and writeQuorum are decided on a per object basis, instead
of deployment wide static quorums.

This PR updates madmin api to remove readQuorum/writeQuorum
and add Standard storage class and reduced redundancy storage
class parity as return values. Since these parity values are
used to decide the quorum for each object.

Fixes #5378
2018-01-12 07:52:52 +05:30
Matthieu Paret 374feda237 add HTTPStats to madmin (#5299) 2017-12-22 17:47:30 -08:00
Anis Elleuch 465274cd21 server-info: Change Error type to string (#4346)
Golang std error type doesn't marshal/unmarshal with json. So errors
are not actually being sent when a client calls ServerInfo() API.
2017-05-15 07:28:47 -07:00
Anis Elleuch 83abad0b37 admin: ServerInfo() returns info for each node (#4150)
ServerInfo() will gather information from all nodes before returning
it back to the client.
2017-04-21 07:15:53 -07:00
Anis Elleuch 7f86a21317 admin: Add ServerInfo API() (#3743) 2017-02-15 10:45:45 -08:00