minio

Commit Graph

Author	SHA1	Message	Date
Anis Elleuch	c434dff0a4	posix: Add missing error return in RenameFile() (#9319 ) Although it should not happen in most cases.	2020-04-11 11:15:30 -07:00
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	2020-04-09 09:30:02 -07:00
Harshavardhana	e20e08d700	fix: remove the sleep from listing operations (#9287 ) make rest of the Walk() function more predictable, it was observed that in nominal deployments even without much workload the drives are generally slow for respond for readdir operations, for the sleepDuration factor of 10 this can cause unexpected slowness in the Listing calls, while it is good for all other I/O, it may simply slow down Listing immensely which is not useful. fixes #9261	2020-04-08 19:42:57 -07:00
Anis Elleuch	e51e465543	delete: Use physical Dir() for proper prefix cleanup in Windows (#9297 ) In FS mode under Windows, removing an object will not automatically. remove parent empty prefixes. The reason is that path.Dir() was used, however filepath.Dir() is more appropriate since filepath is physical (meaning it operates on OS filesystem paths) This is not caught because failure for Windows CI is not caught.	2020-04-08 11:32:58 -07:00
Bala FA	2c3e34f001	add force delete option of non-empty bucket (#9166 ) passing HTTP header `x-minio-force-delete: true` would allow standard S3 API DeleteBucket to delete a non-empty bucket forcefully.	2020-03-27 21:52:59 -07:00
Harshavardhana	6f992134a2	fix: startup load time by reusing storageDisks (#9210 )	2020-03-27 14:48:30 -07:00
Krishna Srinivas	45b1c66195	fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186 )	2020-03-22 19:23:47 -07:00
Harshavardhana	cfc9cfd84a	fix: various optimizations, idiomatic changes (#9179 ) - acquire since leader lock for all background operations - healing, crawling and applying lifecycle policies. - simplify lifecyle to avoid network calls, which was a bug in implementation - we should hold a leader and do everything from there, we have access to entire name space. - make listing, walking not interfere by slowing itself down like the crawler. - effectively use global context everywhere to ensure proper shutdown, in cache, lifecycle, healing - don't read `format.json` for prometheus metrics in StorageInfo() call.	2020-03-22 12:16:36 -07:00
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	2020-03-18 16:19:29 -07:00
Krishna Srinivas	2e9fed1a14	non-empty dirs should not be listed as objects (#9129 )	2020-03-13 17:43:00 -07:00
Kody A Kantor	06e30b5aa1	Skip building directio on platforms that don't support Direct IO (#9059 )	2020-03-12 18:57:41 -07:00
Anis Elleuch	0af62d35a0	xl: Implement posix.DeletePrefixes to enhance delete perf (#9100 ) Bulk delete API was using cleanupObjectsBulk() which calls posix listing and delete API to remove objects internal files in the backend (xl.json and parts) one by one. Add DeletePrefixes in the storage API to remove the content of a directory in a single call. Also use a remove goroutine for each disk to accelerate removal.	2020-03-11 08:56:36 -07:00
Harshavardhana	88ae0f1196	Improve delete performance by reducing the number of calls (#9092 ) - Remove the requirement to honor storage class for deletes - Improve `posix.DeleteFileBulk` code to Stat the volumeDir only once per call, rather than for all object paths.	2020-03-06 13:44:24 -08:00
Harshavardhana	23a8411732	Add a generic Walk()'er to list a bucket, optinally prefix (#9026 ) This generic Walk() is used by likes of Lifecyle, or KMS to rotate keys or any other functionality which relies on this functionality.	2020-02-25 21:22:28 +05:30
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2020-02-20 09:21:33 +05:30
Klaus Post	d0cea7adea	Fix stream read IO count (#8961 ) Streams are returning a readcloser and returning would decrement io count instantly, fix it. change maxActiveIOCount to 3, meaning it will pause crawling if 3 operations are running.	2020-02-07 09:43:55 +05:30
Harshavardhana	2d295a31de	Avoid select inside a recursive function to avoid CPU spikes (#8923 ) Additionally also allow configurable go-routines	2020-02-03 16:45:59 -08:00
Harshavardhana	f14f60a487	fix: Avoid double usage calculation on every restart (#8856 ) On every restart of the server, usage was being calculated which is not useful instead wait for sufficient time to start the crawling routine. This PR also avoids lots of double allocations through strings, optimizes usage of string builders and also avoids crawling through symbolic links. Fixes #8844	2020-01-21 14:07:49 -08:00
Harshavardhana	fc5213258e	posix: Do not take disk offline on I/O errors (#8836 ) Choosing maxAllowedIOError is arbitrary and prone to errors, when drives might be perfectly capable of taking I/O with only few locations return I/O error. This is a hindrance of sort where backend filesystems like ZFS can automatically fix and handle these scenarios. The added problem with current approach that we take the drive offline, making it virtually impossible to bring it online without restart the server which is not desirable on a busy cluster. Remove this state such that let the backend return error appropriately to caller and let the caller decide what to do with the error.	2020-01-17 13:34:43 -08:00
Anis Elleuch	c18fbdb29a	posix: Remove a non needed nil check in DiskInfo() (#8830 ) posix.DiskInfo() returns errFaultyDisk when posix is nil, but there is no way that this would happen any time, therefore removing un-needed code.	2020-01-16 11:27:50 -08:00
Harshavardhana	0879a4f743	rest/storage: Remove racy LastError usage (#8817 ) instead perform a liveness check call to verify if server is online and print relevant errors. Also introduce a StorageErr string error type instead of errors.New() deprecate usage of VerifyFileError, DeleteFileError for gob, change in datastructure also requires bump in storage REST version to v13. Fixes #8811	2020-01-14 18:45:17 -08:00
Klaus Post	37b32199e3	Validate XL sets on format (#8779 ) When formatting a set validate if a host failure will likely lead to data loss. While we don't know what config will be set in the future evaluate to our best knowledge, assuming default settings.	2020-01-13 13:09:10 -08:00
Harshavardhana	5aa5dcdc6d	lock: improve locker initialization at init (#8776 ) Use reference format to initialize lockers during startup, also handle `nil` for NetLocker in dsync and remove errorLocker implementation Add further tuning parameters such as - DialTimeout is now 15 seconds from 30 seconds - KeepAliveTimeout is not 20 seconds, 5 seconds more than default 15 seconds - ResponseHeaderTimeout to 10 seconds - ExpectContinueTimeout is reduced to 3 seconds - DualStack is enabled by default remove setting it to `true` - Reduce IdleConnTimeout to 30 seconds from 1 minute to avoid idleConn build up Fixes #8773	2020-01-10 02:35:06 -08:00
Harshavardhana	f68a7005c0	Improve disk formatting stage for large disk sets (#8690 )	2019-12-23 16:31:03 -08:00
Anis Elleuch	555969ee42	Add data usage collect with its new admin API (#8553 ) Admin data usage info API returns the following (Only FS & XL, for now) - Number of buckets - Number of objects - The total size of objects - Objects histogram - Bucket sizes	2019-12-12 06:02:37 -08:00
Nitish Tiwari	3df7285c3c	Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591 ) This PR adds support below metrics - Cache Hit Count - Cache Miss Count - Data served from Cache (in Bytes) - Bytes received from AWS S3 - Bytes sent to AWS S3 - Number of requests sent to AWS S3 Fixes #8549	2019-12-05 23:16:06 -08:00
Harshavardhana	2ab8d5e47f	Enable build verification with race (#8583 )	2019-12-02 15:54:26 -08:00
Klaus Post	c7844fb1fb	posix: cache disk ID for a short while (#8564 ) `posix.getDiskID()` takes up to 30% of all CPU due to the `os.Stat` call on `GET` calls. Before: ``` Operation: GET - Concurrency: 12 Average: 1333.97 MB/s, 1365.99 obj/s, 1365.98 ops ended/s (4m59.975s) * First Byte: Average: 7.801487ms, Median: 7.9974ms, Best: 1.9822ms, Worst: 110.0021ms Aggregated, split into 299 x 1s time segments: * Fastest: 1453.50 MB/s, 1488.38 obj/s, 1492.00 ops ended/s (1s) * 50% Median: 1360.47 MB/s, 1393.12 obj/s, 1393.00 ops ended/s (1s) * Slowest: 978.68 MB/s, 1002.17 obj/s, 1004.00 ops ended/s (1s) ``` After: ``` Operation: GET - Concurrency: 12 * Average: 1706.07 MB/s, 1747.02 obj/s, 1747.01 ops ended/s (4m59.985s) * First Byte: Average: 5.797886ms, Median: 5.9959ms, Best: 996.3µs, Worst: 84.0007ms Aggregated, split into 299 x 1s time segments: * Fastest: 1830.03 MB/s, 1873.96 obj/s, 1872.00 ops ended/s (1s) * 50% Median: 1735.04 MB/s, 1776.68 obj/s, 1776.00 ops ended/s (1s) * Slowest: 994.94 MB/s, 1018.82 obj/s, 1018.00 ops ended/s (1s) ``` TLDR; `os.Stat` is not free.	2019-11-29 02:57:14 -08:00
Klaus Post	890b493a2e	Use random file name for write check (#8563 ) Since there may be multiple writes going on concurrently Use a random file name for the write check to avoid collisions.	2019-11-22 09:50:17 -08:00
Klaus Post	1dd38750f7	Remove read-ahead for small files (#8522 ) We should only read ahead if we are reading big files. We enable it for files >= 16MB. Benchmark on 64KB objects. Before: ``` Operation: GET Errors: 0 Average: 59.976s, 87.13 MB/s, 1394.07 ops ended/s. Fastest: 1s, 90.99 MB/s, 1455.00 ops ended/s. 50% Median: 1s, 87.53 MB/s, 1401.00 ops ended/s. Slowest: 1s, 81.39 MB/s, 1301.00 ops ended/s. ``` After: ``` Operation: GET Errors: 0 Average: 59.992s, 207.99 MB/s, 3327.85 ops ended/s. Fastest: 1s, 219.20 MB/s, 3507.00 ops ended/s. 50% Median: 1s, 210.54 MB/s, 3368.00 ops ended/s. Slowest: 1s, 179.14 MB/s, 2865.00 ops ended/s. ``` The 64KB buffer is actually a small disadvantage for this case, but I believe it will be better in general than no buffer.	2019-11-14 12:58:41 -08:00
Praveen raj Mani	fa325665b1	Do not append the endpoint for fs/xl disks in StorageInfo (#8472 )	2019-10-31 09:13:54 -07:00
Krishna Srinivas	980bf78b4d	Detect underlying disk mount/unmount (#8408 )	2019-10-25 10:37:53 -07:00
Praveen raj Mani	8836d57e3c	The prometheus metrics refractoring (#8003 ) The measures are consolidated to the following metrics - `disk_storage_used` : Disk space used by the disk. - `disk_storage_available`: Available disk space left on the disk. - `disk_storage_total`: Total disk space on the disk. - `disks_offline`: Total number of offline disks in current MinIO instance. - `disks_total`: Total number of disks in current MinIO instance. - `s3_requests_total`: Total number of s3 requests in current MinIO instance. - `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance. - `s3_requests_current`: Total number of active s3 requests in current MinIO instance. - `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance. - `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance. - `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance. - `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance. - `minio_version_info`: Current MinIO version with commit-id. - `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests. And this PR also modifies the current StorageInfo queries - Decouples StorageInfo from ServerInfo . - StorageInfo is enhanced to give endpoint information. NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR Fixes #7873	2019-10-22 21:01:14 -07:00
Harshavardhana	ff5bf51952	admin/heal: Fix deep healing to heal objects under more conditions (#8321 ) - Heal if the part.1 is truncated from its original size - Heal if the part.1 fails while being verified in between - Heal if the part.1 fails while being at a certain offset Other cleanups include make sure to flush the HTTP responses properly from storage-rest-server, avoid using 'defer' to improve call latency. 'defer' incurs latency avoid them in our hot-paths such as storage-rest handlers. Fixes #8319	2019-10-02 01:42:15 +05:30
Harshavardhana	975134e42b	Add checks in DiskInfo() to protect against changing mounts (#8286 )	2019-09-23 15:16:55 -07:00
Anis Elleuch	3f258062d8	bitrot: Verify file size inside storage interface (#7932 )	2019-09-12 02:19:53 +05:30
Harshavardhana	a7be313230	Start using new errors package (#8207 )	2019-09-11 22:51:43 +05:30
Krishna Srinivas	c38ada1a26	write() to disk in 4MB blocks for better performance (#7888 )	2019-08-23 15:36:46 -07:00
Harshavardhana	e6d8e272ce	Use const slashSeparator instead of "/" everywhere (#8028 )	2019-08-06 12:08:58 -07:00
Harshavardhana	e40c29e834	Fail appropriately if the disk has I/O errors (#7972 ) If the disk has I/O errors, we should simply ignore such a disk and not be bothered about it - until it is replaced.	2019-07-25 13:35:27 -07:00
Anis Elleuch	000a60f238	xl: Heal empty parts (#7860 ) posix.VerifyFile() doesn't know how to check if a file is corrupted if that file is empty. We do have the part size in xl.json so we pass it to VerifyFile to return an error so healing empty parts can work properly.	2019-07-13 00:29:44 +01:00
Krishna Srinivas	58d90ed73c	Avoid network transfer for bitrot verification during healing (#7375 )	2019-07-08 13:51:18 -07:00
Harshavardhana	39b3e4f9b3	Avoid using io.ReadFull() for WriteAll and CreateFile (#7676 ) With these changes we are now able to peak performances for all Write() operations across disks HDD and NVMe. Also adds readahead for disk reads, which also increases performance for reads by 3x.	2019-05-22 13:47:15 -07:00
Krishnan Parthasarathi	c871456269	File must be sync'd before closing (#7657 ) - group sync and close action into a single defer statement to avoid evaluation order related bugs in future.	2019-05-16 18:30:51 -07:00
Harshavardhana	b3f22eac56	Offload listing to posix layer (#7611 ) This PR adds one API WalkCh which sorts and sends list over the network Each disk walks independently in a sorted manner.	2019-05-14 13:49:10 -07:00
Anis Elleuch	9c90a28546	Implement bulk delete (#7607 ) Bulk delete at storage level in Multiple Delete Objects API In order to accelerate bulk delete in Multiple Delete objects API, a new bulk delete is introduced in storage layer, which will accept a list of objects to delete rather than only one. Consequently, a new API is also need to be added to Object API.	2019-05-13 12:25:49 -07:00
Harshavardhana	3eb7a8bde8	Sync before Close() to avoid random I/O (#7638 )	2019-05-11 15:03:10 -07:00
poornas	cf2a436bc8	Show SlowDown error message if backend is busy (#7521 ) or if there are too many open file descriptors.	2019-05-02 07:09:57 -07:00
Praveen raj Mani	c113d4e49c	Posix CreateFile should work for compressed lengths (#7584 )	2019-04-30 16:27:31 -07:00
Krishna Srinivas	b93ef73f9b	Fix divide by 0 error when directio.AlignSize is 0 (#7591 )	2019-04-26 16:08:15 -07:00

1 2 3

136 Commits