minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	fc5213258e	posix: Do not take disk offline on I/O errors (#8836 ) Choosing maxAllowedIOError is arbitrary and prone to errors, when drives might be perfectly capable of taking I/O with only few locations return I/O error. This is a hindrance of sort where backend filesystems like ZFS can automatically fix and handle these scenarios. The added problem with current approach that we take the drive offline, making it virtually impossible to bring it online without restart the server which is not desirable on a busy cluster. Remove this state such that let the backend return error appropriately to caller and let the caller decide what to do with the error.	2020-01-17 13:34:43 -08:00
Anis Elleuch	c18fbdb29a	posix: Remove a non needed nil check in DiskInfo() (#8830 ) posix.DiskInfo() returns errFaultyDisk when posix is nil, but there is no way that this would happen any time, therefore removing un-needed code.	2020-01-16 11:27:50 -08:00
Harshavardhana	0879a4f743	rest/storage: Remove racy LastError usage (#8817 ) instead perform a liveness check call to verify if server is online and print relevant errors. Also introduce a StorageErr string error type instead of errors.New() deprecate usage of VerifyFileError, DeleteFileError for gob, change in datastructure also requires bump in storage REST version to v13. Fixes #8811	2020-01-14 18:45:17 -08:00
Klaus Post	37b32199e3	Validate XL sets on format (#8779 ) When formatting a set validate if a host failure will likely lead to data loss. While we don't know what config will be set in the future evaluate to our best knowledge, assuming default settings.	2020-01-13 13:09:10 -08:00
Harshavardhana	5aa5dcdc6d	lock: improve locker initialization at init (#8776 ) Use reference format to initialize lockers during startup, also handle `nil` for NetLocker in dsync and remove errorLocker implementation Add further tuning parameters such as - DialTimeout is now 15 seconds from 30 seconds - KeepAliveTimeout is not 20 seconds, 5 seconds more than default 15 seconds - ResponseHeaderTimeout to 10 seconds - ExpectContinueTimeout is reduced to 3 seconds - DualStack is enabled by default remove setting it to `true` - Reduce IdleConnTimeout to 30 seconds from 1 minute to avoid idleConn build up Fixes #8773	2020-01-10 02:35:06 -08:00
Harshavardhana	f68a7005c0	Improve disk formatting stage for large disk sets (#8690 )	2019-12-23 16:31:03 -08:00
Anis Elleuch	555969ee42	Add data usage collect with its new admin API (#8553 ) Admin data usage info API returns the following (Only FS & XL, for now) - Number of buckets - Number of objects - The total size of objects - Objects histogram - Bucket sizes	2019-12-12 06:02:37 -08:00
Nitish Tiwari	3df7285c3c	Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591 ) This PR adds support below metrics - Cache Hit Count - Cache Miss Count - Data served from Cache (in Bytes) - Bytes received from AWS S3 - Bytes sent to AWS S3 - Number of requests sent to AWS S3 Fixes #8549	2019-12-05 23:16:06 -08:00
Harshavardhana	2ab8d5e47f	Enable build verification with race (#8583 )	2019-12-02 15:54:26 -08:00
Klaus Post	c7844fb1fb	posix: cache disk ID for a short while (#8564 ) `posix.getDiskID()` takes up to 30% of all CPU due to the `os.Stat` call on `GET` calls. Before: ``` Operation: GET - Concurrency: 12 Average: 1333.97 MB/s, 1365.99 obj/s, 1365.98 ops ended/s (4m59.975s) * First Byte: Average: 7.801487ms, Median: 7.9974ms, Best: 1.9822ms, Worst: 110.0021ms Aggregated, split into 299 x 1s time segments: * Fastest: 1453.50 MB/s, 1488.38 obj/s, 1492.00 ops ended/s (1s) * 50% Median: 1360.47 MB/s, 1393.12 obj/s, 1393.00 ops ended/s (1s) * Slowest: 978.68 MB/s, 1002.17 obj/s, 1004.00 ops ended/s (1s) ``` After: ``` Operation: GET - Concurrency: 12 * Average: 1706.07 MB/s, 1747.02 obj/s, 1747.01 ops ended/s (4m59.985s) * First Byte: Average: 5.797886ms, Median: 5.9959ms, Best: 996.3µs, Worst: 84.0007ms Aggregated, split into 299 x 1s time segments: * Fastest: 1830.03 MB/s, 1873.96 obj/s, 1872.00 ops ended/s (1s) * 50% Median: 1735.04 MB/s, 1776.68 obj/s, 1776.00 ops ended/s (1s) * Slowest: 994.94 MB/s, 1018.82 obj/s, 1018.00 ops ended/s (1s) ``` TLDR; `os.Stat` is not free.	2019-11-29 02:57:14 -08:00
Klaus Post	890b493a2e	Use random file name for write check (#8563 ) Since there may be multiple writes going on concurrently Use a random file name for the write check to avoid collisions.	2019-11-22 09:50:17 -08:00
Klaus Post	1dd38750f7	Remove read-ahead for small files (#8522 ) We should only read ahead if we are reading big files. We enable it for files >= 16MB. Benchmark on 64KB objects. Before: ``` Operation: GET Errors: 0 Average: 59.976s, 87.13 MB/s, 1394.07 ops ended/s. Fastest: 1s, 90.99 MB/s, 1455.00 ops ended/s. 50% Median: 1s, 87.53 MB/s, 1401.00 ops ended/s. Slowest: 1s, 81.39 MB/s, 1301.00 ops ended/s. ``` After: ``` Operation: GET Errors: 0 Average: 59.992s, 207.99 MB/s, 3327.85 ops ended/s. Fastest: 1s, 219.20 MB/s, 3507.00 ops ended/s. 50% Median: 1s, 210.54 MB/s, 3368.00 ops ended/s. Slowest: 1s, 179.14 MB/s, 2865.00 ops ended/s. ``` The 64KB buffer is actually a small disadvantage for this case, but I believe it will be better in general than no buffer.	2019-11-14 12:58:41 -08:00
Praveen raj Mani	fa325665b1	Do not append the endpoint for fs/xl disks in StorageInfo (#8472 )	2019-10-31 09:13:54 -07:00
Krishna Srinivas	980bf78b4d	Detect underlying disk mount/unmount (#8408 )	2019-10-25 10:37:53 -07:00
Praveen raj Mani	8836d57e3c	The prometheus metrics refractoring (#8003 ) The measures are consolidated to the following metrics - `disk_storage_used` : Disk space used by the disk. - `disk_storage_available`: Available disk space left on the disk. - `disk_storage_total`: Total disk space on the disk. - `disks_offline`: Total number of offline disks in current MinIO instance. - `disks_total`: Total number of disks in current MinIO instance. - `s3_requests_total`: Total number of s3 requests in current MinIO instance. - `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance. - `s3_requests_current`: Total number of active s3 requests in current MinIO instance. - `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance. - `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance. - `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance. - `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance. - `minio_version_info`: Current MinIO version with commit-id. - `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests. And this PR also modifies the current StorageInfo queries - Decouples StorageInfo from ServerInfo . - StorageInfo is enhanced to give endpoint information. NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR Fixes #7873	2019-10-22 21:01:14 -07:00
Harshavardhana	ff5bf51952	admin/heal: Fix deep healing to heal objects under more conditions (#8321 ) - Heal if the part.1 is truncated from its original size - Heal if the part.1 fails while being verified in between - Heal if the part.1 fails while being at a certain offset Other cleanups include make sure to flush the HTTP responses properly from storage-rest-server, avoid using 'defer' to improve call latency. 'defer' incurs latency avoid them in our hot-paths such as storage-rest handlers. Fixes #8319	2019-10-02 01:42:15 +05:30
Harshavardhana	975134e42b	Add checks in DiskInfo() to protect against changing mounts (#8286 )	2019-09-23 15:16:55 -07:00
Anis Elleuch	3f258062d8	bitrot: Verify file size inside storage interface (#7932 )	2019-09-12 02:19:53 +05:30
Harshavardhana	a7be313230	Start using new errors package (#8207 )	2019-09-11 22:51:43 +05:30
Krishna Srinivas	c38ada1a26	write() to disk in 4MB blocks for better performance (#7888 )	2019-08-23 15:36:46 -07:00
Harshavardhana	e6d8e272ce	Use const slashSeparator instead of "/" everywhere (#8028 )	2019-08-06 12:08:58 -07:00
Harshavardhana	e40c29e834	Fail appropriately if the disk has I/O errors (#7972 ) If the disk has I/O errors, we should simply ignore such a disk and not be bothered about it - until it is replaced.	2019-07-25 13:35:27 -07:00
Anis Elleuch	000a60f238	xl: Heal empty parts (#7860 ) posix.VerifyFile() doesn't know how to check if a file is corrupted if that file is empty. We do have the part size in xl.json so we pass it to VerifyFile to return an error so healing empty parts can work properly.	2019-07-13 00:29:44 +01:00
Krishna Srinivas	58d90ed73c	Avoid network transfer for bitrot verification during healing (#7375 )	2019-07-08 13:51:18 -07:00
Harshavardhana	39b3e4f9b3	Avoid using io.ReadFull() for WriteAll and CreateFile (#7676 ) With these changes we are now able to peak performances for all Write() operations across disks HDD and NVMe. Also adds readahead for disk reads, which also increases performance for reads by 3x.	2019-05-22 13:47:15 -07:00
Krishnan Parthasarathi	c871456269	File must be sync'd before closing (#7657 ) - group sync and close action into a single defer statement to avoid evaluation order related bugs in future.	2019-05-16 18:30:51 -07:00
Harshavardhana	b3f22eac56	Offload listing to posix layer (#7611 ) This PR adds one API WalkCh which sorts and sends list over the network Each disk walks independently in a sorted manner.	2019-05-14 13:49:10 -07:00
Anis Elleuch	9c90a28546	Implement bulk delete (#7607 ) Bulk delete at storage level in Multiple Delete Objects API In order to accelerate bulk delete in Multiple Delete objects API, a new bulk delete is introduced in storage layer, which will accept a list of objects to delete rather than only one. Consequently, a new API is also need to be added to Object API.	2019-05-13 12:25:49 -07:00
Harshavardhana	3eb7a8bde8	Sync before Close() to avoid random I/O (#7638 )	2019-05-11 15:03:10 -07:00
poornas	cf2a436bc8	Show SlowDown error message if backend is busy (#7521 ) or if there are too many open file descriptors.	2019-05-02 07:09:57 -07:00
Praveen raj Mani	c113d4e49c	Posix CreateFile should work for compressed lengths (#7584 )	2019-04-30 16:27:31 -07:00
Krishna Srinivas	b93ef73f9b	Fix divide by 0 error when directio.AlignSize is 0 (#7591 )	2019-04-26 16:08:15 -07:00
Krishna Srinivas	a3ec71bc28	Use O_DIRECT while writing to disk (#7479 ) - Use O_DIRECT while writing to disk - Remove MINIO_DRIVE_SYNC option	2019-04-23 21:25:06 -07:00
Harshavardhana	f767a2538a	Optimize listing with leaf check offloaded to posix (#7541 ) Other listing optimizations include - remove double sorting while filtering object entries - improve error message when upload-id is not in quorum - use jsoniter for full unmarshal json, instead of gjson - remove unused code	2019-04-23 14:54:28 -07:00
kannappanr	5ecac91a55	Replace Minio refs in docs with MinIO and links (#7494 )	2019-04-09 11:39:42 -07:00
Harshavardhana	4a698c731b	HealObjects should remove objects without quorum (#7407 ) This PR adds a way to list objects without quorum such that they can purged by `mc admin heal --remove`	2019-03-26 14:57:44 -07:00
Kirill Motkov	3d29ab4059	Rewrite if-else chains to switch statements (#7382 )	2019-03-18 07:46:20 -07:00
Harshavardhana	6702d23d52	Simplify ReadFileStream closer, make sure to flush all HTTP responses (#7374 )	2019-03-18 10:50:26 +05:30
Krishna Srinivas	6dd26b8231	Detect change in underlying mounted disks (#7229 )	2019-02-20 13:32:29 -08:00
Harshavardhana	df35d7db9d	Introduce staticcheck for stricter builds (#7035 )	2019-02-13 18:29:36 +05:30
Krishna Srinivas	6f08edfb36	Use O_EXCL when creating file as we never overwrite an existing file (#7189 )	2019-02-01 19:01:06 -08:00
Krishna Srinivas	82af0be1aa	Healing process should not heal root disk (#7089 )	2019-01-23 15:29:29 -08:00
Harshavardhana	8e0910ab3e	Fix build issues on BSDs in pkg/cpu (#7116 ) Also add a cross compile script to test always cross compilation for some well known platforms and architectures , we support out of box compilation of these platforms even if we don't make an official release build. This script is to avoid regressions in this area when we add platform dependent code.	2019-01-22 09:27:23 +05:30
Krishna Srinivas	98c950aacd	Streaming bitrot verification support (#7004 )	2019-01-17 18:28:18 +05:30
poornas	5a80cbec2a	Add double encryption at S3 gateway. (#6423 ) This PR adds pass-through, single encryption at gateway and double encryption support (gateway encryption with pass through of SSE headers to backend). If KMS is set up (either with Vault as KMS or using MINIO_SSE_MASTER_KEY),gateway will automatically perform single encryption. If MINIO_GATEWAY_SSE is set up in addition to Vault KMS, double encryption is performed.When neither KMS nor MINIO_GATEWAY_SSE is set, do a pass through to backend. When double encryption is specified, MINIO_GATEWAY_SSE can be set to "C" for SSE-C encryption at gateway and backend, "S3" for SSE-S3 encryption at gateway/backend or both to support more than one option. Fixes #6323, #6696	2019-01-05 14:16:42 -08:00
Harshavardhana	b9b353db4b	Add env to support synchronous ops for all calls (#6877 )	2018-12-11 16:22:56 -08:00
Nitish Tiwari	2a810c7da2	Improve du thread performance (#6849 )	2018-11-26 10:35:14 +05:30
Harshavardhana	f1f23f6f11	Add sync mode for 'xl.json' (#6798 ) xl.json is the source of truth for all erasure coded objects, without which we won't be able to read the objects properly. This PR enables sync mode for writing `xl.json` such all writes go hit the disk and are persistent under situations such as abrupt power failures on servers running Minio.	2018-11-14 19:48:35 +05:30
Praveen raj Mani	ce9d36d954	Add object compression support (#6292 ) Add support for streaming (golang/LZ77/snappy) compression.	2018-09-28 09:06:17 +05:30
Anis Elleuch	7571582000	Print storage errors during distributed initialization (#6441 ) This commit will print connection failures to other disks in other nodes after 5 retries. It is useful for users to understand why the distribued cluster fails to boot up.	2018-09-10 16:21:59 -07:00

1 2 3

118 Commits