minio

mirror of https://github.com/minio/minio.git synced 2025-11-27 04:46:53 -05:00

Author	SHA1	Message	Date
Klaus Post	2040559f71	Fix SkipReader performance with small initial read (#20030 ) If `SkipReader` is called with a small initial buffer it may be doing a huge number if Reads to skip the requested number of bytes. If a small buffer is provided grab a 32K buffer and use that. Fixes slow execution of `testAPIGetObjectWithMPHandler`. Bonuses: * Use `-short` with `-race` test. * Do all suite test types with `-short`. * Enable compressed+encrypted in `testAPIGetObjectWithMPHandler`. * Disable big file tests in `testAPIGetObjectWithMPHandler` when using `-short`.	2024-07-02 08:13:05 -07:00
Shubhendu	7c7650b7c3	Add sufficient deadlines and countermeasures to handle hung node scenario (#19688 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2024-05-22 16:07:14 -07:00
Harshavardhana	03767d26da	fix: get rid of large buffers (#19549 ) these lead to run-away usage of memory beyond which the Go's GC can handle, we have to re-visit this differently, remove this for now.	2024-04-19 04:26:59 -07:00
Harshavardhana	c957e0d426	fix: increase the tiering part size to 128MiB (#19424 ) also introduce 8MiB buffer to read from for bigger parts	2024-04-08 02:22:27 -07:00
Harshavardhana	c201d8bda9	write anything beyond 4k to be written in 4k pages (#19269 ) we were prematurely not writing 4k pages while we could have due to the fact that most buffers would be multiples of 4k upto some number and there shall be some remainder. We only need to write the remainder without O_DIRECT.	2024-03-15 12:27:59 -07:00
Harshavardhana	d99d16e8c3	simplify deadlineWriter, re-use WithDeadline (#18948 )	2024-02-02 03:02:31 -08:00
Harshavardhana	9ef132c33b	remove excessive logging due to runtime.debugStack	2024-01-28 18:10:42 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00
Klaus Post	4a6c97463f	Fix all racy use of NewDeadlineWorker (#18861 ) AlmosAll uses of NewDeadlineWorker, which relied on secondary values, were used in a racy fashion, which could lead to inconsistent errors/data being returned. It also propagates the deadline downstream. Rewrite all these to use a generic WithDeadline caller that can return an error alongside a value. Remove the stateful aspect of DeadlineWorker - it was racy if used - but it wasn't AFAICT. Fixes races like: ``` WARNING: DATA RACE Read at 0x00c130b29d10 by goroutine 470237: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).ReadVersion() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:702 +0x611 github.com/minio/minio/cmd.readFileInfo() github.com/minio/minio/cmd/erasure-metadata-utils.go:160 +0x122 github.com/minio/minio/cmd.erasureObjects.getObjectFileInfo.func1.1() github.com/minio/minio/cmd/erasure-object.go:809 +0x27a github.com/minio/minio/cmd.erasureObjects.getObjectFileInfo.func1.2() github.com/minio/minio/cmd/erasure-object.go:828 +0x61 Previous write at 0x00c130b29d10 by goroutine 470298: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).ReadVersion.func1() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:698 +0x244 github.com/minio/minio/internal/ioutil.(DeadlineWorker).Run.func1() github.com/minio/minio/internal/ioutil/ioutil.go:141 +0x33 WARNING: DATA RACE Write at 0x00c0ba6e6c00 by goroutine 94507: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).StatVol.func1() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:419 +0x104 github.com/minio/minio/internal/ioutil.(DeadlineWorker).Run.func1() github.com/minio/minio/internal/ioutil/ioutil.go:141 +0x33 Previous read at 0x00c0ba6e6c00 by goroutine 94463: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).StatVol() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:422 +0x47e github.com/minio/minio/cmd.getBucketInfoLocal.func1() github.com/minio/minio/cmd/peer-s3-server.go:275 +0x122 github.com/minio/pkg/v2/sync/errgroup.(*Group).Go.func1() ``` Probably back from #17701	2024-01-24 10:08:31 -08:00
Harshavardhana	dd2542e96c	add codespell action (#18818 ) Original work here, #18474, refixed and updated.	2024-01-17 23:03:17 -08:00
Klaus Post	6c89a81af4	Fix CreateFile shared buffer corruption. (#18652 ) `(xlStorageDiskIDCheck).CreateFile` wraps the incoming reader in `xioutil.NewDeadlineReader`. The wrapped reader is handed to `(xlStorage).CreateFile`. This performs a Read call via `writeAllDirect`, which reads into an `ODirectPool` buffer. `(*DeadlineReader).Read` spawns an async read into the buffer. If a timeout is hit while reading, the read operation returns to `writeAllDirect`. The operation returns an error and the buffer is reused. However, if the async `Read` call unblocks, it will write to the now recycled buffer. Fix: Remove the `DeadlineReader` - it is inherently unsafe. Instead, rely on the network timeouts. This is not a disk timeout, anyway. Regression in https://github.com/minio/minio/pull/17745	2023-12-14 10:51:57 -08:00
Harshavardhana	65f34cd823	fix: remove ODirectReader entirely since we do not need it anymore (#18619 )	2023-12-09 10:17:51 -08:00
Harshavardhana	754f7a8a39	replace io.Discard usage to fix some NUMA copy() latencies (#18394 ) replace io.Discard usage to fix NUMA copy() latencies On NUMA systems copying from 8K buffer allocated via io.Discard leads to large latency build-up for every ``` copy(new8kbuf, largebuf) ``` can in-cur upto 1ms worth of latencies on NUMA systems due to memory sharding across NUMA nodes.	2023-11-06 14:26:08 -08:00
Harshavardhana	d9f1df01eb	return an error in CopyAligned upon premature EOF (#18110 ) add a unit-test to capture this corner case	2023-09-26 11:20:06 -07:00
Harshavardhana	731e03fe5a	add ReadFileStream deadline for disk call (#17745 ) timeout the reader side if hung via disk max timeout	2023-07-28 15:37:53 -07:00
Harshavardhana	e7b60c4d65	Add slow drive timeouts to match with active disk monitoring (#17701 ) allow active disk-monitoring to be configurable, and use these add deadlines in various call layers for various syscalls.	2023-07-25 16:58:31 -07:00
Klaus Post	76913a9fd5	Signed trailers for signature v4 (#16484 )	2023-05-05 19:53:12 -07:00
Klaus Post	e8c0a50862	optimization use small blocks up to 64KB (#17107 )	2023-05-01 09:47:49 -07:00
jiuker	c8b92f6067	protect wg.Done from being called twice (#17075 )	2023-04-27 07:55:36 -07:00
Klaus Post	6a04067514	fix: tweak read buffer size to reduce over-reading (#16338 )	2023-01-01 08:14:20 -08:00
Klaus Post	ff12080ff5	Remove deprecated io/ioutil (#15707 )	2022-09-19 11:05:16 -07:00
Harshavardhana	97376f6e8f	improve performance for inlined data (#15603 ) inlined data often is bigger than the allowed O_DIRECT alignment, so potentially we can write 'xl.meta' without O_DSYNC instead we can rely on O_DIRECT + fdatasync() instead. This PR allows O_DIRECT on inlined data that would gain the benefits of performing O_DIRECT, eventually performing an fdatasync() at the end. Performance boost can be observed here for small objects < 128KiB. The performance boost is mainly seen on HDD, and marginal on NVMe setups.	2022-08-29 11:19:29 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
Klaus Post	b890bbfa63	Add local disk health checks (#14447 ) The main goal of this PR is to solve the situation where disks stop responding to operations. This generally causes an FD build-up and eventually will crash the server. This adds detection of hung disks, where calls on disk get stuck. We add functionality to `xlStorageDiskIDCheck` where it keeps track of the number of concurrent requests on a given disk. A total number of 100 operations are allowed. If this limit is reached we will block (but not reject) new requests, but we will monitor the state of the disk. If no requests have been completed or updated within a 15-second window, we mark the disk as offline. Requests that are blocked will be unblocked and return an error as "faulty disk". New requests will be rejected until the disk is marked OK again. Once a disk has been marked faulty, a check will run every 5 seconds that will attempt to write and read back a file. As long as this fails the disk will remain faulty. To prevent lots of long-running requests to mark the disk faulty we implement a callback feature that allows updating the status as parts of these operations are running. We add a reader and writer wrapper that will update the status of each successful read/write operation. This should allow fine enough granularity that a slow, but still operational disk will not reach 15 seconds where 50 operations have not progressed. Note that errors themselves are not enough to mark a disk faulty. A nil (or io.EOF) error will mark a disk as "good". * Make concurrent disk setting configurable via `_MINIO_DISK_MAX_CONCURRENT`. * de-couple IsOnline() from disk health tracker The purpose of IsOnline() is to ensure that we reconnect the drive only when the "drive" was - disconnected from network we need to validate if the drive is "correct" and is the same drive which belongs to this server. - drive was replaced we have to format it - we support hot swapping of the drives. IsOnline() is not meant for taking the drive offline when it is hung, it is not useful we can let the drive be online instead "return" errors for relevant calls. * return errFaultyDisk for DiskInfo() call Co-authored-by: Harshavardhana <harsha@minio.io> Possible future Improvements: * Unify the REST server and local xlStorageDiskIDCheck. This would also improve stats significantly. * Allow reads/writes to be aborted by the context. * Add usage stats, concurrent count, blocked operations, etc.	2022-03-09 11:38:54 -08:00
Harshavardhana	5a9f133491	speed up startup sequence for all operations (#14148 ) This speed-up is intended for faster startup times for almost all MinIO operations. Changes here are - Drives are not re-read for 'format.json' on a regular basis once read during init is remembered and refreshed at 5 second intervals. - Do not do O_DIRECT tests on drives with existing 'format.json' only fresh setups need this check. - Parallelize initializing erasureSets for multiple sets. - Avoid re-reading format.json when migrating 'format.json' from really old V1->V2->V3 - Keep a copy of local drives for any given server in memory for a quick lookup.	2022-01-24 11:28:45 -08:00
Harshavardhana	f527c708f2	run gofumpt cleanup across code-base (#14015 )	2022-01-02 09:15:06 -08:00
Harshavardhana	9c5d9ae376	fallback O_DIRECT if not supported, do regular reads() (#13680 )	2021-11-17 15:48:47 -08:00
Harshavardhana	661b263e77	add gocritic/ruleguard checks back again, cleanup code. (#13665 ) - remove some duplicated code - reported a bug, separately fixed in #13664 - using strings.ReplaceAll() when needed - using filepath.ToSlash() use when needed - remove all non-Go style comments from the codebase Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2021-11-16 09:28:29 -08:00
Harshavardhana	14d8a931fe	re-use io.Copy buffers with 32k pools (#13553 ) Borrowed idea from Go's usage of this optimization for ReadFrom() on client side, we should re-use the 32k buffers io.Copy() allocates for generic copy from a reader to writer. the performance increase for reads for really tiny objects is at this range after this change. > * Fastest: +7.89% (+1.3 MiB/s) throughput, +7.89% (+1308.1) obj/s	2021-11-02 08:11:50 -07:00
Klaus Post	974073a2e5	directio: Check if buffers are set. (#13440 ) Check if directio buffers have actually been fetched and prevent errors on double Close. Return error on Read after Close. Fixes ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xf8582f] goroutine 210 [running]: github.com/minio/minio/internal/ioutil.(ODirectReader).Read(0xc0054f8320, {0xc0014560b0, 0xa8, 0x44d012}) github.com/minio/minio/internal/ioutil/odirect_reader.go:88 +0x10f io.ReadAtLeast({0x428c5c0, 0xc0054f8320}, {0xc0014560b0, 0xa8, 0xa8}, 0xa8) io/io.go:328 +0x9a io.ReadFull(...) io/io.go:347 github.com/minio/minio/internal/ioutil.ReadFile({0xc001bf60e0, 0x6}) github.com/minio/minio/internal/ioutil/read_file.go:48 +0x19b github.com/minio/minio/cmd.(FSObjects).scanBucket.func1({{0xc00444e1e0, 0x4d}, 0x0, {0xc0040cf240, 0xe}, {0xc0040cf24f, 0x18}, {0xc0040cf268, 0x18}, 0x0, ...}) github.com/minio/minio/cmd/fs-v1.go:366 +0x1ea github.com/minio/minio/cmd.(folderScanner).scanFolder.func1({0xc00474a6a8, 0xc0065d6793}, 0x0) github.com/minio/minio/cmd/data-scanner.go:494 +0xb15 github.com/minio/minio/cmd.readDirFn({0xc002803e80, 0x34}, 0xc000670270) github.com/minio/minio/cmd/os-readdir_unix.go:172 +0x638 github.com/minio/minio/cmd.(folderScanner).scanFolder(0xc002deeb40, {0x42dc9d0, 0xc00068cbc0}, {{0xc001c6e2d0, 0x27}, 0xc0023db8e0, 0x1}, 0xc0001c7ab0) github.com/minio/minio/cmd/data-scanner.go:427 +0xa8f github.com/minio/minio/cmd.(folderScanner).scanFolder.func2({{0xc001c6e2d0, 0x27}, 0xc0023db8e0, 0x27}) github.com/minio/minio/cmd/data-scanner.go:549 +0xd0 github.com/minio/minio/cmd.(folderScanner).scanFolder(0xc002deeb40, {0x42dc9d0, 0xc00068cbc0}, {{0xc0013fa9e0, 0xe}, 0x0, 0x1}, 0xc000670dd8) github.com/minio/minio/cmd/data-scanner.go:623 +0x205d github.com/minio/minio/cmd.scanDataFolder({_, _}, {_, _}, {{{0xc0013fa9e0, 0xe}, 0x802, {0x210f15d2, 0xed8f903b8, 0x5bc0e80}, ...}, ...}, ...) github.com/minio/minio/cmd/data-scanner.go:333 +0xc51 github.com/minio/minio/cmd.(FSObjects).scanBucket(_, {_, _}, {_, _}, {{{0xc0013fa9e0, 0xe}, 0x802, {0x210f15d2, 0xed8f903b8, ...}, ...}, ...}) github.com/minio/minio/cmd/fs-v1.go:364 +0x305 github.com/minio/minio/cmd.(FSObjects).NSScanner(0x42dc9d0, {0x42dc9d0, 0xc00068cbc0}, 0x0, 0xc003bcfda0, 0x802) github.com/minio/minio/cmd/fs-v1.go:307 +0xa16 github.com/minio/minio/cmd.runDataScanner({0x42dc9d0, 0xc00068cbc0}, {0x436a6c0, 0xc000bfcf50}) github.com/minio/minio/cmd/data-scanner.go:150 +0x749 created by github.com/minio/minio/cmd.initDataScanner github.com/minio/minio/cmd/data-scanner.go:73 +0xb0 ```	2021-10-14 10:19:17 -07:00
Harshavardhana	d00ff3c453	use O_DIRECT for all ReadFileStream (#13324 ) This PR also removes #13312 to ensure that we can use a better mechanism to handle page-cache, using O_DIRECT even for Range GETs.	2021-09-29 16:40:28 -07:00
Harshavardhana	38027c8f52	use fadvise to control Linux page-cache (#13312 ) This PR brings two optimizations mainly for page-cache build-up and how to avoid getting OOM killed in the process. Although these memories are reclaimable Linux is not fast enough to reclaim them as needed on a very busy system. fadvise is a system call implemented in Linux to advise page-cache to avoid overload as we get significant amount of requests on the server. - FADV_SEQUENTIAL tells that all I/O from now is going to be sequential, allowing for more resposive throughput. - FADV_NOREUSE tells kernel to start removing things for this 'fd' from page-cache.	2021-09-28 10:02:56 -07:00
Klaus Post	470553ff5d	Tweak readall allocation and renameData buffer reuse (#13108 ) Use a single allocation for reading the file, not the growing buffer of `io.ReadAll`. Reuse the write buffer if we can when writing metadata in RenameData.	2021-08-30 08:38:11 -07:00
Harshavardhana	202d0b64eb	fix: enable go1.17 github ci/cd (#12997 )	2021-08-18 18:35:22 -07:00
Harshavardhana	039978640f	fix: honor system umask for file creates (#12601 ) use 0666 os.FileMode to honor system umask	2021-07-06 12:54:16 -07:00
Harshavardhana	1f262daf6f	rename all remaining packages to internal/ (#12418 ) This is to ensure that there are no projects that try to import `minio/minio/pkg` into their own repo. Any such common packages should go to `https://github.com/minio/pkg`	2021-06-01 14:59:40 -07:00

36 Commits