minio

Commit Graph

Author	SHA1	Message	Date
Harshavardhana	88c1bb0720	fix: improper ticker usage in goroutines (#11468 ) - lock maintenance loop was incorrectly sleeping as well as using ticker badly, leading to extra expiration routines getting triggered that could flood the network. - multipart upload cleanup should be based on timer instead of ticker, to ensure that long running jobs don't get triggered twice. - make sure to get right lockers for object name	2021-02-05 19:23:48 -08:00
Harshavardhana	76e2713ffe	fix: use buffers only when necessary for io.Copy() (#11229 ) Use separate sync.Pool for writes/reads Avoid passing buffers for io.CopyBuffer() if the writer or reader implement io.WriteTo or io.ReadFrom respectively then its useless for sync.Pool to allocate buffers on its own since that will be completely ignored by the io.CopyBuffer Go implementation. Improve this wherever we see this to be optimal. This allows us to be more efficient on memory usage. ``` 385 // copyBuffer is the actual implementation of Copy and CopyBuffer. 386 // if buf is nil, one is allocated. 387 func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) { 388 // If the reader has a WriteTo method, use it to do the copy. 389 // Avoids an allocation and a copy. 390 if wt, ok := src.(WriterTo); ok { 391 return wt.WriteTo(dst) 392 } 393 // Similarly, if the writer has a ReadFrom method, use it to do the copy. 394 if rt, ok := dst.(ReaderFrom); ok { 395 return rt.ReadFrom(src) 396 } ``` From readahead package ``` // WriteTo writes data to w until there's no more data to write or when an error occurs. // The return value n is the number of bytes written. // Any error encountered during the write is also returned. func (a *reader) WriteTo(w io.Writer) (n int64, err error) { if a.err != nil { return 0, a.err } n = 0 for { err = a.fill() if err != nil { return n, err } n2, err := w.Write(a.cur.buffer()) a.cur.inc(n2) n += int64(n2) if err != nil { return n, err } ```	2021-01-06 09:36:55 -08:00
Klaus Post	2294e53a0b	Don't retain context in locker (#10515 ) Use the context for internal timeouts, but disconnect it from outgoing calls so we always receive the results and cancel it remotely.	2020-11-04 08:25:42 -08:00
Harshavardhana	3831cc9e3b	fix: [fs] CompleteMultipart use trie structure for partMatch (#10522 ) performance improves by around 100x or more ``` go test -v -run NONE -bench BenchmarkGetPartFile goos: linux goarch: amd64 pkg: github.com/minio/minio/cmd BenchmarkGetPartFileWithTrie BenchmarkGetPartFileWithTrie-4 1000000000 0.140 ns/op 0 B/op 0 allocs/op PASS ok github.com/minio/minio/cmd 1.737s ``` fixes #10520	2020-09-21 01:18:13 -07:00
Harshavardhana	84bf4624a4	fix: make sure to preserve metadata during overwrite in FS mode (#10512 ) This bug was introduced in `14f0047295` almost 3yrs ago, as a side affect of removing stale `fs.json` but we in-fact end up removing existing good `fs.json` for an existing object, leading to some form of a data loss. fixes #10496	2020-09-18 00:16:16 -07:00
Harshavardhana	4a36cd7035	fix: improve performance ListObjectParts in FS mode (#10510 ) from 20s for 10000 parts to less than 1sec Without the patch ``` ~ time aws --endpoint-url=http://localhost:9000 --profile minio s3api \ list-parts --bucket testbucket --key test \ --upload-id c1cd1f50-ea9a-4824-881c-63b5de95315a real 0m20.394s user 0m0.589s sys 0m0.174s ``` With the patch ``` ~ time aws --endpoint-url=http://localhost:9000 --profile minio s3api \ list-parts --bucket testbucket --key test \ --upload-id c1cd1f50-ea9a-4824-881c-63b5de95315a real 0m0.891s user 0m0.624s sys 0m0.182s ``` fixes #10503	2020-09-17 18:51:16 -07:00
Harshavardhana	0104af6bcc	delayed locks until we have started reading the body (#10474 ) This is to ensure that Go contexts work properly, after some interesting experiments I found that Go net/http doesn't cancel the context when Body is non-zero and hasn't been read till EOF. The following gist explains this, this can lead to pile up of go-routines on the server which will never be canceled and will die at a really later point in time, which can simply overwhelm the server. https://gist.github.com/harshavardhana/c51dcfd055780eaeb71db54f9c589150 To avoid this refactor the locking such that we take locks after we have started reading from the body and only take locks when needed. Also, remove contextReader as it's not useful, doesn't work as expected context is not canceled until the body reaches EOF so there is no point in wrapping it with context and putting a `select {` on it which can unnecessarily increase the CPU overhead. We will still use the context to cancel the lockers etc. Additional simplification in the locker code to avoid timers as re-using them is a complicated ordeal avoid them in the hot path, since locking is very common this may avoid lots of allocations.	2020-09-14 15:57:13 -07:00
Klaus Post	b7438fe4e6	Copy metadata before spawning goroutine + prealloc maps (#10458 ) In `(*cacheObjects).GetObjectNInfo` copy the metadata before spawning a goroutine. Clean up a few map[string]string copies as well, reducing allocs and simplifying the code. Fixes #10426	2020-09-10 11:37:22 -07:00
Harshavardhana	6a0372be6c	cleanup tmpDir any older entries automatically just like multipart (#10439 ) also consider multipart uploads, temporary files in `.minio.sys/tmp` as stale beyond 24hrs and clean them up automatically	2020-09-08 15:55:40 -07:00
Harshavardhana	c13afd56e8	Remove MaxConnsPerHost settings to avoid potential hangs (#10438 ) MaxConnsPerHost can potentially hang a call without any way to timeout, we do not need this setting for our proxy and gateway implementations instead IdleConn settings are good enough. Also ensure to use NewRequestWithContext and make sure to take the disks offline only for network errors. Fixes #10304	2020-09-08 14:22:04 -07:00
Harshavardhana	102ad60dee	simplify removing temporary files (#10389 )	2020-08-31 12:35:40 -07:00
Harshavardhana	e57c742674	use single dynamic timeout for most locked API/heal ops (#10275 ) newDynamicTimeout should be allocated once, in-case of temporary locks in config and IAM we should have allocated timeout once before the `for loop` This PR doesn't fix any issue as such, but provides enough dynamism for the timeout as per expectation.	2020-08-17 11:29:58 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
Harshavardhana	8befedef14	simplify FS multipart cleanup (#9740 ) fixes #9671	2020-05-30 13:56:31 -07:00
Harshavardhana	b330c2c57e	Introduce simpler GetMultipartInfo call for performance (#9722 ) Advantages avoids 100's of stats which are needed for each upload operation in FS/NAS gateway mode when uploading a large multipart object, dramatically increases performance for multipart uploads by avoiding recursive calls. For other gateway's simplifies the approach since azure, gcs, hdfs gateway's don't capture any specific metadata during upload which needs handler validation for encryption/compression. Erasure coding was already optimized, additionally just avoids small allocations of large data structure. Fixes #7206	2020-05-28 12:36:20 -07:00
Krishna Srinivas	94f1a1dea3	add option for O_SYNC writes for standalone FS backend (#9581 )	2020-05-12 19:24:59 -07:00
Klaus Post	073aac3d92	add data update tracking using bloom filter (#9208 ) By monitoring PUT/DELETE and heal operations it is possible to track changed paths and keep a bloom filter for this data. This can help prioritize paths to scan. The bloom filter can identify paths that have not changed, and the few collisions will only result in a marginal extra workload. This can be implemented on either a bucket+(1 prefix level) with reasonable performance. The bloom filter is set to have a false positive rate at 1% at 1M entries. A bloom table of this size is about ~2500 bytes when serialized. To not force a full scan of all paths that have changed cycle bloom filters would need to be kept, so we guarantee that dirty paths have been scanned within cycle runs. Until cycle bloom filters have been collected all paths are considered dirty.	2020-04-27 10:06:21 -07:00
Harshavardhana	60d415bb8a	deprecate/remove global WORM mode (#9436 ) global WORM mode is a complex piece for which the time has passed, with the advent of S3 compatible object locking and retention implementation global WORM is sort of deprecated, this has been mentioned in our documentation for some time, now the time has come for this to go.	2020-04-24 16:37:05 -07:00
Harshavardhana	69fb68ef0b	fix simplify code to start using context (#9350 )	2020-04-16 10:56:18 -07:00
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	2020-03-18 16:19:29 -07:00
Harshavardhana	e3b44c3829	Remove partName, partETag requirement (#9044 ) This is a precursor change before versioning, removes/deprecates the requirement of remembering partName and partETag which are not useful after a multipart transaction has finished. This PR reduces the overall size of the backend JSON for large file uploads.	2020-03-03 03:29:30 +03:00
poornas	30922148fb	Fix bug preventing overwrite of object if (#8796 ) object lock config is enabled for a bucket. Creating a bucket with object lock configuration enabled does not automatically cause WORM protection to be applied. PUT operation needs to specifically request object locking or bucket has to have default retention settings configured. Fixes regression introduced in #8657	2020-01-13 17:29:31 -08:00
Harshavardhana	5f2318567e	Allow metadata updates on meta bucket even in WORM mode (#8657 ) This ensures that we can update the - .minio.sys is updated for accounting/data usage purposes - .minio.sys is updated to indicate if backend is encrypted or not.	2019-12-17 10:13:12 -08:00
poornas	ca96560d56	Add object retention at the per object (#8528 ) level - this PR builds on #8120 which added PutBucketObjectLockConfiguration and GetBucketObjectLockConfiguration APIS This PR implements PutObjectRetention, GetObjectRetention API and enhances PUT and GET API operations to display governance metadata if permissions allow.	2019-11-20 13:18:09 -08:00
Harshavardhana	e9b2bf00ad	Support MinIO to be deployed on more than 32 nodes (#8492 ) This PR implements locking from a global entity into a more localized set level entity, allowing for locks to be held only on the resources which are writing to a collection of disks rather than a global level. In this process this PR also removes the top-level limit of 32 nodes to an unlimited number of nodes. This is a precursor change before bring in bucket expansion.	2019-11-13 12:17:45 -08:00
Bala FA	fb48ca5020	Add Get/Put Bucket Lock Configuration API support (#8120 ) This feature implements [PUT Bucket object lock configuration][1] and [GET Bucket object lock configuration][2]. After object lock configuration is set, existing and new objects are set to WORM for specified duration. Currently Governance mode works exactly like Compliance mode. Fixes #8101 [1] https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketPUTObjectLockConfiguration.html [2] https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGETObjectLockConfiguration.html	2019-11-12 14:50:18 -08:00
Anis Elleuch	ee05280721	fs: Remove stale background append temporary file (#8404 ) Background append creates a temporary file which appends uploaded parts as long as they are available, but when a client stops the upload, the temporary file is not removed by any way. This commit removes the temporary file when the server does its regular removing stale multipart uploads.	2019-10-17 00:27:52 +05:30
poornas	d7060c4c32	Allow logging targets to be configured to receive `minio` (#8347 ) specific errors, `application` errors or `all` by default. console logging on server by default lists all logs - enhance admin console API to accept `type` as query parameter to subscribe to application/minio logs.	2019-10-11 18:50:54 -07:00
Harshavardhana	b52a3e523c	Avoid using fastjson parser pool, move back to jsoniter (#8190 ) It looks like from implementation point of view fastjson parser pool doesn't behave the same way as expected when dealing many `xl.json` from multiple disks. The fastjson parser pool usage ends up returning incorrect xl.json entries for checksums, with references pointing to older entries. This led to the subtle bug where checksum info is duplicated from a previous xl.json read of a different file from different disk.	2019-09-06 04:21:27 +05:30
Harshavardhana	9ca7470ccc	Avoid using jsoniter, move to fastjson (#8063 ) This is to avoid using unsafe.Pointer type code dependency for MinIO, this causes crashes on ARM64 platforms Refer #8005 collection of runtime crashes due to unsafe.Pointer usage incorrectly. We have seen issues like this before when using jsoniter library in the past. This PR hopes to fix this using fastjson	2019-08-19 08:35:52 -10:00
Harshavardhana	e6d8e272ce	Use const slashSeparator instead of "/" everywhere (#8028 )	2019-08-06 12:08:58 -07:00
Daryl Finlay	9389a55e5d	Cancel PutObjectPart on upload abort (#7940 ) Calling ListMultipartUploads fails if an upload is aborted while a part is being uploaded because the directory for the upload exists (since fsRenameFile ends up calling os.MkdirAll) but the meta JSON file doesn't. To fix this we make sure an upload hasn't been aborted during PutObjectPart by checking the existence of the directory for the upload while moving the temporary part file into it.	2019-07-22 22:36:15 -07:00
Krishna Srinivas	338e9a9be9	Put object client disconnect (#7824 ) Fail putObject and postpolicy in case client prematurely disconnects Use request's context to cancel lock requests on client disconnects	2019-06-28 22:09:17 -07:00
Harshavardhana	72929ec05b	Turn off md5sum optionally if content-md5 is not set (#7609 ) This PR also brings --compat option to run MinIO in strict S3 compatibility mode, MinIO by default will now try to run high performance mode.	2019-05-08 18:35:40 -07:00
Krishnan Parthasarathi	35ef5eb236	Don't exit background append if backend specific files show up (#7519 )	2019-04-12 15:51:32 -07:00
kannappanr	5ecac91a55	Replace Minio refs in docs with MinIO and links (#7494 )	2019-04-09 11:39:42 -07:00
poornas	023866642c	canonicalize ETag correctly (#7442 ) Fixes #7441 Trim extra quotes prefixing/suffixing ETag in CompleteMultipartUpload request.	2019-04-01 12:19:52 -07:00
Harshavardhana	c184038b6a	Add proper custom errors object creations (#7387 ) In scenario 1 ``` - bucket/object-prefix - bucket/object-prefix/object ``` Server responds with `XMinioParentIsObject` In scenario 2 ``` - bucket/object-prefix/object - bucket/object-prefix ``` Server responds with `XMinioObjectExistsAsDirectory` Fixes #6566	2019-03-20 13:06:53 -07:00
poornas	40b8d11209	Move metadata into ObjectOptions for NewMultipart and PutObject (#7060 )	2019-02-09 11:01:06 +05:30
poornas	5a80cbec2a	Add double encryption at S3 gateway. (#6423 ) This PR adds pass-through, single encryption at gateway and double encryption support (gateway encryption with pass through of SSE headers to backend). If KMS is set up (either with Vault as KMS or using MINIO_SSE_MASTER_KEY),gateway will automatically perform single encryption. If MINIO_GATEWAY_SSE is set up in addition to Vault KMS, double encryption is performed.When neither KMS nor MINIO_GATEWAY_SSE is set, do a pass through to backend. When double encryption is specified, MINIO_GATEWAY_SSE can be set to "C" for SSE-C encryption at gateway and backend, "S3" for SSE-S3 encryption at gateway/backend or both to support more than one option. Fixes #6323, #6696	2019-01-05 14:16:42 -08:00
Ashish Kumar Sinha	9bb88e610e	Deletion of subfolders of multipart (#6961 ) Delete subfolders under multipart folder upon completion of CompleteMultipartUpload, AbortMultipartUpload and cleanupStaleMultipartUploads functions	2018-12-19 11:27:10 -08:00
poornas	5f6d717b7a	Fix: Preserve MD5Sum for SSE encrypted objects (#6680 ) To conform with AWS S3 Spec on ETag for SSE-S3 encrypted objects, encrypt client sent MD5Sum and store it on backend as ETag.Extend this behavior to SSE-C encrypted objects.	2018-11-14 17:36:41 -08:00
Aarushi Arya	7c2ae4eaf7	Remove tmp file and multipart folder in FS mode. (#6677 ) Fixes #6588	2018-10-22 07:36:30 -07:00
Harshavardhana	839a758a36	Fix a crash due to race between Abort/CompleteMultipart (#6544 ) Fixes #6429	2018-10-01 09:50:09 -07:00
Praveen raj Mani	ce9d36d954	Add object compression support (#6292 ) Add support for streaming (golang/LZ77/snappy) compression.	2018-09-28 09:06:17 +05:30
Anis Elleuch	aa4e2b1542	Use GetObjectNInfo in CopyObject and CopyObjectPart (#6489 )	2018-09-25 12:39:46 -07:00
poornas	5c0b98abf0	Add ObjectOptions to ObjectLayer calls (#6382 )	2018-09-10 09:42:43 -07:00
Harshavardhana	2debe77586	Remove error returned when part sizes are un-equal (#6183 ) Since implementing `pwrite` like implementation would require a more complex code than background append implementation, it is better to keep the current code as is and not implement `pwrite` based functionality. Closes #4881	2018-07-24 21:31:03 -07:00
Harshavardhana	e5e522fc61	docs: fix all Chinese doc links for the new docs site (#6097 ) Additionally fix typos, default to US locale words	2018-06-28 16:02:02 -07:00
Anis Elleuch	9439dfef64	Use defer style to stop tickers to avoid current/possible misuse (#5883 ) This commit ensures that all tickers are stopped using defer ticker.Stop() style. This will also fix one bug seen when a client starts to listen to event notifications and that case will result a leak in tickers.	2018-05-04 10:43:20 -07:00

1 2 3

137 Commits