Commit Graph

120 Commits

Author SHA1 Message Date
Harshavardhana
1f8b9b4bd5 fix: do not listAndHeal() inline with PutObject() (#17499)
there is a possibility that slow drives can actually add latency
to the overall call, leading to a large spike in latency.

this can happen if there are other parallel listObjects()
calls to the same drive, in-turn causing each other to sort
of serialize.

this potentially improves performance and makes PutObject()
also non-blocking.
2023-06-24 19:31:04 -07:00
Harshavardhana
64de61d15d fallback on etags if they match when mtime is not same (#17424)
on "unversioned" buckets there are situations
when successive concurrent I/O can lead to
an inconsistent state() with mtime while the
etag might be the same for the object on disk.

in such a scenario it is possible for us to
allow reading of the object since etag matches
and if etag matches we are guaranteed that we
have enough copies the object will be readable
and same.

This PR allows fallback in such scenarios.
2023-06-17 19:18:20 -07:00
Harshavardhana
75c6fc4f02 only allow decryption of etag for only sse-s3 (#17335) 2023-06-05 13:08:51 -07:00
Harshavardhana
1cd7f1e38d fix: cleanup empty multipart folders upon stale upload cleanup (#17312) 2023-05-30 09:56:50 -07:00
Harshavardhana
394690dcfb check for upto 50%+ data disks to be offline (#17294) 2023-05-26 22:56:19 -07:00
Klaus Post
c839b64f6a fix: compressed+encrypted block overhead (#17289) 2023-05-26 10:57:07 -07:00
Anis Eleuch
54c5c88fe6 Add number of offline disks in quorum errors (#16822) 2023-05-25 09:39:06 -07:00
Harshavardhana
ef54200db7 offline drives more than 50% of total drives return error (#17252) 2023-05-23 07:57:57 -07:00
Harshavardhana
f7d29b4a53 cleanup of multipart per disk must cleanup itself only (#17223) 2023-05-17 01:45:58 -07:00
Klaus Post
76913a9fd5 Signed trailers for signature v4 (#16484) 2023-05-05 19:53:12 -07:00
Klaus Post
7fad0c8b41 Remove checksums from HTTP range request, add part checksums (#17105) 2023-04-28 08:26:32 -07:00
Praveen raj Mani
72802a5972 Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069) 2023-04-25 22:57:40 -07:00
Krishnan Parthasarathi
fae9000304 heal: Pick maximally occuring modTime in quorum (#17071) 2023-04-25 10:13:57 -07:00
Harshavardhana
84f31ed45d simplify MRF, converge it to regular healing (#17026) 2023-04-19 07:47:42 -07:00
Klaus Post
c133979b8e Add part count to checksum (#17035) 2023-04-14 09:44:45 -07:00
Klaus Post
9acf1024e4 Remove bloom filter (#16682)
Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner.

Also removes a tiny bit of CPU by each write operation.
2023-02-24 09:03:31 +05:30
Anis Elleuch
acc9c033ed debug: Add X-Amz-Request-ID to lock/unlock calls (#16309) 2022-12-23 19:49:07 -08:00
Anis Elleuch
89db3fdb5d Do not return an error when version disparity is detected (#16269) 2022-12-16 08:52:12 -08:00
Harshavardhana
444ff20bc5 do not rename multipart failed transactions back to tmp (#16204) 2022-12-12 01:40:29 -08:00
Klaus Post
12fd6678ee Encrypt checksums with KMS on CompleteMultipartUpload (#16177) 2022-12-07 10:18:18 -08:00
Poorna
63fc6ba2cd preserve replicated ETag properly on target (#16129) 2022-11-26 14:43:32 -08:00
Klaus Post
98ba622679 Reduce temporary file clean-up waits (#16110) 2022-11-22 07:23:36 -08:00
Harshavardhana
6aea950d74 avoid partID lock validating uploadID exists prematurely (#16086) 2022-11-18 03:09:35 -08:00
Aditya Manthramurthy
5f1999cc71 fix: avoid URL unsafe chars in multipart upload ID (#16034) 2022-11-09 16:41:16 -08:00
Harshavardhana
fd6f6fc8df cleanup stale parent multipart directories (#15980) 2022-11-01 08:00:02 -07:00
Klaus Post
86420a1f46 Store multipart checksums (#15953) 2022-10-26 18:14:58 -07:00
Poorna
ce8456a1a9 proxy multipart to peers via multipart uploadID (#15926) 2022-10-25 10:52:29 -07:00
Harshavardhana
328d660106 support CRC32 Checksums on single drive setup (#15873) 2022-10-15 11:58:47 -07:00
Harshavardhana
b04c0697e1 validate correct ETag for the parts sent during CompleteMultipart (#15751) 2022-09-23 21:17:08 -07:00
Harshavardhana
9e5853ecc0 optimize double reads by reusing results from checkUploadIDExists() (#15692)
Move to using `xl.meta` data structure to keep temporary partInfo,
this allows for a future change where we move to different parts to
different drives.
2022-09-15 12:43:49 -07:00
Harshavardhana
124544d834 add pre-conditions support for PUT calls during replication (#15674)
PUT shall only proceed if pre-conditions are met, the new
code uses

- x-minio-source-mtime
- x-minio-source-etag

to verify if the object indeed needs to be replicated
or not, allowing us to avoid StatObject() call.
2022-09-14 18:44:04 -07:00
Klaus Post
8e4a45ec41 fix: encrypt checksums in metadata (#15620) 2022-08-31 08:13:23 -07:00
Klaus Post
a9f1ad7924 Add extended checksum support (#15433) 2022-08-29 16:57:16 -07:00
Anis Elleuch
b3edb25377 bloom: healObject to mark a path dirty only for dangling objects (#15458)
The path is marked dirty automatically when healObject() is called, which is
wrong. HealObject() is called during self-healing and this will lead to
an increase in the false positive result of the bloom filter.

Also move NSUpdated() from renameData() and call it directly in
CompleteMultipart and PutObject, this is not a functional change but
it will make it less prone to errors in the future.
2022-08-02 16:57:39 -07:00
Klaus Post
69bf39f42e fix: make complete multipart uploads faster encrypted/compressed backends (#15375)
- Only fetch the parts we need and abort as soon as one is missing.
- Only fetch the number of parts requested by "ListObjectParts".
2022-07-21 16:47:58 -07:00
Harshavardhana
ce8397f7d9 use partInfo only for intermediate part.x.meta (#15353) 2022-07-19 18:56:24 -07:00
Klaus Post
f939d1c183 Independent Multipart Uploads (#15346)
Do completely independent multipart uploads.

In distributed mode, a lock was held to merge each multipart 
upload as it was added. This lock was highly contested and 
retries are expensive (timewise) in distributed mode.

Instead, each part adds its metadata information uniquely. 
This eliminates the per object lock required for each to merge.
The metadata is read back and merged by "CompleteMultipartUpload" 
without locks when constructing final object.

Co-authored-by: Harshavardhana <harsha@minio.io>
2022-07-19 08:35:29 -07:00
Harshavardhana
7da9e3a6f8 support encrypted/compressed objects properly during decommission (#15320)
fixes #15314
2022-07-16 19:35:24 -07:00
Anis Elleuch
876970baea Exclude upload-ids with incomplete part upload in multipart listing (#15318)
Uploading a part object can leave an inconsistent state inside
.minio.sys/multipart where data are uploaded but xl.meta is not
committed yet.

Do not list upload-ids that have this state in the multipart listing.
2022-07-16 13:25:58 -07:00
Harshavardhana
1b339ea062 allow force delete on decom pool (#15302)
Bonus:

- skip suspended pool from being
  considered for multipart uploads

- add more context for decomErrors()
2022-07-14 20:44:22 -07:00
Harshavardhana
236ef03dbd fix: skip objects expired via lifecycle rules during decommission (#15300) 2022-07-14 16:47:09 -07:00
Klaus Post
911a17b149 Add compressed file index (#15247) 2022-07-11 17:30:56 -07:00
Praveen raj Mani
b49fc33cb3 purge objects immediately with x-minio-force-delete in DeleteObject and DeleteBucket API (#15148) 2022-07-11 09:15:54 -07:00
Anis Elleuch
54a061bdda Save minio version information centrally (#15181) 2022-06-29 14:45:49 -07:00
Harshavardhana
9c605ad153 allow support for parity '0', '1' enabling support for 2,3 drive setups (#15171)
allows for further granular setups

- 2 drives (1 parity, 1 data)
- 3 drives (1 parity, 2 data)

Bonus: allows '0' parity as well.
2022-06-27 20:22:18 -07:00
Harshavardhana
6722f58668 save MinIO version with each version (8-bytes extra) (#15170)
store MinIO version along with each version in 'xl.meta'
for future purposes, can be used as ways to add specific
code for bug fixes if any.
2022-06-27 03:59:41 -07:00
Harshavardhana
52221db7ef fix: for unexpected errors in reading versioning config panic (#14994)
We need to make sure if we cannot read bucket metadata
for some reason, and bucket metadata is not missing and
returning corrupted information we should panic such
handlers to disallow I/O to protect the overall state
on the system.

In-case of such corruption we have a mechanism now
to force recreate the metadata on the bucket, using
`x-minio-force-create` header with `PUT /bucket` API
call.

Additionally fix the versioning config updated state
to be set properly for the site replication healing
to trigger correctly.
2022-05-31 02:57:57 -07:00
Harshavardhana
c7df1ffc6f avoid concurrent reads and writes to opts.UserDefined (#14862)
do not modify opts.UserDefined after object-handler
has set all the necessary values, any mutation needed
should be done on a copy of this value not directly.

As there are other pieces of code that access opts.UserDefined
concurrently this becomes challenging.

fixes #14856
2022-05-05 04:14:41 -07:00
Anis Elleuch
44a3b58e52 Add audit log for decommissioning (#14858) 2022-05-04 00:45:27 -07:00
Klaus Post
64d4da5a37 Add Put input readahead (#14084)
When reading input for PutObject or PutObjectPart add a readahead buffer for big inputs.

This will make network reads+hashing separate run async with erasure coding and writes. This will reduce overall latency in distributed setups where the input is from upstream and writes go to other servers.

We will read at 2 buffers ahead, meaning one will always be ready/waiting and one is currently being read from.

This improves PutObject and PutObjectParts for these cases.
2022-01-14 10:01:25 -08:00