Commit Graph

143 Commits

Author SHA1 Message Date
Harshavardhana 1dd8ef09a6
remove unnecessary 'recreate' code (#19136) 2024-02-27 01:47:58 -08:00
Harshavardhana 6d381f7c0a
relax pre-emptive GetBucketInfo() for multi-object delete (#19035) 2024-02-12 08:46:46 -08:00
Harshavardhana 404d8b3084
fix: dangling objects honor parityBlocks instead of dataBlocks (#19019)
Bonus: do not recreate buckets if NoRecreate is asked.
2024-02-08 15:22:16 -08:00
Anis Eleuch ba975ca320
Add defensive code to ignore checking parts with transitioned objects (#18973)
Though dataErrs are nil with transitioned objects, add a more defensive
code to ignore counting missing parts in that case
2024-02-05 10:48:03 -08:00
Harshavardhana f225ca3312
Add more advanced cases for dangling (#18968) 2024-02-04 14:36:13 -08:00
Anis Eleuch 6ae97aedc9
xl: Disable rename2 in decommissioning/rebalance (#18964)
Always disable rename2 optimization in decom/rebalance
2024-02-03 14:03:30 -08:00
Harshavardhana 80ca120088
remove checkBucketExist check entirely to avoid fan-out calls (#18917)
Each Put, List, Multipart operations heavily rely on making
GetBucketInfo() call to verify if bucket exists or not on
a regular basis. This has a large performance cost when there
are tons of servers involved.

We did optimize this part by vectorizing the bucket calls,
however its not enough, beyond 100 nodes and this becomes
fairly visible in terms of performance.
2024-01-30 12:43:25 -08:00
Harshavardhana 708cebe7f0
add necessary protection err, fileInfo slice reads and writes (#18854)
protection was in place. However, it covered only some
areas, so we re-arranged the code to ensure we could hold
locks properly.

Along with this, remove the DataShardFix code altogether,
in deployments with many drive replacements, this can affect
and lead to quorum loss.
2024-01-24 01:08:23 -08:00
Harshavardhana dd2542e96c
add codespell action (#18818)
Original work here, #18474,  refixed and updated.
2024-01-17 23:03:17 -08:00
Harshavardhana 21d60eab7c
remove all older unused APIs (#18769) 2024-01-17 20:41:23 -08:00
Harshavardhana 38637897ba
fix: listing SSE encrypted multipart objects (#18786)
GetActualSize() was heavily relying on o.Parts()
to be non-empty to figure out if the object is multipart or not, 
However, we have many indicators of whether an object is multipart 
or not.

Blindly assuming that o.Parts == nil is not a multipart, is an 
incorrect expectation instead, multipart must be obtained via

- Stored metadata value indicating this is a multipart encrypted object.

- Rely on <meta>-actual-size metadata to get the object's actual size.
  This value is preserved for additional reasons such as these.

- ETag != 32 length
2024-01-15 00:57:49 -08:00
Shubhendu e31081d79d
Heal buckets at node level (#18612)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2024-01-09 20:34:04 -08:00
Harshavardhana a50ea92c64
feat: introduce list_quorum="auto" to prefer quorum drives (#18084)
NOTE: This feature is not retro-active; it will not cater to previous transactions
on existing setups. 

To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment
variable as part of systemd service or k8s configmap. 

Once this has been enabled, you need to also set `list_quorum`. 

```
~ mc admin config set alias/ api list_quorum=auto` 
```

A new debugging tool is available to check for any missing counters.
2023-12-29 15:52:41 -08:00
Harshavardhana 196e7e072b
allow bitrot files to be healed in MRF (#18618)
bitrot scanMode was ignored in MRF,
allow it to heal relevant content if
needed when seen as an error.
2023-12-08 12:26:01 -08:00
Harshavardhana e30c0e7ca3 Revert "Heal buckets at node level (#18504)"
This reverts commit 708296ae1b.
2023-12-05 22:34:46 -08:00
Shubhendu 708296ae1b
Heal buckets at node level (#18504) 2023-12-05 02:17:35 -08:00
Klaus Post 5f971fea6e
Fix Mux Connect Error (#18567)
`OpMuxConnectError` was not handled correctly.

Remove local checks for single request handlers so they can 
run before being registered locally.

Bonus: Only log IAM bootstrap on startup.
2023-12-01 00:18:04 -08:00
Anis Eleuch b7d11141e1
rename Force to Immediate for clarity (#18540) 2023-11-28 22:35:16 -08:00
Klaus Post e6b0fc465b
tweak healing to include version-id in healing result (#18225) 2023-11-22 12:30:31 -08:00
Harshavardhana a4cfb5e1ed
return errors if dataDir is missing during HeadObject() (#18477)
Bonus: allow replication to attempt Deletes/Puts when
the remote returns quorum errors of some kind, this is
to ensure that MinIO can rewrite the namespace with the
latest version that exists on the source.
2023-11-20 21:33:47 -08:00
Klaus Post 7926df0b80
Fix globalDeploymentID race (#18275)
globalDeploymentID was being read while it was being set.

Fixes race:

```
WARNING: DATA RACE
Write at 0x0000079605a0 by main goroutine:
  github.com/minio/minio/cmd.connectLoadInitFormats()
      github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0
  github.com/minio/minio/cmd.waitForFormatErasure()
      github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d
...

Previous read at 0x0000079605a0 by goroutine 105:
  github.com/minio/minio/cmd.newContext()
      github.com/minio/minio/cmd/utils.go:817 +0x31e
  github.com/minio/minio/cmd.adminMiddleware.func1()
      github.com/minio/minio/cmd/admin-router.go:110 +0x96
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
  github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1()
      github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
...
```
2023-10-18 08:06:57 -07:00
Harshavardhana c34bdc33fb
make sure to set Versioned field to ensure rename2 is not called (#18141)
without this the rename2() can rename the previous dataDir
causing issues for different versions of the object, only
latest version is preserved due to this bug.

Added healing code to ensure recovery of such content.
2023-09-29 09:08:24 -07:00
Aditya Manthramurthy 1c99fb106c
Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Harshavardhana 18b3655c99
with xlv2 format we never had to fill in checksumInfo() (#17963)
- this PR avoids sending a large ChecksumInfo slice
  when its not needed

- also for a file with XLV2 format there is no reason
  to allocate Checksum slice while reading
2023-09-01 13:45:58 -07:00
Harshavardhana 9458485e43
avoid double logging from healing (#17950) 2023-08-30 18:46:04 -07:00
Harshavardhana 45fb375c41
allow healing to prefer local disks over remote (#17788) 2023-08-03 02:18:18 -07:00
Kaan Kabalak 21fbe88e1f
Print certain log messages once per error (#17484) 2023-06-24 20:29:13 -07:00
Harshavardhana 1f8b9b4bd5
fix: do not listAndHeal() inline with PutObject() (#17499)
there is a possibility that slow drives can actually add latency
to the overall call, leading to a large spike in latency.

this can happen if there are other parallel listObjects()
calls to the same drive, in-turn causing each other to sort
of serialize.

this potentially improves performance and makes PutObject()
also non-blocking.
2023-06-24 19:31:04 -07:00
Aditya Manthramurthy 5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Harshavardhana 1443b5927a
allow quorum fileInfo to pick same parityBlocks (#17454)
Bonus: allow replication to proceed for 503 errors such as
with error code SlowDownRead
2023-06-18 18:20:15 -07:00
Harshavardhana 64de61d15d
fallback on etags if they match when mtime is not same (#17424)
on "unversioned" buckets there are situations
when successive concurrent I/O can lead to
an inconsistent state() with mtime while the
etag might be the same for the object on disk.

in such a scenario it is possible for us to
allow reading of the object since etag matches
and if etag matches we are guaranteed that we
have enough copies the object will be readable
and same.

This PR allows fallback in such scenarios.
2023-06-17 19:18:20 -07:00
Anis Eleuch ae95384dd8
Revert "heal: Update object parity with the latest configured SC (#17187)" (#17404) 2023-06-12 11:54:51 -07:00
Anis Eleuch e2b7a08c10
heal: Update object parity with the latest configured SC (#17187) 2023-05-15 21:32:13 -07:00
Praveen raj Mani 72802a5972
Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069) 2023-04-25 22:57:40 -07:00
Krishnan Parthasarathi fae9000304
heal: Pick maximally occuring modTime in quorum (#17071) 2023-04-25 10:13:57 -07:00
Krishnan Parthasarathi 25f7a8e406
Indicate RenameData is called by healObject (#16997) 2023-04-09 10:25:37 -07:00
Harshavardhana 6c11dbffd5
add crash protection from backend modifications (#16846) 2023-03-20 09:08:42 -07:00
ferhat elmas 714283fae2
cleanup ignored static analysis (#16767) 2023-03-06 08:56:10 -08:00
Klaus Post 9acf1024e4
Remove bloom filter (#16682)
Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner.

Also removes a tiny bit of CPU by each write operation.
2023-02-24 09:03:31 +05:30
Klaus Post fd6622458b
Add detailed scanner trace output and notifications (#16668) 2023-02-21 09:33:33 -08:00
Harshavardhana 72daccd468
fix: scanner in healing cycle must use actual size (#16589) 2023-02-10 06:53:03 -08:00
Poorna b22b39de96
Avoid dangling deletes if disk not found (#16401) 2023-01-12 22:20:19 -08:00
Harshavardhana a15a2556c3
converge listBuckets() as a peer call (#16346) 2023-01-03 23:39:40 -08:00
Anis Elleuch acc9c033ed
debug: Add X-Amz-Request-ID to lock/unlock calls (#16309) 2022-12-23 19:49:07 -08:00
Klaus Post 70986b6e6e
Add version id to healresult (#16193) 2022-12-08 07:49:10 -08:00
Aditya Manthramurthy a30cfdd88f
Bump up madmin-go to v2 (#16162) 2022-12-06 13:46:50 -08:00
Klaus Post cc1d8f0057
Check for abandoned data when healing (#16122) 2022-11-28 10:20:55 -08:00
Harshavardhana fd6f6fc8df
cleanup stale parent multipart directories (#15980) 2022-11-01 08:00:02 -07:00
Harshavardhana 136d41775f
remove numAvailableDisks check as it doesn't serve any purpose (#15954) 2022-10-27 09:05:24 -07:00
Harshavardhana 2d9b5a65f1
verify RenameData() versions to be consistent (#15649)
xl.meta gets written and never rolled back, however
we definitely need to validate the state that is
persisted on the disk, if there are inconsistencies

- more than write quorum we should return an error
  to the client

- if write quorum was achieved however there are
  inconsistent xl.meta's we should simply trigger
  an MRF on them
2022-09-05 16:51:37 -07:00