3021 Commits

Author SHA1 Message Date
Harshavardhana
29e7058ebf background delete operations and delete serially every 10mins
addtionally introduce MINIO_DELETE_CLEANUP_INTERVAL environment
value to control this interval, choose a lesser value if higher
speed is necessary. Supports time.Duration format

expory MINIO_DELETE_CLEANUP_INTERVAL=1m

Would let MinIO run delete cleanup interval every minute, once
the previous cycle completes.
2021-03-09 16:49:39 -08:00
Harshavardhana
f864931ab4 delete dangling objects automatically 2021-03-07 00:08:30 -08:00
Harshavardhana
96b1377863 add additional logs 2021-03-07 00:04:11 -08:00
Harshavardhana
bff2f9c733 set http2 for KES communication 2021-02-18 21:43:26 -08:00
Klaus Post
5f41f6043d Avoid synchronizing usage writes (#11560)
If the periodic `case <-t.C:` save gets held up for a long time it will end up
synchronize all disk writes for saving the caches.

We add jitter to per set writes so they don't sync up and don't hold a
lock for the write, since it isn't needed anyway.

If an outage prevents writes for a long while we also add individual
waits for each disk in case there was a queue.

Furthermore limit the number of buffers kept to 2GiB, since this could get
huge in large clusters. This will not act as a hard limit but should be enough
for normal operation.
2021-02-18 21:34:18 -08:00
Ritesh H Shukla
21718705b8
turn off http2 for TLS setups for now (#11523) (#11569)
due to lots of issues with x/net/http2, as
well as the bundled h2_bundle.go in the go
runtime should be avoided for now.

https://github.com/golang/go/issues/23559
https://github.com/golang/go/issues/42534
https://github.com/golang/go/issues/43989
https://github.com/golang/go/issues/33425
https://github.com/golang/go/issues/29246

With collection of such issues present, it
make sense to remove HTTP2 support for now
2021-02-17 19:06:26 -08:00
Harshavardhana
53e0c16976 add bucket name to the log 2021-02-08 23:00:48 -08:00
Harshavardhana
fb78283c0a add GOMAXPROCS back 2021-02-08 22:29:10 -08:00
Harshavardhana
f07c9c58e7 fix: handle setIndexes properly 2021-02-08 22:25:06 -08:00
Harshavardhana
bc89e47066 remove GOMAXPROCS requirement 2021-02-08 21:54:00 -08:00
Harshavardhana
0615d85384 heal sets with optional prefix input 2021-02-05 11:15:42 -08:00
Harshavardhana
42157eb218 listing also match sets index for proper quorum 2021-02-01 22:48:08 -08:00
Harshavardhana
fa1cd6dcce heal multiple buckets in parallel 2021-02-01 22:45:34 -08:00
Harshavardhana
745a4b31ba add support for concurrent heals 2021-01-29 21:59:49 -08:00
Harshavardhana
5151c429e4 fix: add api level throttler for LIST calls 2021-01-28 22:59:15 -08:00
Klaus Post
dc1a46e5d2 crawler: Stream bucket usage cache data (#11068)
Stream bucket caches to storage and through RPC calls.
2021-01-25 21:27:28 -08:00
Harshavardhana
8724d49116 implement Heal sets API to heal erasure sets independently 2021-01-24 19:05:56 -08:00
Harshavardhana
123cfa7573 re-route requests if IAM is not initialized (#10850) 2020-11-08 18:35:33 -08:00
Klaus Post
2439d4fb3c Don't retain context in locker (#10515)
Use the context for internal timeouts, but disconnect it from outgoing
calls so we always receive the results and cancel it remotely.
2020-11-04 10:08:58 -08:00
Harshavardhana
6bd9057bb1 initialize IAM after etcd has initialized 2020-11-03 08:49:27 -08:00
Harshavardhana
2d878b7081 allow requests to be proxied when server is booting up (#10790)
when server is booting up there is a possibility
that users might see '503' because object layer
when not initialized, then the request is proxied
to neighboring peers first one which is online.
2020-10-31 19:38:23 -07:00
Harshavardhana
0570c21671 fix: replaced drive properly by healing the entire drive
Bonus fixes, we do not need reload format anymore
as the replaced drive is healed locally we only need
to ensure that drive heal reloads the drive properly.

We preserve the UUID of the original order, this means
that the replacement in `format.json` doesn't mean that
the drive needs to be reloaded into memory anymore.

fixes #10791
2020-10-31 00:30:14 -07:00
Klaus Post
2c0a81bc91 Optimize decryptObjectInfo (#10726)
`decryptObjectInfo` is a significant bottleneck when listing objects.

Reduce the allocations for a significant speedup.

https://github.com/minio/sio/pull/40

```
λ benchcmp before.txt after.txt
benchmark                          old ns/op     new ns/op     delta
Benchmark_decryptObjectInfo-32     24260928      808656        -96.67%

benchmark                          old MB/s     new MB/s     speedup
Benchmark_decryptObjectInfo-32     0.04         1.24         31.00x

benchmark                          old allocs     new allocs     delta
Benchmark_decryptObjectInfo-32     75112          48996          -34.77%

benchmark                          old bytes     new bytes     delta
Benchmark_decryptObjectInfo-32     287694772     4228076       -98.53%
```
2020-10-31 00:19:53 -07:00
Klaus Post
b0698b4b98 rest client: Expect context timeouts for locks (#10782)
Add option for rest clients to not mark a remote offline for context timeouts.

This can be used if context timeouts are expected on the call.
2020-10-29 10:15:35 -07:00
Harshavardhana
7ec6214e6e fix: A possible crash when fi.Erasure.Distribution is empty (#10779) 2020-10-28 21:00:36 -07:00
Krishna Srinivas
f53c5a020e
fix: heal object shards with ec.index and ec.distribution mismatches (#10773)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-10-28 00:10:20 -07:00
Harshavardhana
5b30bbda92
fix: add more protection distribution to match EcIndex (#10772)
allows for more stricter validation in picking up the right
set of disks for reconstruction.
2020-10-28 00:09:15 -07:00
Shireesh Anjal
858e2a43df
Remove logging info from OBDInfoHandler (#10727)
A lot of logging data is counterproductive. A better implementation with
precise useful log data can be introduced later.
2020-10-27 17:41:48 -07:00
Kaloyan Raev
df9894e275
avoid caching http ranges in background goroutine (#10724) 2020-10-26 23:04:48 -07:00
Krishna Srinivas
592f2f23a3
fix: heal rejects objects with disk re-ordering issue (#10766) 2020-10-26 18:48:47 -07:00
Krishna Srinivas
c49a80db41
fix: use meta.Erasure.Index for GetObject() to reconstruct object (#10764) 2020-10-26 16:19:42 -07:00
Poorna Krishnamoorthy
46275c6547
cache: rename function declarations (#10763) 2020-10-26 15:41:24 -07:00
Poorna Krishnamoorthy
0994ed9783
cache: fix call in GetObjectNInfo (#10762)
Fixes: #10751
2020-10-26 12:30:40 -07:00
Anis Elleuch
eb95353cb1
fix: Get/HeadObject return 404 on non quorum objects (#10753) 2020-10-26 10:30:46 -07:00
Harshavardhana
029758cb20
fix: retain the previous UUID for newly replaced drives (#10759)
only newly replaced drives get the new `format.json`,
this avoids disks reloading their in-memory reference
format, ensures that drives are online without
reloading the in-memory reference format.

keeping reference format in-tact means UUIDs
never change once they are formatted.
2020-10-26 10:29:29 -07:00
Harshavardhana
646d6917ed
turn-off checking for updates completely if MINIO_UPDATE=off (#10752) 2020-10-24 22:39:44 -07:00
Harshavardhana
d9db7f3308
expire lockers if lockers are offline (#10749)
lockers currently might leave stale lockers,
in unknown ways waiting for downed lockers.

locker check interval is high enough to safely
cleanup stale locks.
2020-10-24 13:23:16 -07:00
Harshavardhana
6a8c62f9fd
make sure to preserve UUID from reference format (#10748)
reference format should be source of truth
for inconsistent drives which reconnect,
add them back to their original position

remove automatic fix for existing offline
disk uuids
2020-10-24 13:23:08 -07:00
Anis Elleuch
00124c56d9
erasure: Commit data before xl.meta in RenameData() (#10734)
This will reduce the chance to have updated xl.meta without data.
2020-10-23 21:54:58 -07:00
Anis Elleuch
2c32c2149e
tests: Avoid running TestNSRace in short test mode (#10735) 2020-10-23 21:23:12 -07:00
Harshavardhana
734f258878
fix: slow down auto healing more aggressively (#10730)
Bonus fixes

- logging improvements to ensure that we don't use
  `go logger.LogIf` to avoid runtime.Caller missing
  the function name. log where necessary.
- remove unused code at erasure sets
2020-10-22 13:36:24 -07:00
Anis Elleuch
0e0c53bba4
tests: Lower expectation in addr selection in rand cache dialer (#10739)
Test TestDialContextWithDNSCacheRand was failing sometimes because it depends
on a random selection of addresses when testing random DNS resolution from cache.

Lower addr selection exception to 10%
2020-10-22 09:35:32 -07:00
Poorna Krishnamoorthy
5cc23ae052
validate if iam store is initialized (#10719)
Fixes panic - regression from d6d770c1b16670771640d606690f05d63c5dbea4
2020-10-20 21:28:24 -07:00
Harshavardhana
d6d770c1b1 initialize object layer right after config has loaded 2020-10-19 22:04:59 -07:00
Harshavardhana
b07df5cae1
initialize IAM as soon as object layer is initialized (#10700)
Allow requests to come in for users as soon as object
layer and config are initialized, this allows users
to be authenticated sooner and would succeed automatically
on servers which are yet to fully initialize.
2020-10-19 09:54:40 -07:00
Harshavardhana
c107728676
fix: s3 gateway DNS cache initialization (#10706)
fixes #10705
2020-10-19 01:34:23 -07:00
Anis Elleuch
284a2b9021
ilm: Send delete marker creation event when appropriate (#10696)
Before this commit, the crawler ILM will always send object delete event
notification though this is wrong.
2020-10-16 21:22:12 -07:00
Ritesh H Shukla
0b53e30ecb
Clean up monitor on delete bucket (#10698) 2020-10-16 17:59:31 -07:00
Harshavardhana
bd2131ba34
add DNS cache support to avoid DNS flooding (#10693)
Go stdlib resolver doesn't support caching DNS
resolutions, since we compile with CGO disabled
we are more probe to DNS flooding for all network
calls to resolve for DNS from the DNS server.

Under various containerized environments such as
VMWare this becomes a problem because there are
no DNS caches available and we may end up overloading
the kube-dns resolver under concurrent I/O.

To circumvent this issue implement a DNSCache resolver
which resolves DNS and caches them for around 10secs
with every 3sec invalidation attempted.
2020-10-16 14:49:05 -07:00
ebozduman
1aec168c84
fix: azure gateway should reject bucket names with "." (#10635) 2020-10-16 09:30:18 -07:00