Commit Graph

166 Commits

Author SHA1 Message Date
Klaus Post
974073a2e5
directio: Check if buffers are set. (#13440)
Check if directio buffers have actually been fetched and prevent errors on double Close. Return error on Read after Close.

Fixes

```
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xf8582f]

goroutine 210 [running]:
github.com/minio/minio/internal/ioutil.(*ODirectReader).Read(0xc0054f8320, {0xc0014560b0, 0xa8, 0x44d012})
	github.com/minio/minio/internal/ioutil/odirect_reader.go:88 +0x10f
io.ReadAtLeast({0x428c5c0, 0xc0054f8320}, {0xc0014560b0, 0xa8, 0xa8}, 0xa8)
	io/io.go:328 +0x9a
io.ReadFull(...)
	io/io.go:347
github.com/minio/minio/internal/ioutil.ReadFile({0xc001bf60e0, 0x6})
	github.com/minio/minio/internal/ioutil/read_file.go:48 +0x19b
github.com/minio/minio/cmd.(*FSObjects).scanBucket.func1({{0xc00444e1e0, 0x4d}, 0x0, {0xc0040cf240, 0xe}, {0xc0040cf24f, 0x18}, {0xc0040cf268, 0x18}, 0x0, ...})
	github.com/minio/minio/cmd/fs-v1.go:366 +0x1ea
github.com/minio/minio/cmd.(*folderScanner).scanFolder.func1({0xc00474a6a8, 0xc0065d6793}, 0x0)
	github.com/minio/minio/cmd/data-scanner.go:494 +0xb15
github.com/minio/minio/cmd.readDirFn({0xc002803e80, 0x34}, 0xc000670270)
	github.com/minio/minio/cmd/os-readdir_unix.go:172 +0x638
github.com/minio/minio/cmd.(*folderScanner).scanFolder(0xc002deeb40, {0x42dc9d0, 0xc00068cbc0}, {{0xc001c6e2d0, 0x27}, 0xc0023db8e0, 0x1}, 0xc0001c7ab0)
	github.com/minio/minio/cmd/data-scanner.go:427 +0xa8f
github.com/minio/minio/cmd.(*folderScanner).scanFolder.func2({{0xc001c6e2d0, 0x27}, 0xc0023db8e0, 0x27})
	github.com/minio/minio/cmd/data-scanner.go:549 +0xd0
github.com/minio/minio/cmd.(*folderScanner).scanFolder(0xc002deeb40, {0x42dc9d0, 0xc00068cbc0}, {{0xc0013fa9e0, 0xe}, 0x0, 0x1}, 0xc000670dd8)
	github.com/minio/minio/cmd/data-scanner.go:623 +0x205d
github.com/minio/minio/cmd.scanDataFolder({_, _}, {_, _}, {{{0xc0013fa9e0, 0xe}, 0x802, {0x210f15d2, 0xed8f903b8, 0x5bc0e80}, ...}, ...}, ...)
	github.com/minio/minio/cmd/data-scanner.go:333 +0xc51
github.com/minio/minio/cmd.(*FSObjects).scanBucket(_, {_, _}, {_, _}, {{{0xc0013fa9e0, 0xe}, 0x802, {0x210f15d2, 0xed8f903b8, ...}, ...}, ...})
	github.com/minio/minio/cmd/fs-v1.go:364 +0x305
github.com/minio/minio/cmd.(*FSObjects).NSScanner(0x42dc9d0, {0x42dc9d0, 0xc00068cbc0}, 0x0, 0xc003bcfda0, 0x802)
	github.com/minio/minio/cmd/fs-v1.go:307 +0xa16
github.com/minio/minio/cmd.runDataScanner({0x42dc9d0, 0xc00068cbc0}, {0x436a6c0, 0xc000bfcf50})
	github.com/minio/minio/cmd/data-scanner.go:150 +0x749
created by github.com/minio/minio/cmd.initDataScanner
	github.com/minio/minio/cmd/data-scanner.go:73 +0xb0
```
2021-10-14 10:19:17 -07:00
Krishna Srinivas
03a2a74697
Support speedtest autotune on the server side (#13086) 2021-09-10 17:43:34 -07:00
Harshavardhana
e124d88788
optimize listing operation concurrency (#12728)
- remove use of getOnlineDisks() instead rely on fallbackDisks()
  when disk return errors like diskNotFound, unformattedDisk
  use other fallback disks to list from, instead of paying the
  price for checking getOnlineDisks()

- optimize getDiskID() further to avoid large write locks when
  looking formatLastCheck time window

This new change allows for a more relaxed fallback for listing
allowing for more tolerance and also eventually gain more
consistency in results even if using '3' disks by default.
2021-07-24 22:03:38 -07:00
Anis Elleuch
23ef25b57a
profiling: Return goroutines with sleep duration (#12775)
Add a new goroutine file which has another printing format. We need it
to see how much time each goroutine was blocked. Easier to detect stops.

Co-authored-by: Anis Elleuch <anis@min.io>
2021-07-23 13:16:53 -07:00
Poorna Krishnamoorthy
a69c2a2fb3
Change replication to use read lock instead of writelock (#12581)
Fixes #12573

This PR also adding audit logging for replication activity
2021-06-28 23:58:08 -07:00
Poorna Krishnamoorthy
d00783c923
Use rate.Limiter for bandwidth monitoring (#12506)
Bonus: fixes a hang when bandwidth caps are enabled for
synchronous replication
2021-06-24 18:29:30 -07:00
Harshavardhana
cdeccb5510
feat: Deprecate embedded browser and import console (#12460)
This feature also changes the default port where
the browser is running, now the port has moved
to 9001 and it can be configured with

```
--console-address ":9001"
```
2021-06-17 20:27:04 -07:00
Harshavardhana
1f262daf6f
rename all remaining packages to internal/ (#12418)
This is to ensure that there are no projects
that try to import `minio/minio/pkg` into
their own repo. Any such common packages should
go to `https://github.com/minio/pkg`
2021-06-01 14:59:40 -07:00
Harshavardhana
81d5688d56
move the dependency to minio/pkg for common libraries (#12397) 2021-05-28 15:17:01 -07:00
Andreas Auernhammer
d8eb7d3e15
kms: replace KES client implementation with minio/kes (#12207)
This commit replaces the custom KES client implementation
with the KES SDK from https://github.com/minio/kes

The SDK supports multi-server client load-balancing and
requests retry out of the box. Therefore, this change reduces
the overall complexity within the MinIO server and there
is no need to maintain two separate client implementations.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-10 18:15:11 -07:00
Harshavardhana
1aa5858543
move madmin to github.com/minio/madmin-go (#12239) 2021-05-06 08:52:02 -07:00
Harshavardhana
069432566f update license change for MinIO
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 11:58:53 -07:00
Harshavardhana
bb1198c2c6
revert CreateFile waitForResponse (#12124)
instead use expect continue timeout, and have
higher response header timeout, the new higher
timeout satisfies worse case scenarios for total
response time on a CreateFile operation.

Also set the "expect" continue header to satisfy
expect continue timeout behavior.

Some clients seem to cause CreateFile body to be
truncated, leading to no errors which instead
fails with ObjectNotFound on a PUT operation,
this change avoids such failures appropriately.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 10:18:18 -07:00
Poorna Krishnamoorthy
c9bf6007b4
Use custom transport for remote targets (#12080) 2021-04-16 18:58:26 -07:00
Harshavardhana
a334554f99
fix: add helper for expected path.Clean behavior (#12068)
current usage of path.Clean returns "." for empty strings
instead we need `""` string as-is, make relevant changes
as needed.
2021-04-15 16:32:13 -07:00
Harshavardhana
feafccf007
handle trimming '/' if present in the object names (#11765)
- MultipleDeletes should handle '/' prefix for objectnames
- Trimming the slash alone is enough for ListObjects()
  prefix and markers

fixes #11769
2021-03-11 13:57:03 -08:00
Harshavardhana
691035832a
fix: normalize object layer inputs (#11534)
Cases where we have applications making request
for `//` in object names make sure that all
are normalized to `/` and all such requests that
are prefixed '/' are removed. To ensure a
consistent view from all operations.
2021-03-09 12:58:22 -08:00
Harshavardhana
9ccc483df6
[feat]: change erasure coding default block size from 10MiB to 1MiB (#11721)
major performance improvements in range GETs to avoid large
read amplification when ranges are tiny and random

```
-------------------
Operation: GET
Operations: 142014 -> 339421
Duration: 4m50s -> 4m56s
* Average: +139.41% (+1177.3 MiB/s) throughput, +139.11% (+658.4) obj/s
* Fastest: +125.24% (+1207.4 MiB/s) throughput, +132.32% (+612.9) obj/s
* 50% Median: +139.06% (+1175.7 MiB/s) throughput, +133.46% (+660.9) obj/s
* Slowest: +203.40% (+1267.9 MiB/s) throughput, +198.59% (+753.5) obj/s
```

TTFB from 10MiB BlockSize
```
* First Access TTFB: Avg: 81ms, Median: 61ms, Best: 20ms, Worst: 2.056s
```

TTFB from 1MiB BlockSize
```
* First Access TTFB: Avg: 22ms, Median: 21ms, Best: 8ms, Worst: 91ms
```

Full object reads however do see a slight change which won't be
noticeable in real world, so not doing any comparisons

TTFB still had improvements with full object reads with 1MiB

```
* First Access TTFB: Avg: 68ms, Median: 35ms, Best: 11ms, Worst: 1.16s
```

v/s

TTFB with 10MiB
```
* First Access TTFB: Avg: 388ms, Median: 98ms, Best: 20ms, Worst: 4.156s
```

This change should affect all new uploads, previous uploads should
continue to work with business as usual. But dramatic improvements can
be seen with these changes.
2021-03-06 14:09:34 -08:00
Andreas Auernhammer
f14cc6c943
etag: add FromContentMD5 to parse content-md5 as ETag (#11688)
This commit adds the `FromContentMD5` function to
parse a client-provided content-md5 as ETag.

Further, it also adds multipart ETag computation
for future needs.
2021-03-03 12:58:28 -08:00
Anis Elleuch
e8d8dfa3ae
Add metric for internode RPC calls errors (#11669) 2021-03-01 12:31:33 -08:00
Nitish Tiwari
bbd1244a88
Add support for mTLS for Audit log target (#11645) 2021-03-01 09:19:13 -08:00
Klaus Post
c5b3a675fa
Block profiling tweaks (#11612)
The base profiles contains no valuable data, don't record them.

Reduce block rate by 2 orders of magnitude, should still capture just as valuable data with less CPU strain.
2021-02-27 09:22:14 -08:00
Harshavardhana
79b6a43467
fix: avoid timed value for network calls (#11531)
additionally simply timedValue to have RWMutex
to avoid concurrent calls to DiskInfo() getting
serialized, this has an effect on all calls that
use GetDiskInfo() on the same disks.

Such as getOnlineDisks, getOnlineDisksWithoutHealing
2021-02-12 18:17:52 -08:00
Harshavardhana
2a7b123895
turn off http2 for TLS setups for now (#11523)
due to lots of issues with x/net/http2, as
well as the bundled h2_bundle.go in the go
runtime should be avoided for now.

https://github.com/golang/go/issues/23559
https://github.com/golang/go/issues/42534
https://github.com/golang/go/issues/43989
https://github.com/golang/go/issues/33425
https://github.com/golang/go/issues/29246

With collection of such issues present, it
make sense to remove HTTP2 support for now
2021-02-11 15:53:04 -08:00
Harshavardhana
6cd255d516
fix: allow updated domain names in federation (#11365)
additionally also disallow overlapping domain names
2021-01-28 11:44:48 -08:00
Harshavardhana
f903cae6ff
Support variable server pools (#11256)
Current implementation requires server pools to have
same erasure stripe sizes, to facilitate same SLA
and expectations.

This PR allows server pools to be variadic, i.e they
do not have to be same erasure stripe sizes - instead
they should have SLA for parity ratio.

If the parity ratio cannot be guaranteed by the new
server pool, the deployment is rejected i.e server
pool expansion is not allowed.
2021-01-16 12:08:02 -08:00
Harshavardhana
8565cefe4e fix: allow HTTP2.0 to be always configured 2020-12-22 16:32:58 -08:00
Harshavardhana
5c451d1690
update x/net/http2 to address few bugs (#11144)
additionally also configure http2 healthcheck
values to quickly detect unstable connections
and let them timeout.

also use single transport for proxying requests
2020-12-21 21:42:38 -08:00
Harshavardhana
790833f3b2 Revert "Support variable server sets (#10314)"
This reverts commit aabf053d2f.
2020-12-01 12:02:29 -08:00
Harshavardhana
aabf053d2f
Support variable server sets (#10314) 2020-11-25 16:28:47 -08:00
Harshavardhana
bd2131ba34
add DNS cache support to avoid DNS flooding (#10693)
Go stdlib resolver doesn't support caching DNS
resolutions, since we compile with CGO disabled
we are more probe to DNS flooding for all network
calls to resolve for DNS from the DNS server.

Under various containerized environments such as
VMWare this becomes a problem because there are
no DNS caches available and we may end up overloading
the kube-dns resolver under concurrent I/O.

To circumvent this issue implement a DNSCache resolver
which resolves DNS and caches them for around 10secs
with every 3sec invalidation attempted.
2020-10-16 14:49:05 -07:00
Harshavardhana
2760fc86af
Bump default idleConnsPerHost to control conns in time_wait (#10653)
This PR fixes a hang which occurs quite commonly at higher concurrency
by allowing following changes

- allowing lower connections in time_wait allows faster socket open's
- lower idle connection timeout to ensure that we let kernel
  reclaim the time_wait connections quickly
- increase somaxconn to 4096 instead of 2048 to allow larger tcp
  syn backlogs.

fixes #10413
2020-10-12 14:19:46 -07:00
Harshavardhana
736e58dd68
fix: handle concurrent lockers with multiple optimizations (#10640)
- select lockers which are non-local and online to have
  affinity towards remote servers for lock contention

- optimize lock retry interval to avoid sending too many
  messages during lock contention, reduces average CPU
  usage as well

- if bucket is not set, when deleteObject fails make sure
  setPutObjHeaders() honors lifecycle only if bucket name
  is set.

- fix top locks to list out always the oldest lockers always,
  avoid getting bogged down into map's unordered nature.
2020-10-08 12:32:32 -07:00
Harshavardhana
1f9abbee4d
make sure to release locks upon timeout (#10596)
fixes #10418
2020-09-29 15:18:34 -07:00
Harshavardhana
37a5d5d7a0
reduce timeouts between servers for faster disconnects (#10562) 2020-09-24 20:10:07 -07:00
Krishna Srinivas
230fc0d186
Support for "directory" objects (#10499) 2020-09-19 08:39:41 -07:00
Klaus Post
b7438fe4e6
Copy metadata before spawning goroutine + prealloc maps (#10458)
In `(*cacheObjects).GetObjectNInfo` copy the metadata before spawning a goroutine.

Clean up a few map[string]string copies as well, reducing allocs and simplifying the code.

Fixes #10426
2020-09-10 11:37:22 -07:00
Harshavardhana
c13afd56e8
Remove MaxConnsPerHost settings to avoid potential hangs (#10438)
MaxConnsPerHost can potentially hang a call without any
way to timeout, we do not need this setting for our proxy
and gateway implementations instead IdleConn settings are
good enough.

Also ensure to use NewRequestWithContext and make sure to
take the disks offline only for network errors.

Fixes #10304
2020-09-08 14:22:04 -07:00
Anis Elleuch
46ee8659b4
fix write quorum calculation for bucket operations (#10364)
When the number of disks is odd, the calculation of quorum 
for bucket operations were not correct, fix it.
2020-08-27 12:55:32 -07:00
Harshavardhana
1e2ebc9945
feat: time to bring back http2.0 support (#10230)
Bonus move our CI/CD to go1.14
2020-08-10 09:02:29 -07:00
Harshavardhana
0b8255529a
fix: proxies set keep-alive timeouts to be system dependent (#10199)
Split the DialContext's one for internode and another
for all other external communications especially
proxy forwarders, gateway transport etc.
2020-08-04 14:55:53 -07:00
Klaus Post
968342c732
Remove usage of go-ieproxy for windows (#10009)
There is a potential for deadlock on Windows 10
refer https://github.com/mattn/go-ieproxy/issues/17 

remove this dependency for now.
2020-07-10 12:08:14 -07:00
Harshavardhana
72e0745e2f
fix: migrate to go.etcd.io import path (#9987)
with the merge of https://github.com/etcd-io/etcd/pull/11823
etcd v3.5.0 will now have a properly imported versioned path

this fixes our pending migration to newer repo
2020-07-07 19:04:29 -07:00
Klaus Post
aa4d1021eb
Remove timeout from putobject and listobjects (#9986)
Use a separate client for these calls that can take a long time.

Add request context to these so they are canceled when the client 
disconnects instead except for ListObject which doesn't have any equivalent.
2020-07-07 12:19:57 -07:00
Harshavardhana
e59ee14f40
Tune tcp keep-alives with new kernel timeout options (#9963)
For more deeper understanding https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
2020-07-03 10:03:41 -07:00
Harshavardhana
4915433bd2
Support bucket versioning (#9377)
- Implement a new xl.json 2.0.0 format to support,
  this moves the entire marshaling logic to POSIX
  layer, top layer always consumes a common FileInfo
  construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
  for object placements.

Fixes #2111
2020-06-12 20:04:01 -07:00
Klaus Post
95814359bd
cache disk info to avoid repeated calls (#9682)
This value is requested on every upload when there are multiple zones.

Since this will result in an RPC call to every remote disk this scales 
quite badly in a distributed setup. Load every 1second interval.

2 servers, localhost only. In large distributed setups much bigger 
gains can be expected.

```
Operations: 21743 -> 22454
* Average: +3.28% (+0.0 MiB/s) throughput, +3.28% (+11.9) obj/s
* Fastest: +3.37% (+0.0 MiB/s) throughput, +3.37% (+13.0) obj/s
* 50% Median: +3.03% (+0.0 MiB/s) throughput, +3.03% (+11.2) obj/s
* Slowest: +8.03% (+0.0 MiB/s) throughput, +8.03% (+22.8) obj/s
```

For easy management of this a generic helper has been added.
2020-05-26 12:52:24 -07:00
Harshavardhana
6ac48a65cb
fix: use unused cacheMetrics code in prometheus (#9588)
remove all other unusued/deadcode
2020-05-13 08:15:26 -07:00
Anis Elleuch
6d76efb9bb
Add support of TCP fast open in internode calls (#9486) 2020-05-08 14:33:23 -07:00
Harshavardhana
60d415bb8a
deprecate/remove global WORM mode (#9436)
global WORM mode is a complex piece for which
the time has passed, with the advent of S3 compatible
object locking and retention implementation global
WORM is sort of deprecated, this has been mentioned
in our documentation for some time, now the time
has come for this to go.
2020-04-24 16:37:05 -07:00