Commit Graph

3265 Commits

Author SHA1 Message Date
Harshavardhana
5982965839
fix: re-use bytes.Buffer using sync.Pool (#11156) 2020-12-22 23:22:37 -08:00
Harshavardhana
8565cefe4e fix: allow HTTP2.0 to be always configured 2020-12-22 16:32:58 -08:00
Andreas Auernhammer
8cdf2106b0
refactor cmd/crypto code for SSE handling and parsing (#11045)
This commit refactors the code in `cmd/crypto`
and separates SSE-S3, SSE-C and SSE-KMS.

This commit should not cause any behavior change
except for:
  - `IsRequested(http.Header)`

which now returns the requested type {SSE-C, SSE-S3,
SSE-KMS} and does not consider SSE-C copy headers.

However, SSE-C copy headers alone are anyway not valid.
2020-12-22 09:19:32 -08:00
Harshavardhana
35fafb837b
fix: issues with handling delete markers in metacache (#11150)
Additional cases handled

- fix address situations where healing is not
  triggered on failed writes and deletes.

- consider object exists during listing when
  metadata can be successfully decoded.
2020-12-22 09:16:43 -08:00
Harshavardhana
274bbad5cb
fix: select always online peers for remote listing (#11153)
always find the right set of online peers for remote listing,
this may have an effect on listing if the server is down - we
should do this to avoid always performing transient operations
on bucket->peerClient that is permanently or down for a long
period.
2020-12-22 09:16:07 -08:00
Harshavardhana
5c451d1690
update x/net/http2 to address few bugs (#11144)
additionally also configure http2 healthcheck
values to quickly detect unstable connections
and let them timeout.

also use single transport for proxying requests
2020-12-21 21:42:38 -08:00
Poorna Krishnamoorthy
c987313431
Encrypt remote target if kms is configured (#11034)
Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2020-12-21 16:21:33 -08:00
Anis Elleuch
2ecaab55a6
admin: ServerInfo returns info without object layer initialized (#11142) 2020-12-21 09:35:19 -08:00
Harshavardhana
3e792ae2a2
fix: change defaults for DNS cache dialer (#11145) 2020-12-21 09:33:29 -08:00
Harshavardhana
4cc500a041
normalize users with double // in accessKeys (#11143)
Bonus fix, use constant time compare for secret keys  in web-handlers.go:SetAuth()
2020-12-20 10:09:51 -08:00
Harshavardhana
d8e28830cf
fix: allow STS creds for admin accounts to add users (#11138)
Allow rotating creds with privileges to add users

fixes https://github.com/minio/console/issues/529
2020-12-19 13:24:21 -08:00
Harshavardhana
3e16ec457a
fix: support user/groups with '/' character (#11127)
NOTE: user/groups with `//` shall be normalized to `/`

fixes #11126
2020-12-19 09:36:37 -08:00
Harshavardhana
e5d378931d
fix: delimiter based listing was broken without marker (#11136)
with missing nextMarker with delimiter based listing,
top level prefixes beyond 4500 or max-keys value
wouldn't be sent back for client to ask for the next
batch.

reproduced at a customer deployment, create prefixes
as shown below

```
for year in $(seq 2017 2020)
do
    for month in {01..12}
    do for day in {01..31}
       do
           mc -q cp file myminio/testbucket/dir/day_id=$year-$month-$day/;
       done
    done
done
```

Then perform

```
aws s3api --profile minio --endpoint-url http://localhost:9000 list-objects \
    --bucket testbucket --prefix dir/ --delimiter / --max-keys 1000
```

You shall see missing NextMarker, this would disallow listing beyond max-keys
requested and also disallow beyond 4500 (maxKeyObjectList) prefixes being listed
because client wouldn't know the NextMarker available.

This PR addresses this situation properly by making the implementation
more spec compatible. i.e NextMarker in-fact can be either an object, a prefix
with delimiter depending on the input operation.

This issue was introduced after the list caching changes and has been present
for a while.
2020-12-19 09:36:04 -08:00
Anis Elleuch
e63a10e505
Profiling does not required object layer to be initialized (#11133) 2020-12-18 11:51:15 -08:00
Anis Elleuch
5434088c51
replication: Ensure to always use nano precision source modtime (#11135) 2020-12-18 11:37:28 -08:00
Harshavardhana
a773cf48d8
fix: overlapping object and prefix rejected (#11130)
fixes #11129
2020-12-18 08:51:09 -08:00
Harshavardhana
f714840da7
add _MINIO_SERVER_DEBUG env for enabling debug messages (#11128) 2020-12-17 16:52:47 -08:00
Harshavardhana
7c9ef76f66
fix: timer deadlock on expired timers (#11124)
issue was introduced in #11106 the following
pattern

<-t.C // timer fired

if !t.Stop() {
   <-t.C // timer hangs
}

Seems to hang at the last `t.C` line, this
issue happens because a fired timer cannot be
Stopped() anymore and t.Stop() returns `false`
leading to confusing state of usage.

Refactor the code such that use timers appropriately
with exact requirements in place.
2020-12-17 12:35:02 -08:00
Anis Elleuch
cffdb01279
azure/s3 gateways: Pass ETag during GET call to avoid data corruption (#11024)
Both Azure & S3 gateways call for object information before returning
the stream of the object, however, the object content/length could be
modified meanwhile, which means it can return a corrupted object.

Use ETag to ensure that the object was not modified during the GET call
2020-12-17 09:11:14 -08:00
Harshavardhana
b390a2a0b9
fix: reuser timers in erasure set hotpaths (#11106)
reuser timers in

 - connectDisks() monitoring
 - healMRFRoutine() channel timeouts
2020-12-16 14:33:05 -08:00
Harshavardhana
90158f1e33
fix: avoid logging for Heal APIs in FS mode (#11121)
fixes #11120
2020-12-16 09:46:13 -08:00
Harshavardhana
c606c76323
fix: prioritized latest buckets for crawler to finish the scans faster (#11115)
crawler should only ListBuckets once not for each serverPool,
buckets are same across all pools, across sets and ListBuckets
always returns an unified view, once list buckets returns
sort it by create time to scan the latest buckets earlier
with the assumption that latest buckets would have lesser
content than older buckets allowing them to be scanned faster
and also to be able to provide more closer to latest view.
2020-12-15 17:34:54 -08:00
Klaus Post
e7d3b49a20
metacache: Make very small requests transient (#11109) 2020-12-15 11:25:36 -08:00
Harshavardhana
5df61ab96b
fix: remove gorilla/rpc/ deps fully after our fork (#11108) 2020-12-15 11:18:06 -08:00
Poorna Krishnamoorthy
3456b03b12
Ignore ObjectNotFound errors in delete api while enforcing locking (#11114)
AWS does not report this or version not found as errors in the response.
2020-12-15 11:15:49 -08:00
Klaus Post
f6fb27e8f0
Don't copy interesting ids, clean up logging (#11102)
When searching the caches don't copy the ids, instead inline the loop.

```
Benchmark_bucketMetacache_findCache-32    	   19200	     63490 ns/op	    8303 B/op	       5 allocs/op
Benchmark_bucketMetacache_findCache-32    	   20338	     58609 ns/op	     111 B/op	       4 allocs/op
```

Add a reasonable, but still the simplistic benchmark.

Bonus - make nicer zero alloc logging
2020-12-14 13:13:33 -08:00
Harshavardhana
8368ab76aa
fix: remove the requirement for healing buckets in ListBucketsHeal (#11098)
With new refactor of bucket healing, healing bucket happens
automatically including its metadata, there is no need to
redundant heal buckets also in ListBucketsHeal remove
it.
2020-12-14 12:07:07 -08:00
Harshavardhana
3e83643320
lifecycle improvements and additional debug logging (#11096)
Bonus change fix browser assets
2020-12-13 12:05:54 -08:00
Harshavardhana
2eb52ca5f4
fix: heal bucket metadata right before healing bucket (#11097)
optimization mainly to avoid listing the entire
`.minio.sys/buckets/.minio.sys` directory, this
can get really huge and comes in the way of startup
routines, contents inside `.minio.sys/buckets/.minio.sys`
are rather transient and not necessary to be healed.
2020-12-13 11:57:08 -08:00
Anis Elleuch
f164085227
xl: Always set root disk to true in test environment (#11094)
Tests environments (go test or manual testing) should always consider
the passed disks are root disks and should not rely on disk.IsRootDisk()
function. The reason is that this latter can return a false negative
when called in a busy system. However, returning a false negative will
only occur in a testing environment and not in a production, so we can
accept this trade-off for now.
2020-12-12 16:10:07 -08:00
Harshavardhana
48191dd748
return NoSuchVersion if invalid version-id is specified (#11091) 2020-12-11 20:44:08 -08:00
Anis Elleuch
c4f29d24da
metacache: Ask all disks when drive count is 4 (#11087) 2020-12-11 17:54:31 -08:00
Harshavardhana
db7890660e
fix: a crash when disk is nil, safe access on erasureDisks (#11089)
fixes #11088
2020-12-11 16:58:36 -08:00
Poorna Krishnamoorthy
9adc33efbb
Return version-id header in DeleteObject response (#11090)
even when the object version is non-existent

To make this consistent with aws behavior.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2020-12-11 16:58:15 -08:00
Poorna Krishnamoorthy
8f65aba04b
ignore NoSuchVersion error in DeleteObjects API (#11086)
Currently, the error response reports NoSuchVersion
for a non-existent version-id, whereas AWS ignores it.
2020-12-11 12:39:09 -08:00
Harshavardhana
3a0082f0f1
fix: TTFB prometheus metrics calculation (#11082)
until now metrics was reporting entire call
duration instead of ttfb's this PR fixes it
2020-12-10 23:02:25 -08:00
Klaus Post
4bca62a0bd
crawler: Stream bucket usage cache data (#11068)
Stream bucket caches to storage and through RPC calls.
2020-12-10 13:03:22 -08:00
Klaus Post
82e2be4239
metacache: Speed up cleanup operation (#11078)
Perform cleanup operations on copied data. Avoids read locking
data while determining which caches to keep.

Also, reduce the log(N*N) operation to log(N*M) where M caches 
with the same root or below when checking potential replacements.
2020-12-10 12:30:28 -08:00
Harshavardhana
4550ac6fff
fix: refactor locks to apply them uniquely per node (#11052)
This refactor is done for few reasons below

- to avoid deadlocks in scenarios when number
  of nodes are smaller < actual erasure stripe
  count where in N participating local lockers
  can lead to deadlocks across systems.

- avoids expiry routines to run 1000 of separate
  network operations and routes per disk where
  as each of them are still accessing one single
  local entity.

- it is ideal to have since globalLockServer
  per instance.

- In a 32node deployment however, each server
  group is still concentrated towards the
  same set of lockers that partipicate during
  the write/read phase, unlike previous minio/dsync
  implementation - this potentially avoids send
  32 requests instead we will still send at max
  requests of unique nodes participating in a
  write/read phase.

- reduces overall chattiness on smaller setups.
2020-12-10 07:28:37 -08:00
Klaus Post
e65ed2e44f
listcache: Add path index (#11063)
Add a root path index.

```
Before:
Benchmark_bucketMetacache_findCache-32    	   10000	    730737 ns/op

With excluded prints:
Benchmark_bucketMetacache_findCache-32    	   10000	    207100 ns/op

With the root path:
Benchmark_bucketMetacache_findCache-32    	  705765	      1943 ns/op
```

Benchmark used (not linear):

```Go
func Benchmark_bucketMetacache_findCache(b *testing.B) {
	bm := newBucketMetacache("", false)
	for i := 0; i < b.N; i++ {
		bm.findCache(listPathOptions{
			ID:           mustGetUUID(),
			Bucket:       "",
			BaseDir:      "prefix/" + mustGetUUID(),
			Prefix:       "",
			FilterPrefix: "",
			Marker:       "",
			Limit:        0,
			AskDisks:     0,
			Recursive:    false,
			Separator:    slashSeparator,
			Create:       true,
			CurrentCycle: 0,
			OldestCycle:  0,
		})
	}
}
```

Replaces #11058
2020-12-09 08:37:43 -08:00
Anis Elleuch
d90044b847
federation: Redirect Lifecycle PUT request by bucket name (#11062)
The bucket forwarder handler considers MakeBucket to be always local but
it mistakenly thinks that PUT bucket lifecycle to be a MakeBucket call.

Fix the check of the MakeBucket call by ensuring that the query is empty
in the PUT url.
2020-12-09 07:25:26 -08:00
Harshavardhana
d8c1f93de6
reject mixed drive situations with drives on root disks (#11057)
till now we used to match the inode number of the root
drive and the drive path minio would use, if they match
we knew that its a root disk.

this may not be true in all situations such as running
inside a container environment where the container might
be mounted from a different partition altogether, root
disk detection might fail.
2020-12-09 00:27:02 -08:00
Anis Elleuch
a51488cbaa
s3: Fix reading GET with partNumber specified (#11032)
partNumber was miscalculting the start and end of parts when partNumber
query is specified in the GET request. This commit fixes it and also
fixes the ContentRange header in that case.
2020-12-08 13:12:42 -08:00
Harshavardhana
dc819afa44 fix: auto update crawler meta version
PR 038bcd9079 introduced
version '3', we need to make sure that we do not
print an unexpected error instead log a message to
indicate we will auto update the version.
2020-12-08 10:40:51 -08:00
Harshavardhana
4a564336fe Revert "Add metrics for nodes online and offline (#11050)"
This reverts commit f60bbdf86b.
2020-12-08 09:23:35 -08:00
Ritesh H Shukla
f60bbdf86b
Add metrics for nodes online and offline (#11050) 2020-12-08 01:06:27 -08:00
Poorna Krishnamoorthy
f3beb1236a
Add cache usage, total capacity to prometheus metrics (#11026) 2020-12-07 16:35:11 -08:00
Poorna Krishnamoorthy
934bed47fa
Add transition event notification (#11047)
This is a MinIO specific extension to allow monitoring of transition events.
2020-12-07 13:53:28 -08:00
Ritesh H Shukla
038bcd9079
Add replication capacity metrics support in crawler (#10786) 2020-12-07 13:47:48 -08:00
Harshavardhana
ce93b2681b
fix: re-use er.getDisks() properly in certain calls (#11043) 2020-12-07 10:04:07 -08:00
Harshavardhana
8d036ed6d8
fix: allow sub-admin to modify password for other users (#11039)
fixes #11037
2020-12-06 20:36:34 -08:00
Harshavardhana
9c53cc1b83
fix: heal multiple buckets in bulk (#11029)
makes server startup, orders of magnitude
faster with large number of buckets
2020-12-05 13:00:44 -08:00
Harshavardhana
3514e89eb3
support envs as well for new crawler sub-system (#11033) 2020-12-04 21:54:24 -08:00
Klaus Post
a896125490
Add crawler delay config + dynamic config values (#11018) 2020-12-04 09:32:35 -08:00
Harshavardhana
e083471ec4
use argon2 with sync.Pool for better memory management (#11019) 2020-12-03 19:23:19 -08:00
Harshavardhana
80d31113e5
fix: etcd import paths again depend on v3.4.14 release (#11020)
Due to botched upstream renames of project repositories
and incomplete migration to go.mod support, our current
dependency version of `go.mod` had bugs i.e it was
using commits from master branch which didn't have
the required fixes present in release-3.4 branches

which leads to some rare bugs

https://github.com/etcd-io/etcd/pull/11477 provides
a workaround for now and we should migrate to this.

release-3.5 eventually claims to fix all of this
properly until then we cannot use /v3 import right now
2020-12-03 11:35:18 -08:00
Ritesh H Shukla
7e2b79984e
Stream bucket bandwidth measurements (#11014) 2020-12-03 11:34:42 -08:00
Harshavardhana
951b6b203b skip metacache entries healing to speed up startup 2020-12-02 21:30:54 -08:00
Harshavardhana
44e23b7f4f fix: startup being slow - wait only if IOCount > 0 2020-12-02 21:06:17 -08:00
Harshavardhana
96c0ce1f0c
add support for tuning healing to make healing more aggressive (#11003)
supports `mc admin config set <alias> heal sleep=100ms` to
enable more aggressive healing under certain times.

also optimize some areas that were doing extra checks than
necessary when bitrotscan was enabled, avoid double sleeps
make healing more predictable.

fixes #10497
2020-12-02 11:12:00 -08:00
ebozduman
303be1866d
Adds "x-amz-usr-agent" and "x-id" params to be used in authentication of presignedURL (#10792) 2020-12-02 02:02:49 -08:00
Harshavardhana
4ec45753e6 rename server sets to server pools 2020-12-01 13:50:33 -08:00
Klaus Post
e6ea5c2703
crawler: Missing folder heal check per set (#10876) 2020-12-01 12:07:39 -08:00
Harshavardhana
790833f3b2 Revert "Support variable server sets (#10314)"
This reverts commit aabf053d2f.
2020-12-01 12:02:29 -08:00
Harshavardhana
7cbca43eb1
fix: allow admins to create users (#11005)
PR #10978 introduced a regression, root
credential should be allowed to create users
2020-11-30 21:53:23 -08:00
Poorna Krishnamoorthy
2f564437ae
Disallow writeback caching with cache_after (#11002)
fixes #10974
2020-11-30 20:53:27 -08:00
Harshavardhana
bdd094bc39
fix: avoid sending errors on missing objects on locked buckets (#10994)
make sure multi-object delete returned errors that are AWS S3 compatible
2020-11-28 21:15:45 -08:00
Harshavardhana
e6fa410778
fix: allow accountInfo, addUser and getUserInfo implicit (#10978)
- accountInfo API that returns information about
  user, access to buckets and the size per bucket
- addUser - user is allowed to change their secretKey
- getUserInfo - returns user info if the incoming
  is the same user requesting their information
2020-11-27 17:23:57 -08:00
Harshavardhana
aabf053d2f
Support variable server sets (#10314) 2020-11-25 16:28:47 -08:00
Anis Elleuch
91130e884b
Avoid sending errors in gob in storage requests (#10977) 2020-11-25 12:42:48 -08:00
Poorna Krishnamoorthy
2ff655a745
Refactor replication, ILM handling in DELETE API (#10945) 2020-11-25 11:24:50 -08:00
Klaus Post
0422eda6a2
metacache: Always close block writer (#10973)
In some cases a writer could be left behind unclosed, leaking compression blocks.

Always close and set compression concurrency to 2 which should be fine to keep up.
2020-11-25 09:37:30 -08:00
Harshavardhana
31e6f60847
fix: improve error handling in metacache (#10965) 2020-11-25 01:11:22 -08:00
Poorna Krishnamoorthy
3ad41fe89d
Add admin API to edit remote bucket target credentials (#10848) 2020-11-24 19:09:05 -08:00
Klaus Post
a75fafdbe2
Remove msgp workaround (#10964)
The error in `github.com/philhofer/fwd` was quickly fixed through 
https://github.com/philhofer/fwd/pull/22 - update the dependency 
and remove the workaround.
2020-11-24 11:58:10 -08:00
Klaus Post
a58b7874ef
Temporary workaround for msgp skipping (#10960)
Due to https://github.com/philhofer/fwd/issues/20 when skipping a metadata entry that is >2048 bytes and the buffer is full (2048 bytes) the skip will fail with `io.ErrNoProgress`.

Enlarge the buffer so we temporarily make this much more unlikely.

If it still happens we will have to rewrite the skips to reads.

Fixes #10959
2020-11-23 18:51:59 -08:00
Harshavardhana
6990de9c94
fix: dangling object delete shall return object doesn't exist (#10961)
dangling object when deleted means object doesn't exist
anymore, so we should return appropriate errors, this
allows crawler heal to ensure that it removes the tracker
for dangling objects.
2020-11-23 18:50:53 -08:00
Anis Elleuch
75a8e81f8f
azure: Specify different Azure storage in the shell env (#10943)
AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY are used in 
azure CLI to specify the azure blob storage access & secret keys. With this commit, 
it is possible to set them if you want the gateway's own credentials to be
different from the Azure blob credentials.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-11-23 16:45:56 -08:00
Harshavardhana
519c0077a9
fix: do not return an error for successfully deleted dangling objects (#10938)
dangling objects when removed `mc admin heal -r` or crawler
auto heal would incorrectly return error - this can interfere
with usage calculation as the entry size for this would be
returned as `0`, instead upon success use the resultant
object size to calculate the final size for the object
and avoid reporting this in the log messages

Also do not set ObjectSize in healResultItem to be '-1'
this has an effect on crawler metrics calculating 1 byte
less for objects which seem to be missing their `xl.meta`
2020-11-23 09:12:17 -08:00
Harshavardhana
734d07a532
fix: all hosts local and port same should be local erasure setup (#10951)
this is needed to avoid initializing notification peers
that can lead to races in many sub-systems

fixes #10950
2020-11-23 09:07:50 -08:00
Harshavardhana
df93102235
fix: unwrapping issues with os.Is* functions (#10949)
reduces  3 stat calls, reducing the
overall startup time significantly.
2020-11-23 08:36:49 -08:00
Poorna Krishnamoorthy
39f3d5493b
Show Delete replication status header (#10946)
X-Minio-Replication-Delete-Status header shows the
status of the replication of a permanent delete of a version.

All GETs are disallowed and return 405 on this object version.
In the case of replicating delete markers.

X-Minio-Replication-DeleteMarker-Status shows the status 
of replication, and would similarly return 405.

Additionally, this PR adds reporting of delete marker event completion
and updates documentation
2020-11-21 23:48:50 -08:00
Klaus Post
692ff41ef7
Unwrap network errors (#10934)
Alternative to #10927

Instead of having an upstream fix, do unwrap when checking network errors.

'As' will also work when destination is an interface as checked by the tests.
2020-11-20 22:55:35 -08:00
Harshavardhana
86409fa93d
add audit/admin trace support for browser requests (#10947)
To support this functionality we had to fork
the gorilla/rpc package with relevant changes
2020-11-20 22:52:17 -08:00
Shireesh Anjal
7bc47a14cc
Rename OBD to Health (#10842)
Also, Remove thread stats and openfds from the health report 
as we already have process stats and numfds
2020-11-20 12:52:53 -08:00
Harshavardhana
73e308079a
fix: handle errors appropriately as they are wrapped (#10917) 2020-11-20 10:43:07 -08:00
Poorna Krishnamoorthy
08b24620c0 Display storage-class of transitioned object in HEAD 2020-11-20 09:17:31 -08:00
Harshavardhana
95675b0c9a
fix: do not crash PutObjectTags when node is down (#10940)
fixes #10939
2020-11-20 09:10:48 -08:00
Poorna Krishnamoorthy
251c1ef6da Add support for replication of object tags, retention metadata (#10880) 2020-11-19 18:56:09 -08:00
Poorna Krishnamoorthy
0fa430c1da validate service type of target in replication/ilm transition config (#10928) 2020-11-19 18:47:33 -08:00
Poorna Krishnamoorthy
f60b6eb82e fix validation for deletemarker replication on object locked bucket (#10892) 2020-11-19 18:47:19 -08:00
Poorna Krishnamoorthy
1ebf6f146a Add support for ILM transition (#10565)
This PR adds transition support for ILM
to transition data to another MinIO target
represented by a storage class ARN. Subsequent
GET or HEAD for that object will be streamed from
the transition tier. If PostRestoreObject API is
invoked, the transitioned object can be restored for
duration specified to the source cluster.
2020-11-19 18:47:17 -08:00
Harshavardhana
8f7fe0405e fix: delete marker replication should support directories (#10878)
allow directories to be replicated as well, along with
their delete markers in replication.

Bonus fix to fix bloom filter updates for directories
to be preserved.
2020-11-19 18:47:12 -08:00
Harshavardhana
9a34fd5c4a Revert "Revert "Add delete marker replication support (#10396)""
This reverts commit 267d7bf0a9.
2020-11-19 18:43:58 -08:00
Harshavardhana
f794fe79e3
fix: network shutdown was not handle properly (#10927)
fixes a regression introduced in #10859, due
to the error returned by rest.Client being typed
i.e *rest.NetworkError - IsNetworkHostDown function
didn't work as expected to detect network issues.

This in-turn aggravated the situations when nodes
are disconnected leading to performance loss.
2020-11-19 13:53:49 -08:00
Harshavardhana
0f9e125cf3
fix: check for gateway backend online without http request (#10924)
fixes #10921
2020-11-19 10:38:02 -08:00
Harshavardhana
d778d9493f
remove MinIO release tag as part of HTTP Server string (#10929) 2020-11-19 09:16:02 -08:00
Harshavardhana
70d2c2ccc9
skip files that are not erasure objects or directories (#10926)
without this change WalkDir reports errors while
trying to read `format.json/xl.meta` which is a
replicated file
2020-11-19 09:15:09 -08:00
Harshavardhana
9dea7020f0
allow prefix filtering for WalkDir to be optional (#10923) 2020-11-18 12:03:16 -08:00
Klaus Post
990d074f7d
metacache: Allow prefix filtering (#10920)
Do listings with prefix filter when bloom filter is dirty.

This will forward the prefix filter to the lister which will make it 
only scan the folders/objects with the specified prefix.

If we have a clean bloom filter we try to build a more generally 
useful cache so in that case, we will list all objects/folders.
2020-11-18 10:44:18 -08:00
Klaus Post
e413f05397
Save listing error async (#10922)
Since the RPC call may have to time out save an error state async 
to not hold up the listing returning.

Fixes #10919
2020-11-18 10:28:22 -08:00
Harshavardhana
d1b1fee080
fix: save healing tracker right before healing (#10915)
this change avoids a situation where accidentally
if the user deleted the healing tracker or drives
were replaced again within the 10sec window.
2020-11-18 09:34:46 -08:00
Harshavardhana
9738d605e4
increase readdir per block memory to facilitate faster WalkDir (#10908) 2020-11-18 09:21:02 -08:00
Klaus Post
10099357b6
listcache: Wrap returned errors (#10882)
To give an indication of where they happen
2020-11-17 09:11:59 -08:00
Harshavardhana
80b8ce89a4
remove context deadline from Delete calls (#10901) 2020-11-17 09:09:45 -08:00
Poorna Krishnamoorthy
0b766288ef
fix: send replication completed event notification (#10902) 2020-11-15 22:16:41 -08:00
Rafael Bodill
598ca0569c
fix: global in-place update boolean check (#10900) 2020-11-15 13:34:12 -08:00
Poorna Krishnamoorthy
d295ce5708
Fix disk cache usage percent for prometheus (#10898)
Fixes: #10895

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2020-11-14 19:18:00 -08:00
Klaus Post
b5a3d79bce
listobjectversions: Add shortcut for Veeam blocks (#10893)
Add shortcut for `APN/1.0 Veeam/1.0 Backup/10.0`

It requests unique blocks with a specific prefix. We skip 
scanning the parent directory for more objects matching the prefix.
2020-11-13 16:58:20 -08:00
Harshavardhana
17a5ff51ff
fix: move context timeout closer to network for Delete calls (#10897)
allowing for disconnects to be limited to the drive
themselves instead of disconnecting all drives.
2020-11-13 16:56:45 -08:00
Harshavardhana
0bcb1b679d
fix: disallow update if dates are same (#10890)
fixes #10889
2020-11-12 14:18:59 -08:00
Klaus Post
a3017c724e
Sort directory objects correctly (#10886)
Decode dir objects when listing and sort them correctly.
2020-11-12 13:09:34 -08:00
Harshavardhana
267d7bf0a9 Revert "Add delete marker replication support (#10396)"
This reverts commit 50c10a5087.

PR is moved to origin/dev branch
2020-11-12 11:43:14 -08:00
cksac
be83dfc52a
fix: HDFS list bucket when subpath is provided (#10884) 2020-11-12 11:26:51 -08:00
Harshavardhana
ca88ca753c
ignore typed errors correctly in list cache layer (#10879)
bonus write bucket metadata cache with enough quorum

Possible fix for #10868
2020-11-12 09:28:56 -08:00
Klaus Post
f86d3538f6
Allow deeper sleep (#10883)
Allow each crawler operation to sleep up to 10 seconds on very heavily loaded systems.

This will of course make minimum crawler speed less, but should be more effective at stopping.
2020-11-12 09:17:56 -08:00
Klaus Post
1c3590078d
Skip 0 byte stream writes (#10875)
Don't send a packet when receiving 0 bytes or there is an error recorded
2020-11-11 18:07:40 -08:00
Harshavardhana
aa158228f9
fix: simplify healing metadata objects per set (#10867) 2020-11-11 10:58:16 -08:00
Klaus Post
8747834c69
DeletedObjects: Return objects on lock failure (#10874)
Return objects when locking fails.

<details>
<summary>Panic</summary>

```
: 2020/11/10 04:15:55 http: panic serving 10.10.62.153:44858: runtime error: index out of range [0] with length 0
: goroutine 363537270 [running]:
: net/http.(*conn).serve.func1(0xc019232780)
:         net/http/server.go:1801 +0x147
: panic(0x1cadd60, 0xc001719260)
:         runtime/panic.go:975 +0x47a
: github.com/minio/minio/cmd.criticalErrorHandler.ServeHTTP.func1(0xc0121d1200, 0x210cda0, 0xc0141940e0)
:         github.com/minio/minio/cmd/generic-handlers.go:781 +0x1a8
: panic(0x1cadd60, 0xc001719260)
:         runtime/panic.go:969 +0x1b9
: github.com/minio/minio/cmd.objectAPIHandlers.DeleteMultipleObjectsHandler(0x1e71ce8, 0x1e71cc8, 0x2108420, 0xc0192328c0, 0xc0121d1400)
:         github.com/minio/minio/cmd/bucket-handlers.go:465 +0x2490
: net/http.HandlerFunc.ServeHTTP(...)
:         net/http/server.go:2042
: github.com/minio/minio/cmd.httpTraceAll.func1(0x2108420, 0xc0192328c0, 0xc0121d1400)
:         github.com/minio/minio/cmd/handler-utils.go:353 +0x158
: net/http.HandlerFunc.ServeHTTP(...)
:         net/http/server.go:2042
: github.com/minio/minio/cmd.collectAPIStats.func1(0x2108420, 0xc019232820, 0xc0121d1400)
:         github.com/minio/minio/cmd/handler-utils.go:380 +0xed
: net/http.HandlerFunc.ServeHTTP(...)
:         net/http/server.go:2042
: github.com/minio/minio/cmd.maxClients.func1(0x2108420, 0xc019232820, 0xc0121d1400)
:         github.com/minio/minio/cmd/handler-api.go:132 +0x33b
: net/http.HandlerFunc.ServeHTTP(0xc00271d590, 0x2108420, 0xc019232820, 0xc0121d1400)
:         net/http/server.go:2042 +0x44
: github.com/minio/minio/cmd.redirectHandler.ServeHTTP(0x20e2180, 0xc00271d590, 0x2108420, 0xc019232820, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:192 +0x156
: github.com/minio/minio/cmd.customHeaderHandler.ServeHTTP(0x20e1060, 0xc0141a22b0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:751 +0x162
: github.com/minio/minio/cmd.securityHeaderHandler.ServeHTTP(0x20e0fc0, 0xc0141a22c0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:766 +0x1d6
: github.com/minio/minio/cmd.bucketForwardingHandler.ServeHTTP(0xc0121c7a40, 0x20e1120, 0xc0141a22d0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:624 +0xbf
: github.com/minio/minio/cmd.requestValidityHandler.ServeHTTP(0x20e0f20, 0xc01814d280, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:608 +0x42a
: github.com/minio/minio/cmd.httpStatsHandler.ServeHTTP(0x20e10c0, 0xc0141a2300, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:536 +0xe4
: github.com/minio/minio/cmd.requestSizeLimitHandler.ServeHTTP(0x20e0fe0, 0xc0141a2310, 0x50004000000, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:68 +0xd4
: github.com/minio/minio/cmd.requestHeaderSizeLimitHandler.ServeHTTP(0x20e10a0, 0xc01814d2a0, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:93 +0x1b7
: github.com/minio/minio/cmd.crossDomainPolicy.ServeHTTP(0x20e1080, 0xc0141a2320, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/crossdomain-xml-handler.go:51 +0x82
: github.com/minio/minio/cmd.browserRedirectHandler.ServeHTTP(0x20e0fa0, 0xc0141a2330, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:276 +0x68
: github.com/minio/minio/cmd.minioReservedBucketHandler.ServeHTTP(0x20e0f00, 0xc0141a2340, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:344 +0xb8
: github.com/minio/minio/cmd.cacheControlHandler.ServeHTTP(0x20e1020, 0xc0141a2350, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:303 +0x1ce
: github.com/minio/minio/cmd.timeValidityHandler.ServeHTTP(0x20e0f40, 0xc0141a2360, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:414 +0x3ca
: github.com/minio/minio/cmd.resourceHandler.ServeHTTP(0x20e1160, 0xc0141a2370, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:516 +0xab
: github.com/minio/minio/cmd.authHandler.ServeHTTP(0x20e1100, 0xc0141a2380, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/auth-handler.go:502 +0x2e7
: github.com/minio/minio/cmd.sseTLSHandler.ServeHTTP(0x20e0ee0, 0xc0141a2390, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:802 +0x79
: github.com/minio/minio/cmd.reservedMetadataHandler.ServeHTTP(0x20e1140, 0xc0141a23a0, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:139 +0x1b7
: github.com/gorilla/mux.(*Router).ServeHTTP(0xc00073fb00, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/gorilla/mux@v1.8.0/mux.go:210 +0xd3
: github.com/rs/cors.(*Cors).Handler.func1(0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/rs/cors@v1.7.0/cors.go:219 +0x1b9
: net/http.HandlerFunc.ServeHTTP(0xc0009aece0, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         net/http/server.go:2042 +0x44
: github.com/minio/minio/cmd.criticalErrorHandler.ServeHTTP(0x20e2180, 0xc0009aece0, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/minio/minio/cmd/generic-handlers.go:784 +0x85
: github.com/minio/minio/cmd/http.(*Server).Start.func1(0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/minio/minio/cmd/http/server.go:101 +0x258
: net/http.HandlerFunc.ServeHTTP(0xc000dc4080, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         net/http/server.go:2042 +0x44
: net/http.serverHandler.ServeHTTP(0xc000764c60, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         net/http/server.go:2843 +0xa3
: net/http.(*conn).serve(0xc019232780, 0x2114720, 0xc03381f6c0)
:         net/http/server.go:1925 +0x8ad
: created by net/http.(*Server).Serve
:         net/http/server.go:2969 +0x36c
```
</details>
2020-11-11 09:14:32 -08:00
Poorna Krishnamoorthy
50c10a5087
Add delete marker replication support (#10396)
Delete marker replication is implemented for V2
configuration specified in AWS spec (though AWS
allows it only in the V1 configuration).

This PR also brings in a MinIO only extension of
replicating permanent deletes, i.e. deletes specifying
version id are replicated to target cluster.
2020-11-10 15:24:14 -08:00
Steven Reitsma
4683a623dc
fix: negative STS IAM token TTL value (#10866) 2020-11-10 12:24:01 -08:00
Klaus Post
06899210a7
Reduce health check output (#10859)
This will make the health check clients 'silent'.
Use `IsNetworkOrHostDown` determine if network is ok so it mimics the functionality in the actual client.
2020-11-10 09:28:23 -08:00
Harshavardhana
cbdab62c1e
fix: heal user/metadata right away upon server startup (#10863)
this is needed such that we make sure to heal the
users, policies and bucket metadata right away as
we do listing based on list cache which only lists
'3' sufficiently good drives, to avoid possibly
losing access to these users upon upgrade make
sure to heal them.
2020-11-10 09:02:06 -08:00
Harshavardhana
8df6112204
fix: avoid divide by zero error single node distributed setup (#10862) 2020-11-09 20:40:39 -08:00
Harshavardhana
97692bc772
re-route requests if IAM is not initialized (#10850) 2020-11-07 21:03:06 -08:00
Steven Reitsma
54120107ce
fix: infinite loop in cleanupStaleUploads of encrypted MPUs (#10845)
fixes #10588
2020-11-06 11:53:42 -08:00
Klaus Post
9bf5990ea9
metadata: Invalidate cache if unreadable and not updating (#10844)
If a scanning server shuts down unexpectedly we may have "successful" caches that are incomplete on a set.

In this case mark the cache with an error so it will no longer be handed out.
2020-11-06 08:54:09 -08:00
Steven Reitsma
74f7cf24ae
fix: s3 gateway SSE pagination (#10840)
Fixes #10838
2020-11-05 15:04:03 -08:00
Harshavardhana
fb28aa847b
fix: add missing deleted key element in multiObjectDelete (#10839)
fixes #10832
2020-11-05 12:47:46 -08:00
Klaus Post
0724205f35
metacache: Add option for life extension (#10837)
Add `MINIO_API_EXTEND_LIST_CACHE_LIFE` that will extend 
the life of generated caches for a while.

This changes caches to remain valid until no updates have been 
received for the specified time plus a fixed margin.

This also changes the caches from being invalidated when the *first* 
set finishes until the *last* set has finished plus the specified time 
has passed.
2020-11-05 11:49:56 -08:00
Harshavardhana
b72cac4cf3
fix: dangling objects on actual namespace (#10822) 2020-11-05 11:48:55 -08:00
Klaus Post
bd77f29fc4
Don't replace caches that are receiving updates (#10834)
Keep caches while they are receiving updates.
Move update code to separate function.
2020-11-05 07:34:08 -08:00
Klaus Post
d1e1205036
metacache: Always close the s2 writer (#10836)
The s2 writer could be leaked if there was an error.

Make sure it is always closed.
2020-11-05 07:30:14 -08:00
Harshavardhana
71753e21e0
add missing TTL for STS credentials on etcd (#10828) 2020-11-04 13:06:05 -08:00
Harshavardhana
fde3299bf3
re-use optimized readdir for isDirEmpty() (#10829)
reduces effective memory usage by an order
of magnitude, also increases performance for
small objects
2020-11-04 13:05:21 -08:00
Harshavardhana
1a1f00fa15
fix: use internode data for DisksInfo, VolsInfo in message pack (#10821)
Similar to #10775 for fewer memory allocations, since we use
getOnlineDisks() extensively for listing we should optimize it
further.

Additionally, remove all unused walkers from the storage layer
2020-11-04 10:10:54 -08:00
Bill Thorp
4a1efabda4
Context based AccessKey passing (#10615)
A new field called AccessKey is added to the ReqInfo struct and populated.
Because ReqInfo is added to the context, this allows the AccessKey to be
accessed from 3rd-party code, such as a custom ObjectLayer.

Co-authored-by: Harshavardhana <harsha@minio.io>
Co-authored-by: Kaloyan Raev <kaloyan@storj.io>
2020-11-04 09:13:34 -08:00
Klaus Post
3b88a646ec
Add remote online/offline information (#10825)
Log information about remote clients being marked offline.

This will help to identify root causes of failures.
2020-11-04 08:27:32 -08:00
Klaus Post
2294e53a0b
Don't retain context in locker (#10515)
Use the context for internal timeouts, but disconnect it from outgoing 
calls so we always receive the results and cancel it remotely.
2020-11-04 08:25:42 -08:00
Klaus Post
f0819cce75
Keep transient lists while they are updating (#10826)
On extremely long running listings keep the transient list 15 minutes after last update instead of using start time.

Also don't do overlap checks on transient lists.
2020-11-04 08:01:33 -08:00
Klaus Post
1e11b4629f
Add remote Diskinfo caching (#10824)
Add 1 second remote disk info cache.

Should decrease need for remote calls a great deal due to how actively it is used now.
2020-11-04 08:00:18 -08:00
Harshavardhana
5c72a34fa8
fix: honor delimiter as per AWS S3 spec (#10823) 2020-11-04 07:56:58 -08:00
Klaus Post
b9277c8030
metacache: Add trashcan (#10820)
Add trashcan that keeps recently updated lists after bucket deletion.
All caches were deleted once a bucket was deleted, so caches still running would report errors. Now they are canceled.
Fix `.minio.sys` not being transient.
2020-11-03 12:47:52 -08:00
Harshavardhana
8c76e1353e
initialize IAM after etcd has initialized (#10819) 2020-11-03 12:12:30 -08:00
Harshavardhana
ad382799b1
use list cache for Walk() with webUI and quota (#10814)
bring list cache optimizations for web UI
object listing, also FIFO quota enforcement
through list cache as well.
2020-11-03 08:53:48 -08:00
Harshavardhana
68de5a6f6a
fix: IAM store fallback to list users and policies from disk (#10787)
Bonus fixes, remove package retry it is harder to get it
right, also manage context remove it such that we don't have
to rely on it anymore instead use a simple Jitter retry.
2020-11-02 17:52:13 -08:00
Harshavardhana
4ea31da889
fix: move list quorum ENV to config (#10804) 2020-11-02 17:21:56 -08:00
Klaus Post
0a796505c1
metacache: Check only one disk for updates (#10809)
Check only one disk for updates.

This will reduce IO while waiting for lists to finish.
2020-11-02 17:20:27 -08:00
Klaus Post
37749f4623
Optimize FileInfo(Version) transfer (#10775)
File Info decoding, in particular, is showing up as a major 
allocator and time consumer for internode data transfers

Switch to message pack for cross-server transfers:

```
MSGP:

Size: 945 bytes

BenchmarkEncodeFileInfoMsgp-32    	 1558444	       866 ns/op	   1.16 MB/s	       0 B/op	       0 allocs/op
BenchmarkDecodeFileInfoMsgp-32    	  479968	      2487 ns/op	   0.40 MB/s	     848 B/op	      18 allocs/op

GOB:

Size: 1409 bytes

BenchmarkEncodeFileInfoGOB-32    	  333339	      3237 ns/op	   0.31 MB/s	     576 B/op	      19 allocs/op
BenchmarkDecodeFileInfoGOB-32    	   20869	     57837 ns/op	   0.02 MB/s	   16439 B/op	     428 allocs/op
```
2020-11-02 17:07:52 -08:00
Klaus Post
86e0d272f3
Reduce WriteAll allocs (#10810)
WriteAll saw 127GB allocs in a 5 minute timeframe for 4MiB buffers 
used by `io.CopyBuffer` even if they are pooled.

Since all writers appear to write byte buffers, just send those 
instead and write directly. The files are opened through the `os` 
package so they have no special properties anyway.

This removes the alloc and copy for each operation.

REST sends content length so a precise alloc can be made.
2020-11-02 16:14:31 -08:00