Commit Graph

5752 Commits

Author SHA1 Message Date
ebozduman b154581b65
fix: partially defined cred env vars cause "minio gateway s3" to fail (#12228)
Both credential env vars not needed to start s3 gateway
2021-06-10 22:28:09 -07:00
Anis Elleuch ba5fb2365c
feat: support of ZIP list/get/head as S3 extension (#12267)
When enabled, it is possible to list/get files
inside a zip file without uncompressing it.

Signed-off-by: Anis Elleuch <anis@min.io>
2021-06-10 08:17:03 -07:00
Harshavardhana a93aa2eac1
fix: upon failure attempt an undo for all calls in DeleteBucket() (#12480)
its possible that, version might exist on second pool such that
upon deleteBucket() might have deleted the bucket on pool1 successfully
since it doesn't have any objects, undo such operations properly in
all any error scenario.

Also delete bucket metadata from pool layer rather than sets layer.
2021-06-09 17:13:00 -07:00
Harshavardhana 0980554725
fix: getServerPoolsAvailableSpace() shouldn't crash (#12478)
if one of the disk is offline then DiskInfo can be `nil`
and crash in server pool.
2021-06-09 11:14:47 -07:00
Anis Elleuch 8e9e028c0c
fix: safe update of the audit objectErasureMap (#12477)
objectErasureMap in the audit holds information about the objects
involved in the current S3 operation such as pool index, set an index,
and disk endpoints. One user saw a crash due to a concurrent update of
objectErasureMap information. Use sync.Map to prevent a crash.
2021-06-09 10:51:19 -07:00
Harshavardhana af6366e102 fix: allow GetBucketLifecycle in NAS gateway 2021-06-09 08:48:07 -07:00
Harshavardhana 66d549c05d
remove support for deprecated MINIO_KMS_MASTER_KEY (#12463) 2021-06-08 18:50:14 -07:00
Anis Elleuch 6c8be64cdb
rest: healthcheck should not update failure metrics (#12458)
Otherwise, we can see high numbers of networking issues when a node is
down.
2021-06-08 14:09:26 -07:00
Klaus Post 9a2102f5ed
Always get actual size in CopyObjectPart (#12466)
Always use `GetActualSize` to get the part size, not just when encrypted.

Fixes mint test io.minio.MinioClient.uploadPartCopy, 
error "Range specified is not valid for source object".
2021-06-08 09:51:55 -07:00
Harshavardhana 542fe4ea2e
fix: legacy objects with 10MiB blockSize should use right buffers (#12459)
healing code was using incorrect buffers to heal older
objects with 10MiB erasure blockSize, incorrect calculation
of such buffers can lead to incorrect premature closure of
io.Pipe() during healing.

fixes #12410
2021-06-07 10:06:06 -07:00
Harshavardhana dd2831c1a0
fix: remove parent dirs in RenameData upon failure (#12452)
- it is possible that during I/O failures we might
  leave partially written directories, make sure
  we purge them after.

- rename current data-dir (null) versionId only after
  the newer xl.meta has been written fully.

- attempt removal once for minioMetaTmpBucket/uuid/
  as this folder is empty if all previous operations
  were successful, this allows avoiding recursive os.Remove()
2021-06-07 09:35:08 -07:00
Klaus Post 403f4b9c84
Improve disk usage calculation (#12376)
- for single pool setups usage is not checked.
- for pools, only check the "set" in which it would be placed.
- keep a minimum number of inodes (when we know it).
- ignore for `.minio.sys`.
2021-06-07 08:13:15 -07:00
Anis Elleuch 810af07529
xl: Avoid multi-disks node to exit when one disk fails (#12423)
It makes sense that a node that has multiple disks starts when one
disk fails, returning an i/o error for example. This commit will make this
faulty tolerance available in this specific use case.
2021-06-05 09:10:32 -07:00
Poorna Krishnamoorthy f199afcd6c
tiering: add aws role support for s3 (#12424)
Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-06-04 12:47:00 -07:00
Harshavardhana 36b2f6d11d
fix: etcd IAM encryption fails due to incorrect kms.Context (#12431)
Due to incorrect KMS context constructed, we need to add
additional fallbacks and also fix the original root cause
to fix already migrated deployments.

Bonus remove double migration is avoided in gateway mode
for etcd, instead do it once in iam.Init(), also simplify
the migration by not migrating STS users instead let the
clients regenerate them.
2021-06-04 11:15:13 -07:00
Klaus Post d524544494
Fix nil disk check in parity upgrade feature (#12444)
Fixes #12443
2021-06-04 09:38:19 -07:00
Harshavardhana c0e79e28b2
fix: close the channel appropriately for dataUsageEntry (#12432)
Bonus: initialize dataScanner routines after server
config has initialized.

fixes #12430
2021-06-03 19:18:59 -07:00
Anis Elleuch 3109441258
s3: Return correct error XML tag in case of copy object (#12427)
In Copy Object S3 API, the server does not return correct bucket &
object names when the source bucket/object does not exist, this commit
fixes it.
2021-06-03 17:25:31 -07:00
Aditya Manthramurthy 30a3921d3e
[Tiering] Support remote tiers with object versioning (#12342)
- Adds versioning support for S3 based remote tiers that have versioning
enabled. This ensures that when reading or deleting we specify the specific
version ID of the object. In case of deletion, this is important to ensure that
the object version is actually deleted instead of simply being marked for
deletion.

- Stores the remote object's version id in the tier-journal. Tier-journal file
version is not bumped up as serializing the new struct version is
compatible with old journals without the remote object version id.

- `storageRESTVersion` is bumped up as FileInfo struct now includes a
`TransitionRemoteVersionID` member.

- Azure and GCS support for this feature will be added subsequently.

Co-authored-by: Krishnan Parthasarathi <krisis@users.noreply.github.com>
2021-06-03 14:26:51 -07:00
Shireesh Anjal fb140c146b
Redact sensitive values from config in health data (#12421)
The health api returns the server configuration details. Redact
sensitive values from the config values like URLs and credentials.
2021-06-03 08:15:44 -07:00
Harshavardhana 7a3b5235bf remove deprecated kms_vault unused key name 2021-06-03 00:10:11 -07:00
Poorna Krishnamoorthy dbea8d2ee0
Add support for existing object replication. (#12109)
Also adding an API to allow resyncing replication when
existing object replication is enabled and the remote target
is entirely lost. With the `mc replicate reset` command, the
objects that are eligible for replication as per the replication
config will be resynced to target if existing object replication
is enabled on the rule.
2021-06-01 19:59:11 -07:00
Harshavardhana 1f262daf6f
rename all remaining packages to internal/ (#12418)
This is to ensure that there are no projects
that try to import `minio/minio/pkg` into
their own repo. Any such common packages should
go to `https://github.com/minio/pkg`
2021-06-01 14:59:40 -07:00
Harshavardhana bf87c4b1e4
fix: no need to proxy if IAM not initialized (#12416)
IAM not initialized doesn't mean we can't still
read the content from the disk, we should just
allow the request to go-through if object layer
is initialized.
2021-06-01 12:23:13 -07:00
Harshavardhana 7148c2490a
avoid metrics not meant for single drive mode (#12415)
fixes #12414
2021-06-01 12:22:42 -07:00
Bala FA 120951d9e9
Refactor health data structure (#11914)
This feature comes with simplified data structures and versioning support.

Signed-off-by: Bala.FA <bala.gluster@gmail.com>
2021-06-01 08:55:49 -07:00
Anis Elleuch 8347db8be3
sts: Map parent user to the STS access key policy (#12411) 2021-06-01 08:37:42 -07:00
Poorna Krishnamoorthy 3690de0c6b
Drop Pending size and count from replication metrics (#12378)
Real-time metrics calculated in-memory rely on the initial
replication metrics saved with data usage. However, this can
lag behind the actual state of the cluster at the time of server 
restart leading to inaccurate Pending size/counts reported to
Prometheus. Dropping the Pending metrics as this can be more 
reliably monitored by applications with replication notifications.

Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-05-31 20:26:52 -07:00
Harshavardhana fdc2020b10
move to iam, bucket policy from minio/pkg (#12400) 2021-05-29 21:16:42 -07:00
Harshavardhana 3350dbc50d
always indent and reply policy JSON (#12399) 2021-05-29 09:22:22 -07:00
Harshavardhana 81d5688d56
move the dependency to minio/pkg for common libraries (#12397) 2021-05-28 15:17:01 -07:00
Poorna Krishnamoorthy 547bb7d0a1
replication: Init worker kill channel correctly (#12379)
Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-05-28 13:28:37 -07:00
Harshavardhana 4444ba13a4
support ldap:username for policy substitution (#12390)
LDAPusername is the simpler form of LDAPUser (userDN),
using a simpler version is convenient from policy
conditions point of view, since these are unique id's
used for LDAP login.
2021-05-28 10:33:07 -07:00
Harshavardhana fa8e3151bc
fix: move to new etcd imports (#12391) 2021-05-28 10:31:42 -07:00
Harshavardhana 89bb9f17d7
fix: when parityDrives hits > len(storageDisks)/2, keep maxParity (#12387)
Additionally move out `x-minio-internal-erasure-upgraded` from HTTP headers
list, as its an internal header, rename elsewhere accordingly.
2021-05-27 13:38:04 -07:00
Klaus Post acc452b7ce
Add more erasure codes on degraded systems. (#11852)
In cases where a cluster is degraded, we do not uphold our consistency 
guarantee and we will write fewer erasure codes and rely on healing 
to recreate the missing shards.

In some cases replacing known bad disks in practice take days.
We want to change the behavior of a known degraded system to keep
the erasure code promise of the storage class for each object.

This will create the objects with the same confidence as a fully 
functional cluster. The tradeoff will be that objects created 
during a partial outage will take up slightly more space.

This means that when the storage class is EC:4, there should 
always be written 4 parity shards, even if some disks are unavailable.

When an object is created on a set, the disks are immediately 
checked. If any disks are unavailable additional parity shards 
will be made for each offline disk, up to 50% of the number of disks.

We add an internal metadata field with the actual and intended 
erasure code level, this can optionally be picked up later by 
the scanner if we decide that data like this should be re-sharded.
2021-05-27 11:38:09 -07:00
Harshavardhana be541dba8a
feat: introduce listUsers, listPolicies for any bucket (#12372)
Bonus change LDAP settings such as user, group mappings
are now listed as part of `mc admin user list` and
`mc admin group list`

Additionally this PR also deprecates the `/v2` API
that is no longer in use.
2021-05-27 10:15:02 -07:00
Harshavardhana b5ebfd35b4
fix: always prefer DataBlocks present in FileInfo (#12386) 2021-05-27 10:11:50 -07:00
Anis Elleuch 530b703902
audit/logger: Increase http request timeout (#12385)
A configured audit logger or HTTP logger is validated during MinIO
server startup. Relax the timeout to 10 seconds in that case, otherwise,
both loggers won't be used.

1 second could be too low for a busy HTTP endpoint.
2021-05-27 09:54:10 -07:00
Andreas Auernhammer e8a12cbfdd
etag: compute ETag as MD5 for compressed single-part objects (#12375)
This commit fixes a bug causing the MinIO server to compute
the ETag of a single-part object as MD5 of the compressed
content - not as MD5 of the actual content.

This usually does not affect clients since the MinIO appended
a `-1` to indicate that the ETag belongs to a multipart object.
However, this behavior was problematic since:
 - A S3 client being very strict should reject such an ETag since
   the client uploaded the object via single-part API but got
   a multipart ETag that is not the content MD5.
 - The MinIO server leaks (via the ETag) that it compressed the
   object.

This commit addresses both cases. Now, the MinIO server returns
an ETag equal to the content MD5 for single-part objects that got
compressed.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-27 08:18:41 -07:00
Anis Elleuch e63908c391
Update bloom module (#12383)
To fix dependency import issues when importing madmin-go v0.7.1
2021-05-27 08:02:39 -07:00
Harshavardhana b251ae5f3d fix: update default values for listing, replication workers 2021-05-26 11:55:46 -07:00
Anis Elleuch 0e80b5fe63
tests: Add test for upload of the same object inlined and not inlined (#12374)
Upload an object smaller than small file threshold and upload another file bigger
than small file threshold and tries to read it.
2021-05-26 08:09:23 -07:00
Harshavardhana 225d8c51fd
fix: missing path in admin trace (#12373)
PR #12360 introduced a change which seems to have
added a regression, the RawPath in r.URL seems to
be empty, if it is fallback to r.URL.Path instead.
2021-05-26 08:04:12 -07:00
Klaus Post 3fff50120b
Revert heal locks (#12365)
A lot of healing is likely to be on non-existing objects and 
locks are very expensive and will slow down scanning 
significantly.

In cases where all are valid or, all are broken allow 
rejection without locking.

Keep the existing behavior, but move the check for 
dangling objects to after the lock has been acquired.

```
	_, err = getLatestFileInfo(ctx, partsMetadata, errs)
	if err != nil {
		return er.purgeObjectDangling(ctx, bucket, object, versionID, partsMetadata, errs, []error{}, opts)
	}
```

Revert "heal: Hold lock when reading xl.meta from disks (#12362)"

This reverts commit abd32065aa
2021-05-25 17:02:06 -07:00
Harshavardhana 4840974d7a
fix: inline data upon overwrites should be readable (#12369)
This PR fixes two bugs

- Remove fi.Data upon overwrite of objects from inlined-data to non-inlined-data
- Workaround for an existing bug on disk with latest releases to ignore fi.Data
  and instead read from the disk for non-inlined-data
- Addtionally add a reserved metadata header to indicate data is inlined for
  a given version.
2021-05-25 16:33:06 -07:00
Harshavardhana 4fd1378242
fix: lint errors after upgrading golangci-lint (#12368) 2021-05-25 14:17:33 -07:00
Harshavardhana ed4941a5f3
fix: calculate dataBlocks properly in healing (#12364) 2021-05-25 09:34:27 -07:00
Harshavardhana cacdeca8cc
fix: return error for unexpected quorum in pickValidFileInfo (#12363) 2021-05-24 18:31:56 -07:00
Anis Elleuch abd32065aa
heal: Hold lock when reading xl.meta from disks (#12362)
Lock is hold in healObject() after reading xl.meta from disks the first
time. This commit will held the lock since the beginning of HealObject()

Co-authored-by: Anis Elleuch <anis@min.io>
2021-05-24 13:39:38 -07:00
Harshavardhana 2baabd455b docs: fix per tenant limits docs formatting 2021-05-24 09:37:17 -07:00
Harshavardhana ebf75ef10d
fix: remove all unused code (#12360) 2021-05-24 09:28:19 -07:00
Klaus Post f01820a4ee
fix: invalid multipart offset when compressed+encrypted. (#12340)
Fixes `testSSES3EncryptedGetObjectReadSeekFunctional` mint test.

```
{
  "args": {
    "bucketName": "minio-go-test-w53hbpat649nhvws",
    "objectName": "6mdswladz4vfpp2oit1pkn3qd11te5"
  },
  "duration": 7537,
  "error": "We encountered an internal error, please try again.: cause(The requested range \"bytes 251717932 -> -116384170 of 135333762\" is not satisfiable.)",
  "function": "GetObject(bucketName, objectName)",
  "message": "CopyN failed",
  "name": "minio-go: testSSES3EncryptedGetObjectReadSeekFunctional",
  "status": "FAIL"
}
```

Compressed files always start at the beginning of a part so no additional offset should be added.
2021-05-21 14:07:16 -07:00
Harshavardhana 0287711dc9
fix: implement readMetadata common function for re-use (#12353)
Previous PR #12351 added functions to read from the reader
stream to reduce memory usage, use the same technique in
few other places where we are not interested in reading the
data part.
2021-05-21 11:41:25 -07:00
Klaus Post 9d1b6fb37d
Add XL reader without data (#12351)
Add XL metadata reader that reads metadata only on larger files.

Use for scanning and listing for now.
2021-05-21 09:10:54 -07:00
Harshavardhana 32d8a48d4e
reduce memory usage in metacache reader (#12334) 2021-05-20 09:00:11 -07:00
Harshavardhana 6060b755c6
fix: migrate users properly from older releases to newer (#12333) 2021-05-19 19:25:44 -07:00
Krishnan Parthasarathi cfa94cc35c
Simplify remote tier validation in lifecycle rule validation (#12329) 2021-05-19 18:51:23 -07:00
Klaus Post 2ca9c533ef
feat: implement in-progress partial bucket updates (#12279) 2021-05-19 14:38:30 -07:00
Anis Elleuch 866593fd94
heal: Ignore disks with non quorum modtime and dataDir (#12328) 2021-05-19 12:04:08 -07:00
Harshavardhana ecb5525c91
fix: muxing order for rejected APIs (#12321) 2021-05-19 09:21:34 -07:00
Klaus Post c2c803dd30
Fix list entry deduplication (#12325)
File infos would always be the same.

Add numversions as a final tiebreaker.
2021-05-19 09:21:18 -07:00
Harshavardhana 4f5d75f22b
fix: speed up drive mux registration (#12319)
in setups with lots of drives the server
startup is slow, initialize all local drives
in parallel before registering with muxer.

this speeds up when there are multiple pools
and large collection of drives.
2021-05-18 17:25:00 -07:00
Harshavardhana bb7fbcdc09
fix: generating service accounts for group only LDAP accounts (#12318)
fixes #12315
2021-05-18 15:19:20 -07:00
Andreas Auernhammer 82c53ac260
sse-kms: set KMS key ID response header (#12316)
This commit adds the `X-Amz-Server-Side-Encryption-Aws-Kms-Key-Id`
response header to the GET, HEAD, PUT and Download API.

Based on AWS documentation [1] AWS S3 returns the KMS key ID as part
of the response headers.

[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-kms-encryption.html

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-18 14:21:20 -07:00
Harshavardhana a70e0da19e
use direntPool, direntNamePool for reusable buffers (#12314)
- in readDirFn re-use buffers from direntPool()
- in readDirN use separate dirent name buffer direntNamePool()
2021-05-18 10:29:50 -07:00
Harshavardhana c6b7dc012a
fix: use key.Ciphertext for DecryptKey in KeyStatus (#12313)
enhance GlobalKMS.Stat() for kes to actually perform
a network call to check Version() of kes and also
implicitly that its reachable.
2021-05-18 07:22:31 -07:00
Harshavardhana 2daba018d6
reduce allocations on multi-disk clusters (#12311)
multi-disk clusters initialize buffer pools
per disk, this is perhaps expensive and perhaps
not useful, for a running server instance. As this
may disallow re-use of buffers across sets,

this change ensures that buffers across sets
can be re-used at drive level, this can reduce
quite a lot of memory on large drive setups.
2021-05-17 17:49:48 -07:00
Harshavardhana d610578d84
fix: de-couple IAM migration and loading context from lock context (#12312)
fixes #12307
2021-05-17 16:50:47 -07:00
Harshavardhana a096a92c63
add io.ErrUnexpectedEOF for config retriable errors (#12309)
fixes #12307
2021-05-17 15:13:14 -07:00
Harshavardhana 3d9873106d
feat: distributed setup can start now with default credentials (#12303)
In lieu of new changes coming for server command line, this
change is to deprecate strict requirement for distributed setups
to provide root credentials.

Bonus: remove MINIO_WORM warning from April 2020, it is time to
remove this warning.
2021-05-17 08:45:22 -07:00
Klaus Post cde6469b88
Fix hanging erasure writes (#12253)
However, this slice is also used for closing the writers, so close is never called on these.

Furthermore when an error is returned from a write it is now reported to the reader.

bonus: remove unused heal param from `newBitrotWriter`.

* Remove copy, now that we don't mutate.
2021-05-17 08:32:28 -07:00
Klaus Post 55375fa7f6
Update probabilities for bloom filter. (#12305)
See https://github.com/minio/minio/discussions/12285

Results in M=958506 K=7 and 119840 bytes per filter when serialized compared to 26176 bytes before.
2021-05-17 08:31:04 -07:00
Harshavardhana f1e479d274
remove more duplicate bloom filter trackers (#12302)
At some places bloom filter tracker was getting
updated for `.minio.sys/tmp` bucket, there is no
reason to update bloom filters for those.

And add a missing bloom filter update for MakeBucket()

Bonus: purge unused function deleteEmptyDir()
2021-05-17 08:25:48 -07:00
Harshavardhana 2ab9dc7609 do not update bloomFilters for temporary objects 2021-05-15 19:54:07 -07:00
Harshavardhana 4d876d03e8 fix: do not fail upon faulty/non-writable drives
gracefully start the server, if there are other drives
available - print enough information for administrator
to notice the errors in console.

Bonus: for really large streams use larger buffer for
writes.
2021-05-15 12:57:18 -07:00
Harshavardhana d84261aa6d
fix: ensure proper usage of DataDir (#12300)
- GetObject() should always use a common dataDir to
  read from when it starts reading, this allows the
  code in erasure decoding to have sane expectations.

- Healing should always heal on the common dataDir, this
  allows the code in dangling object detection to purge
  dangling content.

These both situations can happen under certain types of
retries during PUT when server is restarting etc, some
namespace entries might be left over.
2021-05-14 16:50:47 -07:00
Harshavardhana 5b18c57a54
fix: for deleteBucket delete on dnsStore first (#12298)
attempt a delete on remote DNS store first before
attempting locally, because removing at DNS store
is cheaper than deleting locally, in case of
errors locally we can cheaply recreate the
bucket on dnsStore instead of.
2021-05-14 12:40:54 -07:00
Andreas Auernhammer a1f70b106f
sse: add support for SSE-KMS bucket configurations (#12295)
This commit adds support for SSE-KMS bucket configurations.
Before, the MinIO server did not support SSE-KMS, and therefore,
it was not possible to specify an SSE-KMS bucket config.

Now, this is possible. For example:
```
mc encrypt set sse-kms some-key <alias>/my-bucket
```

Further, this commit fixes an issue caused by not supporting
SSE-KMS bucket configuration and switching to SSE-KMS as default
SSE method.

Before, the server just checked whether an SSE bucket config was
present (not which type of SSE config) and applied the default
SSE method (which was switched from SSE-S3 to SSE-KMS).

This caused objects to get encrypted with SSE-KMS even though a
SSE-S3 bucket config was present.

This issue is fixed as a side-effect of this commit.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-14 00:59:05 -07:00
Poorna Krishnamoorthy 951acf561c
Add support for syncing replica modifications (#11104)
when bidirectional replication is set up.

If ReplicaModifications is enabled in the replication
configuration, sync metadata updates to source if
replication rules are met. By default, if this
configuration is unset, MinIO automatically sync's
metadata updates on replica back to the source.
2021-05-13 19:20:45 -07:00
Harshavardhana 397391c89f
fix: parentUser mapped policy for OIDC creds (#12293)
missing parentUser for OIDC STS creds can lead
to fail to authenticate, this PR attempts to
fix the parentUser policy map for distributed
setups.
2021-05-13 16:21:06 -07:00
Andreas Auernhammer 9cd9f5a0b3
check that we can reach KES server and that the default key exists (#12291)
This commit adds a check to the MinIO server setup that verifies
that MinIO can reach KES, if configured, and that the default key
exists. If the default key does not exist it will create it
automatically.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-13 11:13:31 -07:00
Harshavardhana 5c0a7189c7
fix: LDAP authentication with groups only (#12283)
fixes #12282
2021-05-12 21:25:07 -07:00
Harshavardhana 57aed841dd
do not return error for usage-cache version v4 (#12276) 2021-05-12 08:07:02 -07:00
Klaus Post 229d83bb75
feat: add dynamic usage cache (#12229)
A cache structure will be kept with a tree of usages.
The cache is a tree structure where each keeps track 
of its children.

An uncompacted branch contains a count of the files 
only directly at the branch level, and contains link to 
children branches or leaves.

The leaves are "compacted" based on a number of properties.
A compacted leaf contains the totals of all files beneath it.

A leaf is only scanned once every dataUsageUpdateDirCycles,
rarer if the bloom filter for the path is clean and no lifecycles 
are applied. Skipped leaves have their totals transferred from 
the previous cycle.

A clean leaf will be included once every healFolderIncludeProb 
for partial heal scans. When selected there is a one in 
healObjectSelectProb that any object will be chosen for heal scan.

Compaction happens when either:

- The folder (and subfolders) contains less than dataScannerCompactLeastObject objects.
- The folder itself contains more than dataScannerCompactAtFolders folders.
- The folder only contains objects and no subfolders.
- A bucket root will never be compacted.

Furthermore, if a has more than dataScannerCompactAtChildren recursive 
children (uncompacted folders) the tree will be recursively scanned and the 
branches with the least number of objects will be compacted until the limit 
is reached.

This ensures that any branch will never contain an unreasonable amount 
of other branches, and also that small branches with few objects don't 
take up unreasonable amounts of space.

Whenever a branch is scanned, it is assumed that it will be un-compacted
before it hits any of the above limits. This will make the branch rebalance 
itself when scanned if the distribution of objects has changed.

TLDR; With current values: No bucket will ever have more than 10000 
child nodes recursively. No single folder will have more than 2500 child 
nodes by itself. All subfolders are compacted if they have less than 500 
objects in them recursively.

We accumulate the (non-deletemarker) version count for paths as well, 
since we are changing the structure anyway.
2021-05-11 18:36:15 -07:00
Harshavardhana fe21aa356c
fix: if targetUser empty use parentUser for serviceAccounts (#12275) 2021-05-11 13:02:00 -07:00
Anis Elleuch 56d4d7b8b1
MRF: Better detection of non stable disks (#12252)
MRF does not detect when a node is disconnected and reconnected quickly
this change will ensure that MRF is alerted by comparing the last disk
reconnection timestamp with the last MRF check time.

Signed-off-by: Anis Elleuch <anis@min.io>

Co-authored-by: Klaus Post <klauspost@gmail.com>
2021-05-11 09:19:15 -07:00
Harshavardhana e84f533c6c
add missing wait groups for certain io.Pipe() usage (#12264)
wait groups are necessary with io.Pipes() to avoid
races when a blocking function may not be expected
and a Write() -> Close() before Read() races on each
other. We should avoid such situations..

Co-authored-by: Klaus Post <klauspost@gmail.com>
2021-05-11 09:18:37 -07:00
Anis Elleuch 0b34dfb479
lock: Timeout Unlock RPC call (#12213)
RPC unlock call needs to be timed out otherwise this can block
indefinitely.

Signed-off-by: Anis Elleuch <anis@min.io>
2021-05-11 02:11:29 -07:00
Harshavardhana b81fada834
use json unmarshal/marshal from jsoniter in hotpaths (#12269) 2021-05-11 02:02:32 -07:00
Andreas Auernhammer d8eb7d3e15
kms: replace KES client implementation with minio/kes (#12207)
This commit replaces the custom KES client implementation
with the KES SDK from https://github.com/minio/kes

The SDK supports multi-server client load-balancing and
requests retry out of the box. Therefore, this change reduces
the overall complexity within the MinIO server and there
is no need to maintain two separate client implementations.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-10 18:15:11 -07:00
Andreas Auernhammer c03a06cca8
config: enforce AES-GCM in FIPS mode (#12265)
This commit enforces the usage of AES-256
for config and IAM data en/decryption in FIPS
mode.

Further, it improves the implementation of
`fips.Enabled` by making it a compile time
constant. Now, the compiler is able to evaluate
the any `if fips.Enabled { ... }` at compile time
and eliminate unused code.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-10 08:24:11 -07:00
Harshavardhana 2d79d6d847
fix: do not niladic p.writers upon failure (#12255)
p.writers is a verbatim value of bitrotWriter
backed by a pipe() that should never be nil'ed,
instead use the captured errors to skip the writes.

additionally detect also short writes, and reject
them as errors.
2021-05-10 08:20:23 -07:00
Harshavardhana 8b52d70012
fix: IAM not initialized then checkKeyValid() should return 503s (#12260)
currently GetUser() returns 403 when IAM is not initialized
this can lead to applications crashing, instead return 503
so that the applications can retry and backoff.

fixes #12078
2021-05-09 08:14:19 -07:00
Harshavardhana 39d681a04a update fsSimpleRenameFile contrib 2021-05-08 22:31:41 -07:00
Harshavardhana 764721e2c6
add root_disk threshold detection (#12259)
as there is no automatic way to detect if there
is a root disk mounted on / or /var for the container
environments due to how the root disk information
is masked inside overlay root inside container.

this PR brings an environment variable to set
root disk size threshold manually to detect the
root disks in such situations.
2021-05-08 15:40:29 -07:00
Andreas Auernhammer adaae26bbc
sse-kms: fix single-part object decryption (#12257)
This commit fixes a bug in the single-part object decryption
that is triggered in case of SSE-KMS. Before, it was assumed
that the encryption is either SSE-C or SSE-S3. In case of SSE-KMS
the SSE-C branch was executed. This lead to an invalid SSE-C
algorithm error.

This commit fixes this by inverting the `if-else` logic.
Now, the SSE-C branch only gets executed when SSE-C headers
are present.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-07 14:40:57 -07:00
Andreas Auernhammer 0ba8c0a19b
sse-kms: fix assignment to potential nil map (#12250)
This commit fixes a bug introduced by af0c65b.
When there is no / an empty client-provided SSE-KMS
context the `ParseMetadata` may return a nil map
(`kms.Context`).

When unsealing the object key we must check that
the context is nil before assigning a key-value pair.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-07 09:16:49 -07:00
Anis Elleuch cb0b36f8c2
svcacct: Fix updating service account and add missing check (#12251)
UpdateServiceAccount ignores updating fields when not passed from upper
layer, such as empty policy, empty account status, and empty secret key.

This PR will check for a secret key only if it is empty and add more
check on the value of the account status.

Signed-off-by: Anis Elleuch <anis@min.io>
2021-05-07 09:13:30 -07:00
Klaus Post 254698f126
fix: minor allocation improvements in xlMetaV2 (#12133) 2021-05-07 09:11:05 -07:00
Krishnan Parthasarathi 0bab1c1895
Heal restored object contents on disk (#12238) 2021-05-06 16:06:57 -07:00
Harshavardhana d495cb68d3
fix: crash in prometherus metrics collector (#12244)
node_health metrics crashes in gateway mode,
in gateway mode ignore node health metrics.

fixes #12243
2021-05-06 15:43:34 -07:00
Andreas Auernhammer af0c65be93
add SSE-KMS support and use SSE-KMS for auto encryption (#12237)
This commit adds basic SSE-KMS support.
Now, a client can specify the SSE-KMS headers
(algorithm, optional key-id, optional context)
such that the object gets encrypted using the
SSE-KMS method. Further, auto-encryption now
defaults to SSE-KMS.

This commit does not try to do any refactoring
and instead tries to implement SSE-KMS as a minimal
change to the code base. However, refactoring the entire
crypto-related code is planned - but needs a separate
effort.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-06 15:24:01 -07:00
Nitish Tiwari 776589f0da Add free inode metric for Prometheus (#12225) 2021-05-06 12:50:48 -07:00
Harshavardhana 361940706d
fix: avoid races in NewMultipartUpload under multiple pools (#12233)
It is possible in some scenarios that in multiple pools,
two concurrent calls for the same object as a multipart operation
can lead to duplicate entries on two different pools.

This PR fixes this

- hold locks to serialize multiple callers so that we don't race.
- make sure to look for existing objects on the namespace as well
  not just for existing uploadIDs
2021-05-06 10:45:33 -07:00
Harshavardhana 1aa5858543
move madmin to github.com/minio/madmin-go (#12239) 2021-05-06 08:52:02 -07:00
Harshavardhana f4623ea8dc fix: validate secret key before updating service accounts 2021-05-05 16:41:47 -07:00
Harshavardhana b8833c2947 do not change targetUser after permission validation
for service accounts make sure that targetUser is
always the one that is presented/validated from
the incoming request, not the parentUser.
2021-05-05 16:13:52 -07:00
Anis Elleuch af1b6e3458
iam: Do not create service accounts for non existant IAM users (#12236)
When running MinIO server without LDAP/OpenID, we should error out when
the code tries to create a service account for a non existant regular
user.

Bonus: refactor the check code to be show all cases more clearly

Signed-off-by: Anis Elleuch <anis@min.io>

Co-authored-by: Anis Elleuch <anis@min.io>
2021-05-05 16:04:50 -07:00
Harshavardhana 0eeb0a4e04 Revert "add SSE-KMS support and use SSE-KMS for auto encryption (#11767)"
This reverts commit 26f1fcab7d.
2021-05-05 15:20:46 -07:00
Andreas Auernhammer 26f1fcab7d
add SSE-KMS support and use SSE-KMS for auto encryption (#11767)
This commit adds basic SSE-KMS support.
Now, a client can specify the SSE-KMS headers
(algorithm, optional key-id, optional context)
such that the object gets encrypted using the
SSE-KMS method. Further, auto-encryption now
defaults to SSE-KMS.

This commit does not try to do any refactoring
and instead tries to implement SSE-KMS as a minimal
change to the code base. However, refactoring the entire
crypto-related code is planned - but needs a separate
effort.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
Co-authored-by: Klaus Post <klauspost@gmail.com>
2021-05-05 11:24:14 -07:00
Harshavardhana 3a0e7347ca
support startTLS with serverName TLSConfig (#12219)
fixes #12216
2021-05-04 20:13:24 -07:00
Harshavardhana 67001e3ce9
fix: allow root credentials to generate STS, service accounts (#12210) 2021-05-04 11:58:19 -07:00
Harshavardhana 804a23a06d update docs to remove _OLD credential references
also update the docs about config, IAM on encryption.
2021-05-04 10:27:51 -07:00
Nitish Tiwari c8aa56ccd7
Add node cpu & memory metrics to Prometheus cluster endpoint (#12214) 2021-05-04 10:17:10 -07:00
Harshavardhana ff36baeaa7
fix: attempt to drain the ReadFileStream for connection pooling (#12208)
avoid time_wait build up with getObject requests if there are
pending callers and they timeout, can lead to time_wait states

Bonus share the same buffer pool with erasure healing logic,
additionally also fixes a race where parallel readers were
never cleanup during Encode() phase, because pipe.Reader end
was never closed().

Added closer right away upon an error during Encode to make
sure to avoid racy Close() while stream was still being
Read().
2021-05-04 10:12:08 -07:00
Krishnan Parthasarathi 860bf1bab2
Add IsRemote method on FileInfo, ObjectInfo (#12209)
Provides a convenient method to know if an object's contents are in its remote
tier.
2021-05-04 08:40:42 -07:00
Andreas Auernhammer 4815f92fa8
fix MINIO_KMS_SECRET_KEY env. variable parsing (#12200)
This commit fixes a bug when parsing the env. variable
`MINIO_KMS_SECRET_KEY`. Before, the env. variable
name - instead of its value - was parsed. This (obviously)
did not work properly.

This commit fixes this.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-04-30 18:47:30 -07:00
Harshavardhana 0d3ddf7286
fix: improve NewObjectReader implementation for careful cleanup usage (#12199)
cleanup functions should never be cleaned before the reader is
instantiated, this type of design leads to situations where order
of lockers and places for them to use becomes confusing.

Allow WithCleanupFuncs() if the caller wishes to add cleanupFns
to be run upon close() or an error during initialization of the
reader.

Also make sure streams are closed before we unlock the resources,
this allows for ordered cleanup of resources.
2021-04-30 18:37:58 -07:00
Harshavardhana 3524c00090 fix: nats testdata relocation fix 2021-04-30 17:22:56 -07:00
Harshavardhana f7a87b30bf Revert "deprecate embedded browser (#12163)"
This reverts commit 736d8cbac4.

Bring contrib files for older contributions
2021-04-30 08:50:39 -07:00
Harshavardhana 64f6020854
fix: cleanup locking, cancel context upon lock timeout (#12183)
upon errors to acquire lock context would still leak,
since the cancel would never be called. since the lock
is never acquired - proactively clear it before returning.
2021-04-29 20:55:21 -07:00
Harshavardhana 0faa4e6187
fix: make sure failed requests only to failed queue (#12196)
failed queue should be used for retried requests to
avoid cascading the failures into incoming queue, this
would allow for a more fair retry for failed replicas.

Additionally also avoid taking context in queue task
to avoid confusion, simplifies its usage.
2021-04-29 18:20:39 -07:00
Poorna Krishnamoorthy 90112b5644
Update ReplicationStatus if metadata not updated correctly (#12191)
There can be situations where replication completed but the
`X-Amz-Replication-Status` metadata update failed such as
when the server returns 503 under high load. This object version will
continue to be picked up by the scanner and replicateObject would perform
no action since the versions match between source and target.
The metadata would never reflect that replication was successful
without this fix, leading to repeated re-queuing.
2021-04-29 16:46:26 -07:00
Harshavardhana c4b21ac7fa
fix: remove healthcheck routine for replication targets (#12192)
Bonus also fix a racy lookup on arnsMap() without a
read lock, hold read locks to avoid such race.

moving the healthcheck logic to minio-go
2021-04-29 16:41:28 -07:00
Andreas Auernhammer e5ec1325fc
docs: add QuickStart section to KMS encryption of IAM data (#12190)
This commit enhances the docs about IAM encryption.
It adds a quick-start section that explains how to
get started quickly with `MINIO_KMS_SECRET_KEY`
instead of setting up KES.

It also removes the startup message that gets printed
when the server migrates IAM data to plaintext.
We will point this out in the release notes.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-04-29 14:20:28 -07:00
Harshavardhana c5a80ca5d5
support service accounts for OpenID connect properly (#12178)
OpenID connect generated service accounts do not work
properly after console logout, since the parentUser state
is lost - instead use sub+iss claims for parentUser, according
to OIDC spec both the claims provide the necessary stability
across logins etc.
2021-04-29 13:01:42 -07:00
Harshavardhana 8cd89e10ea Revert "fix: remove deprecated MINIO_ACCESS_KEY, MINIO_SECRET_KEY envs (#12173)"
This reverts commit b0baaeaa3d.
2021-04-29 10:56:53 -07:00
Harshavardhana 091845df39
fix: return quorum error upon decode failures (#12184) 2021-04-29 10:00:03 -07:00
Harshavardhana 336c8ac99f
fix: do not heal when disks are down (#12186)
HeadObject() was erroneously attempting
a heal when disks are down, avoid it.
2021-04-29 09:54:16 -07:00
Harshavardhana b3c8a1864f
fix: optimize ListBuckets for anonymous users (#12182)
anonymous users are never allowed to listBuckets(),
we do not need to further validate the policy, we can
simply reject if credentials are empty.
2021-04-28 21:37:02 -07:00
Poorna Krishnamoorthy 632252ff1d
fix: change SetRemoteTarget API to allow editing remote target granularly (#12175)
Currently, only credentials could be updated with
`mc admin bucket remote edit`. 

Allow updating synchronous replication flag, path, 
bandwidth and healthcheck duration on buckets, and
a flag to disable proxying in active-active replication.
2021-04-28 15:26:20 -07:00
Harshavardhana 77f9c71133 Revert "redirect to console project for browser (#12172)"
This reverts commit 301669cf7b.

fixes #12179
2021-04-28 12:22:15 -07:00
Krishnan Parthasarathi 0c9d095deb
ilm: Close warmBackend GetObject reader (#12174) 2021-04-27 22:42:18 -07:00
Harshavardhana b0baaeaa3d
fix: remove deprecated MINIO_ACCESS_KEY, MINIO_SECRET_KEY envs (#12173) 2021-04-27 22:41:24 -07:00
Harshavardhana 301669cf7b
redirect to console project for browser (#12172) 2021-04-27 16:39:41 -07:00
Anis Elleuch 9e797532dc
lock: Always cancel the returned Get(R)Lock context (#12162)
* lock: Always cancel the returned Get(R)Lock context

There is a leak with cancel created inside the locking mechanism. The
cancel purpose was to cancel operations such erasure get/put that are
holding non-refreshable locks.

This PR will ensure the created context.Cancel is passed to the unlock
API so it will cleanup and avoid leaks.

* locks: Avoid returning nil cancel in local lockers

Since there is no Refresh mechanism in the local locking mechanism, we
do not generate a new context or cancel. Currently, a nil cancel
function is returned but this can cause a crash. Return a dummy function
instead.
2021-04-27 16:12:50 -07:00
Harshavardhana 736d8cbac4
deprecate embedded browser (#12163)
https://github.com/minio/console takes over the functionality for the
future object browser development

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-27 10:52:12 -07:00
Harshavardhana cf335f6c63
service accounts should use LDAP user DN to assign credentials (#12166)
LDAP DN should be used when allowing setting service accounts
for LDAP users instead of just simple user,

Bonus root owner should be allowed full access
to all service account APIs.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-27 10:04:08 -07:00
Harshavardhana c8050bc079
fix: sleeper behavior in data scanner (#12164)
do not apply healReplication() for ILM
expired, transitioned objects
2021-04-27 08:24:44 -07:00
Harshavardhana edda244066 move pkg/rpc, pkg/csvparser, pkg/argon2 to contrib
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-26 18:24:40 -07:00
Poorna Krishnamoorthy 4be0f92067
Fix multipart restore to remove part match (#12161)
Part ETags are not available after multipart finalizes, removing this
check as not useful.

Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2021-04-26 18:24:06 -07:00
Harshavardhana 26544848ea
remove legacy master_key support by June (#12153)
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-26 16:02:05 -07:00
Harshavardhana 2966823818
use jsoniter for json marshal/unmarshal in KMS (#12146)
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-26 16:01:52 -07:00
Harshavardhana d501c5e38b
add missing responseBody drain (#12147)
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-26 08:59:54 -07:00
Harshavardhana d825d92499 rename production to release directory, rebuild assets 2021-04-25 16:51:29 -07:00
Andreas Auernhammer f7feff8665
avoid parsing MINIO_KMS_MASTER_KEY as base64 (#12149)
This commit reverts a change that added support for
parsing base64-encoded keys set via `MINIO_KMS_MASTER_KEY`.

The env. variable `MINIO_KMS_MASTER_KEY` is deprecated and
should ONLY support parsing existing keys - not the new format.

Any new deployment should use `MINIO_KMS_SECRET_KEY`. The legacy
env. variable `MINIO_KMS_MASTER_KEY` will be removed at some point
in time.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-04-25 11:04:31 -07:00
Harshavardhana 4eb9b6eaf8
preserve metadata multipart restore (#12139)
avoid re-read of xl.meta instead just use
the success criteria from PutObjectPart()
and check the ETag matches per Part, if
they match then the parts have been
successfully restored as is.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-24 19:07:27 -07:00
Harshavardhana f420996dfa
fix: allow parsing keys in both new and old format (#12144)
Bonus fix fallback to decrypt previously
encrypted content as well using older master
key ciphertext format.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-24 19:05:25 -07:00
Poorna Krishnamoorthy 5d954ea228
fix: versionID and MTime for restored object (#12145)
Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-04-24 19:04:35 -07:00
Harshavardhana 25d3c73162
add HEAD for cluster healthcheck (#12140)
fixes #12130

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 22:47:39 -07:00
Harshavardhana 82dc6aff1c
add support for configurable replication MRF workers (#12125)
just like replication workers, allow failed replication
workers to be configurable in situations like DR failures
etc to catch up on replication sooner when DR is back
online.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 21:58:45 -07:00
Poorna Krishnamoorthy 014e419151
fix: ensure pending replication queued to MRF queue (#12138)
Signed-off-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-04-23 16:52:57 -07:00
Harshavardhana 799691eded
fix: reload LDAP users properly with latest mapping (#12137)
peer nodes would not update if policy is unset on
a user, until policies reload every 5minutes. Make
sure to reload the policies properly, if no policy
is found make sure to delete such users and groups

fixes #12074

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 15:11:01 -07:00
Harshavardhana cbfdf97abf Use CompleteMultipartUpload in RestoreTransitionedObject
Signed-off-by: Krishnan Parthasarathi <kp@minio.io>
2021-04-23 11:58:53 -07:00
Krishnan Parthasarathi 3831027c54 fix: compiler errors in restoreTransitionedObject (#12120) 2021-04-23 11:58:53 -07:00
Harshavardhana 4d53054f8c update internode API for FileInfo change
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 11:58:53 -07:00
Krishnan Parthasarathi c829e3a13b Support for remote tier management (#12090)
With this change, MinIO's ILM supports transitioning objects to a remote tier.
This change includes support for Azure Blob Storage, AWS S3 compatible object
storage incl. MinIO and Google Cloud Storage as remote tier storage backends.

Some new additions include:

 - Admin APIs remote tier configuration management

 - Simple journal to track remote objects to be 'collected'
   This is used by object API handlers which 'mutate' object versions by
   overwriting/replacing content (Put/CopyObject) or removing the version
   itself (e.g DeleteObjectVersion).

 - Rework of previous ILM transition to fit the new model
   In the new model, a storage class (a.k.a remote tier) is defined by the
   'remote' object storage type (one of s3, azure, GCS), bucket name and a
   prefix.

* Fixed bugs, review comments, and more unit-tests

- Leverage inline small object feature
- Migrate legacy objects to the latest object format before transitioning
- Fix restore to particular version if specified
- Extend SharedDataDirCount to handle transitioned and restored objects
- Restore-object should accept version-id for version-suspended bucket (#12091)
- Check if remote tier creds have sufficient permissions
- Bonus minor fixes to existing error messages

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
Co-authored-by: Krishna Srinivas <krishna@minio.io>
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 11:58:53 -07:00
Harshavardhana 069432566f update license change for MinIO
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 11:58:53 -07:00
Klaus Post e0d3a8c1f4
Alloc less for metacache decompression (#12134)
Network streams are limited to 16K blocks. Don't alloc more upfront.

Signed-off-by: Klaus Post <klauspost@gmail.com>
2021-04-23 10:27:42 -07:00
Harshavardhana bb1198c2c6
revert CreateFile waitForResponse (#12124)
instead use expect continue timeout, and have
higher response header timeout, the new higher
timeout satisfies worse case scenarios for total
response time on a CreateFile operation.

Also set the "expect" continue header to satisfy
expect continue timeout behavior.

Some clients seem to cause CreateFile body to be
truncated, leading to no errors which instead
fails with ObjectNotFound on a PUT operation,
this change avoids such failures appropriately.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 10:18:18 -07:00
Anis Elleuch c9dfa0d87b
audit: Add field to know who triggered the operation (#12129)
This is for now needed to know if an external S3 request deleted a file
or it was the scanner.

Signed-off-by: Anis Elleuch <anis@min.io>
2021-04-23 09:51:12 -07:00
Harshavardhana d0d67f9de0
feat: allow prometheus for only authorized users (#12121)
allow restrictions on who can access Prometheus
endpoint, additionally add prometheus as part of
diagnostics canned policy.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-22 18:55:30 -07:00
Andreas Auernhammer 3455f786fa kms: encrypt IAM/config data with the KMS (#12041)
This commit changes the config/IAM encryption
process. Instead of encrypting config data
(users, policies etc.) with the root credentials
MinIO now encrypts this data with a KMS - if configured.

Therefore, this PR moves the MinIO-KMS configuration (via
env. variables) to a "top-level" configuration.
The KMS configuration cannot be stored in the config file
since it is used to decrypt the config file in the first
place.

As a consequence, this commit also removes support for
Hashicorp Vault - which has been deprecated anyway.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-04-22 09:51:09 -07:00
Harshavardhana a7acfa6158
fix: pick valid FileInfo additionally based on dataDir (#12116)
* fix: pick valid FileInfo additionally based on dataDir

historically we have always relied on modTime
to be consistent and same, we can now add additional
reference to look for the same dataDir value.

A dataDir is the same for an object at a given point in
time for a given version, let's say a `null` version
is overwritten in quorum we do not by mistake pick
up the fileInfo's incorrectly.

* make sure to not preserve fi.Data

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-21 19:06:08 -07:00
Anis Elleuch cebada2cc7
svcacct: Always search for parent user policy svcacct implied policy (#12117)
InfoServiceAccount admin API does not correctly calculate the policy for
a given service account in case if the policy is implied. Fix it.

Signed-off-by: Anis Elleuch <anis@min.io>
2021-04-21 18:12:02 -07:00
Harshavardhana 38a9f87a56 Revert "svc: Disallow creating services accounts by root (#12062)"
This reverts commit 150f3677d6.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-21 11:59:23 -07:00
Harshavardhana 4a41222310
fix: newMultipartUpload should go to same pool (#12106)
avoid potential for duplicates under multi-pool
setup, additionally also make sure CompleteMultipart
is using a more optimal API for uploadID lookup
and never delete the object there is a potential
to create a delete marker during complete multipart.

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-21 10:57:36 -07:00
Klaus Post 6235bd825b
Grab read lock while reading usage cache (#12111)
Signed-off-by: Klaus Post <klauspost@gmail.com>
2021-04-21 08:39:00 -07:00
Harshavardhana 2ef824bbb2
collapse two distinct calls into single RenameData() call (#12093)
This is an optimization by reducing one extra system call,
and many network operations. This reduction should increase
the performance for small file workloads.
2021-04-20 10:44:39 -07:00
Klaus Post 3d685b7fff
fix: zip error races in WebDownload (#12086)
When an error is reported it is ignored and zipping continues with the next object.

However, if there is an error it will write a response to `writeWebErrorResponse(w, err)`, but responses are still being built.

Fixes #12082

Bonus: Exclude common compressed image types.
2021-04-19 08:44:18 -07:00
Poorna Krishnamoorthy c9bf6007b4
Use custom transport for remote targets (#12080) 2021-04-16 18:58:26 -07:00
Harshavardhana 7a0a5bdc0d
remove legacy path for LDAP during policy map removal (#12081)
Thanks to @Alevsk for noticing this nuanced behavior
change between releases from 03-04 to 03-20, make sure
that we handle the legacy path removal as well.
2021-04-16 18:18:55 -07:00
Harshavardhana 0a9d8dfb0b
fix: crash in single drive mode for lifecycle (#12077)
also make sure to close the channel on the producer
side, not in a separate go-routine, this can lead
to races between a writer and a closer.

fixes #12073
2021-04-16 14:09:25 -07:00
Harshavardhana a334554f99
fix: add helper for expected path.Clean behavior (#12068)
current usage of path.Clean returns "." for empty strings
instead we need `""` string as-is, make relevant changes
as needed.
2021-04-15 16:32:13 -07:00
Poorna Krishnamoorthy d30c5d1cf0
Avoid metadata update for incoming replication failure (#12054)
This is an optimization to save IOPS. The replication
failures will be re-queued once more to re-attempt
replication. If it still does not succeed, the replication
status is set as `FAILED` and will be caught up on
scanner cycle.
2021-04-15 16:32:00 -07:00
Harshavardhana 75ac4ea840
remove possible double locks in bandwidth monitor (#12067)
additionally reject bandwidth limits with synchronous replication for now.
2021-04-15 16:20:45 -07:00
Anis Elleuch b6f5785a6d
svc: Display the correct policy of a particular service account (#12064)
For InfoServiceAccount API, calculating the policy before showing it to
the user was not correctly done (only UX issue, not a security issue)

This commit fixes it.
2021-04-15 14:47:58 -07:00
Harshavardhana 39dd9b6483
fix: do not return an error on expired credentials (#12057)
policy might have an associated mapping with an expired
user key, do not return an error during DeletePolicy
for such situations - proceed normally as its an
expected situation.
2021-04-15 08:51:01 -07:00
Andreas Auernhammer 885c170a64
introduce new package pkg/kms (#12019)
This commit introduces a new package `pkg/kms`.
It contains basic types and functions to interact
with various KMS implementations.

This commit also moves KMS-related code from `cmd/crypto`
to `pkg/kms`. Now, it is possible to implement a KMS-based
config data encryption in the `pkg/config` package.
2021-04-15 08:47:33 -07:00
Harshavardhana 1456f9f090
fix: preserve shared dataDir during suspend overwrites (#12058)
CopyObject() when shares dataDir needs to be preserved,
and upon versioning suspended overwrites should still
preserve the dataDir.
2021-04-15 08:44:05 -07:00
Anis Elleuch 150f3677d6
svc: Disallow creating services accounts by root (#12062) 2021-04-15 08:43:44 -07:00
Anis Elleuch 291d2793ca
ldap: Create services accounts for LDAP and STS temp accounts (#11808) 2021-04-14 22:51:14 -07:00
Harshavardhana b70c298c27
update findDataDir to skip inline data (#12050) 2021-04-14 22:44:27 -07:00
Harshavardhana 94e1bacd16
STS call should be rejected for missing policies (#12056)
fixes #12055
2021-04-14 22:35:42 -07:00
Andreas Auernhammer 97aa831352
add new pkg/fips for FIPS 140-2 (#12051)
This commit introduces a new package `pkg/fips`
that bundles functionality to handle and configure
cryptographic protocols in case of FIPS 140.

If it is compiled with `--tags=fips` it assumes
that a FIPS 140-2 cryptographic module is used
to implement all FIPS compliant cryptographic
primitives - like AES, SHA-256, ...

In "FIPS mode" it excludes all non-FIPS compliant
cryptographic primitives from the protocol parameters.
2021-04-14 08:29:56 -07:00
ebozduman b4eeeb8449
PutObjectRetention : return matching error XML as AWS S3 (#11973) 2021-04-14 00:01:53 -07:00
Harshavardhana e85b28398b
fix: pre-allocate certain slices with expected capacity (#12044)
Avoids append() based tiny allocations on known
allocated slices repeated access.
2021-04-12 13:45:06 -07:00
Anis Elleuch 8ab111cfb6
scanner: Shuffle disks to scan (#12036)
Ensure random association between disk and bucket in each crawling
iteration to ensure that ILM applies correctly to objects not present in
all disks.
2021-04-12 07:55:40 -07:00
Harshavardhana 641150f2a2
change updateVersion to only update keys, no deletes (#12032)
there are situations where metadata can have keys
with empty values, preserve existing behavior
2021-04-10 09:13:12 -07:00
sgandon 0ddc4f0075
fix: allow S3 gateway passthrough for SSE-S3 header on copy object (#12029) 2021-04-09 08:56:09 -07:00
Harshavardhana 928ee1a7b2
remove null version dataDir upon overwrites (#12023) 2021-04-08 19:55:44 -07:00
Harshavardhana 8f98e3acfa fix build with fips tags 2021-04-08 19:31:10 -07:00
Harshavardhana 89d58bec16
avoid frequent DNS lookups for baremetal setups (#11972)
bump up the DNS cache for baremetal setups upto 10 minutes
2021-04-08 17:51:59 -07:00
Klaus Post f0ca0b3ca9
Add metadata checksum (#12017)
- Add 32-bit checksum (32 LSB part of xxhash64) of the serialized metadata.

This will ensure that we always reject corrupted metadata.

- Add automatic repair of inline data, so the data structure can be used.

If data was corrupted, we remove all unreadable entries to ensure that operations 
can succeed on the object. Since higher layers add bitrot checks this is not a big problem.

Cannot downgrade to v1.1 metadata, but since that isn't released, no need for a major bump.
2021-04-08 17:29:54 -07:00
Harshavardhana 0e4794ea50
fix: allow S3 gateway passthrough for SSE-S3 header (#12020)
only in case of S3 gateway we have a case where we
need to allow for SSE-S3 headers as passthrough,

If SSE-C headers are passed then they are rejected
if KMS is not configured.
2021-04-08 16:40:38 -07:00
Harshavardhana 16ce7fb70c
fix: legacy object should be overwritten for metadataOnly updates (#12012) 2021-04-08 14:29:27 -07:00
Harshavardhana 641e564b65
fips build tag uses relevant binary link for updates (#12014)
This code is necessary for `mc admin update` command
to work with fips compiled binaries, with fips tags
the releaseInfo will automatically point to fips
specific binaries.
2021-04-08 09:51:11 -07:00
Harshavardhana 835d2cb9a3
handle dns.ErrBucketConflict as BucketAlreadyExists (#12013) 2021-04-08 08:24:55 -07:00
Andreas Auernhammer cda570992e set SSE headers in put-part response (#12008)
This commit fixes a bug in the put-part
implementation. The SSE headers should be
set as specified by AWS - See:
https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html

Now, the MinIO server should set SSE-C headers,
like `x-amz-server-side-encryption-customer-algorithm`.

Fixes #11991
2021-04-07 15:05:00 -07:00
Harshavardhana 0b33fa50ae
fix: calculate correct content-range with partNumber query (#11992)
fixes #11989
fixes #11824
2021-04-07 14:37:10 -07:00
Harshavardhana 4223ebab8d
fix: remove auto-close GetObjectReader (#12009)
locks can get relinquished when Read() sees io.EOF
leading to prematurely closing of the readers

concurrent writes on the same object can have
undesired consequences here when these locks
are relinquished.
2021-04-07 13:29:27 -07:00
Klaus Post 48c5e7e5b6
Add runtime mem stats to server info (#11995)
Adds information about runtime+gc memory use.
2021-04-07 10:40:51 -07:00
Klaus Post d267d152ba
healing: re-read metadata after lock (#12004)
Do no use potentially wrong metadata from before acquiring lock.

Plus remove unused NoLock option.
2021-04-07 10:39:48 -07:00
Klaus Post d2ac2f758e
odirectReader: handle EOF correctly (#11998)
EOF may be sent along with data so queue it up and 
return it when the buffer is empty.

Also, when reading data without direct io don't add a buffer 
that only results in extra memcopy.
2021-04-07 08:32:59 -07:00
Klaus Post 788a8bc254
Fix disk info race (#11984)
Protect updated members in xlStorage.

```
WARNING: DATA RACE
Write at 0x00c004b4ee78 by goroutine 1491:
  github.com/minio/minio/cmd.(*xlStorage).GetDiskID()
      d:/minio/minio/cmd/xl-storage.go:590 +0x1078
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).checkDiskStale()
      d:/minio/minio/cmd/xl-storage-disk-id-check.go:195 +0x84
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).StatVol()
      d:/minio/minio/cmd/xl-storage-disk-id-check.go:284 +0x16a
  github.com/minio/minio/cmd.erasureObjects.getBucketInfo.func1()
      d:/minio/minio/cmd/erasure-bucket.go:100 +0x1a5
  github.com/minio/minio/pkg/sync/errgroup.(*Group).Go.func1()
      d:/minio/minio/pkg/sync/errgroup/errgroup.go:122 +0xd7

Previous read at 0x00c004b4ee78 by goroutine 1087:
  github.com/minio/minio/cmd.(*xlStorage).CheckFile.func1()
      d:/minio/minio/cmd/xl-storage.go:1699 +0x384
  github.com/minio/minio/cmd.(*xlStorage).CheckFile()
      d:/minio/minio/cmd/xl-storage.go:1726 +0x13c
  github.com/minio/minio/cmd.(*xlStorageDiskIDCheck).CheckFile()
      d:/minio/minio/cmd/xl-storage-disk-id-check.go:446 +0x23b
  github.com/minio/minio/cmd.erasureObjects.parentDirIsObject.func1()
      d:/minio/minio/cmd/erasure-common.go:173 +0x194
  github.com/minio/minio/pkg/sync/errgroup.(*Group).Go.func1()
      d:/minio/minio/pkg/sync/errgroup/errgroup.go:122 +0xd7
```
2021-04-06 11:33:42 -07:00
Klaus Post 111c02770e
Fix data race when connecting disks (#11983)
Multiple disks from the same set would be writing concurrently.

```
WARNING: DATA RACE
Write at 0x00c002100ce0 by goroutine 166:
  github.com/minio/minio/cmd.(*erasureSets).connectDisks.func1()
      d:/minio/minio/cmd/erasure-sets.go:254 +0x82f

Previous write at 0x00c002100ce0 by goroutine 129:
  github.com/minio/minio/cmd.(*erasureSets).connectDisks.func1()
      d:/minio/minio/cmd/erasure-sets.go:254 +0x82f

Goroutine 166 (running) created at:
  github.com/minio/minio/cmd.(*erasureSets).connectDisks()
      d:/minio/minio/cmd/erasure-sets.go:210 +0x324
  github.com/minio/minio/cmd.(*erasureSets).monitorAndConnectEndpoints()
      d:/minio/minio/cmd/erasure-sets.go:288 +0x244

Goroutine 129 (finished) created at:
  github.com/minio/minio/cmd.(*erasureSets).connectDisks()
      d:/minio/minio/cmd/erasure-sets.go:210 +0x324
  github.com/minio/minio/cmd.(*erasureSets).monitorAndConnectEndpoints()
      d:/minio/minio/cmd/erasure-sets.go:288 +0x244
```
2021-04-06 11:33:10 -07:00
Poorna Krishnamoorthy 40409437cd
Add initial usage in GetBucketReplicationMetrics API (#11985) 2021-04-06 11:32:52 -07:00
iternity-dotcom 02f797a23b
remove redundant GetBucketLifecycleHandler call (#11982) 2021-04-06 09:21:37 -07:00
Andreas Auernhammer d5d2fc9850
bitrot: add selftest for server startup (#11917)
This commit adds a self-test for all bitrot algorithms:
 - SHA-256
 - BLAKE2b
 - HighwayHash

The self-test computes an incremental checksum of pseudo-random
messages. If a bitrot algorithm implementation stops working on
some CPU architecture or with a certain Go version this self-test
will prevent the server from starting and silently corrupting data.

For additional context see: minio/highwayhash#19
2021-04-06 08:38:22 -07:00
Poorna Krishnamoorthy 075bccda42
Fix cluster bucket stats API for prometheus (#11970)
Metrics calculation was accumulating inital usage across all nodes
rather than using initial usage only once.

Also fixing:
- bug where all  peer traffic was going to the same node.
- reset counters when replication status changes from
PENDING -> FAILED
2021-04-06 08:36:54 -07:00
Klaus Post 0276652f26
Fix Access Key requests (#11979)
Fix accessing claims when auth error is unchecked.

Only replaced when unchecked and when clearly without side effects.

Fixes #11959
2021-04-06 08:35:46 -07:00
Harshavardhana abb55bd49e
fix: properly close leaking bandwidth monitor channel (#11967)
This PR fixes

- close leaking bandwidth report channel leakage
- remove the closer requirement for bandwidth monitor
  instead if Read() fails remember the error and return
  error for all subsequent reads.
- use locking for usage-cache.bin updates, with inline
  data we cannot afford to have concurrent writes to
  usage-cache.bin corrupting xl.meta
2021-04-05 16:07:53 -07:00
Poorna Krishnamoorthy bb6561fe55
fix: route for replication-metrics API (#11968) 2021-04-05 13:36:39 -07:00
Harshavardhana 5cce9361bc
fix: avoid an extra rename when there is no dataDir (#11964)
also perform globalSync() in defer when enabled
for RenameData(), to ensure all calls are flushed
to disk.
2021-04-05 08:52:28 -07:00
Harshavardhana 09ee303244
add cluster support for realtime bucket stats (#11963)
implementation in #11949 only catered from single
node, but we need cluster metrics by capturing
from all peers. introduce bucket stats API that
will be used for capturing in-line bucket usage
as well eventually
2021-04-04 15:34:33 -07:00
Harshavardhana d46386246f
api: Introduce metadata update APIs to update only metadata (#11962)
Current implementation heavily relies on readAllFileInfo
but with the advent of xl.meta inlined with data, we cannot
easily avoid reading data when we are only interested is
updating metadata, this leads to invariably write
amplification during metadata updates, repeatedly reading
data when we are only interested in updating metadata.

This PR ensures that we implement a metadata only update
API at storage layer, that handles updates to metadata alone
for any given version - given the version is valid and
present.

This helps reduce the chattiness for following calls..

- PutObjectTags
- DeleteObjectTags
- PutObjectLegalHold
- PutObjectRetention
- ReplicateObject (updates metadata on replication status)
2021-04-04 13:32:31 -07:00
Poorna Krishnamoorthy 47c09a1e6f
Various improvements in replication (#11949)
- collect real time replication metrics for prometheus.
- add pending_count, failed_count metric for total pending/failed replication operations.

- add API to get replication metrics

- add MRF worker to handle spill-over replication operations

- multiple issues found with replication
- fixes an issue when client sends a bucket
 name with `/` at the end from SetRemoteTarget
 API call make sure to trim the bucket name to 
 avoid any extra `/`.

- hold write locks in GetObjectNInfo during replication
  to ensure that object version stack is not overwritten
  while reading the content.

- add additional protection during WriteMetadata() to
  ensure that we always write a valid FileInfo{} and avoid
  ever writing empty FileInfo{} to the lowest layers.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2021-04-03 09:03:42 -07:00
Harshavardhana bf106453b8
add policy conditions support for signatureVersion and authType (#11947)
https://docs.aws.amazon.com/AmazonS3/latest/API/bucket-policy-s3-sigv4-conditions.html

fixes #11944
2021-04-02 09:34:15 -07:00
Harshavardhana 434e5c0cfe
allow preserving legacyXLv1 with inline data format (#11951)
current master breaks this important requirement
we need to preserve legacyXLv1 format, this is simply
ignored and overwritten causing a myriad of issues
by leaving stale files on the namespace etc.

for now lets still use the two-phase approach of
writing to `tmp` and then renaming the content to
the actual namespace.
2021-04-01 22:12:03 -07:00
Harshavardhana 204c610d84
do not use dataDir to reference inline data use versionID (#11942)
versionID is the one that needs to be preserved and as
well as overwritten in case of replication, transition
etc - dataDir is an ephemeral entity that changes
during overwrites - make sure that versionID is used
to save the object content.

this would break things if you are already running
the latest master, please wipe your current content
and re-do your setup after this change.
2021-04-01 13:09:23 -07:00
Harshavardhana f966fbc4a3
make sure to preserve checksumInfo to lookup older hash (#11940)
upgrading from 2yr old releases is expected to work,
the issue was we were missing checksum info to be
passed down to newBitrotReader() for whole bitrot
calculation
2021-03-31 21:14:08 -07:00
Harshavardhana 3c571472e0
avoid network read errors crashing CreateFile call (#11939)
Thanks to @dvaldivia for reproducing this
2021-03-31 18:44:45 -07:00
Harshavardhana f60eaabfcd
fix: notify parent user in notification events (#11934)
fixes #11885
2021-03-31 13:21:10 -07:00
Harshavardhana 18dee6a333
add stringer for ErrorCodes (#11933) 2021-03-31 09:30:52 -07:00
Klaus Post 4dcce17eb9
Determine small objects on shard size (#11935)
Use shard size to determine when to inline data.

For unversioned objects, use 128K/shard and for versioned 16K thresholds.
2021-03-31 09:19:14 -07:00
Klaus Post 0d8c74358d
Add erasure and compression self-tests (#11918)
Ensure that we don't use potentially broken algorithms for critical functions, whether it be a runtime problem or implementation problem for a specific platform.
2021-03-31 09:11:37 -07:00
Anis Elleuch 6b484f45c6
crawling: Apply lifecycle then decide healing action (#11563)
It is inefficient to decide to heal an object before checking its
lifecycle for expiration or transition. This commit will just reverse
the order of action: evaluate lifecycle and heal only if asked and
lifecycle resulted a NoneAction.
2021-03-31 02:15:08 -07:00
Ritesh H Shukla 3ddd8b04d1
fix: handle unsupported APIs more granularly (#11674) 2021-03-30 23:19:36 -07:00
Harshavardhana 8e6e287729
fix: delete/delete marker replication versions consistent (#11932)
replication didn't work as expected when deletion of
delete markers was requested in DeleteMultipleObjects
API, this is due to incorrect lookup elements being
used to look for delete markers.
2021-03-30 17:15:36 -07:00
Harshavardhana 014edd3462
allow configuring scanner cycles dynamically (#11931)
This allows us to speed up or slow down sleeps
between multiple scanner cycles, helps in testing
as well as some deployments might want to run
scanner more frequently.

This change is also dynamic can be applied on
a running cluster, subsequent cycles pickup
the newly set value.
2021-03-30 13:59:02 -07:00
Steven Reitsma e9fede88b3
fix: multi delete when using S3 Gateway with SSE (#11929) 2021-03-30 13:09:48 -07:00
Harshavardhana edf053c5c9
disksWithAllParts should use parts if present (#11923) 2021-03-30 01:51:00 -07:00
Klaus Post 2623338dc5
Inline small file data in xl.meta file (#11758) 2021-03-29 17:00:55 -07:00
Anis Elleuch f5831174e6
iam: Use 'on' for enabled accounts for consistency (#11913)
This commit does not fix any bug, just ensure consistency.
2021-03-29 09:32:36 -07:00
Harshavardhana d93c6cb9c7
use Access() instead of Lstat() for frequent use (#11911)
using Lstat() is causing tiny memory allocations,
that are usually wasted and never used, instead
we can simply uses Access() call that does 0
memory allocations.
2021-03-29 08:07:23 -07:00
Harshavardhana 7c5b35d20f trace: enhance trace experience further 2021-03-27 13:19:14 -07:00
Anis Elleuch 07ab4d1250
trace: Add prefix to func names of OS & Storage (#11912) 2021-03-27 10:07:07 -07:00
Anis Elleuch d8b5adfd10
trace: Add storage & OS tracing (#11889) 2021-03-26 23:24:07 -07:00
Poorna Krishnamoorthy 95096e31a7
Improve error message from SetRemoteTargetHandler (#11909) 2021-03-26 18:58:13 -07:00
Harshavardhana d8bda2dd92
[feat] Add targz transparent extract support (#11849)
This feature brings in support for auto extraction
of objects onto MinIO's namespace from an incoming
tar gzipped stream, the only expected metadata sent
by the client is to set `snowball-auto-extract`.

All the contents from the tar stream are saved as
folders and objects on the namespace.

fixes #8715
2021-03-26 17:15:09 -07:00
Harshavardhana df42b128db
fix: service accounts policy enforcement regression (#11910)
service accounts were not inheriting parent policies
anymore due to refactors in the PolicyDBGet() from
the latest release, fix this behavior properly.
2021-03-26 13:55:42 -07:00
Anis Elleuch 2c296652f7
Simplify access to local node name (#11907)
The local node name is heavily used in tracing, create a new global 
variable to store it. Multiple goroutines can access it since it won't be
changed later.
2021-03-26 11:37:58 -07:00
Klaus Post 9efcb9e15c
Fix listPathRaw/WalkDir cancelation (#11905)
In #11888 we observe a lot of running, WalkDir calls.

There doesn't appear to be any listerners for these calls, so they should be aborted.

Ensure that WalkDir aborts when upstream cancels the request.

Fixes #11888
2021-03-26 11:18:30 -07:00
Anis Elleuch 8d5456c15a
Fix error returned by HealObject in some cases (#11906)
The background healing can return NoSuchUpload error, the reason is that
healing code can return errFileNotFound with three parameters. Simplify
the code by returning exact errUploadNotFound error in multipart code.

Also ensure that a typed error is always returned whatever the number of
parameters because it is better than showing internal error.
2021-03-26 11:17:23 -07:00
Harshavardhana cf87303094
do not call LocalStorageInfo on gateways (#11903)
fixes https://github.com/minio/mc/issues/3665
2021-03-25 15:26:22 -07:00
Harshavardhana 90d8ec6310
fix: reject duplicate keys in PostPolicyJSON document (#11902)
fixes #11894
2021-03-25 13:57:57 -07:00
Klaus Post b383522743
fix error could not read /proc ion windows. (#11868)
Bonus: Prealloc reasonable sizes for metrics.
2021-03-25 12:58:43 -07:00
Aditya Manthramurthy b4d8bcf644
Converge PolicyDBGet functions in IAM (#11891) 2021-03-25 00:38:15 -07:00
Harshavardhana d7f32ad649 xl: avoid sending Delete() remote call for fully successful runs
an optimization to avoid extra syscalls in PutObject(),
adds up to our PutObject response times.
2021-03-24 17:32:12 -07:00
Aditya Manthramurthy 906d68c356
Fix LDAP policy application on user policy (#11887) 2021-03-24 12:29:25 -07:00
Klaus Post 749e9c5771
metrics: Add canceled requests (#11881)
Add metric for canceled requests
2021-03-24 10:25:27 -07:00
Harshavardhana 410e84d273 xl: add checks for minioTmpMetaBucket in CreateFile 2021-03-24 09:36:10 -07:00
Harshavardhana 75741dbf4a
xl: remove cleanupDir instead use Delete() (#11880)
use a single call to remove directly at disk
instead of doing recursively at network layer.
2021-03-24 09:08:05 -07:00
Anis Elleuch fad7b27f15
metrics: Change type of minio_s3_requests_waiting_total to gauge (#11884) 2021-03-24 09:06:37 -07:00
Harshavardhana 79564656eb
xl: CreateFile shouldn't prematurely timeout (#11878)
For large objects taking more than '3 minutes' response
times in a single PUT operation can timeout prematurely
as 'ResponseHeader' timeout hits for 3 minutes. Avoid
this by keeping the connection active during CreateFile
phase.
2021-03-24 09:05:03 -07:00
Harshavardhana 21cfc4aa49 Revert "xl: CreateFile shouldn't prematurely timeout (#11854)"
This reverts commit 922c7b57f5.
2021-03-23 23:47:45 -07:00
Harshavardhana e80239a661 simplify OS instrumentation remove functions for global variables 2021-03-23 22:32:44 -07:00
Ritesh H Shukla 6a2ed44095 fix: optionally enable tracing posix calls 2021-03-23 22:23:08 -07:00
Aditya Manthramurthy 8adfeb0d84
fix: AccountInfo API for LDAP users (#11874)
Also, ensure admin APIs auth additionally validates groups
2021-03-23 17:39:20 -07:00
Harshavardhana d23485e571
fix: LDAP groups handling and group mapping (#11855)
comprehensively handle group mapping for LDAP
users across IAM sub-subsytem.
2021-03-23 15:15:51 -07:00
Harshavardhana da70e6ddf6
avoid healObjects recursively healing at empty path (#11856)
baseDirFromPrefix(prefix) for object names without
parent directory incorrectly uses empty path, leading
to long listing at various paths that are not useful
for healing - avoid this listing completely if "baseDir"
returns empty simple use the "prefix" as is.

this improves startup performance significantly
2021-03-23 07:57:07 -07:00
Harshavardhana 922c7b57f5
xl: CreateFile shouldn't prematurely timeout (#11854)
For large objects taking more than '3 minutes' response
times in a single PUT operation can timeout prematurely
as 'ResponseHeader' timeout hits for 3 minutes. Avoid
this by keeping the connection active during CreateFile
phase.
2021-03-22 18:25:05 -07:00
Harshavardhana 726d80dbb7
fix: merge duplicate keys in post policy (#11843)
some SDKs might incorrectly send duplicate
entries for keys such as "conditions", Go
stdlib unmarshal for JSON does not support
duplicate keys - instead skips the first
duplicate and only preserves the last entry.

This can lead to issues where a policy JSON
while being valid might not properly apply
the required conditions, allowing situations
where POST policy JSON would end up allowing
uploads to unauthorized buckets and paths.

This PR fixes this properly.
2021-03-20 22:16:30 -07:00
Ritesh H Shukla 23b03dadb8
Add process uptime metric (#11844) 2021-03-20 21:23:27 -07:00
Andreas Auernhammer 7b3719c17b
crypto: simplify Context encoding (#11812)
This commit adds a `MarshalText` implementation
to the `crypto.Context` type.
The `MarshalText` implementation replaces the
`WriteTo` and `AppendTo` implementation.

It is slightly slower than the `AppendTo` implementation
```
goos: darwin
goarch: arm64
pkg: github.com/minio/minio/cmd/crypto
BenchmarkContext_AppendTo/0-elems-8         	381475698	         2.892 ns/op	       0 B/op	       0 allocs/op
BenchmarkContext_AppendTo/1-elems-8         	17945088	        67.54 ns/op	       0 B/op	       0 allocs/op
BenchmarkContext_AppendTo/3-elems-8         	 5431770	       221.2 ns/op	      72 B/op	       2 allocs/op
BenchmarkContext_AppendTo/4-elems-8         	 3430684	       346.7 ns/op	      88 B/op	       2 allocs/op
```
vs.
```
BenchmarkContext/0-elems-8         	135819834	         8.658 ns/op	       2 B/op	       1 allocs/op
BenchmarkContext/1-elems-8         	13326243	        89.20 ns/op	     128 B/op	       1 allocs/op
BenchmarkContext/3-elems-8         	 4935301	       243.1 ns/op	     200 B/op	       3 allocs/op
BenchmarkContext/4-elems-8         	 2792142	       428.2 ns/op	     504 B/op	       4 allocs/op
goos: darwin
```

However, the `AppendTo` benchmark used a pre-allocated buffer. While
this improves its performance it does not match the actual usage of
`crypto.Context` which is passed to a `KMS` and always encoded into
a newly allocated buffer.

Therefore, this change seems acceptable since it should not impact the
actual performance but reduces the overall code for Context marshaling.
2021-03-20 02:48:48 -07:00
Harshavardhana 9a6487319a
remove MINIO_IO_DEADLINE support (#11841)
this feature in actual deployment was found
to be not that useful, remove support for this
for now.
2021-03-20 02:47:04 -07:00
Aditya Manthramurthy 94ff624242
Fix querying LDAP group/user policy (#11840) 2021-03-20 02:37:52 -07:00
Anis Elleuch 98ff91b484
xl: Reduce usage of isDirEmpty() (#11838)
When an object is removed, its parent directory is inspected to check if
it is empty to remove if that is the case.

However, we can use os.Remove() directly since it is only able to remove
a file or an empty directory.
2021-03-19 15:42:01 -07:00
Anis Elleuch 4d86384dc7
xl: Remove non needed check for empty dir (#11835)
RenameData renames xl.meta and data dir and removes the parent directory
if empty, however, there is a duplicate check for empty dir, since the
parent dir of xl.meta is always the same as the data-dir.
2021-03-19 12:26:53 -07:00
Ritesh H Shukla b5dcaaccb4
Introduce metrics caching for performant metrics (#11831) 2021-03-19 00:04:29 -07:00
Harshavardhana b92a220db1
fix: handle weird drives sporadic read O_DIRECT behavior (#11832)
on freshReads if drive returns errInvalidArgument, we
should simply turn-off DirectIO and read normally, there
are situations in k8s like environments where the drives
behave sporadically in a single deployment and may not
have been implemented properly to handle O_DIRECT for
reads.
2021-03-18 20:16:50 -07:00
Harshavardhana 51a8619a79
[feat] Add configurable deadline for writers (#11822)
This PR adds deadlines per Write() calls, such
that slow drives are timed-out appropriately and
the overall responsiveness for Writes() is always
up to a predefined threshold providing applications
sustained latency even if one of the drives is slow
to respond.
2021-03-18 14:09:55 -07:00
Anis Elleuch 14d89eaae4
mrf: Enhance behavior for better results (#11788)
MRF was starting to heal when it receives a disk connection event, which
is not good when a node having multiple disks reconnects to the cluster.

Besides, MRF needs Remove healing option to remove stale files.
2021-03-18 11:19:02 -07:00
Harshavardhana add3cd4e44
allow configuring delete cleanup interval from default 10minutes (#11818) 2021-03-17 15:15:58 -07:00
Harshavardhana 60b0f2324e
storage write call path optimizations (#11805)
- write in o_dsync instead of o_direct for smaller
  objects to avoid unaligned double Write() situations
  that may arise for smaller objects < 128KiB
- avoid fallocate() as its not useful since we do not
  use Append() semantics anymore, fallocate is not useful
  for streaming I/O we can save on a syscall
- createFile() doesn't need to validate `bucket` name
  with a Lstat() call since createFile() is only used
  to write at `minioTmpBucket`
- use io.Copy() when writing unAligned writes to allow
  usage of ReadFrom() from *os.File providing zero
  buffer writes().
2021-03-17 09:38:38 -07:00
Anis Elleuch 0eb146e1b2
add additional metrics per disk API latency, API call counts #11250)
```
mc admin info --json
```

provides these details, for now, we shall eventually 
expose this at Prometheus level eventually. 

Co-authored-by: Harshavardhana <harsha@minio.io>
2021-03-16 20:06:57 -07:00
Andreas Auernhammer e197800f90
s3v4: read and verify S3 signature v4 chunks separately (#11801)
This commit fixes a security issue in the signature v4 chunked
reader. Before, the reader returned unverified data to the caller
and would only verify the chunk signature once it has encountered
the end of the chunk payload.

Now, the chunk reader reads the entire chunk into an in-memory buffer,
verifies the signature and then returns data to the caller.

In general, this is a common security problem. We verifying data
streams, the verifier MUST NOT return data to the upper layers / its
callers as long as it has not verified the current data chunk / data
segment:
```
func (r *Reader) Read(buffer []byte) {
   if err := r.readNext(r.internalBuffer); err != nil {
      return err
   }
   if err := r.verify(r.internalBuffer); err != nil {
      return err
   }
   copy(buffer, r.internalBuffer)
}
```
2021-03-16 13:33:40 -07:00
Klaus Post 771dea175c
erasure pools enable faster checks for file not found (#11799)
For operations that require the object to exist make it possible to 
detect if the file isn't found in *any* pool.

This will allow these to return the error early without having to re-check.
2021-03-16 11:02:20 -07:00
Harshavardhana 6160188bf3
fix: erasure index based reading based on actual ParityBlocks (#11792)
in some setups with ordering issues in drive configuration,
we should rely on expected parityBlocks instead of `len(disks)/2`
2021-03-15 20:03:13 -07:00
Steve Wills 642ba3f2d6
fix: runtime issue on FreeBSD due to missing O_NOATIME/O_DSYNC support (#11790)
See also:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253937
2021-03-15 14:02:36 -07:00
Harshavardhana afbd3e41eb
add missing principalId in web notifications (#11777)
fixes #11561
2021-03-13 10:52:43 -08:00
Poorna Krishnamoorthy 5e003549cc
Replication: Enforce DeleteMarker disable setting (#11720)
This PR also enforces DeleteReplication
disable setting
2021-03-13 10:28:35 -08:00
Nitish Tiwari 7fa3e4106b
Add consoleAdmin as a default canned policy (#11770) 2021-03-12 12:51:43 -08:00
Philip Brown 75db500e85
cmd/os-readdir_other.go - return nil with err (#11772) 2021-03-12 07:22:25 -08:00
Harshavardhana feafccf007
handle trimming '/' if present in the object names (#11765)
- MultipleDeletes should handle '/' prefix for objectnames
- Trimming the slash alone is enough for ListObjects()
  prefix and markers

fixes #11769
2021-03-11 13:57:03 -08:00
Anis Elleuch f92b7a5621
Browser: Shared link has content-disposition header (#11712)
The shared link will be automatically downloadable when the user opens
the shared link in a browser.
2021-03-10 23:02:16 -08:00
Poorna Krishnamoorthy c25e75f0b5
Fix redact LDAP password properly (#11762)
fixes #11742 previous pull request #11750 fixed only the web trace
2021-03-10 11:05:38 -08:00
Harshavardhana 777344a594
add release build-arg to docker multiarch builds (#11754)
additional paths to ignore for healing
2021-03-10 09:38:35 -08:00
Poorna Krishnamoorthy 878bc6c72b
Redact LDAP password if any in request trace (#11750)
Fixes: #11742
2021-03-09 14:43:16 -08:00
Klaus Post fdc2f69218
truncate xl.meta files upon rewrites #11749)
If the destination files exist and is larger - junk data will be left at the end of the file.
2021-03-09 14:42:24 -08:00
Anis Elleuch 0d124095ea
lc: Return expiration header only when version id is unspecified (#11718)
Follow S3 specification to return Expiration header in HEAD/GET call
only when version-id is not passed in the request.
2021-03-09 13:19:08 -08:00
Harshavardhana 691035832a
fix: normalize object layer inputs (#11534)
Cases where we have applications making request
for `//` in object names make sure that all
are normalized to `/` and all such requests that
are prefixed '/' are removed. To ensure a
consistent view from all operations.
2021-03-09 12:58:22 -08:00
Anis Elleuch eac66e67ec
Use maximum parity for config files (#11740)
Some deployments have low parity (EC:2), but we really do not need to
save our config data with the same parity configuration.

N/2 would be better to keep MinIO configurations intact when unexpected
a number of drives fail.
2021-03-09 10:19:47 -08:00
Anis Elleuch 57f3ed22d4
erasure: Reduce the interval of cleaning up .trash folder (#11741)
Reduce from 30 to 10 minutes.
2021-03-09 09:45:38 -08:00
Poorna Krishnamoorthy 2f29719e6b
resize replication worker pool dynamically after config update (#11737) 2021-03-09 02:56:42 -08:00
Andreas Auernhammer 209fe61dcc
vault: disable Hashicorp Vault with opt-in (#11711)
This commit disables the Hashicorp Vault
support but provides a way to temp. enable
it via the `MINIO_KMS_VAULT_DEPRECATION=off`

Vault support has been deprecated long ago
and this commit just requires users to take
action if they maintain a Vault integration.
2021-03-09 00:02:35 -08:00
Harshavardhana 8ecffdb7a7 Revert "Revert "heal: Heal bucket metadata when a fresh disk is inserted (#11734)""
This reverts commit 806df164b2.
2021-03-08 16:12:17 -08:00
Harshavardhana 806df164b2 Revert "heal: Heal bucket metadata when a fresh disk is inserted (#11734)"
This reverts commit 64662a49ff.
2021-03-08 14:43:24 -08:00
Klaus Post 4ac9ed4248
CopyObject: Do not remove crypto info when compressed (#11702)
Removing crypto info makes it impossible to copy encrypted+compressed objects.

Disable destination compression when encrypted.
2021-03-08 12:57:54 -08:00
Klaus Post 3ff5f55dcb
Fetch fileinfo concurrently (#11700)
For non-erasure setups fetch up to 10 fileinfos concurrently.

Fixes #11625
2021-03-08 11:30:43 -08:00
Max Xu 097e5eba9f
feat: remove go-bindata-assetfs in favor of embed by upgrading to go1.16 (#11733) 2021-03-08 11:26:43 -08:00
Anis Elleuch 64662a49ff
heal: Heal bucket metadata when a fresh disk is inserted (#11734)
Replacing disk with a fresh one never heals bucket metadata (policy,
notification, etc..). This commit fixes the issue.
2021-03-08 10:54:13 -08:00
Harshavardhana 78e867e145
ignore healing .trash, .metacache amd .multipart paths (#11725) 2021-03-07 09:38:31 -08:00
Harshavardhana 9ccc483df6
[feat]: change erasure coding default block size from 10MiB to 1MiB (#11721)
major performance improvements in range GETs to avoid large
read amplification when ranges are tiny and random

```
-------------------
Operation: GET
Operations: 142014 -> 339421
Duration: 4m50s -> 4m56s
* Average: +139.41% (+1177.3 MiB/s) throughput, +139.11% (+658.4) obj/s
* Fastest: +125.24% (+1207.4 MiB/s) throughput, +132.32% (+612.9) obj/s
* 50% Median: +139.06% (+1175.7 MiB/s) throughput, +133.46% (+660.9) obj/s
* Slowest: +203.40% (+1267.9 MiB/s) throughput, +198.59% (+753.5) obj/s
```

TTFB from 10MiB BlockSize
```
* First Access TTFB: Avg: 81ms, Median: 61ms, Best: 20ms, Worst: 2.056s
```

TTFB from 1MiB BlockSize
```
* First Access TTFB: Avg: 22ms, Median: 21ms, Best: 8ms, Worst: 91ms
```

Full object reads however do see a slight change which won't be
noticeable in real world, so not doing any comparisons

TTFB still had improvements with full object reads with 1MiB

```
* First Access TTFB: Avg: 68ms, Median: 35ms, Best: 11ms, Worst: 1.16s
```

v/s

TTFB with 10MiB
```
* First Access TTFB: Avg: 388ms, Median: 98ms, Best: 20ms, Worst: 4.156s
```

This change should affect all new uploads, previous uploads should
continue to work with business as usual. But dramatic improvements can
be seen with these changes.
2021-03-06 14:09:34 -08:00
Anis Elleuch abce040088
fix: Remove repetitive IAM ready message (#11723)
"IAM initialization complete" is printed each 5 minutes, avoid this by
printing it only during the first initialization of IAM.
2021-03-06 09:27:46 -08:00
Anis Elleuch 558762bdf6
iam: Return a slice of policies for a group (#11722)
A group can have multiple policies, a user subscribed to readwrite &
diagnostics can perform S3 operations & admin operations as well.
However, the current code only returns one policy for one group.
2021-03-06 09:27:06 -08:00
Harshavardhana d971061305
use listPathRaw for HealObjects() instead of expensive WalkVersions() (#11675) 2021-03-06 09:25:48 -08:00
Andreas Auernhammer 509bcc01ad
fips: do not use SHA-3 when building a FIPS-140 2 binary (#11710)
This commit disables SHA-3 for OpenID when building a
FIPS-140 2 compatible binary. While SHA-3 is a
crypto. hash function accepted by NIST there is no
FIPS-140 2 compliant implementation available when
using the boringcrypto Go branch.

Therefore, SHA-3 must not be used when building
a FIPS-140 2 binary.
2021-03-05 20:43:42 -08:00
Krishnan Parthasarathi bcf9825082
Data usage should account for transitioned objects (#11717) 2021-03-05 14:15:53 -08:00
sgandon 124816f6a6
fix : IAM Intialization failing with a large number of users/policies (#11701) 2021-03-05 08:36:16 -08:00
Klaus Post fa9cf1251b
Imporve healing and reporting (#11312)
* Provide information on *actively* healing, buckets healed/queued, objects healed/failed.
* Add concurrent healing of multiple sets (typically on startup).
* Add bucket level resume, so restarts will only heal non-healed buckets.
* Print summary after healing a disk is done.
2021-03-04 14:36:23 -08:00
Harshavardhana d73d756a80
fix: incorrect errors thrown by lint (#11699)
fixes #11698
2021-03-04 14:27:38 -08:00
Aditya Manthramurthy 7488c77e7c
Test LDAP connection configuration at startup (#11684) 2021-03-04 12:17:36 -08:00
Harshavardhana 786585009e
fix: capture disks when entire peer is offline (#11697)
currently when one of the peer is down, the
drives from that peer are reported as '0/0'
offline instead we should capture/filter the
drives from the peer and populate it appropriately
such that `mc admin info` displays correct info.
2021-03-04 10:07:05 -08:00
Anis Elleuch 7be7109471
locking: Add Refresh for better locking cleanup (#11535)
Co-authored-by: Anis Elleuch <anis@min.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2021-03-03 18:36:43 -08:00
Klaus Post c3217bd6eb
Use actual size for buffer selection (#11687)
For compressed inputs, this will be -1, but the object may be small.
2021-03-03 16:28:10 -08:00
Andreas Auernhammer f14cc6c943
etag: add FromContentMD5 to parse content-md5 as ETag (#11688)
This commit adds the `FromContentMD5` function to
parse a client-provided content-md5 as ETag.

Further, it also adds multipart ETag computation
for future needs.
2021-03-03 12:58:28 -08:00
Harshavardhana 2c198ae7b6
fix: prometheus metrics disks_online count when disks are down (#11689)
prometheus metrics was using total disks instead
of online disk count, when disks were down, this
PR fixes this and also adds a new metric for
total_disk_count
2021-03-03 11:18:41 -08:00
Poorna Krishnamoorthy 690434514d
Avoid notification event for replicas (#11683)
Creating notification events for replica creation
is not particularly useful to send as the notification
event generated at source already includes replication
completion events.

For applications using replica cluster as failover, avoiding
duplicate notifications for replica event will allow seamless
failover.
2021-03-03 11:13:31 -08:00
Harshavardhana 039f59b552
fix: missing user policy enforcement in PostPolicyHandler (#11682) 2021-03-03 08:47:08 -08:00
Harshavardhana c6a120df0e
fix: Prometheus metrics to re-use storage disks (#11647)
also re-use storage disks for all `mc admin server info`
calls as well, implement a new LocalStorageInfo() API
call at ObjectLayer to lookup local disks storageInfo

also fixes bugs where there were double calls to StorageInfo()
2021-03-02 17:28:04 -08:00
Klaus Post cd9e30c0f4
IAM: Block while loading users (#11671)
While starting up a request that needs all IAM data will start another load operation if the first on startup hasn't finished. This slows down both operations.

Block these requests until initial load has completed.

Blocking calls will be ListPolicies, ListUsers, ListServiceAccounts, ListGroups - and the calls that eventually trigger these. These will wait for the initial load to complete.

Fixes issue seen in #11305
2021-03-02 17:08:25 -08:00
Harshavardhana f96d4cf7d3 fix: do not deny admins to change other passwords
fixes a regression from #11680
2021-03-02 17:02:32 -08:00
Harshavardhana 879599b0cf
fix: enforce deny if present for implicit permissions (#11680)
Implicit permissions for any user is to be allowed to
change their own password, we need to restrict this
further even if there is an implicit allow for this
scenario - we have to honor Deny statements if they
are specified.
2021-03-02 15:35:50 -08:00
Harshavardhana b1bb3f7016
[feat]: implement GetBucketPolicyStatus API (#11673)
additionally also add more APIs in notImplemented
list, adjust routing rules appropriately
2021-03-01 23:10:33 -08:00
Anis Elleuch e8d8dfa3ae
Add metric for internode RPC calls errors (#11669) 2021-03-01 12:31:33 -08:00
Nitish Tiwari bbd1244a88
Add support for mTLS for Audit log target (#11645) 2021-03-01 09:19:13 -08:00
Klaus Post 10bdb78699
fix: listObjectVersions Include object in marker (#11562)
ListObjectVersions would skip past the object in the marker when version id is specified. 
Make `listPath` return the object with the marker and truncate it if not needed.

Avoid having to parse unintended objects to find a version marker.
2021-03-01 08:12:02 -08:00
Shireesh Anjal 289b22d911
fix: pool number not added for one server (#11670)
The previous code was iterating over replies from peers and assigning
pool numbers to them, thus missing to add it for the local server.

Fixed by iterating over the server properties of all the servers
including the local one.
2021-03-01 08:09:43 -08:00
Harshavardhana 0b9c17443e
update gopsutil to use the v3 API (#11638) 2021-03-01 00:15:46 -08:00
Bala FA 23f7ab40b3
Add PoolNumber field to madmin.ServerProperties (#11327) 2021-02-28 21:26:28 -08:00
Harshavardhana 2f4af09c01 fix: alow changes to readAllData to decrement activeCount() 2021-02-28 20:09:23 -08:00
Harshavardhana 37960cbc2f
fix: avoid writing more content on network with O_DIRECT reads (#11659)
There was an io.LimitReader was missing for the 'length'
parameter for ranged requests, that would cause client to
get truncated responses and errors.

fixes #11651
2021-02-28 15:33:03 -08:00
cbows c67d1bf120
add unauthenticated lookup-bind mode to LDAP identity (#11655)
Closes #11646
2021-02-28 12:57:31 -08:00
Klaus Post c5b3a675fa
Block profiling tweaks (#11612)
The base profiles contains no valuable data, don't record them.

Reduce block rate by 2 orders of magnitude, should still capture just as valuable data with less CPU strain.
2021-02-27 09:22:14 -08:00
Harshavardhana b690304eed
use faster way for siphash (#11640) 2021-02-26 16:53:06 -08:00
Harshavardhana 9171d6ef65
rename all references from crawl -> scanner (#11621) 2021-02-26 15:11:42 -08:00
Harshavardhana 6386b45c08
[feat] use rename instead of recursive deletes (#11641)
most of the delete calls today spend time in
a blocking operation where multiple calls need
to be recursively sent to delete the objects,
instead we can use rename operation to atomically
move the objects from the namespace to `tmp/.trash`

we can schedule deletion of objects at this
location once in 15, 30mins and we can also add
wait times between each delete operation.

this allows us to make delete's faster as well
less chattier on the drives, each server runs locally
a groutine which would clean this up regularly.
2021-02-26 09:52:27 -08:00
Andreas Auernhammer 1f659204a2
remove GetObject from ObjectLayer interface (#11635)
This commit removes the `GetObject` method
from the `ObjectLayer` interface.

The `GetObject` method is not longer used by
the HTTP handlers implementing the high-level
S3 semantics. Instead, they use the `GetObjectNInfo`
method which returns both, an object handle as well
as the object metadata.

Therefore, it is no longer necessary that a concrete
`ObjectLayer` implements `GetObject`.
2021-02-26 09:52:02 -08:00
Harshavardhana f9f6fd0421
fix: service account permissions generated from LDAP user (#11637)
service accounts generated from LDAP parent user
did not inherit correct permissions, this PR fixes
this fully.
2021-02-25 13:49:59 -08:00
Klaus Post 85620dfe93
use bucket in path in distribution hash (#11634)
Use bucket in erasure distribution hash.

For the rare cases where objects with the same names are uploaded to many buckets.
2021-02-25 10:11:31 -08:00
Harshavardhana a8e4f64ff3 Revert "fix: remove persistence layer for metacache store in memory (#11538)"
This reverts commit b23659927c.
2021-02-24 22:24:51 -08:00
Krishnan Parthasarathi ca5c6e3160
fix: translate empty versionID string to null version where appropriate (#11629)
We store the null version as empty string. We should translate it to null
version for bucket with version suspended too.
2021-02-24 18:39:10 -08:00
Harshavardhana b23659927c
fix: remove persistence layer for metacache store in memory (#11538)
store the cache in-memory instead of disks to avoid large
write amplifications for list heavy workloads, store in
memory instead and let it auto expire.
2021-02-24 15:51:41 -08:00
Andreas Auernhammer c1a49be639
use crypto/sha256 for FIPS 140-2 compliance (#11623)
This commit replaces the usage of
github.com/minio/sha256-simd with crypto/sha256
of the standard library in all non-performance
critical paths.

This is necessary for FIPS 140-2 compliance which
requires that all crypto. primitives are implemented
by a FIPS-validated module.

Go can use the Google FIPS module. The boringcrypto
branch of the Go standard library uses the BoringSSL
FIPS module to implement crypto. primitives like AES
or SHA256.

We only keep github.com/minio/sha256-simd when computing
the content-SHA256 of an object. Therefore, this commit
relies on a build tag `fips`.

When MinIO is compiled without the `fips` flag it will
use github.com/minio/sha256-simd. When MinIO is compiled
with the fips flag (go build --tags "fips") then MinIO
uses crypto/sha256 to compute the content-SHA256.
2021-02-24 09:00:15 -08:00
Klaus Post 03172b89e2
Ensure cache has finished deserializing (#11620)
Make sure that response has been fully deserialized before returning.
2021-02-24 02:59:49 -08:00
Harshavardhana b517c791e9
[feat]: use DSYNC for xl.meta writes and NOATIME for reads (#11615)
Instead of using O_SYNC, we are better off using O_DSYNC
instead since we are only ever interested in data to be
persisted to disk not the associated filesystem metadata.

For reads we ask customers to turn off noatime, but instead
we can proactively use O_NOATIME flag to avoid atime updates
upon reads.
2021-02-24 00:14:16 -08:00
Petr Tichý 14aef52004
remove Content-MD5 on Range requests (#11611)
This removes the Content-MD5 response header on Range requests in Azure
Gateway mode. The partial content MD5 doesn't match the full object MD5
in metadata.
2021-02-23 19:32:56 -08:00
Andreas Auernhammer d4b822d697
pkg/etag: add new package for S3 ETag handling (#11577)
This commit adds a new package `etag` for dealing
with S3 ETags.

Even though ETag is often viewed as MD5 checksum of
an object, handling S3 ETags correctly is a surprisingly
complex task. While it is true that the ETag corresponds
to the MD5 for the most basic S3 API operations, there are
many exceptions in case of multipart uploads or encryption.

In worse, some S3 clients expect very specific behavior when
it comes to ETags. For example, some clients expect that the
ETag is a double-quoted string and fail otherwise.
Non-AWS compliant ETag handling has been a source of many bugs
in the past.

Therefore, this commit adds a dedicated `etag` package that provides
functionality for parsing, generating and converting S3 ETags.
Further, this commit removes the ETag computation from the `hash`
package. Instead, the `hash` package (i.e. `hash.Reader`) should
focus only on computing and verifying the content-sha256.

One core feature of this commit is to provide a mechanism to
communicate a computed ETag from a low-level `io.Reader` to
a high-level `io.Reader`.

This problem occurs when an S3 server receives a request and
has to compute the ETag of the content. However, the server
may also wrap the initial body with several other `io.Reader`,
e.g. when encrypting or compressing the content:
```
   reader := Encrypt(Compress(ETag(content)))
```
In such a case, the ETag should be accessible by the high-level
`io.Reader`.

The `etag` provides a mechanism to wrap `io.Reader` implementations
such that the `ETag` can be accessed by a type-check.
This technique is applied to the PUT, COPY and Upload handlers.
2021-02-23 12:31:53 -08:00
Harshavardhana aa7244a9a4
fix: make sure to convert the error properly in HealBucket() (#11610)
server startup code expects the object layer to properly
convert error into a proper type, so that in situations when
servers are coming up and quorum is not available servers
wait on each other.
2021-02-23 09:23:11 -08:00
Harshavardhana 2a79ea0332
isServerResolvable its sufficient to check server is reachable (#11609)
using isServerResolvable for expiration can lead to chicken
and egg problems, a lock might expire knowingly when server
is booting up causing perpetual locks getting expired.
2021-02-22 16:29:53 -08:00
Aditya Manthramurthy 02e7de6367
LDAP config: fix substitution variables (#11586)
- In username search filter and username format variables we support %s for
replacing with the username.

- In group search filter we support %s for username and %d for the full DN of
the username.
2021-02-22 13:20:36 -08:00
Harshavardhana da676ac298
remove network calls for getLocalDisks (#11603) 2021-02-22 13:19:44 -08:00
Harshavardhana 18ec933085
fix: for containers use root-disk detection cleverly (#11593)
root-disk implemented currently had issues where root
disk partitions getting modified might race and provide
incorrect results, to avoid this lets rely again back on
DeviceID and match it instead.

In-case of containers `/data` is one such extra entity that
needs to be verified for root disk, due to how 'overlay'
filesystem works and the 'overlay' presents a completely
different 'device' id - using `/data` as another entity
for fallback helps because our containers describe 'VOLUME'
parameter that allows containers to automatically have a
virtual `/data` that points to the container root path this
can either be at `/` or `/var/lib/` (on different partition)
2021-02-22 10:32:21 -08:00
Harshavardhana c31d2c3fdc
fix: CrawlAndGetDataUsage close pipe() before using a new one (#11600)
also additionally make sure errors during deserializer closes
the reader with right error type such that Write() end
actually see the final error, this avoids a waitGroup usage
and waiting.
2021-02-22 10:04:32 -08:00
Harshavardhana 8778828a03
fix: read metadata in O_DIRECT if configured and supported (#11594)
reduce the page-cache pressure completely by moving
the entire read-phase of our operations to O_DIRECT,
primarily this is going to be very useful for chatty
metadata operations such as listing, scanner, ilm, healing
like operations to avoid filling up the page-cache upon
repeated runs.
2021-02-22 01:36:17 -08:00
Sarasa Kisaragi 48b212dd8e
Fix HDFS wrong filepath if subpath provided (#11574) 2021-02-20 15:32:18 -08:00
Harshavardhana be7de911c4
fix: update minio-go to fix an issue with S3 gateway (#11591)
since we have changed our default envs to MINIO_ROOT_USER,
MINIO_ROOT_PASSWORD this was not supported by minio-go
credentials package, update minio-go to v7.0.10 for this
support. This also addresses few bugs related to users
had to specify AWS_ACCESS_KEY_ID as well to authenticate
with their S3 backend if they only used MINIO_ROOT_USER.
2021-02-20 11:10:21 -08:00
Harshavardhana 8cad407e0b
fix: Bring support for symlink on regular files on NAS (#11383)
fixes #11203
2021-02-20 00:30:12 -08:00
Poorna Krishnamoorthy 85d2187c20
fix: ETag mismatch for large upload in replica (#11587) 2021-02-20 00:22:17 -08:00
Anis Elleuch 98d3f94996
metrics: Add the number of requests in the waiting queue (#11580)
We can use this metric to check if there are too many S3 clients in the
queue and could explain why some of those S3 clients are timing out.

```
minio_s3_requests_waiting_total{server="127.0.0.1:9000"} 9981
```

If max_requests is 10000 then there is a strong possibility that clients
are timing out because of the queue deadline.
2021-02-20 00:21:55 -08:00
mailsmail 173284903b
fix incorrect http range in SelectObjectContentHandler (#11585) 2021-02-19 17:55:28 -08:00
Poorna Krishnamoorthy 2dce5d9442
fix: delete marker permanent delete replication (#11581) 2021-02-18 16:35:37 -08:00
Anis Elleuch f28b063091
heal: Use healDeleteDangling global const in self healing (#11579)
A small fix, use healDeleteDangling constant instead of 'true' in the
self-healing code.
2021-02-18 15:16:20 -08:00
Klaus Post c5b2a8441b
fix: faster healing when disk is replaced. (#11520) 2021-02-18 11:06:54 -08:00
Klaus Post 8a6b13c239
Avoid synchronizing usage writes (#11560)
If the periodic `case <-t.C:` save gets held up for a long time it will end up 
synchronize all disk writes for saving the caches.

We add jitter to per set writes so they don't sync up and don't hold a 
lock for the write, since it isn't needed anyway.

If an outage prevents writes for a long while we also add individual 
waits for each disk in case there was a queue.

Furthermore limit the number of buffers kept to 2GiB, since this could get 
huge in large clusters. This will not act as a hard limit but should be enough 
for normal operation.
2021-02-18 00:38:37 -08:00
Poorna Krishnamoorthy 8e8a792d9d
Allow delete marker replication from replica (#11566)
in the case of active-active replication.

This PR also has the following changes:

- add docs on replication design
- fix corner case of completing versioned delete on a delete marker
  when the target is down and `mc rm --vid` is performed repeatedly. Instead
  the version should still be retained in the `PENDING|FAILED` state until
  replication sync completes.
- remove `s3:Replication:OperationCompletedReplication` and
   `s3:Replication:OperationFailedReplication` from ObjectCreated 
  events type
2021-02-18 00:33:51 -08:00
Harshavardhana 95e0acbb26
fix: allow accountInfo with creds with parentUsers (#11568) 2021-02-17 20:57:17 -08:00
Poorna Krishnamoorthy 55037e6e54
lifecycle:Fix args passed to determine expiry header (#11567) 2021-02-17 19:25:19 -08:00
Harshavardhana 289e1d8b2a
fix: reduce crawler memory usage by orders of magnitude (#11556)
currently crawler waits for an entire readdir call to
return until it processes usage, lifecycle, replication
and healing - instead we should pass the applicator all
the way down to avoid building any special stack for all
the contents in a single directory.

This allows for

- no need to remember the entire list of entries per directory
  before applying the required functions
- no need to wait for entire readdir() call to finish before
  applying the required functions
2021-02-17 15:34:42 -08:00
Harshavardhana ffea6fcf09
fix: rename crawler as scanner in config (#11549) 2021-02-17 12:04:11 -08:00
Klaus Post 11b2220696
Don't autoheal if disks are healing (#11558)
Don't spawn automatic healing ops if a disk is healing.
2021-02-17 10:18:12 -08:00
Harshavardhana aa8450a2a1
fix: parallelize getPoolIdx() for object lookup (#11547) 2021-02-16 19:36:15 -08:00
Harshavardhana 7d4a2d2b68
fix: multiple pool reads parallelize when possible (#11537) 2021-02-16 02:43:47 -08:00
Anis Elleuch c4e12dc846
fix: in MultiDelete API return MalformedXML upon empty input (#11532)
To follow S3 spec
2021-02-13 09:48:25 -08:00
Harshavardhana a94a9c37fa
fix: support IAM policy handling for wildcard actions (#11530)
This PR fixes

- allow 's3:versionid` as a valid conditional for
  Get,Put,Tags,Object locking APIs
- allow additional headers missing for object APIs
- allow wildcard based action matching
2021-02-12 23:05:09 -08:00
Harshavardhana 79b6a43467
fix: avoid timed value for network calls (#11531)
additionally simply timedValue to have RWMutex
to avoid concurrent calls to DiskInfo() getting
serialized, this has an effect on all calls that
use GetDiskInfo() on the same disks.

Such as getOnlineDisks, getOnlineDisksWithoutHealing
2021-02-12 18:17:52 -08:00
Shireesh Anjal 928de04f7a
fix: osinfos incomplete in case of warnings (#11505)
The function used for getting host information
(host.SensorsTemperaturesWithContext) returns warnings in some cases.

Returning with error in such cases means we miss out on the other useful
information already fetched (os info).

If the OS info has been succesfully fetched, it should always be
included in the output irrespective of whether the other data (CPU
sensors, users) could be fetched or not.
2021-02-12 17:57:57 -08:00
Poorna Krishnamoorthy 93fd248b52
fix: save ModTime properly in disk cache (#11522)
fix #11414
2021-02-11 19:25:47 -08:00
Harshavardhana 2a7b123895
turn off http2 for TLS setups for now (#11523)
due to lots of issues with x/net/http2, as
well as the bundled h2_bundle.go in the go
runtime should be avoided for now.

https://github.com/golang/go/issues/23559
https://github.com/golang/go/issues/42534
https://github.com/golang/go/issues/43989
https://github.com/golang/go/issues/33425
https://github.com/golang/go/issues/29246

With collection of such issues present, it
make sense to remove HTTP2 support for now
2021-02-11 15:53:04 -08:00
Harshavardhana b3c56b53fb
fix: metacache should only rename entries during cleanup (#11503)
To avoid large delays in metacache cleanup, use rename
instead of recursive delete calls, renames are cheaper
move the content to minioMetaTmpBucket and then cleanup
this folder once in 24hrs instead.

If the new cache can replace an existing one, we should
let it replace since that is currently being saved anyways,
this avoids pile up of 1000's of metacache entires for
same listing calls that are not necessary to be stored
on disk.
2021-02-11 10:22:03 -08:00
Poorna Krishnamoorthy f24d8127ab
fix: DeleteMultipleObjectsHandler to process deleted objects correctly (#11515)
DeleteMarkerVersionID which is returned by the lower layer should 
not be used in the key to lookup ObjectToDelete map
2021-02-10 23:41:41 -08:00
Harshavardhana 7875d472bc
avoid notification for non-existent delete objects (#11514)
Skip notifications on objects that might have had
an error during deletion, this also avoids unnecessary
replication attempt on such objects.

Refactor some places to make sure that we have notified
the client before we

- notify
- schedule for replication
- lifecycle etc.
2021-02-10 22:00:42 -08:00
Harshavardhana 711adb9652 remove ipv6 fallbackdelay leave it as default 2021-02-10 17:35:09 -08:00
Poorna Krishnamoorthy e6b4ea7618
More fixes for delete marker replication (#11504)
continuation of PR#11491 for multiple server pools and
bi-directional replication.

Moving proxying for GET/HEAD to handler level rather than
server pool layer as this was also causing incorrect proxying 
of HEAD.

Also fixing metadata update on CopyObject - minio-go was not passing
source version ID in X-Amz-Copy-Source header
2021-02-10 17:25:04 -08:00
Aditya Manthramurthy 466e95bb59
Return group DN instead of group name in LDAP STS (#11501)
- Additionally, check if the user or their groups has a policy attached during
the STS call.

- Remove the group name attribute configuration value.
2021-02-10 16:52:49 -08:00
Harshavardhana 881f98e511
fix: use getPoolIdx in DeleteObjects() (#11513)
filter out relevant objects for each pool to
avoid calling, further delete operations on
subsequent pools where some of these objects
might not exist.

This is mainly useful to avoid situations
during bi-directional bucket replication.
2021-02-10 14:25:43 -08:00
Harshavardhana cbf4bb62e0
fix: getPoolIdx decouple from top level options (#11512)
top-level options shouldn't be passed down for
GetObjectInfo() while verifying the objects in
different pools, this is to make sure that
we always get the value from the pool where
the object exists.
2021-02-10 11:45:02 -08:00
Anis Elleuch 682482459d
Change the default object content-type to binary/octet-stream (#11508) 2021-02-10 08:56:37 -08:00
Krishnan Parthasarathi b87fae0049
Simplify PutObjReader for plain-text reader usage (#11470)
This change moves away from a unified constructor for plaintext and encrypted
usage. NewPutObjReader is simplified for the plain-text reader use. For
encrypted reader use, WithEncryption should be called on an initialized PutObjReader.

Plaintext:
func NewPutObjReader(rawReader *hash.Reader) *PutObjReader

The hash.Reader is used to provide payload size and md5sum to the downstream
consumers. This is different from the previous version in that there is no need
to pass nil values for unused parameters.

Encrypted:
func WithEncryption(encReader *hash.Reader,
key *crypto.ObjectKey) (*PutObjReader, error)

This method sets up encrypted reader along with the key to seal the md5sum
produced by the plain-text reader (already setup when NewPutObjReader was
called).

Usage:
```
  pReader := NewPutObjReader(rawReader)
  // ... other object handler code goes here

  // Prepare the encrypted hashed reader
  pReader, err = pReader.WithEncryption(encReader, objEncKey)

```
2021-02-10 08:52:50 -08:00
Shireesh Anjal 5a18d437ce
fix: drive hw info incomplete when smartinfo fails (#11509)
Collection of SMART information doesn't work in certain scenarios e.g.
in a container based setup. In such cases, instead of returning an error
(without any data), we should only set the error on the smartinfo
struct, so that other important drive hw info like device, mountpoint,
etc is retained in the output.
2021-02-10 08:48:14 -08:00
Poorna Krishnamoorthy 93eb549a83
fix: duplicate delete marker attempts in bi-directional replication (#11491) 2021-02-09 15:11:43 -08:00
Harshavardhana fe3c39b583
use the new errgroup API whereever applicable (#11466)
start using the new errgroup concurrency control
API introduced in #11457
2021-02-09 12:08:25 -08:00
Harshavardhana 84d400487f
fix: accountInfo API to cater for federated setups (#11484)
when MinIO is deployed in a federated setup, use etcd 
based listing of buckets to provide appropriate filtering 
of buckets per user.
2021-02-09 09:53:07 -08:00
Shireesh Anjal 3afa499885
fix: empty buckets/objects nodes in new setup (#11493) 2021-02-09 09:52:38 -08:00
Krishna Srinivas 876b79b8d8
read-health check endpoint returns success if cluster can serve read requests (#11310) 2021-02-09 01:00:44 -08:00
Ritesh H Shukla 3d74efa6b1
fux: copy object for encrypted objects (#11490) 2021-02-08 19:58:17 -08:00
Harshavardhana 68d299e719
fix: case-insensitive lookups for metadata (#11489)
continuation of #11487, with more changes
2021-02-08 18:12:28 -08:00
Poorna Krishnamoorthy f9c5636c2d
fix: lookup metdata case insensitively (#11487)
while setting replication options
2021-02-08 16:19:05 -08:00
Klaus Post 9b10118d34
Metacache add abs entry limit (#11483)
Add an absolute limit to the number of metacaches for a bucket.

Delete excess caches if they haven't been handed out in an hour.
2021-02-08 11:36:16 -08:00
Harshavardhana 0e3211f4ad
fix: server upgrades should have more descriptive error messages (#11476)
during rolling upgrade, provide a more descriptive error
message and discourage rolling upgrade in such situations,
allowing users to take action.

additionally also rename `slashpath -> pathutil` to avoid
a slighly mis-pronounced usage of `path` package.
2021-02-08 10:15:12 -08:00
Harshavardhana 2e4d9124ad
honor region specified for remote targets (#11480)
fixes #11472
2021-02-08 08:54:27 -08:00
Harshavardhana 6fef4c21b9
fix: align atomic variables for 32bit arch (#11475)
fixes #11474
2021-02-08 08:51:12 -08:00
Poorna Krishnamoorthy 8e1bbd989a
replication:alloc UserDefined map before use (#11478) 2021-02-07 22:01:10 -08:00
Sarasa Kisaragi 152d7cd95b
HDFS support keytab (#11473) 2021-02-07 17:29:47 -08:00
Harshavardhana 0d057c777a remove restriction for multi pool distribution algo 2021-02-06 16:19:05 -08:00
Anis Elleuch 275f7a63e8
lc: Apply DeleteAction correctly to objects (#11471)
When lifecycle decides to Delete an object and not a version in a
versioned bucket, the code should create a delete marker and not
removing the scanned version.

This commit fixes the issue.
2021-02-06 16:10:33 -08:00
Shireesh Anjal 97fe57bba9
Remove Connections from SysProcess struct (#11373)
The connections info of the processes takes up a huge amount of space,
and is not important for adding any useful health checks. Removing it
will significantly reduce the size of the subnet health report.
2021-02-05 21:32:28 -08:00
Harshavardhana 88c1bb0720
fix: improper ticker usage in goroutines (#11468)
- lock maintenance loop was incorrectly sleeping
  as well as using ticker badly, leading to
  extra expiration routines getting triggered
  that could flood the network.

- multipart upload cleanup should be based on
  timer instead of ticker, to ensure that long
  running jobs don't get triggered twice.

- make sure to get right lockers for object name
2021-02-05 19:23:48 -08:00
Harshavardhana 1fdafaf72f
fix: listing for directory object when delimiter is present (#11463)
When you have heirarchy of prefixes with directory objects
our current master would list directory objects as prefixes
when delimiter is present, this is inconsistent with AWS S3

```
aws s3api list-objects --endpoint-url http://localhost:9000 \
    --profile minio --bucket testbucket-v --prefix new/ --delimiter /
{
    "CommonPrefixes": [
        {
            "Prefix": "new/"
        },
        {
            "Prefix": "new/new/"
        }
    ]
}
```

Instead this PR fixes this to behave like AWS S3

```
aws s3api list-objects --endpoint-url http://localhost:9000 \
      --profile minio --bucket testbucket-v --prefix new/ --delimiter /
{
    "Contents": [
        {
            "Key": "new/",
            "LastModified": "2021-02-05T06:27:42.660Z",
            "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
            "Size": 0,
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "",
                "ID": "02d6176db174dc93cb1b899f7c6078f08654445fe8cf1b6ce98d8855f66bdbf4"
            }
        }
    ],
    "CommonPrefixes": [
        {
            "Prefix": "new/new/"
        }
    ]
}
```
2021-02-05 16:24:40 -08:00
Ritesh H Shukla 5fe4bb6b36
Reduce redundant crawler logging (#11448) 2021-02-05 15:51:11 -08:00
Harshavardhana 99b733d44c
fix: deletion of delete marker regression (#11465)
fixes #11440
fixes #11451
fixes #11454
2021-02-05 15:06:23 -08:00
Klaus Post b4ac05523b
Add parallel bucket healing during startup (#11457)
Replaces #11449

Does concurrent healing but limits concurrency to 50 buckets.

Aborts on first error.

`errgroup.Group` is extended to facilitate this in a generic way.
2021-02-05 13:04:26 -08:00
Anis Elleuch c7eacba41c
health-info: Add tags to errors (#11412)
We use multiple libraries in health info, but the returned error does
not indicate exactly what library call is failing, hence adding named
tags to returned errors whenever applicable.
2021-02-05 12:37:15 -08:00
Anis Elleuch 1887c25279
xl: Fix feeding NumVersions & SuccessorModTime to lifecycle (#11462)
After recent refactor where lifecycle started to rely on ObjectInfo to
make decisions, it turned out there are some issues calculating
Successor Modtime and NumVersions, hence the lifecycle is not working as
expected in a versioning bucket in some cases.

This commit fixes the behavior.
2021-02-05 11:59:08 -08:00
Harshavardhana c9b0f595b9
support directory objects in listing in certain scenarios (#11452)
When a directory object is presented as a `prefix`
param our implementation tend to only list objects
present common to the `prefix` than the `prefix` itself,
to mimic AWS S3 like flat key behavior this PR ensures
that if `prefix` is directory object, it should be
automatically considered to be part of the eventual
listing result.

fixes #11370
2021-02-05 10:12:25 -08:00
Harshavardhana 8bb580abfc
fix: use getObjectNInfo to avoid bytes.Buffer usage (#11428)
few places were still using legacy call GetObject()
which was mainly designed for client response writer,
use GetObjectNInfo() for internal calls instead.
2021-02-05 09:57:30 -08:00
Harshavardhana da55a05587
fix aggressive expiration detection (#11446)
for some flaky networks this may be too fast of a value
choose a defensive value, and let this be addressed
properly in a new refactor of dsync with renewal logic.

Also enable faster fallback delay to cater for misconfigured
IPv6 servers

refer
 - https://golang.org/pkg/net/#Dialer
 - https://tools.ietf.org/html/rfc6555
2021-02-04 16:56:40 -08:00
Harshavardhana 3fc4d6f620
update dependenices for relevant projects (#11445)
- minio-go -> v7.0.8
- ldap/v3 -> v3.2.4
- reedsolomon -> v1.9.11
- sio-go -> v0.3.1
- msgp -> v1.1.5
- simdjson-go, md5-simd, highwayhash
2021-02-04 13:49:52 -08:00
Ritesh H Shukla 67a8f37df0
fix: disk usage capacity metric reporting (#11435) 2021-02-04 12:26:58 -08:00
ArthurMa df0c678167
fix: ldap config parsing issue for UserDNSearchFilter (#11437) 2021-02-04 11:07:29 -08:00
Harshavardhana f108873c48
fix: replication metadata comparsion and other fixes (#11410)
- using miniogo.ObjectInfo.UserMetadata is not correct
- using UserTags from Map->String() can change order
- ContentType comparison needs to be removed.
- Compare both lowercase and uppercase key names.
- do not silently error out constructing PutObjectOptions
  if tag parsing fails
- avoid notification for empty object info, failed operations
  should rely on valid objInfo for notification in all
  situations
- optimize copyObject implementation, also introduce a new 
  replication event
- clone ObjectInfo() before scheduling for replication
- add additional headers for comparison
- remove strings.EqualFold comparison avoid unexpected bugs
- fix pool based proxying with multiple pools
- compare only specific metadata

Co-authored-by: Poorna Krishnamoorthy <poornas@users.noreply.github.com>
2021-02-03 20:41:33 -08:00
Andreas Auernhammer 871b450dbd
crypto: add support for decrypting SSE-KMS metadata (#11415)
This commit refactors the SSE implementation and add
S3-compatible SSE-KMS context handling.

SSE-KMS differs from SSE-S3 in two main aspects:
 1. The client can request a particular key and
    specify a KMS context as part of the request.
 2. The ETag of an SSE-KMS encrypted object is not
    the MD5 sum of the object content.

This commit only focuses on the 1st aspect.

A client can send an optional SSE context when using
SSE-KMS. This context is remembered by the S3 server
such that the client does not have to specify the
context again (during multipart PUT / GET / HEAD ...).
The crypto. context also includes the bucket/object
name to prevent renaming objects at the backend.

Now, AWS S3 behaves as following:
 - If the user does not provide a SSE-KMS context
   it does not store one - resp. does not include
   the SSE-KMS context header in the response (e.g. HEAD).
 - If the user specifies a SSE-KMS context without
   the bucket/object name then AWS stores the exact
   context the client provided but adds the bucket/object
   name internally. The response contains the KMS context
   without the bucket/object name.
 - If the user specifies a SSE-KMS context with
   the bucket/object name then AWS again stores the exact
   context provided by the client. The response contains
   the KMS context with the bucket/object name.

This commit implements this behavior w.r.t. SSE-KMS.
However, as of now, no such object can be created since
the server rejects SSE-KMS encryption requests.

This commit is one stepping stone for SSE-KMS support.

Co-authored-by: Harshavardhana <harsha@minio.io>
2021-02-03 15:19:08 -08:00
Harshavardhana f71e192343
avoid listing an empty dir without __XLDIR__ (#11427)
```
minio server /tmp/disk{1...4}
mc mb myminio/testbucket/
mkdir -p /tmp/disk{1..4}/testbucket/test-prefix/
```

This would end up being listed in the current
master, this PR fixes this situation.

If a directory is a leaf dir we should it
being listed, since it cannot be deleted anymore
with DeleteObject, DeleteObjects() API calls
because we natively support directories now.

Avoid listing it and let healing purge this folder
eventually in the background.
2021-02-03 14:06:54 -08:00
Anis Elleuch b3f81e75f6
xl: Make it clear when to create delete marker for a non existant object (#11423) 2021-02-03 10:33:43 -08:00
Klaus Post a71e0483c9
Fix nil disks in getOnlineDisksWithHealing (#11419)
If a disk is skipped when nil it is still returned.
2021-02-02 17:04:37 -08:00
Klaus Post 4a9d9c8585
Update colinmarc/hdfs (#11417)
Updates needed dependency as well.

Fixes #11416
2021-02-02 15:37:30 -08:00
Harshavardhana c885777ac6
Add support for TCP_QUICKACK (#11369)
TCP_QUICKACK is a setting that allows TCP endpoints
to acknowledge the receipt of data instantly in situations
where they would normally wait to see if more data
would be arriving.

https://assets.extrahop.com/whitepapers/TCP-Optimization-Guide-by-ExtraHop.pdf
2021-02-02 09:44:18 -08:00
Poorna Krishnamoorthy fe3aca70c3
Make number of replication workers configurable. (#11379)
MINIO_API_REPLICATION_WORKERS env.var and
`mc admin config set api` allow number of replication
workers to be configurable. Defaults to half the number
of cpus available.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2021-02-02 16:45:06 +05:30
Ritesh H Shukla c4848f9b4f
Add process start time to cluster metrics. (#11405) 2021-02-01 23:02:18 -08:00
Andreas Auernhammer 838d4dafbd
gateway: don't use encrypted ETags for If-Match (#11400)
This commit fixes a bug in the S3 gateway that causes
GET requests to fail when the object is encrypted by the
gateway itself.

The gateway was not able to GET the object since it always
specified a `If-Match` pre-condition checking that the object
ETag matches an expected ETag - even for encrypted ETags.

The problem is that an encrypted ETag will never match the ETag
computed by the backend causing the `If-Match` pre-condition
to fail.

This commit fixes this by not sending an `If-Match` header when
the ETag is encrypted. This is acceptable because:
  1. A gateway-encrypted object consists of two objects at the backend
     and there is no way to provide a concurrency-safe implementation
     of two consecutive S3 GETs in the deployment model of the S3
     gateway.
     Ref: S3 gateways are self-contained and isolated - and there may
          be multiple instances at the same time (no lock across
          instances).
  2. Even if the data object changes (concurrent PUT) while gateway
     A has download the metadata object (but not issued the GET to
     the data object => data race) then we don't return invalid data
     to the client since the decryption (of the currently uploaded data)
     will fail - given the metadata of the previous object.
2021-02-01 23:02:08 -08:00
Anis Elleuch e96fdcd5ec
tagging: Add event notif for PUT object tagging (#11366)
An optimization to avoid double calling for during PutObject tagging
2021-02-01 13:52:51 -08:00
Anis Elleuch 6ef678663e
xl: Create a delete-marker when no other version exists (#11362)
Currently, it is not possible to create a delete-marker when xl.meta
does not exist (no version is created for that object yet). This makes a
problem for replication and mc mirroring with versioning enabled.

This also follows S3 specification.
2021-02-01 13:23:50 -08:00
Harshavardhana f737a027cf fix: regression introduced in federated listing buckets
regression was introduced in
6cd255d516 fix it properly.
2021-02-01 12:06:58 -08:00
Anis Elleuch 65aa2bc614
ilm: Remove object in HEAD/GET if having an applicable ILM rule (#11296)
Remove an object on the fly if there is a lifecycle rule with delete
expiry action for the corresponding object.
2021-02-01 09:52:11 -08:00
Andreas Auernhammer 33554651e9
crypto: deprecate native Hashicorp Vault support (#11352)
This commit deprecates the native Hashicorp Vault
support and removes the legacy Vault documentation.

The native Hashicorp Vault documentation is marked as
outdated and deprecated for over a year now. We give
another 6 months before we start removing Hashicorp Vault
support and show a deprecation warning when a MinIO server
starts with a native Vault configuration.
2021-01-29 17:55:37 -08:00
Poorna Krishnamoorthy c82aef0a56
fix ObjectInfo returned by CopyObject (#11377)
erasure CopyObject was returning old metadata
2021-01-29 14:49:18 -08:00
Harshavardhana 1e53bf2789
fix: allow expansion with newer constraints for older setups (#11372)
currently we had a restriction where older setups would
need to follow previous style of "stripe" count being same
expansion, we can relax that instead newer pools can be
expanded for older setups with newer constraints of
common parity ratio.
2021-01-29 11:40:55 -08:00
Ritesh H Shukla c8489a8f0c
fix: log notification errors only once (#11350) 2021-01-28 13:40:31 -08:00
Klaus Post 2680772d4b
Don't mark remotes online when shutting down (#11368)
Shutting down will mark remotes online when the shutdown has 
started since the context is canceled.

For example:

```
API: SYSTEM()
Time: 16:21:31 CET 01/28/2021
DeploymentID: 313b0065-c5a1-4aa3-9233-07223e77a730
Error: Storage resources are insufficient for the write operation .minio.sys/tmp/ced455c4-3d27-4bdd-95fc-b4707a179b8a/fd934ef3-8fc8-4330-abc1-f039fbbb9700/part.1 (cmd.InsufficientWriteQuorum)
       1: d:\minio\minio\cmd\data-usage.go:56:cmd.storeDataUsageInBackend()
Exiting on signal: INTERRUPT
Client http://127.0.0.1:9002/minio/lock/v5 online
Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d3/v24 online
Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d2/v24 online
Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d1/v24 online
Client http://127.0.0.1:9002/minio/peer/v12 online
Client http://127.0.0.1:9002/minio/storage/data/distxl/s2/d4/v24 online
```

Use a fresh context for health checks.
2021-01-28 13:38:12 -08:00
Harshavardhana 567f7bdd05 fix: verify overlapping domains when > 1 2021-01-28 13:08:53 -08:00
Harshavardhana 6cd255d516
fix: allow updated domain names in federation (#11365)
additionally also disallow overlapping domain names
2021-01-28 11:44:48 -08:00
Aditya Manthramurthy e79829b5b3
Bind to lookup user after user auth to lookup ldap groups (#11357) 2021-01-27 17:31:21 -08:00
Poorna Krishnamoorthy fd3f02637a
fix: replication regression due to proxying requests (#11356)
In PR #11165 due to incorrect proxying for 2 
way replication even when the object was not 
yet replicated

Additionally, fix metadata comparisons when
deciding to do full replication vs metadata copy.

fixes #11340
2021-01-27 11:22:34 -08:00
Harshavardhana e019f21bda
fix: trigger heal if one of the parts are not found (#11358)
Previously we added heal trigger when bit-rot checks
failed, now extend that to support heal when parts
are not found either. This healing gets only triggered
if we can successfully decode the object i.e read
quorum is still satisfied for the object.
2021-01-27 10:21:14 -08:00
Anis Elleuch e9ac7b0fb7
heal: Remove empty directories (#11354)
Since the introduction of __XLDIR__, an empty directory does not have a
meaning anymore in erasure mode. Make healing removes it wherever it
finds it.
2021-01-27 02:19:28 -08:00
Harshavardhana 1debd722b5 rename last remaining Zone->Pool 2021-01-26 20:47:42 -08:00
massintha azamoum e7f6051f19
Send bucket name to peers when bucket notification is enabled (#11351) 2021-01-26 13:48:28 -08:00
Harshavardhana 6717295e18 fix: rename audit log docs and datastructure 2021-01-26 13:39:55 -08:00
Anis Elleuch 00cff1aac5
audit: per object send pool number, set number and servers per operation (#11233) 2021-01-26 13:21:51 -08:00
Harshavardhana 9722531817 fix: purge LDAP deprecated keys 2021-01-26 09:53:29 -08:00
Harshavardhana 5c6bfae4c7
fix: load credentials from etcd directly when possible (#11339)
under large deployments loading credentials might be
time consuming, while this is okay and we will not
respond quickly for `mc admin user list` like queries
but it is possible to support `mc admin user info`

just like how we handle authentication by fetching
the user directly from persistent store.

additionally support service accounts properly,
reloaded from etcd during watch() - this was missing

This PR is also half way remedy for #11305
2021-01-25 20:01:49 -08:00
Aditya Manthramurthy 5f51ef0b40
Add LDAP Lookup-Bind mode (#11318)
This change allows the MinIO server to be configured with a special (read-only)
LDAP account to perform user DN lookups.

The following configuration parameters are added (along with corresponding
environment variables) to LDAP identity configuration (under `identity_ldap`):

- lookup_bind_dn / MINIO_IDENTITY_LDAP_LOOKUP_BIND_DN
- lookup_bind_password / MINIO_IDENTITY_LDAP_LOOKUP_BIND_PASSWORD
- user_dn_search_base_dn / MINIO_IDENTITY_LDAP_USER_DN_SEARCH_BASE_DN
- user_dn_search_filter / MINIO_IDENTITY_LDAP_USER_DN_SEARCH_FILTER

This lookup-bind account is a service account that is used to lookup the user's
DN from their username provided in the STS API. When configured, searching for
the user DN is enabled and configuration of the base DN and filter for search is
required. In this "lookup-bind" mode, the username format is not checked and must
not be specified. This feature is to support Active Directory setups where the
DN cannot be simply derived from the username.

When the lookup-bind is not configured, the old behavior is enabled: the minio
server performs LDAP lookups as the LDAP user making the STS API request and the
username format is checked and configuring it is required.
2021-01-25 14:26:10 -08:00
Harshavardhana 7e266293e6
fix: notify bucket replication after replication/ilm (#11343) 2021-01-25 14:04:41 -08:00
Harshavardhana eb6871ecd9
fix: LoginSTS should be an inline implementation (#11337)
STS tokens can be obtained by using local APIs
once the remote JWT token is presented, current
code was not validating the incoming token in the
first place and was incorrectly making a network
operation using that token.

For the most part this always works without issues,
but under adversarial scenarios it exposes client
to hand-craft a request that can reach internal
services without authentication.

This kind of proxying should be avoided before
validating the incoming token.
2021-01-25 10:15:03 -08:00
Harshavardhana 9cdd981ce7
fix: expire locks only on participating lockers (#11335)
additionally also add a new ForceUnlock API, to
allow forcibly unlocking locks if possible.
2021-01-25 10:01:27 -08:00
Anis Elleuch bd8020aba8
heal: Decode object name in healing result (#11348)
The user can see __XLDIR__ prefix in mc admin heal when the command
heals an empty object with a trailing slash. This commit decodes the
name of the object before sending it back to the upper level.
2021-01-25 09:53:37 -08:00
Harshavardhana 09bc49bd51
fix: healBucket across sets should capture results properly (#11341)
healing `.minio.sys/config` returns incorrect quorum errors
across sets, healing of the buckets.
2021-01-25 09:45:09 -08:00
Harshavardhana 82f0471d1b
honor maxWait heal config when maxIO hits (#11338) 2021-01-25 07:53:12 -08:00
Harshavardhana 6a95f412c9
avoid double CORS headers in federation (#11334)
CORS proxying adds double headers one
by the receiving server, one by proxied
server. Remove them before proxying
when 'Origin' header is found.
2021-01-23 18:27:23 -08:00
Ritesh H Shukla 7575c24037
Add open FD and FD limit to cluster metrics (#11328) 2021-01-22 18:30:16 -08:00
Harshavardhana 43f973c4cf
fix: check for O_DIRECT support for reads and writes (#11331)
In-case user enables O_DIRECT for reads and backend does
not support it we shall proceed to turn it off instead
and print a warning. This validation avoids any unexpected
downtimes that users may incur.
2021-01-22 15:38:21 -08:00
Harshavardhana 1b453728a3
initialize forwarder after init() to avoid crashes (#11330)
DNSCache dialer is a global value initialized in
init(), whereas `go` keeps `var =` before `init()`
, also we don't need to keep proxy routers as
global entities - register the forwarder as
necessary to avoid crashes.
2021-01-22 15:37:41 -08:00
Harshavardhana a6c146bd00
validate storage class across pools when setting config (#11320)
```
mc admin config set alias/ storage_class standard=EC:3
```

should only succeed if parity ratio is valid for all
server pools, if not we should fail proactively.

This PR also needs to bring other changes now that
we need to cater for variadic drive counts per pool.

Bonus fixes also various bugs reproduced with

- GetObjectWithPartNumber()
- CopyObjectPartWithOffsets()
- CopyObjectWithMetadata()
- PutObjectPart,PutObject with truncated streams
2021-01-22 12:09:24 -08:00
Klaus Post 2167ba0111
Feed correct part number to sio (#11326)
When offsets were specified we relied on the first part number to be correct.

Recalculate based on offset.
2021-01-21 08:43:03 -08:00
Klaus Post 4e6d717f39
Compress profiling data (#11313)
Trace data can be rather large and compresses fine.

Compress profile data in zip files:

```
277.895.314 before.profiles.zip
152.800.318 after.profiles.zip
```
2021-01-20 15:49:53 -08:00
Poorna Krishnamoorthy 845e251fa9
fix: crash in notificationsys when peers online is 0 (#11307)
Check if the number of peers online > 0 before using peerClient
2021-01-20 13:13:05 -08:00
Harshavardhana d1a8f0b786
fix possible crashes on deleteMarker replication (#11308)
Delete marker can have `metaSys` set to nil, that
can lead to crashes after the delete marker has
been healed.

Additionally also fix isObjectDangling check
for transitioned objects, that do not have parts
should be treated similar to Delete marker.
2021-01-20 13:12:12 -08:00
Klaus Post dac19d7272
Clarify root disk error (#11314)
Make it clearer what the problem is and how to resolve it.
2021-01-20 13:11:42 -08:00
Harshavardhana 7624c8b9bb
fix: honor storage class uniformity for multiple pools (#11309) 2021-01-20 01:41:18 -08:00
Klaus Post 19fb1086b2
select: Fix leak on compressed files (#11302)
Properly close gzip reader when done reading

fixes #11300
2021-01-19 17:51:46 -08:00
Harshavardhana a5e23a40ff
fix: allow delayed etcd updates to have fallbacks (#11151)
fixes #11149
2021-01-19 10:05:41 -08:00
Harshavardhana 1ad2b7b699
fix: add stricter validation for erasure server pools (#11299)
During expansion we need to validate if

- new deployment is expanded with newer constraints
- existing deployment is expanded with older constraints
- multiple server pools rejected if they have different
  deploymentID and distribution algo
2021-01-19 10:01:31 -08:00
Harshavardhana b5049d541f
fix: reduce an extra readdir() attempted on non-legacy setups (#11301)
to verify moving content and preserving legacy content,
we have way to detect the objects through readdir()
this path is not necessary for most common cases on
newer setups, avoid readdir() to save multiple system
calls.

also fix the CheckFile behavior for most common
use case i.e without legacy format.
2021-01-19 10:01:06 -08:00
Harshavardhana e0055609bb
fix: crawler to skip healing the drives in a set being healed (#11274)
If an erasure set had a drive replacement recently, we don't
need to attempt healing on another drive with in the same erasure
set - this would ensure we do not double heal the same content
and also prioritizes usage for such an erasure set to be calculated
sooner.
2021-01-19 02:40:52 -08:00
Klaus Post e8ce348da1
crypto: Escape JSON text (#10794)
Escape the JSON keys+values from the context.

We do not add the HTML escapes, since that is an extra escape level not mandatory for JSON.
2021-01-19 01:39:04 -08:00
Ritesh H Shukla b4add82bb6
Updated Prometheus metrics (#11141)
* Add metrics for nodes online and offline
* Add cluster capacity metrics
* Introduce v2 metrics
2021-01-18 20:35:38 -08:00
Harshavardhana 3ca6330661
fix: optimize parentDirIsObject by moving isObject to storage layer (#11291)
For objects with `N` prefix depth, this PR reduces `N` such network
operations by converting `CheckFile` into a single bulk operation.

Reduction in chattiness here would allow disks to be utilized more
cleanly, while maintaining the same functionality along with one
extra volume check stat() call is removed.

Update tests to test multiple sets scenario
2021-01-18 12:25:22 -08:00
Aditya Manthramurthy 3163a660aa
Fix support for multiple LDAP user formats (#11276)
Fixes support for using multiple base DNs for user search in the LDAP directory
allowing users from different subtrees in the LDAP hierarchy to request
credentials.

- The username in the produced credentials is now the full DN of the LDAP user
to disambiguate users in different base DNs.
2021-01-17 21:54:32 -08:00
Harshavardhana 0dadfd1b3d
fix: do not compute usage for not found lifecycle operations (#11288)
Currently we would proceed to apply incorrect lifecycle policies
for non-existent objects, this PR handles them appropriately.
2021-01-17 13:58:41 -08:00
Harshavardhana 4315f93421
fix: make sure parentDirIsObject is used at set level (#11280)
parentDirIsObject is not using set level understanding
to check for parent objects, without this it can lead to
objects that can actually reside on a separate set as
objects and would conflict.
2021-01-17 01:11:48 -08:00
Harshavardhana ddb5d7043a fix: standard storage class is allowed to be '0' 2021-01-16 17:32:25 -08:00
Harshavardhana f903cae6ff
Support variable server pools (#11256)
Current implementation requires server pools to have
same erasure stripe sizes, to facilitate same SLA
and expectations.

This PR allows server pools to be variadic, i.e they
do not have to be same erasure stripe sizes - instead
they should have SLA for parity ratio.

If the parity ratio cannot be guaranteed by the new
server pool, the deployment is rejected i.e server
pool expansion is not allowed.
2021-01-16 12:08:02 -08:00
Poorna Krishnamoorthy 7090bcc8e0
fix: doc links and delete replication permissions enforcement (#11285) 2021-01-15 15:22:55 -08:00
Harshavardhana c222bde14b
fix: use common logging implementation for DNSCache (#11284) 2021-01-15 14:04:56 -08:00
Poorna Krishnamoorthy feaf8dfb9a
Fix replication status reported on completion (#11273)
Fixes: #11272
2021-01-13 11:52:28 -08:00
Harshavardhana 628ef081d1
fix: preserve cache calculated previously while moving from v2 to v3 (#11269)
This ensures that all the prometheus monitoring and usage
trackers to avoid alerts configured, although we cannot
support v1 to v2 here - we can v2 to v3.
2021-01-13 09:58:08 -08:00
Harshavardhana 44dff36ff7
listing with prefix prefixed with '/' should be ignored (#11268)
fixes #11265
2021-01-13 09:44:11 -08:00
Poorna Krishnamoorthy b97d53b29c
fix remote target healthcheck (#11267) 2021-01-12 20:48:04 -08:00
Harshavardhana 1a5775e2e8
enable small and large file optimization (#11260)
- for large objects we found that 1MiB block for
  r/w respectively.
- for small objects we found that 128KiB block for
  r/w respectively.
2021-01-12 10:20:39 -08:00
Anis Elleuch e2579b1f5a
azure: Use default upload parameters to avoid consuming too much memory (#11251)
A lot of memory is consumed when uploading small files in parallel, use
the default upload parameters and add MINIO_AZURE_UPLOAD_CONCURRENCY for
users to tweak.
2021-01-11 22:48:09 -08:00
Poorna Krishnamoorthy 7824e19d20
Allow synchronous replication if enabled. (#11165)
Synchronous replication can be enabled by setting the --sync
flag while adding a remote replication target.

This PR also adds proxying on GET/HEAD to another node in a
active-active replication setup in the event of a 404 on the current node.
2021-01-11 22:36:51 -08:00
Harshavardhana 317305d5f9
fix: regression in adding new replication targets (#11257) 2021-01-11 09:08:42 -08:00
Harshavardhana e4e117faab
fix: enable xl.json to xl.meta only if legacy drive is found (#11255)
another optimization is renameLegacyMetadata() never needs
to validate bucket with os.Stat() again, leading to reduction
in one extra syscall.
2021-01-11 02:27:04 -08:00
Klaus Post 51dad1d130
Fix missing GetObjectNInfo Closure (#11243)
Review for missing Close of returned value from `GetObjectNInfo`.

This was often obscured by the stuff that auto-unlocks when reaching EOF.
2021-01-08 10:12:26 -08:00
Harshavardhana 4593b146be
fix: print errors only when metacache status has errors (#11248) 2021-01-08 16:52:19 +05:30
Harshavardhana f21d650ed4
fix: readData in bulk call using messagepack byte wrappers (#11228)
This PR refactors the way we use buffers for O_DIRECT and
to re-use those buffers for messagepack reader writer.

After some extensive benchmarking found that not all objects
have this benefit, and only objects smaller than 64KiB see
this benefit overall.

Benefits are seen from almost all objects from

1KiB - 32KiB

Beyond this no objects see benefit with bulk call approach
as the latency of bytes sent over the wire v/s streaming
content directly from disk negate each other with no
remarkable benefits.

All other optimizations include reuse of msgp.Reader,
msgp.Writer using sync.Pool's for all internode calls.
2021-01-07 19:27:31 -08:00
Harshavardhana a4f6705874
expire stale locks when owner is down (#11247)
fixes #11246
2021-01-07 19:16:18 -08:00
Poorna Krishnamoorthy b35b537e3f
Pass versionID to checkReplicateDelete in web handler (#11244) 2021-01-07 15:28:27 -08:00
Harshavardhana 5c52d5ffc7
fix: treat errVolumeNotFound as EOF error in listPathRaw (#11238) 2021-01-07 09:52:53 -08:00
Harshavardhana f0808bb2e5
fix: getObject fd leaks in transition and replication code (#11237) 2021-01-06 16:13:10 -08:00
Harshavardhana a6dee21092
initialize IAM store before Init() to avoid any crash (#11236) 2021-01-06 13:40:20 -08:00
Anis Elleuch 6f781c5e7a
heal: Reduce whitespace ticker to 5 seconds (#11234)
30 seconds white spaces is long for some setups which time out when no
read activity in short time, reduce the subnet health white space ticker
to 5 seconds, since it has no cost at all.
2021-01-06 13:29:50 -08:00
Harshavardhana f8ca859790
fix: server/gateway banner formatting (#11230) 2021-01-06 10:38:07 -08:00
Harshavardhana 76e2713ffe
fix: use buffers only when necessary for io.Copy() (#11229)
Use separate sync.Pool for writes/reads

Avoid passing buffers for io.CopyBuffer()
if the writer or reader implement io.WriteTo or io.ReadFrom
respectively then its useless for sync.Pool to allocate
buffers on its own since that will be completely ignored
by the io.CopyBuffer Go implementation.

Improve this wherever we see this to be optimal.

This allows us to be more efficient on memory usage.
```
   385  // copyBuffer is the actual implementation of Copy and CopyBuffer.
   386  // if buf is nil, one is allocated.
   387  func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {
   388  	// If the reader has a WriteTo method, use it to do the copy.
   389  	// Avoids an allocation and a copy.
   390  	if wt, ok := src.(WriterTo); ok {
   391  		return wt.WriteTo(dst)
   392  	}
   393  	// Similarly, if the writer has a ReadFrom method, use it to do the copy.
   394  	if rt, ok := dst.(ReaderFrom); ok {
   395  		return rt.ReadFrom(src)
   396  	}
```

From readahead package
```
// WriteTo writes data to w until there's no more data to write or when an error occurs.
// The return value n is the number of bytes written.
// Any error encountered during the write is also returned.
func (a *reader) WriteTo(w io.Writer) (n int64, err error) {
	if a.err != nil {
		return 0, a.err
	}
	n = 0
	for {
		err = a.fill()
		if err != nil {
			return n, err
		}
		n2, err := w.Write(a.cur.buffer())
		a.cur.inc(n2)
		n += int64(n2)
		if err != nil {
			return n, err
		}
```
2021-01-06 09:36:55 -08:00
Harshavardhana b5d291ea88
fix: rename remaining zone -> pool (#11231) 2021-01-06 09:35:47 -08:00
Klaus Post eb9172eecb
Allow Compression + encryption (#11103) 2021-01-05 20:08:35 -08:00
Poorna Krishnamoorthy 64bddf47d8
Pass deletemarker correctly to replicate opts (#11227)
fixes: #11180
2021-01-05 14:12:37 -08:00
Harshavardhana 4ed45ce543
fix: healing buckets during pool expansion (#11224)
fixes #11209
2021-01-05 13:24:22 -08:00
Klaus Post ad511b0eb8
tests: Fix occasional data race (#11223)
CI tests could trigger a data race.

Servers are generally not expected to reinitialize, so tests could trigger data races when reinitializing and async operations are running.

We add the option to safely reset global vars instead of overwriting.

Fixes races like:

```
WARNING: DATA RACE
Read at 0x00000477ab18 by goroutine 1159:
  github.com/minio/minio/cmd.FileInfo.ToObjectInfo()
      /home/runner/work/minio/minio/cmd/erasure-metadata.go:105 +0x16d
  github.com/minio/minio/cmd.erasureObjects.putObject()
      /home/runner/work/minio/minio/cmd/erasure-object.go:748 +0x13f8
  github.com/minio/minio/cmd.(*erasureObjects).listPath.func3.2()
      /home/runner/work/minio/minio/cmd/metacache-set.go:682 +0x7d3
  github.com/minio/minio/cmd.newMetacacheBlockWriter.func1.2()
      /home/runner/work/minio/minio/cmd/metacache-stream.go:777 +0x1c4
  github.com/minio/minio/cmd.newMetacacheBlockWriter.func1()
      /home/runner/work/minio/minio/cmd/metacache-stream.go:806 +0x614

Previous write at 0x00000477ab18 by goroutine 1269:
  [failed to restore the stack]

Goroutine 1159 (running) created at:
  github.com/minio/minio/cmd.newMetacacheBlockWriter()
      /home/runner/work/minio/minio/cmd/metacache-stream.go:760 +0x112
  github.com/minio/minio/cmd.(*erasureObjects).listPath.func3()
      /home/runner/work/minio/minio/cmd/metacache-set.go:672 +0xe22

Goroutine 1269 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1095 +0x537
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1339 +0xa6
  testing.tRunner()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1050 +0x1eb
  testing.runTests()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1337 +0x594
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1252 +0x2ff
  github.com/minio/minio/cmd.TestMain()
      /home/runner/work/minio/minio/cmd/test-utils_test.go:120 +0x44e
  main.main()
      _testmain.go:1408 +0x223
==================
==================
WARNING: DATA RACE
Read at 0x00000477aae8 by goroutine 1159:
  github.com/minio/minio/cmd.(*BucketVersioningSys).Enabled()
      /home/runner/work/minio/minio/cmd/bucket-versioning.go:26 +0x52
  github.com/minio/minio/cmd.FileInfo.ToObjectInfo()
      /home/runner/work/minio/minio/cmd/erasure-metadata.go:105 +0x197
  github.com/minio/minio/cmd.erasureObjects.putObject()
      /home/runner/work/minio/minio/cmd/erasure-object.go:748 +0x13f8
  github.com/minio/minio/cmd.(*erasureObjects).listPath.func3.2()
      /home/runner/work/minio/minio/cmd/metacache-set.go:682 +0x7d3
  github.com/minio/minio/cmd.newMetacacheBlockWriter.func1.2()
      /home/runner/work/minio/minio/cmd/metacache-stream.go:777 +0x1c4
  github.com/minio/minio/cmd.newMetacacheBlockWriter.func1()
      /home/runner/work/minio/minio/cmd/metacache-stream.go:806 +0x614

Previous write at 0x00000477aae8 by goroutine 1269:
  [failed to restore the stack]

Goroutine 1159 (running) created at:
  github.com/minio/minio/cmd.newMetacacheBlockWriter()
      /home/runner/work/minio/minio/cmd/metacache-stream.go:760 +0x112
  github.com/minio/minio/cmd.(*erasureObjects).listPath.func3()
      /home/runner/work/minio/minio/cmd/metacache-set.go:672 +0xe22

Goroutine 1269 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1095 +0x537
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1339 +0xa6
  testing.tRunner()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1050 +0x1eb
  testing.runTests()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1337 +0x594
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.14.13/x64/src/testing/testing.go:1252 +0x2ff
  github.com/minio/minio/cmd.TestMain()
      /home/runner/work/minio/minio/cmd/test-utils_test.go:120 +0x44e
  main.main()
      _testmain.go:1408 +0x223
==================
```
2021-01-05 10:45:26 -08:00
Harshavardhana cb0eaeaad8
feat: migrate to ROOT_USER/PASSWORD from ACCESS/SECRET_KEY (#11185) 2021-01-05 10:22:57 -08:00
Harshavardhana d0027c3c41
do not use large buffers if not necessary (#11220)
without this change, there is a performance
regression for small objects GETs, this makes
the overall speed to go back to pre '59d363'
commit days.
2021-01-04 18:51:52 -08:00
Anis Elleuch cb7fc99368
handlers: Avoid initializing a struct in each handler call (#11217) 2021-01-04 09:54:22 -08:00
Harshavardhana a4383051d9
remove/deprecate crawler disable environment (#11214)
with changes present to automatically throttle crawler
at runtime, there is no need to have an environment
value to disable crawling. crawling is a fundamental
piece for healing, lifecycle and many other features
there is no good reason anyone would need to disable
this on a production system.

* Apply suggestions from code review
2021-01-04 09:43:31 -08:00
Harshavardhana e7ae49f9c9
fix: calculate prometheus disks_offline/disks_total correctly (#11215)
fixes #11196
2021-01-04 09:42:09 -08:00
Anis Elleuch 153d4be032
tracing: NumSubscribers() to use atomic instead of mutex (#11219)
globalSubscribers.NumSubscribers() is heavily used in S3 requests and it
uses mutex, use atomic.Load instead since it is faster

Co-authored-by: Anis Elleuch <anis@min.io>
2021-01-04 09:40:30 -08:00
Anis Elleuch dfd99b6d8f
handlers: Little bit more optimizations (#11211) 2021-01-04 00:01:06 -08:00
Harshavardhana c4b1d394d6
erasure: avoid io.Copy in hotpaths to reduce allocation (#11213) 2021-01-03 16:27:34 -08:00
Harshavardhana c4131c2798
feat: Small object optimization read data in single bulk call (#11207) 2021-01-03 11:27:57 -08:00
Anis Elleuch c9d502e6fa
parentDirIsObject() to return quickly with inexistant parent (#11204)
Rewrite parentIsObject() function. Currently if a client uploads
a/b/c/d, we always check if c, b, a are actual objects or not.

The new code will check with the reverse order and quickly quit if 
the segment doesn't exist.

So if a, b, c in 'a/b/c' does not exist in the first place, then returns
false quickly.
2021-01-02 12:01:29 -08:00
Anis Elleuch 677e80c0f8
xl: Remove check-dir in ReadVersion (#11200)
The only purpose of check-dir flag in
ReadVersion is to return 404 when
an object has xl.meta but without data.

This is causing an extract call to the disk 
which can be penalizing in case of busy system
where disks receive many concurrent access.
2021-01-02 10:35:57 -08:00
Harshavardhana aa85af4d1a fix: missing CopyObjectPart maxClients reorder 2021-01-01 23:07:37 -08:00
Anis Elleuch ae731d232f
trace: Reorder http/trace maxClients wrapping for correct tracing (#11202)
mc admin trace does not show the correct handler name in the output: it
is printing `maxClients' for all handlers. The reason is that the wrong
order of handler wrapping.
2021-01-01 23:06:07 -08:00
Anis Elleuch a317d220ed
xl-storage: Do not stat bucket assuming the object exists (#11201)
In HEAD/GET, only STAT the bucket if the 
object does not exist to return the correct
error response.
2021-01-01 09:44:36 -08:00
Harshavardhana 3e1221a01c
fix: log once updating dataUsageCache versions (#11190)
also reduce usage of *bytes.Buffer for
reading `usage-cache.bin`
2020-12-31 09:45:09 -08:00
Ritesh H Shukla 36fc2f98ed
fix: admin trace throttled requests (#11192) 2020-12-30 21:04:55 -08:00
Ritesh H Shukla 556524c715
Reduce logging when peer is offline (#11184) 2020-12-30 14:38:54 -08:00
Harshavardhana cc457f1798
fix: enhance logging in crawler use console.Debug instead of logger.Info (#11179) 2020-12-29 01:57:28 -08:00
Harshavardhana ca0d31b09a
fix: re-arrange handlers to handle requests on /minio (#11177)
fixes #11175
2020-12-28 17:10:33 -08:00
Harshavardhana 445a9bd827
fix: heal optimizations in crawler to avoid multiple healing attempts (#11173)
Fixes two problems

- Double healing when bitrot is enabled, instead heal attempt
  once in applyActions() before lifecycle is applied.

- If applyActions() is successful and getSize() returns proper
  value, then object is accounted for and should be removed
  from the oldCache namespace map to avoid double heal attempts.
2020-12-28 10:31:00 -08:00
Harshavardhana d8d25a308f
fix: use HealObject for cleaning up dangling objects (#11171)
main reason is that HealObjects starts a recursive listing
for each object, this can be a really really long time on
large namespaces instead avoid recursive listing just
perform HealObject() instead at the prefix.

delete's already handle purging dangling content, we
don't need to achieve this by doing recursive listing,
this in-turn can delay crawling significantly.
2020-12-27 15:42:20 -08:00
Harshavardhana c19e6ce773
avoid a crash in crawler when lifecycle is not initialized (#11170)
Bonus for static buffers use bytes.NewReader instead of
bytes.NewBuffer, to use a more reader friendly implementation
2020-12-26 22:58:06 -08:00
Harshavardhana 59d3639396
fix: inherit heal opts globally, including bitrot settings (#11166)
Bonus re-use ReadFileStream internal io.Copy buffers, fixes
lots of chatty allocations when reading metacache readers
with many sustained concurrent listing operations

```
   17.30GB  1.27% 84.80%    35.26GB  2.58%  io.copyBuffer
```
2020-12-24 23:04:03 -08:00
Harshavardhana 027e17468a
fix: discarding results do not attempt in-memory metacache writer (#11163)
Optimizations include

- do not write the metacache block if the size of the
  block is '0' and it is the first block - where listing
  is attempted for a transient prefix, this helps to
  avoid creating lots of empty metacache entries for
  `minioMetaBucket`

- avoid the entire initialization sequence of cacheCh
  , metacacheBlockWriter if we are simply going to skip
  them when discardResults is set to true.

- No need to hold write locks while writing metacache
  blocks - each block is unique, per bucket, per prefix
  and also is written by a single node.
2020-12-24 15:02:02 -08:00
Harshavardhana 45ea161f8d
webUI: change listing to 1000 keys from browser UI (#11159)
gateway implementations do not handle maxKeys being
`-1` properly unlike MinIO implementation, handle it
by setting an appropriate value.

fixes #11158
2020-12-23 19:58:15 -08:00
Harshavardhana 6a66f142d4
fix: strict quorum in list should list on all drives (#11157)
current implementation was incorrect, it in-fact
assumed only read quorum number of disks. in-fact
that value is only meant for read quorum good entries
from all online disks.

This PR fixes this behavior properly.
2020-12-23 09:26:40 -08:00
Harshavardhana 5982965839
fix: re-use bytes.Buffer using sync.Pool (#11156) 2020-12-22 23:22:37 -08:00
Harshavardhana 8565cefe4e fix: allow HTTP2.0 to be always configured 2020-12-22 16:32:58 -08:00
Andreas Auernhammer 8cdf2106b0
refactor cmd/crypto code for SSE handling and parsing (#11045)
This commit refactors the code in `cmd/crypto`
and separates SSE-S3, SSE-C and SSE-KMS.

This commit should not cause any behavior change
except for:
  - `IsRequested(http.Header)`

which now returns the requested type {SSE-C, SSE-S3,
SSE-KMS} and does not consider SSE-C copy headers.

However, SSE-C copy headers alone are anyway not valid.
2020-12-22 09:19:32 -08:00
Harshavardhana 35fafb837b
fix: issues with handling delete markers in metacache (#11150)
Additional cases handled

- fix address situations where healing is not
  triggered on failed writes and deletes.

- consider object exists during listing when
  metadata can be successfully decoded.
2020-12-22 09:16:43 -08:00
Harshavardhana 274bbad5cb
fix: select always online peers for remote listing (#11153)
always find the right set of online peers for remote listing,
this may have an effect on listing if the server is down - we
should do this to avoid always performing transient operations
on bucket->peerClient that is permanently or down for a long
period.
2020-12-22 09:16:07 -08:00
Harshavardhana 5c451d1690
update x/net/http2 to address few bugs (#11144)
additionally also configure http2 healthcheck
values to quickly detect unstable connections
and let them timeout.

also use single transport for proxying requests
2020-12-21 21:42:38 -08:00
Poorna Krishnamoorthy c987313431
Encrypt remote target if kms is configured (#11034)
Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2020-12-21 16:21:33 -08:00
Anis Elleuch 2ecaab55a6
admin: ServerInfo returns info without object layer initialized (#11142) 2020-12-21 09:35:19 -08:00
Harshavardhana 3e792ae2a2
fix: change defaults for DNS cache dialer (#11145) 2020-12-21 09:33:29 -08:00
Harshavardhana 4cc500a041
normalize users with double // in accessKeys (#11143)
Bonus fix, use constant time compare for secret keys  in web-handlers.go:SetAuth()
2020-12-20 10:09:51 -08:00
Harshavardhana d8e28830cf
fix: allow STS creds for admin accounts to add users (#11138)
Allow rotating creds with privileges to add users

fixes https://github.com/minio/console/issues/529
2020-12-19 13:24:21 -08:00
Harshavardhana 3e16ec457a
fix: support user/groups with '/' character (#11127)
NOTE: user/groups with `//` shall be normalized to `/`

fixes #11126
2020-12-19 09:36:37 -08:00
Harshavardhana e5d378931d
fix: delimiter based listing was broken without marker (#11136)
with missing nextMarker with delimiter based listing,
top level prefixes beyond 4500 or max-keys value
wouldn't be sent back for client to ask for the next
batch.

reproduced at a customer deployment, create prefixes
as shown below

```
for year in $(seq 2017 2020)
do
    for month in {01..12}
    do for day in {01..31}
       do
           mc -q cp file myminio/testbucket/dir/day_id=$year-$month-$day/;
       done
    done
done
```

Then perform

```
aws s3api --profile minio --endpoint-url http://localhost:9000 list-objects \
    --bucket testbucket --prefix dir/ --delimiter / --max-keys 1000
```

You shall see missing NextMarker, this would disallow listing beyond max-keys
requested and also disallow beyond 4500 (maxKeyObjectList) prefixes being listed
because client wouldn't know the NextMarker available.

This PR addresses this situation properly by making the implementation
more spec compatible. i.e NextMarker in-fact can be either an object, a prefix
with delimiter depending on the input operation.

This issue was introduced after the list caching changes and has been present
for a while.
2020-12-19 09:36:04 -08:00
Anis Elleuch e63a10e505
Profiling does not required object layer to be initialized (#11133) 2020-12-18 11:51:15 -08:00
Anis Elleuch 5434088c51
replication: Ensure to always use nano precision source modtime (#11135) 2020-12-18 11:37:28 -08:00
Harshavardhana a773cf48d8
fix: overlapping object and prefix rejected (#11130)
fixes #11129
2020-12-18 08:51:09 -08:00
Harshavardhana f714840da7
add _MINIO_SERVER_DEBUG env for enabling debug messages (#11128) 2020-12-17 16:52:47 -08:00
Harshavardhana 7c9ef76f66
fix: timer deadlock on expired timers (#11124)
issue was introduced in #11106 the following
pattern

<-t.C // timer fired

if !t.Stop() {
   <-t.C // timer hangs
}

Seems to hang at the last `t.C` line, this
issue happens because a fired timer cannot be
Stopped() anymore and t.Stop() returns `false`
leading to confusing state of usage.

Refactor the code such that use timers appropriately
with exact requirements in place.
2020-12-17 12:35:02 -08:00
Anis Elleuch cffdb01279
azure/s3 gateways: Pass ETag during GET call to avoid data corruption (#11024)
Both Azure & S3 gateways call for object information before returning
the stream of the object, however, the object content/length could be
modified meanwhile, which means it can return a corrupted object.

Use ETag to ensure that the object was not modified during the GET call
2020-12-17 09:11:14 -08:00
Harshavardhana b390a2a0b9
fix: reuser timers in erasure set hotpaths (#11106)
reuser timers in

 - connectDisks() monitoring
 - healMRFRoutine() channel timeouts
2020-12-16 14:33:05 -08:00
Harshavardhana 90158f1e33
fix: avoid logging for Heal APIs in FS mode (#11121)
fixes #11120
2020-12-16 09:46:13 -08:00
Harshavardhana c606c76323
fix: prioritized latest buckets for crawler to finish the scans faster (#11115)
crawler should only ListBuckets once not for each serverPool,
buckets are same across all pools, across sets and ListBuckets
always returns an unified view, once list buckets returns
sort it by create time to scan the latest buckets earlier
with the assumption that latest buckets would have lesser
content than older buckets allowing them to be scanned faster
and also to be able to provide more closer to latest view.
2020-12-15 17:34:54 -08:00
Klaus Post e7d3b49a20
metacache: Make very small requests transient (#11109) 2020-12-15 11:25:36 -08:00
Harshavardhana 5df61ab96b
fix: remove gorilla/rpc/ deps fully after our fork (#11108) 2020-12-15 11:18:06 -08:00
Poorna Krishnamoorthy 3456b03b12
Ignore ObjectNotFound errors in delete api while enforcing locking (#11114)
AWS does not report this or version not found as errors in the response.
2020-12-15 11:15:49 -08:00
Klaus Post f6fb27e8f0
Don't copy interesting ids, clean up logging (#11102)
When searching the caches don't copy the ids, instead inline the loop.

```
Benchmark_bucketMetacache_findCache-32    	   19200	     63490 ns/op	    8303 B/op	       5 allocs/op
Benchmark_bucketMetacache_findCache-32    	   20338	     58609 ns/op	     111 B/op	       4 allocs/op
```

Add a reasonable, but still the simplistic benchmark.

Bonus - make nicer zero alloc logging
2020-12-14 13:13:33 -08:00
Harshavardhana 8368ab76aa
fix: remove the requirement for healing buckets in ListBucketsHeal (#11098)
With new refactor of bucket healing, healing bucket happens
automatically including its metadata, there is no need to
redundant heal buckets also in ListBucketsHeal remove
it.
2020-12-14 12:07:07 -08:00
Harshavardhana 3e83643320
lifecycle improvements and additional debug logging (#11096)
Bonus change fix browser assets
2020-12-13 12:05:54 -08:00
Harshavardhana 2eb52ca5f4
fix: heal bucket metadata right before healing bucket (#11097)
optimization mainly to avoid listing the entire
`.minio.sys/buckets/.minio.sys` directory, this
can get really huge and comes in the way of startup
routines, contents inside `.minio.sys/buckets/.minio.sys`
are rather transient and not necessary to be healed.
2020-12-13 11:57:08 -08:00
Anis Elleuch f164085227
xl: Always set root disk to true in test environment (#11094)
Tests environments (go test or manual testing) should always consider
the passed disks are root disks and should not rely on disk.IsRootDisk()
function. The reason is that this latter can return a false negative
when called in a busy system. However, returning a false negative will
only occur in a testing environment and not in a production, so we can
accept this trade-off for now.
2020-12-12 16:10:07 -08:00
Harshavardhana 48191dd748
return NoSuchVersion if invalid version-id is specified (#11091) 2020-12-11 20:44:08 -08:00
Anis Elleuch c4f29d24da
metacache: Ask all disks when drive count is 4 (#11087) 2020-12-11 17:54:31 -08:00
Harshavardhana db7890660e
fix: a crash when disk is nil, safe access on erasureDisks (#11089)
fixes #11088
2020-12-11 16:58:36 -08:00
Poorna Krishnamoorthy 9adc33efbb
Return version-id header in DeleteObject response (#11090)
even when the object version is non-existent

To make this consistent with aws behavior.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2020-12-11 16:58:15 -08:00
Poorna Krishnamoorthy 8f65aba04b
ignore NoSuchVersion error in DeleteObjects API (#11086)
Currently, the error response reports NoSuchVersion
for a non-existent version-id, whereas AWS ignores it.
2020-12-11 12:39:09 -08:00
Harshavardhana 3a0082f0f1
fix: TTFB prometheus metrics calculation (#11082)
until now metrics was reporting entire call
duration instead of ttfb's this PR fixes it
2020-12-10 23:02:25 -08:00
Klaus Post 4bca62a0bd
crawler: Stream bucket usage cache data (#11068)
Stream bucket caches to storage and through RPC calls.
2020-12-10 13:03:22 -08:00
Klaus Post 82e2be4239
metacache: Speed up cleanup operation (#11078)
Perform cleanup operations on copied data. Avoids read locking
data while determining which caches to keep.

Also, reduce the log(N*N) operation to log(N*M) where M caches 
with the same root or below when checking potential replacements.
2020-12-10 12:30:28 -08:00
Harshavardhana 4550ac6fff
fix: refactor locks to apply them uniquely per node (#11052)
This refactor is done for few reasons below

- to avoid deadlocks in scenarios when number
  of nodes are smaller < actual erasure stripe
  count where in N participating local lockers
  can lead to deadlocks across systems.

- avoids expiry routines to run 1000 of separate
  network operations and routes per disk where
  as each of them are still accessing one single
  local entity.

- it is ideal to have since globalLockServer
  per instance.

- In a 32node deployment however, each server
  group is still concentrated towards the
  same set of lockers that partipicate during
  the write/read phase, unlike previous minio/dsync
  implementation - this potentially avoids send
  32 requests instead we will still send at max
  requests of unique nodes participating in a
  write/read phase.

- reduces overall chattiness on smaller setups.
2020-12-10 07:28:37 -08:00
Klaus Post e65ed2e44f
listcache: Add path index (#11063)
Add a root path index.

```
Before:
Benchmark_bucketMetacache_findCache-32    	   10000	    730737 ns/op

With excluded prints:
Benchmark_bucketMetacache_findCache-32    	   10000	    207100 ns/op

With the root path:
Benchmark_bucketMetacache_findCache-32    	  705765	      1943 ns/op
```

Benchmark used (not linear):

```Go
func Benchmark_bucketMetacache_findCache(b *testing.B) {
	bm := newBucketMetacache("", false)
	for i := 0; i < b.N; i++ {
		bm.findCache(listPathOptions{
			ID:           mustGetUUID(),
			Bucket:       "",
			BaseDir:      "prefix/" + mustGetUUID(),
			Prefix:       "",
			FilterPrefix: "",
			Marker:       "",
			Limit:        0,
			AskDisks:     0,
			Recursive:    false,
			Separator:    slashSeparator,
			Create:       true,
			CurrentCycle: 0,
			OldestCycle:  0,
		})
	}
}
```

Replaces #11058
2020-12-09 08:37:43 -08:00
Anis Elleuch d90044b847
federation: Redirect Lifecycle PUT request by bucket name (#11062)
The bucket forwarder handler considers MakeBucket to be always local but
it mistakenly thinks that PUT bucket lifecycle to be a MakeBucket call.

Fix the check of the MakeBucket call by ensuring that the query is empty
in the PUT url.
2020-12-09 07:25:26 -08:00
Harshavardhana d8c1f93de6
reject mixed drive situations with drives on root disks (#11057)
till now we used to match the inode number of the root
drive and the drive path minio would use, if they match
we knew that its a root disk.

this may not be true in all situations such as running
inside a container environment where the container might
be mounted from a different partition altogether, root
disk detection might fail.
2020-12-09 00:27:02 -08:00
Anis Elleuch a51488cbaa
s3: Fix reading GET with partNumber specified (#11032)
partNumber was miscalculting the start and end of parts when partNumber
query is specified in the GET request. This commit fixes it and also
fixes the ContentRange header in that case.
2020-12-08 13:12:42 -08:00
Harshavardhana dc819afa44 fix: auto update crawler meta version
PR 038bcd9079 introduced
version '3', we need to make sure that we do not
print an unexpected error instead log a message to
indicate we will auto update the version.
2020-12-08 10:40:51 -08:00
Harshavardhana 4a564336fe Revert "Add metrics for nodes online and offline (#11050)"
This reverts commit f60bbdf86b.
2020-12-08 09:23:35 -08:00
Ritesh H Shukla f60bbdf86b
Add metrics for nodes online and offline (#11050) 2020-12-08 01:06:27 -08:00
Poorna Krishnamoorthy f3beb1236a
Add cache usage, total capacity to prometheus metrics (#11026) 2020-12-07 16:35:11 -08:00
Poorna Krishnamoorthy 934bed47fa
Add transition event notification (#11047)
This is a MinIO specific extension to allow monitoring of transition events.
2020-12-07 13:53:28 -08:00
Ritesh H Shukla 038bcd9079
Add replication capacity metrics support in crawler (#10786) 2020-12-07 13:47:48 -08:00
Harshavardhana ce93b2681b
fix: re-use er.getDisks() properly in certain calls (#11043) 2020-12-07 10:04:07 -08:00
Harshavardhana 8d036ed6d8
fix: allow sub-admin to modify password for other users (#11039)
fixes #11037
2020-12-06 20:36:34 -08:00
Harshavardhana 9c53cc1b83
fix: heal multiple buckets in bulk (#11029)
makes server startup, orders of magnitude
faster with large number of buckets
2020-12-05 13:00:44 -08:00
Harshavardhana 3514e89eb3
support envs as well for new crawler sub-system (#11033) 2020-12-04 21:54:24 -08:00
Klaus Post a896125490
Add crawler delay config + dynamic config values (#11018) 2020-12-04 09:32:35 -08:00
Harshavardhana e083471ec4
use argon2 with sync.Pool for better memory management (#11019) 2020-12-03 19:23:19 -08:00
Harshavardhana 80d31113e5
fix: etcd import paths again depend on v3.4.14 release (#11020)
Due to botched upstream renames of project repositories
and incomplete migration to go.mod support, our current
dependency version of `go.mod` had bugs i.e it was
using commits from master branch which didn't have
the required fixes present in release-3.4 branches

which leads to some rare bugs

https://github.com/etcd-io/etcd/pull/11477 provides
a workaround for now and we should migrate to this.

release-3.5 eventually claims to fix all of this
properly until then we cannot use /v3 import right now
2020-12-03 11:35:18 -08:00
Ritesh H Shukla 7e2b79984e
Stream bucket bandwidth measurements (#11014) 2020-12-03 11:34:42 -08:00
Harshavardhana 951b6b203b skip metacache entries healing to speed up startup 2020-12-02 21:30:54 -08:00
Harshavardhana 44e23b7f4f fix: startup being slow - wait only if IOCount > 0 2020-12-02 21:06:17 -08:00
Harshavardhana 96c0ce1f0c
add support for tuning healing to make healing more aggressive (#11003)
supports `mc admin config set <alias> heal sleep=100ms` to
enable more aggressive healing under certain times.

also optimize some areas that were doing extra checks than
necessary when bitrotscan was enabled, avoid double sleeps
make healing more predictable.

fixes #10497
2020-12-02 11:12:00 -08:00
ebozduman 303be1866d
Adds "x-amz-usr-agent" and "x-id" params to be used in authentication of presignedURL (#10792) 2020-12-02 02:02:49 -08:00
Harshavardhana 4ec45753e6 rename server sets to server pools 2020-12-01 13:50:33 -08:00
Klaus Post e6ea5c2703
crawler: Missing folder heal check per set (#10876) 2020-12-01 12:07:39 -08:00
Harshavardhana 790833f3b2 Revert "Support variable server sets (#10314)"
This reverts commit aabf053d2f.
2020-12-01 12:02:29 -08:00
Harshavardhana 7cbca43eb1
fix: allow admins to create users (#11005)
PR #10978 introduced a regression, root
credential should be allowed to create users
2020-11-30 21:53:23 -08:00
Poorna Krishnamoorthy 2f564437ae
Disallow writeback caching with cache_after (#11002)
fixes #10974
2020-11-30 20:53:27 -08:00
Harshavardhana bdd094bc39
fix: avoid sending errors on missing objects on locked buckets (#10994)
make sure multi-object delete returned errors that are AWS S3 compatible
2020-11-28 21:15:45 -08:00
Harshavardhana e6fa410778
fix: allow accountInfo, addUser and getUserInfo implicit (#10978)
- accountInfo API that returns information about
  user, access to buckets and the size per bucket
- addUser - user is allowed to change their secretKey
- getUserInfo - returns user info if the incoming
  is the same user requesting their information
2020-11-27 17:23:57 -08:00
Harshavardhana aabf053d2f
Support variable server sets (#10314) 2020-11-25 16:28:47 -08:00
Anis Elleuch 91130e884b
Avoid sending errors in gob in storage requests (#10977) 2020-11-25 12:42:48 -08:00
Poorna Krishnamoorthy 2ff655a745
Refactor replication, ILM handling in DELETE API (#10945) 2020-11-25 11:24:50 -08:00
Klaus Post 0422eda6a2
metacache: Always close block writer (#10973)
In some cases a writer could be left behind unclosed, leaking compression blocks.

Always close and set compression concurrency to 2 which should be fine to keep up.
2020-11-25 09:37:30 -08:00
Harshavardhana 31e6f60847
fix: improve error handling in metacache (#10965) 2020-11-25 01:11:22 -08:00
Poorna Krishnamoorthy 3ad41fe89d
Add admin API to edit remote bucket target credentials (#10848) 2020-11-24 19:09:05 -08:00
Klaus Post a75fafdbe2
Remove msgp workaround (#10964)
The error in `github.com/philhofer/fwd` was quickly fixed through 
https://github.com/philhofer/fwd/pull/22 - update the dependency 
and remove the workaround.
2020-11-24 11:58:10 -08:00
Klaus Post a58b7874ef
Temporary workaround for msgp skipping (#10960)
Due to https://github.com/philhofer/fwd/issues/20 when skipping a metadata entry that is >2048 bytes and the buffer is full (2048 bytes) the skip will fail with `io.ErrNoProgress`.

Enlarge the buffer so we temporarily make this much more unlikely.

If it still happens we will have to rewrite the skips to reads.

Fixes #10959
2020-11-23 18:51:59 -08:00
Harshavardhana 6990de9c94
fix: dangling object delete shall return object doesn't exist (#10961)
dangling object when deleted means object doesn't exist
anymore, so we should return appropriate errors, this
allows crawler heal to ensure that it removes the tracker
for dangling objects.
2020-11-23 18:50:53 -08:00
Anis Elleuch 75a8e81f8f
azure: Specify different Azure storage in the shell env (#10943)
AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY are used in 
azure CLI to specify the azure blob storage access & secret keys. With this commit, 
it is possible to set them if you want the gateway's own credentials to be
different from the Azure blob credentials.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-11-23 16:45:56 -08:00
Harshavardhana 519c0077a9
fix: do not return an error for successfully deleted dangling objects (#10938)
dangling objects when removed `mc admin heal -r` or crawler
auto heal would incorrectly return error - this can interfere
with usage calculation as the entry size for this would be
returned as `0`, instead upon success use the resultant
object size to calculate the final size for the object
and avoid reporting this in the log messages

Also do not set ObjectSize in healResultItem to be '-1'
this has an effect on crawler metrics calculating 1 byte
less for objects which seem to be missing their `xl.meta`
2020-11-23 09:12:17 -08:00
Harshavardhana 734d07a532
fix: all hosts local and port same should be local erasure setup (#10951)
this is needed to avoid initializing notification peers
that can lead to races in many sub-systems

fixes #10950
2020-11-23 09:07:50 -08:00
Harshavardhana df93102235
fix: unwrapping issues with os.Is* functions (#10949)
reduces  3 stat calls, reducing the
overall startup time significantly.
2020-11-23 08:36:49 -08:00
Poorna Krishnamoorthy 39f3d5493b
Show Delete replication status header (#10946)
X-Minio-Replication-Delete-Status header shows the
status of the replication of a permanent delete of a version.

All GETs are disallowed and return 405 on this object version.
In the case of replicating delete markers.

X-Minio-Replication-DeleteMarker-Status shows the status 
of replication, and would similarly return 405.

Additionally, this PR adds reporting of delete marker event completion
and updates documentation
2020-11-21 23:48:50 -08:00
Klaus Post 692ff41ef7
Unwrap network errors (#10934)
Alternative to #10927

Instead of having an upstream fix, do unwrap when checking network errors.

'As' will also work when destination is an interface as checked by the tests.
2020-11-20 22:55:35 -08:00
Harshavardhana 86409fa93d
add audit/admin trace support for browser requests (#10947)
To support this functionality we had to fork
the gorilla/rpc package with relevant changes
2020-11-20 22:52:17 -08:00
Shireesh Anjal 7bc47a14cc
Rename OBD to Health (#10842)
Also, Remove thread stats and openfds from the health report 
as we already have process stats and numfds
2020-11-20 12:52:53 -08:00
Harshavardhana 73e308079a
fix: handle errors appropriately as they are wrapped (#10917) 2020-11-20 10:43:07 -08:00
Poorna Krishnamoorthy 08b24620c0 Display storage-class of transitioned object in HEAD 2020-11-20 09:17:31 -08:00
Harshavardhana 95675b0c9a
fix: do not crash PutObjectTags when node is down (#10940)
fixes #10939
2020-11-20 09:10:48 -08:00
Poorna Krishnamoorthy 251c1ef6da Add support for replication of object tags, retention metadata (#10880) 2020-11-19 18:56:09 -08:00
Poorna Krishnamoorthy 0fa430c1da validate service type of target in replication/ilm transition config (#10928) 2020-11-19 18:47:33 -08:00
Poorna Krishnamoorthy f60b6eb82e fix validation for deletemarker replication on object locked bucket (#10892) 2020-11-19 18:47:19 -08:00
Poorna Krishnamoorthy 1ebf6f146a Add support for ILM transition (#10565)
This PR adds transition support for ILM
to transition data to another MinIO target
represented by a storage class ARN. Subsequent
GET or HEAD for that object will be streamed from
the transition tier. If PostRestoreObject API is
invoked, the transitioned object can be restored for
duration specified to the source cluster.
2020-11-19 18:47:17 -08:00
Harshavardhana 8f7fe0405e fix: delete marker replication should support directories (#10878)
allow directories to be replicated as well, along with
their delete markers in replication.

Bonus fix to fix bloom filter updates for directories
to be preserved.
2020-11-19 18:47:12 -08:00
Harshavardhana 9a34fd5c4a Revert "Revert "Add delete marker replication support (#10396)""
This reverts commit 267d7bf0a9.
2020-11-19 18:43:58 -08:00
Harshavardhana f794fe79e3
fix: network shutdown was not handle properly (#10927)
fixes a regression introduced in #10859, due
to the error returned by rest.Client being typed
i.e *rest.NetworkError - IsNetworkHostDown function
didn't work as expected to detect network issues.

This in-turn aggravated the situations when nodes
are disconnected leading to performance loss.
2020-11-19 13:53:49 -08:00
Harshavardhana 0f9e125cf3
fix: check for gateway backend online without http request (#10924)
fixes #10921
2020-11-19 10:38:02 -08:00
Harshavardhana d778d9493f
remove MinIO release tag as part of HTTP Server string (#10929) 2020-11-19 09:16:02 -08:00
Harshavardhana 70d2c2ccc9
skip files that are not erasure objects or directories (#10926)
without this change WalkDir reports errors while
trying to read `format.json/xl.meta` which is a
replicated file
2020-11-19 09:15:09 -08:00
Harshavardhana 9dea7020f0
allow prefix filtering for WalkDir to be optional (#10923) 2020-11-18 12:03:16 -08:00
Klaus Post 990d074f7d
metacache: Allow prefix filtering (#10920)
Do listings with prefix filter when bloom filter is dirty.

This will forward the prefix filter to the lister which will make it 
only scan the folders/objects with the specified prefix.

If we have a clean bloom filter we try to build a more generally 
useful cache so in that case, we will list all objects/folders.
2020-11-18 10:44:18 -08:00
Klaus Post e413f05397
Save listing error async (#10922)
Since the RPC call may have to time out save an error state async 
to not hold up the listing returning.

Fixes #10919
2020-11-18 10:28:22 -08:00
Harshavardhana d1b1fee080
fix: save healing tracker right before healing (#10915)
this change avoids a situation where accidentally
if the user deleted the healing tracker or drives
were replaced again within the 10sec window.
2020-11-18 09:34:46 -08:00
Harshavardhana 9738d605e4
increase readdir per block memory to facilitate faster WalkDir (#10908) 2020-11-18 09:21:02 -08:00
Klaus Post 10099357b6
listcache: Wrap returned errors (#10882)
To give an indication of where they happen
2020-11-17 09:11:59 -08:00
Harshavardhana 80b8ce89a4
remove context deadline from Delete calls (#10901) 2020-11-17 09:09:45 -08:00
Poorna Krishnamoorthy 0b766288ef
fix: send replication completed event notification (#10902) 2020-11-15 22:16:41 -08:00
Rafael Bodill 598ca0569c
fix: global in-place update boolean check (#10900) 2020-11-15 13:34:12 -08:00
Poorna Krishnamoorthy d295ce5708
Fix disk cache usage percent for prometheus (#10898)
Fixes: #10895

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
2020-11-14 19:18:00 -08:00
Klaus Post b5a3d79bce
listobjectversions: Add shortcut for Veeam blocks (#10893)
Add shortcut for `APN/1.0 Veeam/1.0 Backup/10.0`

It requests unique blocks with a specific prefix. We skip 
scanning the parent directory for more objects matching the prefix.
2020-11-13 16:58:20 -08:00
Harshavardhana 17a5ff51ff
fix: move context timeout closer to network for Delete calls (#10897)
allowing for disconnects to be limited to the drive
themselves instead of disconnecting all drives.
2020-11-13 16:56:45 -08:00
Harshavardhana 0bcb1b679d
fix: disallow update if dates are same (#10890)
fixes #10889
2020-11-12 14:18:59 -08:00
Klaus Post a3017c724e
Sort directory objects correctly (#10886)
Decode dir objects when listing and sort them correctly.
2020-11-12 13:09:34 -08:00
Harshavardhana 267d7bf0a9 Revert "Add delete marker replication support (#10396)"
This reverts commit 50c10a5087.

PR is moved to origin/dev branch
2020-11-12 11:43:14 -08:00
cksac be83dfc52a
fix: HDFS list bucket when subpath is provided (#10884) 2020-11-12 11:26:51 -08:00
Harshavardhana ca88ca753c
ignore typed errors correctly in list cache layer (#10879)
bonus write bucket metadata cache with enough quorum

Possible fix for #10868
2020-11-12 09:28:56 -08:00
Klaus Post f86d3538f6
Allow deeper sleep (#10883)
Allow each crawler operation to sleep up to 10 seconds on very heavily loaded systems.

This will of course make minimum crawler speed less, but should be more effective at stopping.
2020-11-12 09:17:56 -08:00
Klaus Post 1c3590078d
Skip 0 byte stream writes (#10875)
Don't send a packet when receiving 0 bytes or there is an error recorded
2020-11-11 18:07:40 -08:00
Harshavardhana aa158228f9
fix: simplify healing metadata objects per set (#10867) 2020-11-11 10:58:16 -08:00
Klaus Post 8747834c69
DeletedObjects: Return objects on lock failure (#10874)
Return objects when locking fails.

<details>
<summary>Panic</summary>

```
: 2020/11/10 04:15:55 http: panic serving 10.10.62.153:44858: runtime error: index out of range [0] with length 0
: goroutine 363537270 [running]:
: net/http.(*conn).serve.func1(0xc019232780)
:         net/http/server.go:1801 +0x147
: panic(0x1cadd60, 0xc001719260)
:         runtime/panic.go:975 +0x47a
: github.com/minio/minio/cmd.criticalErrorHandler.ServeHTTP.func1(0xc0121d1200, 0x210cda0, 0xc0141940e0)
:         github.com/minio/minio/cmd/generic-handlers.go:781 +0x1a8
: panic(0x1cadd60, 0xc001719260)
:         runtime/panic.go:969 +0x1b9
: github.com/minio/minio/cmd.objectAPIHandlers.DeleteMultipleObjectsHandler(0x1e71ce8, 0x1e71cc8, 0x2108420, 0xc0192328c0, 0xc0121d1400)
:         github.com/minio/minio/cmd/bucket-handlers.go:465 +0x2490
: net/http.HandlerFunc.ServeHTTP(...)
:         net/http/server.go:2042
: github.com/minio/minio/cmd.httpTraceAll.func1(0x2108420, 0xc0192328c0, 0xc0121d1400)
:         github.com/minio/minio/cmd/handler-utils.go:353 +0x158
: net/http.HandlerFunc.ServeHTTP(...)
:         net/http/server.go:2042
: github.com/minio/minio/cmd.collectAPIStats.func1(0x2108420, 0xc019232820, 0xc0121d1400)
:         github.com/minio/minio/cmd/handler-utils.go:380 +0xed
: net/http.HandlerFunc.ServeHTTP(...)
:         net/http/server.go:2042
: github.com/minio/minio/cmd.maxClients.func1(0x2108420, 0xc019232820, 0xc0121d1400)
:         github.com/minio/minio/cmd/handler-api.go:132 +0x33b
: net/http.HandlerFunc.ServeHTTP(0xc00271d590, 0x2108420, 0xc019232820, 0xc0121d1400)
:         net/http/server.go:2042 +0x44
: github.com/minio/minio/cmd.redirectHandler.ServeHTTP(0x20e2180, 0xc00271d590, 0x2108420, 0xc019232820, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:192 +0x156
: github.com/minio/minio/cmd.customHeaderHandler.ServeHTTP(0x20e1060, 0xc0141a22b0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:751 +0x162
: github.com/minio/minio/cmd.securityHeaderHandler.ServeHTTP(0x20e0fc0, 0xc0141a22c0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:766 +0x1d6
: github.com/minio/minio/cmd.bucketForwardingHandler.ServeHTTP(0xc0121c7a40, 0x20e1120, 0xc0141a22d0, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:624 +0xbf
: github.com/minio/minio/cmd.requestValidityHandler.ServeHTTP(0x20e0f20, 0xc01814d280, 0x21083e0, 0xc01814d2e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:608 +0x42a
: github.com/minio/minio/cmd.httpStatsHandler.ServeHTTP(0x20e10c0, 0xc0141a2300, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:536 +0xe4
: github.com/minio/minio/cmd.requestSizeLimitHandler.ServeHTTP(0x20e0fe0, 0xc0141a2310, 0x50004000000, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:68 +0xd4
: github.com/minio/minio/cmd.requestHeaderSizeLimitHandler.ServeHTTP(0x20e10a0, 0xc01814d2a0, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:93 +0x1b7
: github.com/minio/minio/cmd.crossDomainPolicy.ServeHTTP(0x20e1080, 0xc0141a2320, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/crossdomain-xml-handler.go:51 +0x82
: github.com/minio/minio/cmd.browserRedirectHandler.ServeHTTP(0x20e0fa0, 0xc0141a2330, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:276 +0x68
: github.com/minio/minio/cmd.minioReservedBucketHandler.ServeHTTP(0x20e0f00, 0xc0141a2340, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:344 +0xb8
: github.com/minio/minio/cmd.cacheControlHandler.ServeHTTP(0x20e1020, 0xc0141a2350, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:303 +0x1ce
: github.com/minio/minio/cmd.timeValidityHandler.ServeHTTP(0x20e0f40, 0xc0141a2360, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:414 +0x3ca
: github.com/minio/minio/cmd.resourceHandler.ServeHTTP(0x20e1160, 0xc0141a2370, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:516 +0xab
: github.com/minio/minio/cmd.authHandler.ServeHTTP(0x20e1100, 0xc0141a2380, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/auth-handler.go:502 +0x2e7
: github.com/minio/minio/cmd.sseTLSHandler.ServeHTTP(0x20e0ee0, 0xc0141a2390, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:802 +0x79
: github.com/minio/minio/cmd.reservedMetadataHandler.ServeHTTP(0x20e1140, 0xc0141a23a0, 0x210cda0, 0xc0141940e0, 0xc0121d1400)
:         github.com/minio/minio/cmd/generic-handlers.go:139 +0x1b7
: github.com/gorilla/mux.(*Router).ServeHTTP(0xc00073fb00, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/gorilla/mux@v1.8.0/mux.go:210 +0xd3
: github.com/rs/cors.(*Cors).Handler.func1(0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/rs/cors@v1.7.0/cors.go:219 +0x1b9
: net/http.HandlerFunc.ServeHTTP(0xc0009aece0, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         net/http/server.go:2042 +0x44
: github.com/minio/minio/cmd.criticalErrorHandler.ServeHTTP(0x20e2180, 0xc0009aece0, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/minio/minio/cmd/generic-handlers.go:784 +0x85
: github.com/minio/minio/cmd/http.(*Server).Start.func1(0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         github.com/minio/minio/cmd/http/server.go:101 +0x258
: net/http.HandlerFunc.ServeHTTP(0xc000dc4080, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         net/http/server.go:2042 +0x44
: net/http.serverHandler.ServeHTTP(0xc000764c60, 0x210cda0, 0xc0141940e0, 0xc0121d1200)
:         net/http/server.go:2843 +0xa3
: net/http.(*conn).serve(0xc019232780, 0x2114720, 0xc03381f6c0)
:         net/http/server.go:1925 +0x8ad
: created by net/http.(*Server).Serve
:         net/http/server.go:2969 +0x36c
```
</details>
2020-11-11 09:14:32 -08:00
Poorna Krishnamoorthy 50c10a5087
Add delete marker replication support (#10396)
Delete marker replication is implemented for V2
configuration specified in AWS spec (though AWS
allows it only in the V1 configuration).

This PR also brings in a MinIO only extension of
replicating permanent deletes, i.e. deletes specifying
version id are replicated to target cluster.
2020-11-10 15:24:14 -08:00
Steven Reitsma 4683a623dc
fix: negative STS IAM token TTL value (#10866) 2020-11-10 12:24:01 -08:00
Klaus Post 06899210a7
Reduce health check output (#10859)
This will make the health check clients 'silent'.
Use `IsNetworkOrHostDown` determine if network is ok so it mimics the functionality in the actual client.
2020-11-10 09:28:23 -08:00
Harshavardhana cbdab62c1e
fix: heal user/metadata right away upon server startup (#10863)
this is needed such that we make sure to heal the
users, policies and bucket metadata right away as
we do listing based on list cache which only lists
'3' sufficiently good drives, to avoid possibly
losing access to these users upon upgrade make
sure to heal them.
2020-11-10 09:02:06 -08:00
Harshavardhana 8df6112204
fix: avoid divide by zero error single node distributed setup (#10862) 2020-11-09 20:40:39 -08:00
Harshavardhana 97692bc772
re-route requests if IAM is not initialized (#10850) 2020-11-07 21:03:06 -08:00
Steven Reitsma 54120107ce
fix: infinite loop in cleanupStaleUploads of encrypted MPUs (#10845)
fixes #10588
2020-11-06 11:53:42 -08:00
Klaus Post 9bf5990ea9
metadata: Invalidate cache if unreadable and not updating (#10844)
If a scanning server shuts down unexpectedly we may have "successful" caches that are incomplete on a set.

In this case mark the cache with an error so it will no longer be handed out.
2020-11-06 08:54:09 -08:00
Steven Reitsma 74f7cf24ae
fix: s3 gateway SSE pagination (#10840)
Fixes #10838
2020-11-05 15:04:03 -08:00
Harshavardhana fb28aa847b
fix: add missing deleted key element in multiObjectDelete (#10839)
fixes #10832
2020-11-05 12:47:46 -08:00
Klaus Post 0724205f35
metacache: Add option for life extension (#10837)
Add `MINIO_API_EXTEND_LIST_CACHE_LIFE` that will extend 
the life of generated caches for a while.

This changes caches to remain valid until no updates have been 
received for the specified time plus a fixed margin.

This also changes the caches from being invalidated when the *first* 
set finishes until the *last* set has finished plus the specified time 
has passed.
2020-11-05 11:49:56 -08:00
Harshavardhana b72cac4cf3
fix: dangling objects on actual namespace (#10822) 2020-11-05 11:48:55 -08:00
Klaus Post bd77f29fc4
Don't replace caches that are receiving updates (#10834)
Keep caches while they are receiving updates.
Move update code to separate function.
2020-11-05 07:34:08 -08:00
Klaus Post d1e1205036
metacache: Always close the s2 writer (#10836)
The s2 writer could be leaked if there was an error.

Make sure it is always closed.
2020-11-05 07:30:14 -08:00
Harshavardhana 71753e21e0
add missing TTL for STS credentials on etcd (#10828) 2020-11-04 13:06:05 -08:00
Harshavardhana fde3299bf3
re-use optimized readdir for isDirEmpty() (#10829)
reduces effective memory usage by an order
of magnitude, also increases performance for
small objects
2020-11-04 13:05:21 -08:00
Harshavardhana 1a1f00fa15
fix: use internode data for DisksInfo, VolsInfo in message pack (#10821)
Similar to #10775 for fewer memory allocations, since we use
getOnlineDisks() extensively for listing we should optimize it
further.

Additionally, remove all unused walkers from the storage layer
2020-11-04 10:10:54 -08:00
Bill Thorp 4a1efabda4
Context based AccessKey passing (#10615)
A new field called AccessKey is added to the ReqInfo struct and populated.
Because ReqInfo is added to the context, this allows the AccessKey to be
accessed from 3rd-party code, such as a custom ObjectLayer.

Co-authored-by: Harshavardhana <harsha@minio.io>
Co-authored-by: Kaloyan Raev <kaloyan@storj.io>
2020-11-04 09:13:34 -08:00
Klaus Post 3b88a646ec
Add remote online/offline information (#10825)
Log information about remote clients being marked offline.

This will help to identify root causes of failures.
2020-11-04 08:27:32 -08:00
Klaus Post 2294e53a0b
Don't retain context in locker (#10515)
Use the context for internal timeouts, but disconnect it from outgoing 
calls so we always receive the results and cancel it remotely.
2020-11-04 08:25:42 -08:00
Klaus Post f0819cce75
Keep transient lists while they are updating (#10826)
On extremely long running listings keep the transient list 15 minutes after last update instead of using start time.

Also don't do overlap checks on transient lists.
2020-11-04 08:01:33 -08:00
Klaus Post 1e11b4629f
Add remote Diskinfo caching (#10824)
Add 1 second remote disk info cache.

Should decrease need for remote calls a great deal due to how actively it is used now.
2020-11-04 08:00:18 -08:00
Harshavardhana 5c72a34fa8
fix: honor delimiter as per AWS S3 spec (#10823) 2020-11-04 07:56:58 -08:00
Klaus Post b9277c8030
metacache: Add trashcan (#10820)
Add trashcan that keeps recently updated lists after bucket deletion.
All caches were deleted once a bucket was deleted, so caches still running would report errors. Now they are canceled.
Fix `.minio.sys` not being transient.
2020-11-03 12:47:52 -08:00
Harshavardhana 8c76e1353e
initialize IAM after etcd has initialized (#10819) 2020-11-03 12:12:30 -08:00
Harshavardhana ad382799b1
use list cache for Walk() with webUI and quota (#10814)
bring list cache optimizations for web UI
object listing, also FIFO quota enforcement
through list cache as well.
2020-11-03 08:53:48 -08:00
Harshavardhana 68de5a6f6a
fix: IAM store fallback to list users and policies from disk (#10787)
Bonus fixes, remove package retry it is harder to get it
right, also manage context remove it such that we don't have
to rely on it anymore instead use a simple Jitter retry.
2020-11-02 17:52:13 -08:00
Harshavardhana 4ea31da889
fix: move list quorum ENV to config (#10804) 2020-11-02 17:21:56 -08:00
Klaus Post 0a796505c1
metacache: Check only one disk for updates (#10809)
Check only one disk for updates.

This will reduce IO while waiting for lists to finish.
2020-11-02 17:20:27 -08:00
Klaus Post 37749f4623
Optimize FileInfo(Version) transfer (#10775)
File Info decoding, in particular, is showing up as a major 
allocator and time consumer for internode data transfers

Switch to message pack for cross-server transfers:

```
MSGP:

Size: 945 bytes

BenchmarkEncodeFileInfoMsgp-32    	 1558444	       866 ns/op	   1.16 MB/s	       0 B/op	       0 allocs/op
BenchmarkDecodeFileInfoMsgp-32    	  479968	      2487 ns/op	   0.40 MB/s	     848 B/op	      18 allocs/op

GOB:

Size: 1409 bytes

BenchmarkEncodeFileInfoGOB-32    	  333339	      3237 ns/op	   0.31 MB/s	     576 B/op	      19 allocs/op
BenchmarkDecodeFileInfoGOB-32    	   20869	     57837 ns/op	   0.02 MB/s	   16439 B/op	     428 allocs/op
```
2020-11-02 17:07:52 -08:00
Klaus Post 86e0d272f3
Reduce WriteAll allocs (#10810)
WriteAll saw 127GB allocs in a 5 minute timeframe for 4MiB buffers 
used by `io.CopyBuffer` even if they are pooled.

Since all writers appear to write byte buffers, just send those 
instead and write directly. The files are opened through the `os` 
package so they have no special properties anyway.

This removes the alloc and copy for each operation.

REST sends content length so a precise alloc can be made.
2020-11-02 16:14:31 -08:00
Harshavardhana 8527f22df1
optimize request URL encoding for internode (#10811)
this reduces allocations in order of magnitude

Also, revert "erasure: delete dangling objects automatically (#10765)" 
affects list caching should be investigated.
2020-11-02 15:15:12 -08:00
Anis Elleuch b456292295
erasure: delete dangling objects automatically (#10765) 2020-11-02 10:49:30 -08:00
Poorna Krishnamoorthy 03fdbc3ec2
Add async caching commit option in diskcache (#10742)
Add store and a forward option for a single part
uploads when an async mode is enabled with env
MINIO_CACHE_COMMIT=writeback 

It defaults to `writethrough` if unspecified.
2020-11-02 10:00:45 -08:00
Harshavardhana 4c773f7068
re-use remote transports in Peer,Storage,Locker clients (#10788)
use one transport for internode communication
2020-11-02 07:43:11 -08:00
Harshavardhana 5412d730c1
simplify monitoring doesn't need to be canceled (#10803)
connect disks monitoring doesn't need to be canceled
upon drive replacement, since we only need to replace
the newly replaced drive.
2020-10-31 14:10:12 -07:00
Klaus Post fe9f23e632
Recreate bucket metacache if corrupted (#10800)
If bucket metadata cannot be read, clean up existing and create a new.
2020-10-31 10:26:16 -07:00
Klaus Post 422898d9b3
Clean up metadata cache when deleting bucket (#10802)
Metadata caches were left behind when deleting a bucket.
2020-10-31 09:46:18 -07:00
Harshavardhana b686bb9c83
fix: replaced drive properly by healing the entire drive (#10799)
Bonus fixes, we do not need reload format anymore
as the replaced drive is healed locally we only need
to ensure that drive heal reloads the drive properly.

We preserve the UUID of the original order, this means
that the replacement in `format.json` doesn't mean that
the drive needs to be reloaded into memory anymore.

fixes #10791
2020-10-31 01:34:48 -07:00
Harshavardhana 5e5cdc581d
remove unnecessary logging and move to log once (#10798)
the current master logs way too much when a node
is down, instead log once and move on.
2020-10-30 14:55:50 -07:00
Harshavardhana 02cfa774be
allow requests to be proxied when server is booting up (#10790)
when server is booting up there is a possibility
that users might see '503' because object layer
when not initialized, then the request is proxied
to neighboring peers first one which is online.
2020-10-30 12:20:28 -07:00
Krishna Srinivas 3a2f89b3c0
fix: add support for O_DIRECT reads for erasure backends (#10718) 2020-10-30 11:04:29 -07:00
Klaus Post 6135f072d2
Fix invalidated metacaches (#10784)
* Fix caches having EOF marked as a failure.
* Simplify cache updates.
* Provide context for checkMetacacheState failures.
* Log 499 when the client disconnects.
2020-10-30 09:33:16 -07:00
Klaus Post e63a44b734
rest client: Expect context timeouts for locks (#10782)
Add option for rest clients to not mark a remote offline for context timeouts.

This can be used if context timeouts are expected on the call.
2020-10-29 09:52:11 -07:00
Klaus Post 6b14c4ab1e
Optimize decryptObjectInfo (#10726)
`decryptObjectInfo` is a significant bottleneck when listing objects.

Reduce the allocations for a significant speedup.

https://github.com/minio/sio/pull/40

```
λ benchcmp before.txt after.txt
benchmark                          old ns/op     new ns/op     delta
Benchmark_decryptObjectInfo-32     24260928      808656        -96.67%

benchmark                          old MB/s     new MB/s     speedup
Benchmark_decryptObjectInfo-32     0.04         1.24         31.00x

benchmark                          old allocs     new allocs     delta
Benchmark_decryptObjectInfo-32     75112          48996          -34.77%

benchmark                          old bytes     new bytes     delta
Benchmark_decryptObjectInfo-32     287694772     4228076       -98.53%
```
2020-10-29 09:34:20 -07:00
Harshavardhana 4bf90ca67f
fix: handle a crash when AskDisks is set to -1 (#10777) 2020-10-29 09:25:43 -07:00
Harshavardhana e0655e24f2
fix: A possible crash when fi.Erasure.Distribution is empty (#10779) 2020-10-28 19:24:01 -07:00
Klaus Post bfc36aed89
Add update retry limit and compare error by string instead (#10776) 2020-10-28 13:19:53 -07:00
Kaloyan Raev be7f67268d
fix: Do not cleanup range files in cache SaveMetadata when total hits are false (#10728) 2020-10-28 09:23:17 -07:00
Klaus Post a982baff27
ListObjects Metadata Caching (#10648)
Design: https://gist.github.com/klauspost/025c09b48ed4a1293c917cecfabdf21c

Gist of improvements:

* Cross-server caching and listing will use the same data across servers and requests.
* Lists can be arbitrarily resumed at a constant speed.
* Metadata for all files scanned is stored for streaming retrieval.
* The existing bloom filters controlled by the crawler is used for validating caches.
* Concurrent requests for the same data (or parts of it) will not spawn additional walkers.
* Listing a subdirectory of an existing recursive cache will use the cache.
* All listing operations are fully streamable so the number of objects in a bucket no 
  longer dictates the amount of memory.
* Listings can be handled by any server within the cluster.
* Caches are cleaned up when out of date or superseded by a more recent one.
2020-10-28 09:18:35 -07:00
Krishna Srinivas f53c5a020e
fix: heal object shards with ec.index and ec.distribution mismatches (#10773)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-10-28 00:10:20 -07:00
Harshavardhana 5b30bbda92
fix: add more protection distribution to match EcIndex (#10772)
allows for more stricter validation in picking up the right
set of disks for reconstruction.
2020-10-28 00:09:15 -07:00
Shireesh Anjal 858e2a43df
Remove logging info from OBDInfoHandler (#10727)
A lot of logging data is counterproductive. A better implementation with
precise useful log data can be introduced later.
2020-10-27 17:41:48 -07:00
Kaloyan Raev df9894e275
avoid caching http ranges in background goroutine (#10724) 2020-10-26 23:04:48 -07:00
Krishna Srinivas 592f2f23a3
fix: heal rejects objects with disk re-ordering issue (#10766) 2020-10-26 18:48:47 -07:00
Krishna Srinivas c49a80db41
fix: use meta.Erasure.Index for GetObject() to reconstruct object (#10764) 2020-10-26 16:19:42 -07:00
Poorna Krishnamoorthy 46275c6547
cache: rename function declarations (#10763) 2020-10-26 15:41:24 -07:00
Poorna Krishnamoorthy 0994ed9783
cache: fix call in GetObjectNInfo (#10762)
Fixes: #10751
2020-10-26 12:30:40 -07:00
Anis Elleuch eb95353cb1
fix: Get/HeadObject return 404 on non quorum objects (#10753) 2020-10-26 10:30:46 -07:00
Harshavardhana 029758cb20
fix: retain the previous UUID for newly replaced drives (#10759)
only newly replaced drives get the new `format.json`,
this avoids disks reloading their in-memory reference
format, ensures that drives are online without
reloading the in-memory reference format.

keeping reference format in-tact means UUIDs
never change once they are formatted.
2020-10-26 10:29:29 -07:00
Harshavardhana 646d6917ed
turn-off checking for updates completely if MINIO_UPDATE=off (#10752) 2020-10-24 22:39:44 -07:00
Harshavardhana d9db7f3308
expire lockers if lockers are offline (#10749)
lockers currently might leave stale lockers,
in unknown ways waiting for downed lockers.

locker check interval is high enough to safely
cleanup stale locks.
2020-10-24 13:23:16 -07:00
Harshavardhana 6a8c62f9fd
make sure to preserve UUID from reference format (#10748)
reference format should be source of truth
for inconsistent drives which reconnect,
add them back to their original position

remove automatic fix for existing offline
disk uuids
2020-10-24 13:23:08 -07:00
Anis Elleuch 00124c56d9
erasure: Commit data before xl.meta in RenameData() (#10734)
This will reduce the chance to have updated xl.meta without data.
2020-10-23 21:54:58 -07:00
Anis Elleuch 2c32c2149e
tests: Avoid running TestNSRace in short test mode (#10735) 2020-10-23 21:23:12 -07:00
Harshavardhana 734f258878
fix: slow down auto healing more aggressively (#10730)
Bonus fixes

- logging improvements to ensure that we don't use
  `go logger.LogIf` to avoid runtime.Caller missing
  the function name. log where necessary.
- remove unused code at erasure sets
2020-10-22 13:36:24 -07:00
Anis Elleuch 0e0c53bba4
tests: Lower expectation in addr selection in rand cache dialer (#10739)
Test TestDialContextWithDNSCacheRand was failing sometimes because it depends
on a random selection of addresses when testing random DNS resolution from cache.

Lower addr selection exception to 10%
2020-10-22 09:35:32 -07:00
Poorna Krishnamoorthy 5cc23ae052
validate if iam store is initialized (#10719)
Fixes panic - regression from d6d770c1b1
2020-10-20 21:28:24 -07:00
Harshavardhana d6d770c1b1 initialize object layer right after config has loaded 2020-10-19 22:04:59 -07:00
Harshavardhana b07df5cae1
initialize IAM as soon as object layer is initialized (#10700)
Allow requests to come in for users as soon as object
layer and config are initialized, this allows users
to be authenticated sooner and would succeed automatically
on servers which are yet to fully initialize.
2020-10-19 09:54:40 -07:00
Harshavardhana c107728676
fix: s3 gateway DNS cache initialization (#10706)
fixes #10705
2020-10-19 01:34:23 -07:00
Anis Elleuch 284a2b9021
ilm: Send delete marker creation event when appropriate (#10696)
Before this commit, the crawler ILM will always send object delete event
notification though this is wrong.
2020-10-16 21:22:12 -07:00
Ritesh H Shukla 0b53e30ecb
Clean up monitor on delete bucket (#10698) 2020-10-16 17:59:31 -07:00
Harshavardhana bd2131ba34
add DNS cache support to avoid DNS flooding (#10693)
Go stdlib resolver doesn't support caching DNS
resolutions, since we compile with CGO disabled
we are more probe to DNS flooding for all network
calls to resolve for DNS from the DNS server.

Under various containerized environments such as
VMWare this becomes a problem because there are
no DNS caches available and we may end up overloading
the kube-dns resolver under concurrent I/O.

To circumvent this issue implement a DNSCache resolver
which resolves DNS and caches them for around 10secs
with every 3sec invalidation attempted.
2020-10-16 14:49:05 -07:00
ebozduman 1aec168c84
fix: azure gateway should reject bucket names with "." (#10635) 2020-10-16 09:30:18 -07:00
Klaus Post 21a549a83b
fix: keep MRF channel open to avoid random CI crash (#10686)
There doesn't seem to be any benefit to closing the channel, so just keep 
it open and let it die with the server.
2020-10-16 09:08:51 -07:00
Ritesh H Shukla 8a16a1a1a9
fix: misc fixes for bandwidth reporting amd monitoring (#10683)
* Set peer for fetch bandwidth
* Fix the limit for bandwidth that is reported.
* Reduce CPU burn from bandwidth management.
2020-10-16 09:07:50 -07:00
Harshavardhana ad726b49b4
rename zones to serverSets to avoid terminology conflict (#10679)
we are bringing in availability zones, we should avoid
zones as per server expansion concept.
2020-10-15 14:28:50 -07:00
Anis Elleuch db2241066b
heal: Enable removing dangling delete markers (#10688) 2020-10-15 13:06:40 -07:00
Harshavardhana f1cc16e788
fix: background heal rely on getOnlineDisks() (#10687) 2020-10-15 13:06:23 -07:00
Klaus Post 3820a905e0
in getOnlineDisks wait for disks to be populated (#10685) 2020-10-15 06:37:10 -07:00
Harshavardhana 2042d4873c
rename crawler config option to heal (#10678) 2020-10-14 13:51:51 -07:00
Harshavardhana f9be783f3e
fix: allow crawler to crawl on disks without usage constraints (#10677)
additionally also change the resolution usage wise
return of disks, allows to small byte level differences
to be masked.
2020-10-14 12:12:10 -07:00
Harshavardhana 71b97fd3ac
fix: connect disks pre-emptively during startup (#10669)
connect disks pre-emptively upon startup, to ensure we have
enough disks are connected at startup rather than wait
for them.

we need to do this to avoid long wait times for server to
be online when we have servers come up in rolling upgrade
fashion
2020-10-13 18:28:42 -07:00
Klaus Post 03991c5d41
crawler: Remove waitForLowActiveIO (#10667)
Only use dynamic delays for the crawler. Even though the max wait was 1 second the number 
of waits could severely impact crawler speed.

Instead of relying on a global metric, we use the stateless local delays to keep the crawler 
running at a speed more adjusted to current conditions.

The only case we keep it is before bitrot checks when enabled.
2020-10-13 13:45:08 -07:00
飞雪无情 614060764d
fix: use the correct Action type for policy.Args and iampolicy.Args (#10650) 2020-10-12 15:18:22 -07:00
Harshavardhana a3ba8188d7 fix: allow locker to be niladic 2020-10-12 14:23:44 -07:00
Harshavardhana 2760fc86af
Bump default idleConnsPerHost to control conns in time_wait (#10653)
This PR fixes a hang which occurs quite commonly at higher concurrency
by allowing following changes

- allowing lower connections in time_wait allows faster socket open's
- lower idle connection timeout to ensure that we let kernel
  reclaim the time_wait connections quickly
- increase somaxconn to 4096 instead of 2048 to allow larger tcp
  syn backlogs.

fixes #10413
2020-10-12 14:19:46 -07:00
Ritesh H Shukla 8ceb2a93fd
fix: peer replication bandwidth monitoring in distributed setup (#10652) 2020-10-12 09:04:55 -07:00
Ritesh H Shukla c2f16ee846
Add basic bandwidth monitoring for replication. (#10501)
This change tracks bandwidth for a bucket and object

- [x] Add Admin API
- [x] Add Peer API
- [x] Add BW throttling
- [x] Admin APIs to set replication limit
- [x] Admin APIs for fetch bandwidth
2020-10-09 20:36:00 -07:00
Harshavardhana 6484453fc6
optionally allow strict quorum listing (#10649)
```
export MINIO_API_LIST_STRICT_QUORUM=on
```

would enable listing in quorum if necessary
2020-10-09 15:40:46 -07:00
Harshavardhana a0d0645128
remove safeMode behavior in startup (#10645)
In almost all scenarios MinIO now is
mostly ready for all sub-systems
independently, safe-mode is not useful
anymore and do not serve its original
intended purpose.

allow server to be fully functional
even with config partially configured,
this is to cater for availability of actual
I/O v/s manually fixing the server.

In k8s like environments it will never make
sense to take pod into safe-mode state,
because there is no real access to perform
any remote operation on them.
2020-10-09 09:59:52 -07:00
Harshavardhana 253194e491
do not hold write locks - if objects don't exist (#10644) 2020-10-08 17:47:21 -07:00
Harshavardhana 736e58dd68
fix: handle concurrent lockers with multiple optimizations (#10640)
- select lockers which are non-local and online to have
  affinity towards remote servers for lock contention

- optimize lock retry interval to avoid sending too many
  messages during lock contention, reduces average CPU
  usage as well

- if bucket is not set, when deleteObject fails make sure
  setPutObjHeaders() honors lifecycle only if bucket name
  is set.

- fix top locks to list out always the oldest lockers always,
  avoid getting bogged down into map's unordered nature.
2020-10-08 12:32:32 -07:00
Poorna Krishnamoorthy 907a171edd
Generalize error messages for remote targets (#10638)
This is to allow remote targets to be generalized
for replication/ILM transition

Also adding a field in BucketTarget to identify
a remote target with a label.
2020-10-08 10:54:11 -07:00
Andreas Auernhammer ed6d2a100f
logger: avoid writing audit log response header twice (#10642)
This commit fixes a misuse of the `http.ResponseWriter.WriteHeader`.
A caller should **either** call `WriteHeader` exactly once **or**
write to the response writer and causing an implicit 200 OK.

Writing the response headers more than once causes a `http: superfluous
response.WriteHeader call` log message. This commit fixes this
by preventing a 2nd `WriteHeader` call being forwarded to the underlying
`ResponseWriter`.

Updates #10587
2020-10-08 09:29:10 -07:00
Harshavardhana effe131090
fix: allow read unlocks to be defensive about split brains (#10637) 2020-10-07 09:15:01 -07:00
Harshavardhana 18063bf25c
fix: cleanup old directory handling code (#10633)
we don't need them anymore, remove legacy code.
2020-10-06 12:03:57 -07:00
Poorna Krishnamoorthy dbbed6f7f0
update minio-go dependency (#10634) 2020-10-06 08:37:09 -07:00
Poorna Krishnamoorthy 7fbfdceba3
Fix replication slowness (#10632)
- Increase channel buffer length
- Avoid blocking wait on replicaCh
2020-10-05 14:45:42 -07:00
Shireesh Anjal f1418a50f0
add NVMe drive info [model num, serial num, drive temp. etc.] (#10613)
* add NVMe drive info [model num, serial num, drive temp. etc.]
* Ignore fuse partitions
* Add the nvme logic only for linux
* Move smart/nvme structs to a separate file

Co-authored-by: wlan0 <sidharthamn@gmail.com>
2020-10-04 10:18:46 -07:00
Krishna Srinivas 045e30f2c1
Set LastModified time from source for bucket replication (#10627) 2020-10-02 18:32:22 -07:00
Harshavardhana c6a9a94f94
fix: optimize ServerInfo() handler to avoid reading config (#10626)
fixes #10620
2020-10-02 16:19:44 -07:00
Harshavardhana 8e7c00f3d4
add missing request-id from DeleteObject events (#10623)
fixes #10621
2020-10-02 13:36:13 -07:00
Harshavardhana 23e8390997
fix: Allow Walk to honor load balanced drives (#10610) 2020-10-01 20:24:34 -07:00
Anis Elleuch 71403be912
fix: consider partNumber in GET/HEAD requests (#10618) 2020-10-01 15:41:12 -07:00
Harshavardhana f28d02b7f2
fix: simplify obd how we calculate transferred bytes (#10617) 2020-10-01 14:34:51 -07:00
Harshavardhana e0cb814f3f
fail if port is not accessible (#10616)
throw proper error when port is not accessible
for the regular user, this is possibly a regression.

```
ERROR Unable to start the server: Insufficient permissions to use specified port
   > Please ensure MinIO binary has 'cap_net_bind_service=+ep' permissions
   HINT:
     Use 'sudo setcap cap_net_bind_service=+ep /path/to/minio' to provide sufficient permissions
```
2020-10-01 13:23:31 -07:00
Harshavardhana 98a08e1644
fix: protect updating latencies/throughput slices in obd (#10611)
Additionally close the transferChan upon function exit.
2020-10-01 09:50:08 -07:00
Klaus Post 3047121255
dataupdate: Bump to force rescan (#10609)
After #10594 let's invalidate the bloom filters to force the next cycles to go through all data.

There is a small chance that the linked PR could have caused missing bloom filter data.

This will invalidate the current bloom filters and make the crawler go through everything.
2020-09-30 16:10:40 -07:00
Ritesh H Shukla 5a7f92481e
fix: client errors for DNS service creation errors (#10584) 2020-09-30 14:09:41 -07:00
Anis Elleuch 0d45c38782
List v1/versions routes based on source IP if found (#10603)
Routing using on source IP if found. This should distribute
the listing load for V1 and versioning on multiple nodes
evenly between different clients.

If source IP is not found from the http request header, then falls back
to bucket name instead.
2020-09-30 13:38:27 -07:00
Poorna Krishnamoorthy 56d1b227cf
Handle changes to versioning config for replication (#10598)
Disallow versioning suspension on a bucket with
pre-existing replication configuration

If versioning is suspended on the target,replication
should fail.
2020-09-30 13:36:37 -07:00
Lenin Alevski bea87a5a20
fix: reading multiple TLS certificates when deployed in K8S (#10601)
Ignore all regular files, CAs directory and any 
directory that starts with `..` inside the
`.minio/certs` folder
2020-09-30 08:21:30 -07:00
Harshavardhana 2b4eb87d77
pick disks which are common maximally used (#10600)
further optimization to ensure that good disks
are always used for listing, other than healing
we only use disks that are maximally used.
2020-09-29 22:54:02 -07:00
Harshavardhana 1f9abbee4d
make sure to release locks upon timeout (#10596)
fixes #10418
2020-09-29 15:18:34 -07:00
Klaus Post fdf0ae9167
exit data update tracker only upon context completion (#10594)
The data update tracker saver would exit if data wasn't updated for between cycles.
2020-09-29 13:23:53 -07:00
Harshavardhana 00eb6f6bc9
cache DiskInfo at storage layer for performance (#10586)
`mc admin info` on busy setups will not move HDD
heads unnecessarily for repeated calls, provides
a better responsiveness for the call overall.

Bonus change allow listTolerancePerSet be N-1
for good entries, to avoid skipping entries
for some reason one of the disk went offline.
2020-09-29 09:54:41 -07:00
Harshavardhana 66174692a2
add '.healing.bin' for tracking currently healing disk (#10573)
add a hint on the disk to allow for tracking fresh disk
being healed, to allow for restartable heals, and also
use this as a way to track and remove disks.

There are more pending changes where we should move
all the disk formatting logic to backend drives, this
PR doesn't deal with this refactor instead makes it
easier to track healing in the future.
2020-09-28 19:39:32 -07:00
飞雪无情 209680e89f
Remove redundant http.HandlerFunc type conversion. (#10576) 2020-09-28 13:33:49 -07:00
飞雪无情 27d9bd04e5
Handling unhandled errors in the InfoCannedPolicy method. (#10575) 2020-09-27 10:24:04 -07:00
Harshavardhana bebcf4f004 unlock() only if locking was successful 2020-09-25 19:36:47 -07:00
Harshavardhana eafa775952
fix: add lock ownership to expire locks (#10571)
- Add owner information for expiry, locking, unlocking a resource
- TopLocks returns now locks in quorum by default, provides
  a way to capture stale locks as well with `?stale=true`
- Simplify the quorum handling for locks to avoid from storage
  class, because there were challenges to make it consistent
  across all situations.
- And other tiny simplifications to reset locks.
2020-09-25 19:21:52 -07:00
Harshavardhana 66b4a862e0
fix: network failure err check should ignore context canceled errors (#10567)
context canceled errors bubbling up from the network
layer has the potential to be misconstrued as network
errors, taking prematurely a server offline and triggering
a health check routine avoid this potential occurrence.
2020-09-25 14:35:47 -07:00
Anis Elleuch 9603489dd3
federation: Honor range with UploadObjectPart to a different cluster (#10570)
Use gr & length instead of srcInfo.Reader & srcInfo.Size because 
they don't honor range header
2020-09-25 12:06:42 -07:00
Anis Elleuch b302c8a5f4
heal: Fix periodic healing cleanup (#10569)
isEnded() was incorrectly calculating if the current healing sequence is
ended or not. h.currentStatus.Items could be empty if healing is very
slow and mc admin heal consumed all items.
2020-09-25 10:29:00 -07:00
Praveen raj Mani b880796aef
Set the maximum open connections limit in PG and MySQL target configs (#10558)
As the bulk/recursive delete will require multiple connections to open at an instance,
The default open connections limit will be reached which results in the following error

```FATAL:  sorry, too many clients already```

By setting the open connections to a reasonable value - `2`, We ensure that the max open connections
will not be exhausted and lie under bounds.

The queries are simple inserts/updates/deletes which is operational and sufficient with the
the maximum open connection limit is 2.

Fixes #10553

Allow user configuration for MaxOpenConnections
2020-09-24 22:20:30 -07:00
Harshavardhana 37a5d5d7a0
reduce timeouts between servers for faster disconnects (#10562) 2020-09-24 20:10:07 -07:00
Harshavardhana 3cac262dd1
report heal drives properly, also from global state (#10561)
It is possible the heal drives are not reported from
the maintenance check because the background heal
state simply relied on the `format.json` for capturing
unformatted drives. It is possible that drives might
be still healing - make sure that applications which
rely on cluster health check respond back this detail.
2020-09-24 15:36:47 -07:00
poornas e6ab4db6b8
Fix minimum replication workers started (#10560)
This PR also fixes GetReplicationConfiguration permission
in web-handlers.go to use bucket as resource
2020-09-24 12:25:41 -07:00
Harshavardhana ca989eb0b3
avoid ListBuckets returning quorum errors when node is down (#10555)
Also, revamp the way ListBuckets work make few portions
of the healing logic parallel

- walk objects for healing disks in parallel
- collect the list of buckets in parallel across drives
- provide consistent view for listBuckets()
2020-09-24 09:53:38 -07:00
飞雪无情 d778d034e7
Remove redundant mgmtQueryKey type. (#10557)
Remove redundant type conversion.
2020-09-24 08:40:21 -07:00
Harshavardhana f7f9517b6a fix: host extraction without port 2020-09-23 12:10:14 -07:00
Harshavardhana 90cff10e2b avoid crash if disks are not initialized 2020-09-23 12:00:29 -07:00
Harshavardhana 81caf35926
fix: reduce healthcheck interval for storage rest client (#10544) 2020-09-23 10:43:42 -07:00
poornas 5726cef3ca
validate bucket exists in ListRemoteTargets api (#10552) 2020-09-23 10:37:54 -07:00
Harshavardhana 8b74a72b21
fix: rename READY deadline to CLUSTER deadline ENV (#10535) 2020-09-23 09:14:33 -07:00
Klaus Post eec69d6796
Fix stale context for bucket retrieval (#10551)
The provided context gets captured by the closure making all subsequent calls fail.
2020-09-23 08:30:31 -07:00
Harshavardhana 0537a21b79
avoid concurrenct use of rand.NewSource (#10543) 2020-09-22 15:34:27 -07:00
poornas 4c54ed8748
Close replica channel only once (#10542)
Also enforce s3:GetReplicationConfiguration permission check as a
bucket level resource.
2020-09-22 12:47:24 -07:00
Anis Elleuch 4c81201f95
fix: healing delete marker on versioned buckets (#10530)
Healing was not working correctly in the distributed mode because
errFileVersionNotFound was not properly converted in storage rest
client.

Besides, fixing the healing delete marker is not working as expected.
2020-09-21 15:16:16 -07:00
Harshavardhana cd8d511d3d move versionsOrder struct to xl-storage-utils 2020-09-21 14:24:42 -07:00
Harshavardhana 17e17da00d
add parallel workers to perform replication in parallel (#10525)
set the concurrency for replication be to runtime.NumCPU()/2
2020-09-21 13:43:29 -07:00
Harshavardhana a5da9120f3
fix: [fs] an error upon rwPool.Write() just attempt rwPool.Create() (#10533)
On some NFS clients looks like errno is incorrectly set,
which leads to incorrect errors thrown upwards.
2020-09-21 12:54:23 -07:00
poornas aa12d75d75
fix crawler to detect lifecycle on bucket even if filter nil (#10532) 2020-09-21 11:41:07 -07:00
Harshavardhana 6fcbdd5607
remove unused putObjectDir code (#10528) 2020-09-21 09:41:39 -07:00
Harshavardhana 3831cc9e3b
fix: [fs] CompleteMultipart use trie structure for partMatch (#10522)
performance improves by around 100x or more

```
go test -v -run NONE -bench BenchmarkGetPartFile
goos: linux
goarch: amd64
pkg: github.com/minio/minio/cmd
BenchmarkGetPartFileWithTrie
BenchmarkGetPartFileWithTrie-4          1000000000               0.140 ns/op           0 B/op          0 allocs/op
PASS
ok      github.com/minio/minio/cmd      1.737s
```

fixes #10520
2020-09-21 01:18:13 -07:00
Krishna Srinivas 230fc0d186
Support for "directory" objects (#10499) 2020-09-19 08:39:41 -07:00
Harshavardhana 7f9498f43f
fix: ignore faulty drives and continue (#10511)
drives might return different types of errors
handle them individually, and for some errors
just log an error and continue
2020-09-18 12:09:05 -07:00
Harshavardhana 1cf322b7d4
change leader locker only for crawler (#10509) 2020-09-18 11:15:54 -07:00
Klaus Post 0b1c824618
Fix incorrect request start time (#10516)
Log request start time BEFORE starting processing the request
2020-09-18 09:30:52 -07:00
Klaus Post c851e022b7
Tweaks to dynamic locks (#10508)
* Fix cases where minimum timeout > default timeout.
* Add defensive code for too small/negative timeouts.
* Never set timeout below the maximum value of a request.
* Protect against (unlikely) int64 wraps.
* Decrease timeout slower.
* Don't re-lock before copying.
2020-09-18 09:18:18 -07:00
Klaus Post 5ad032826a
Add a reasonable if unable to get total RAM (#10506)
Though unlikely we shouldn't skip initializing the API if we cannot get RAM.

Add 16GiB as a default and log the error.
2020-09-18 02:03:02 -07:00
Harshavardhana 84bf4624a4
fix: make sure to preserve metadata during overwrite in FS mode (#10512)
This bug was introduced in 14f0047295
almost 3yrs ago, as a side affect of removing stale `fs.json`
but we in-fact end up removing existing good `fs.json` for an
existing object, leading to some form of a data loss.

fixes #10496
2020-09-18 00:16:16 -07:00
Harshavardhana 4a36cd7035
fix: improve performance ListObjectParts in FS mode (#10510)
from 20s for 10000 parts to less than 1sec

Without the patch
```
~ time aws --endpoint-url=http://localhost:9000 --profile minio s3api \
       list-parts --bucket testbucket --key test \
       --upload-id c1cd1f50-ea9a-4824-881c-63b5de95315a

real    0m20.394s
user    0m0.589s
sys     0m0.174s
```

With the patch
```
~ time aws --endpoint-url=http://localhost:9000 --profile minio s3api \
       list-parts --bucket testbucket --key test \
       --upload-id c1cd1f50-ea9a-4824-881c-63b5de95315a

real    0m0.891s
user    0m0.624s
sys     0m0.182s
```

fixes #10503
2020-09-17 18:51:16 -07:00
Klaus Post 03490c811b
Fix obd goroutine leak (#10504)
The gouroutine collecting transfer stats never exits. Add missing channel close.
2020-09-17 10:10:20 -07:00
Harshavardhana ed78854cea fix: list across all drives to avoid stale disks 2020-09-16 21:17:10 -07:00
Harshavardhana e60834838f
fix: background disk heal, to reload format consistently (#10502)
It was observed in VMware vsphere environment during a
pod replacement, `mc admin info` might report incorrect
offline nodes for the replaced drive. This issue eventually
goes away but requires quite a lot of time for all servers
to be in sync.

This PR fixes this behavior properly.
2020-09-16 21:14:35 -07:00
Harshavardhana d616d8a857
serialize replication and feed it through task model (#10500)
this allows for eventually controlling the concurrency
of replication and overally control of throughput
2020-09-16 16:04:55 -07:00
Anis Elleuch 24cab7f9df
ilm: Remove a 'null' version if not latest (#10494)
If the ILM document requires removing noncurrent versions, the 
the server should be able to remove 'null' versions as well. 
'null' versions are created when versioning is not enabled 
or suspended.
2020-09-16 10:21:50 -07:00
Harshavardhana 02c1a08a5b
fix: make sure to lock CopyObject for in-place updates (#10492) 2020-09-15 20:44:48 -07:00
Ritesh H Shukla 5c47ce456e
Run replication in the background (#10491) 2020-09-15 18:44:58 -07:00
Anis Elleuch 8ea55f9dba
obd: Add console log to OBD output (#10372) 2020-09-15 18:02:54 -07:00
poornas 80e3dce631
azure: update content-md5 to metadata after upload (#10482)
Fixes #10453
2020-09-15 16:31:47 -07:00
Harshavardhana 80fab03b63
fix: S3 gateway doesn't support full passthrough for encryption (#10484)
The entire encryption layer is dependent on the fact that
KMS should be configured for S3 encryption to work properly
and we only support passing the headers as is to the backend
for encryption only if KMS is configured.

Make sure that this predictability is maintained, currently
the code was allowing encryption to go through and fail
at later to indicate that KMS was not configured. We should
simply reply "NotImplemented" if KMS is not configured, this
allows clients to simply proceed with their tests.
2020-09-15 13:57:15 -07:00
Harshavardhana 730d2dc7be
fix: allow CopyObject/PutObjecTags on pre-existing content (#10485)
fixes #10475
2020-09-15 09:18:41 -07:00
Harshavardhana 0ee9678190
fix: add missing delete marker created filter (#10481) 2020-09-14 21:32:52 -07:00
Klaus Post 34859c6d4b
Preallocate (safe) slices when we know the size (#10459) 2020-09-14 20:44:18 -07:00
Klaus Post b1c99e88ac
reduce CPU usage upto 50% in readdir (#10466) 2020-09-14 17:19:54 -07:00
Harshavardhana 0104af6bcc
delayed locks until we have started reading the body (#10474)
This is to ensure that Go contexts work properly, after some
interesting experiments I found that Go net/http doesn't
cancel the context when Body is non-zero and hasn't been
read till EOF.

The following gist explains this, this can lead to pile up
of go-routines on the server which will never be canceled
and will die at a really later point in time, which can
simply overwhelm the server.

https://gist.github.com/harshavardhana/c51dcfd055780eaeb71db54f9c589150

To avoid this refactor the locking such that we take locks after we
have started reading from the body and only take locks when needed.

Also, remove contextReader as it's not useful, doesn't work as expected
context is not canceled until the body reaches EOF so there is no point
in wrapping it with context and putting a `select {` on it which
can unnecessarily increase the CPU overhead.

We will still use the context to cancel the lockers etc.
Additional simplification in the locker code to avoid timers
as re-using them is a complicated ordeal avoid them in
the hot path, since locking is very common this may avoid
lots of allocations.
2020-09-14 15:57:13 -07:00
Harshavardhana 34ea1d2167
fix: return correct error code for MetadataTooLarge (#10470)
fixes #10469
2020-09-13 21:26:35 -07:00
Harshavardhana 9d95937018 update KMS docs indicating deprecation of AUTO_ENCRYPTION env 2020-09-13 16:23:28 -07:00
Klaus Post fa01e640f5
Continous healing: add optional bitrot check (#10417) 2020-09-12 00:08:12 -07:00
Harshavardhana f355374962
add support for configurable remote transport deadline (#10447)
configurable remote transport timeouts for some special cases
where this value needs to be bumped to a higher value when
transferring large data between federated instances.
2020-09-11 23:03:08 -07:00
Harshavardhana bda0fe3150
fix: allow LDAP identity to support form body POST (#10468)
similar to other STS APIs
2020-09-11 23:02:32 -07:00
Harshavardhana b70995dd60 Revert "ilm: Remove null version if not latest with proper config (#10467)"
This reverts commit 4b6264da7d.
2020-09-11 18:15:49 -07:00
Anis Elleuch 4b6264da7d
ilm: Remove null version if not latest with proper config (#10467) 2020-09-11 14:20:09 -07:00
Harshavardhana 48919de301
fix: for defer'ed deleteObject use internal context (#10463) 2020-09-11 06:39:19 -07:00
Harshavardhana eb2934f0c1
simplify webhook DNS further generalize for gateway (#10448)
continuation of the changes from eaaf05a7cc
this further simplifies, enables this for gateway deployments as well
2020-09-10 14:19:32 -07:00
Klaus Post b7438fe4e6
Copy metadata before spawning goroutine + prealloc maps (#10458)
In `(*cacheObjects).GetObjectNInfo` copy the metadata before spawning a goroutine.

Clean up a few map[string]string copies as well, reducing allocs and simplifying the code.

Fixes #10426
2020-09-10 11:37:22 -07:00
Anis Elleuch ce6cef6855
erasure: Call Walk() from all disks (#10445)
It does not make sense to call Walk() in only N/2 disks and then
requires N/2 quorum, just keep it N/2+1 

The commit fixes this behavior.
2020-09-10 09:27:52 -07:00
Klaus Post 493c714663
Remove erasureSets and erasureObjects from ObjectLayer (#10442) 2020-09-10 09:18:19 -07:00
Harshavardhana e959c5d71c
fix: server panic in FS mode (#10455)
fixes #10454
2020-09-10 09:16:26 -07:00
Harshavardhana 4a2928eb49
generate missing object delete bucket notifications (#10449)
fixes #10381
2020-09-09 18:23:08 -07:00
Anis Elleuch af88772a78
lifecycle: NoncurrentVersionExpiration considers noncurrent version age (#10444)
From https://docs.aws.amazon.com/AmazonS3/latest/dev/intro-lifecycle-rules.html#intro-lifecycle-rules-actions

```
When specifying the number of days in the NoncurrentVersionTransition
and NoncurrentVersionExpiration actions in a Lifecycle configuration,
note the following:

It is the number of days from when the version of the object becomes
noncurrent (that is, when the object is overwritten or deleted), that
Amazon S3 will perform the action on the specified object or objects.

Amazon S3 calculates the time by adding the number of days specified in
the rule to the time when the new successor version of the object is
created and rounding the resulting time to the next day midnight UTC.
For example, in your bucket, suppose that you have a current version of
an object that was created at 1/1/2014 10:30 AM UTC. If the new version
of the object that replaces the current version is created at 1/15/2014
10:30 AM UTC, and you specify 3 days in a transition rule, the
transition date of the object is calculated as 1/19/2014 00:00 UTC.
```
2020-09-09 18:11:24 -07:00
Harshavardhana 9109148474
add support for new UA values for update an check (#10451) 2020-09-09 17:21:39 -07:00
Nitish Tiwari eaaf05a7cc
Add Kubernetes operator webook server as DNS target (#10404)
This PR adds a DNS target that ensures to update an entry
into Kubernetes operator when a bucket is created or deleted.

See minio/operator#264 for details.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-09-09 12:20:49 -07:00
Harshavardhana 958661cbb5
skip subdomain from bucket DNS which start with `minio.domain` (#10390)
extend host matcher to reject the host match
2020-09-09 09:57:37 -07:00
Harshavardhana 6a0372be6c
cleanup tmpDir any older entries automatically just like multipart (#10439)
also consider multipart uploads, temporary files in `.minio.sys/tmp`
as stale beyond 24hrs and clean them up automatically
2020-09-08 15:55:40 -07:00
Harshavardhana c13afd56e8
Remove MaxConnsPerHost settings to avoid potential hangs (#10438)
MaxConnsPerHost can potentially hang a call without any
way to timeout, we do not need this setting for our proxy
and gateway implementations instead IdleConn settings are
good enough.

Also ensure to use NewRequestWithContext and make sure to
take the disks offline only for network errors.

Fixes #10304
2020-09-08 14:22:04 -07:00
Harshavardhana 96997d2b21
allow ctrl+c to be consistent at early startup (#10435)
fixes #10431
2020-09-08 09:10:55 -07:00
Klaus Post 86a3319d41
Ignore config values from unknown subsystems (#10432) 2020-09-08 08:57:04 -07:00
Harshavardhana 9f60e84ce1
always copy UserDefined metadata map (#10427)
fixes #10426
2020-09-07 09:25:28 -07:00
Harshavardhana 572b1721b2
set max API requests automatically based on RAM (#10421) 2020-09-04 19:37:37 -07:00
Harshavardhana b0e1d4ce78
re-attach offline drive after new drive replacement (#10416)
inconsistent drive healing when one of the drive is offline
while a new drive was replaced, this change is to ensure
that we can add the offline drive back into the mix by
healing it again.
2020-09-04 17:09:02 -07:00
Harshavardhana eb19c8af40
Bump response header timeout for proxying list request (#10420) 2020-09-04 16:07:40 -07:00
Klaus Post 2d58a8d861
Add storage layer contexts (#10321)
Add context to all (non-trivial) calls to the storage layer. 

Contexts are propagated through the REST client.

- `context.TODO()` is left in place for the places where it needs to be added to the caller.
- `endWalkCh` could probably be removed from the walkers, but no changes so far.

The "dangerous" part is that now a caller disconnecting *will* propagate down,  so a 
"delete" operation will now be interrupted. In some cases we might want to disconnect 
this functionality so the operation completes if it has started, leaving the system in a cleaner state.
2020-09-04 09:45:06 -07:00
poornas 0037951b6e
improve error message when remote target missing (#10412) 2020-09-04 08:48:38 -07:00
Andreas Auernhammer fbd1c5f51a
certs: refactor cert manager to support multiple certificates (#10207)
This commit refactors the certificate management implementation
in the `certs` package such that multiple certificates can be
specified at the same time. Therefore, the following layout of
the `certs/` directory is expected:
```
certs/
 │
 ├─ public.crt
 ├─ private.key
 ├─ CAs/          // CAs directory is ignored
 │   │
 │    ...
 │
 ├─ example.com/
 │   │
 │   ├─ public.crt
 │   └─ private.key
 └─ foobar.org/
     │
     ├─ public.crt
     └─ private.key
   ...
```

However, directory names like `example.com` are just for human
readability/organization and don't have any meaning w.r.t whether
a particular certificate is served or not. This decision is made based
on the SNI sent by the client and the SAN of the certificate.

***

The `Manager` will pick a certificate based on the client trying
to establish a TLS connection. In particular, it looks at the client
hello (i.e. SNI) to determine which host the client tries to access.
If the manager can find a certificate that matches the SNI it
returns this certificate to the client.

However, the client may choose to not send an SNI or tries to access
a server directly via IP (`https://<ip>:<port>`). In this case, we
cannot use the SNI to determine which certificate to serve. However,
we also should not pick "the first" certificate that would be accepted
by the client (based on crypto. parameters - like a signature algorithm)
because it may be an internal certificate that contains internal hostnames. 
We would disclose internal infrastructure details doing so.

Therefore, the `Manager` returns the "default" certificate when the
client does not specify an SNI. The default certificate the top-level
`public.crt` - i.e. `certs/public.crt`.

This approach has some consequences:
 - It's the operator's responsibility to ensure that the top-level
   `public.crt` does not disclose any information (i.e. hostnames)
   that are not publicly visible. However, this was the case in the
   past already.
 - Any other `public.crt` - except for the top-level one - must not
   contain any IP SAN. The reason for this restriction is that the
   Manager cannot match a SNI to an IP b/c the SNI is the server host
   name. The entire purpose of SNI is to indicate which host the client
   tries to connect to when multiple hosts run on the same IP. So, a
   client will not set the SNI to an IP.
   If we would allow IP SANs in a lower-level `public.crt` a user would
   expect that it is possible to connect to MinIO directly via IP address
   and that the MinIO server would pick "the right" certificate. However,
   the MinIO server cannot determine which certificate to serve, and
   therefore always picks the "default" one. This may lead to all sorts
   of confusing errors like:
   "It works if I use `https:instance.minio.local` but not when I use
   `https://10.0.2.1`.

These consequences/limitations should be pointed out / explained in our
docs in an appropriate way. However, the support for multiple
certificates should not have any impact on how deployment with a single
certificate function today.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-09-03 23:33:37 -07:00
Harshavardhana 1c6781757c
add missing ListBucketVersions from policy actions (#10414) 2020-09-03 18:25:06 -07:00
Harshavardhana b4e3956e69
update KES docs to talk about 'mc encrypt' command (#10400)
add a deprecation notice for KMS_AUTO_ENCRYPTION
2020-09-03 12:43:45 -07:00
Harshavardhana 8a291e1dc0
Cluster healthcheck improvements (#10408)
- do not fail the healthcheck if heal status
  was not obtained from one of the nodes,
  if many nodes fail then report this as a
  catastrophic error.
- add "x-minio-write-quorum" value to match
  the write tolerance supported by server.
- admin info now states if a drive is healing
  where madmin.Disk.Healing is set to true
  and madmin.Disk.State is "ok"
2020-09-02 22:54:56 -07:00
Klaus Post 650dccfa9e
cache: Only start at high watermark (#10403)
Currently, cache purges are triggered as soon as the low watermark is exceeded.
To reduce IO this should only be done when reaching the high watermark.
This simplifies checks and reduces all calls for a GC to go through
`dcache.diskSpaceAvailable(size)`. While a comment claims that 
`dcache.triggerGC <- struct{}{}` was non-blocking I don't see how 
that was possible. Instead, we add a 1 size to the queue channel 
and use channel  semantics to avoid blocking when a GC has 
already been requested.

`bytesToClear` now takes the high watermark into account to it will 
not request any bytes to be cleared until that is reached.
2020-09-02 17:48:44 -07:00
Andreas Auernhammer 9a703befe6
crypto: reduce retry delay when retrying KES requests (#10394)
This commit reduces the retry delay when retrying a request
to a KES server by:
 - reducing the max. jitter delay from 3s to 1.5s
 - skipping the random delay when there are more KES endpoints
   available.

If there are more KES endpoints we can directly retry to the request
by sending it to the next endpoint - as pointed out by @krishnasrinivas
2020-09-02 11:04:10 -07:00
Klaus Post 9a1615768d
Fix flaky TestXLStorageVerifyFile (#10398)
`TestXLStorageVerifyFile` would fail 1 in 256 if the first random character was 'a'.

Instead write 256 bytes which has 1 in 256^256 probability.
2020-09-02 09:42:24 -07:00
Harshavardhana 37da0c647e
fix: delete marker compatibility behavior for suspended bucket (#10395)
- delete-marker should be created on a suspended bucket as `null`
- delete-marker should delete any pre-existing `null` versioned
  object and create an entry `null`
2020-09-02 00:19:03 -07:00
Harshavardhana 2acb530ccd
update rulesguard with new rules (#10392)
Co-authored-by: Nitish Tiwari <nitish@minio.io>
Co-authored-by: Praveen raj Mani <praveen@minio.io>
2020-09-01 16:58:13 -07:00
Klaus Post 3e1fb17b70
heal: Check for truncated files (#10399)
When checking parts we already do a stat for each part.

Since we have the on disk size check if it is at least what we expect.

When checking metadata check if metadata is 0 bytes.
2020-09-01 12:06:45 -07:00
Klaus Post a89d6b8e3d
Fix common Windows failure (#10397)
The `getNonLoopBackIP` may grab an IP from an interface that
doesn't allow binding (on Windows), so this test consistently fails.

We exclude that specific error.
2020-09-01 10:11:15 -07:00
Klaus Post 1c085f7d1a
Fix crash on Windows when crawling (#10385)
* readDirN: Check if file is directory

`syscall.FindNextFile` crashes if the handle is a file.

`errFileNotFound` matches 'unix' functionality: d19b434ffc/cmd/os-readdir_unix.go (L106)

Fixes #10384
2020-09-01 09:33:16 -07:00
Harshavardhana 4b6585d249
support 'ldap:user' variable replacement properly (#10391)
also update `ldap.go` examples with latest
minio-go changes

Fixes #10367
2020-09-01 12:26:22 +05:30
Harshavardhana 9ffad7fceb discard empty endpoint in crypto kes
introduced in 18725679c4
2020-08-31 19:35:43 -07:00
Andreas Auernhammer 18725679c4
crypto: allow multiple KES endpoints (#10383)
This commit addresses a maintenance / automation problem when MinIO-KES
is deployed on bare-metal. In orchestrated env. the orchestrator (K8S)
will make sure that `n` KES servers (IPs) are available via the same DNS
name. There it is sufficient to provide just one endpoint.
2020-08-31 18:10:52 -07:00
Anis Elleuch ba8a8ad818
ListObjectsV1 requests unnecessarily fail with offline nodes (#10386)
ListObjectsV1 requests are actually redirected to a specific node, 
depending on the bucket name. The purpose of this behavior was
to optimize listing.

However, the current code sends a Bad Gateway error if the
target node is offline, which is a bad behavior because it means
that the list request will fail, although this is unnecessary since
we can still use the current node to list as well (the default behavior
without using proxying optimization)

Currently, you can see mint fails when there is one offline node, after
this PR, mint will always succeed.
2020-08-31 12:37:31 -07:00
Harshavardhana 102ad60dee
simplify removing temporary files (#10389) 2020-08-31 12:35:40 -07:00
Gaige B Paulsen 859ef52886
update for smartos build (solaris too) (#10378) 2020-08-31 10:19:25 -07:00
Harshavardhana e730da1438
fix: referesh JWKS public keys upon failure (#10368)
fixes #10359
2020-08-28 08:15:12 -07:00
Anis Elleuch 46ee8659b4
fix write quorum calculation for bucket operations (#10364)
When the number of disks is odd, the calculation of quorum 
for bucket operations were not correct, fix it.
2020-08-27 12:55:32 -07:00
Harshavardhana a359e36e35
tolerate listing with only readQuorum disks (#10357)
We can reduce this further in the future, but this is a good
value to keep around. With the advent of continuous healing,
we can be assured that namespace will eventually be
consistent so we are okay to avoid the necessity to
a list across all drives on all sets.

Bonus Pop()'s in parallel seem to have the potential to
wait too on large drive setups and cause more slowness
instead of gaining any performance remove it for now.

Also, implement load balanced reply for local disks,
ensuring that local disks have an affinity for

- cleanupStaleMultipartUploads()
2020-08-26 19:29:35 -07:00
Jorge Israel Peña 0a2e6d58a5
hdfs gateway handle listing single files (#10362) 2020-08-26 16:03:53 -07:00
Klaus Post 1b119557c2
getDisksInfo: Attribute failed disks to correct endpoint (#10360)
If DiskInfo calls failed the information returned was used anyway 
resulting in no endpoint being set.

This would make the drive be attributed to the local system since 
`disk.Endpoint == disk.DrivePath` in that case.

Instead, if the call fails record the endpoint and the error only.
2020-08-26 10:11:26 -07:00
Harshavardhana 7778fef6bb
update continous heal metrics appropriately for scanned items (#10352)
bonus make sure to ignore objectNotFound, and versionNotFound
errors properly at all layers, since HealObjects() returns
objectNotFound error if the bucket or prefix is empty.
2020-08-26 08:53:33 -07:00
飞雪无情 ea1803417f
Use constants for gateway names to avoid bugs caused by spelling. (#10355) 2020-08-26 08:52:46 -07:00
Harshavardhana d19b434ffc
fix: bring back delayed leaf detection in listing (#10346) 2020-08-25 12:26:48 -07:00
Klaus Post 17a1eda702
Disregard healing disks in crawling (#10349)
When crawling never use a disk we know is healing.

Most of the change involves keeping track of the original endpoint on xlStorage
and this also fixes DiskInfo.Endpoint never being populated.

Heal master will print `data-crawl: Disk "http://localhost:9001/data/mindev/data2/xl1" is 
Healing, skipping` once on a cycle (no more often than every 5m).
2020-08-25 10:55:15 -07:00
Daniel Valdivia 7d1734d033
indicate through HTTP header cluster healing in progress (#10342) 2020-08-24 15:20:50 -07:00
Harshavardhana 03ec6adfd0
fix: KES http2.0 communication support (#10341) 2020-08-24 14:37:53 -07:00
Harshavardhana 309b10f201 keep crawler cycle at 5 minutes 2020-08-24 14:05:16 -07:00
Klaus Post c097ce9c32
continous healing based on crawler (#10103)
Design: https://gist.github.com/klauspost/792fe25c315caf1dd15c8e79df124914
2020-08-24 13:47:01 -07:00
Harshavardhana caad314faa
add ruleguard support, fix all the reported issues (#10335) 2020-08-24 12:11:20 -07:00
Klaus Post bc2ebe0021
Only enforce quota on success (#10339)
We should only enforce quotas if no error has been returned.

firstErr is safe to access since all goroutines have exited at this point.

If `firstErr` hasn't been set by something else return the context error if cancelled.
2020-08-24 10:15:46 -07:00
Harshavardhana 11aa393ba7
Allow region errors to be dynamic (#10323)
remove other FIXMEs as we are not planning to fix these, 
instead we will add dynamism case by case basis.

fixes #10250
2020-08-23 22:06:22 -07:00
Praveen raj Mani d0c910a6f3
Support https and basic-auth for elasticsearch notification target (#10332) 2020-08-23 09:43:48 -07:00
kannappanr d15a5ad4cc
S3 Gateway: Check for encryption headers properly (#10309) 2020-08-22 11:41:49 -07:00
Harshavardhana 95411228db
add missing cleanupStaleMultipartUploads (#10325)
fixes #10319
2020-08-21 21:39:54 -07:00
ebozduman 23774353b7
get_object() returns NoSuchKey error when object is a prefix (#10315) 2020-08-21 13:08:01 -07:00
poornas a2a5ec93d3
fix: use global context for filling cache in the background (#10308) 2020-08-20 14:23:24 -07:00
Harshavardhana 27a774cbe9
fix: FS mode should reject putBucketVersioning (#10307) 2020-08-20 13:18:06 -07:00
Klaus Post 8e6787a302
Fix TestDataUpdateTracker hanging (#10302)
Keep dataUpdateTracker while goroutine is starting.

This will ensure the object is updated one `start` returns

Tested with

```
λ go test -cpu=1,2,4,8 -test.run TestDataUpdateTracker -count=1000
PASS
ok      github.com/minio/minio/cmd      8.913s
```

Fixes #10295
2020-08-20 13:17:42 -07:00
Harshavardhana 59352d0ac2
load all blocking metadata in background (#10298)
most of this metadata already has fallbacks
and there is no good reason to load them
in blocking fashion
2020-08-20 10:38:53 -07:00
Harshavardhana 75d44b3bae
add disk for more context in bitrot errors (#10296) 2020-08-20 09:41:15 -07:00
Klaus Post 95ae6c4b49
Fix missing unlock in *healSequence.hasEnded() (#10305)
The background healing sequence would always hang when this function is called.
2020-08-20 08:48:09 -07:00
KevinSmile 0ebb73ee2e
use const instead of literals (#10292) 2020-08-19 16:43:52 -07:00
Harshavardhana c8b84a0e9e
Add nancy vulnerability scanner (#10289) 2020-08-19 14:25:21 -07:00
Ritesh H Shukla 3acb5cff45
Update code comment (#10287) 2020-08-19 14:24:58 -07:00
Harshavardhana 74116204ce
handle fresh setup with mixed drives (#10273)
fresh drive setups when one of the drive is
a root drive, we should ignore such a root
drive and not proceed to format.

This PR handles this properly by marking
the disks which are root disk and they are
taken offline.
2020-08-18 14:37:26 -07:00
Harshavardhana e4a44f6224
fix: commonPrefixes behavior in ListObjectVersions (#10286)
```
$ aws s3api --profile minio --endpoint-url http://localhost:9003 \
    list-object-versions --bucket testbucket \
    --delimiter / --prefix Veeam/Archive/

{
    "CommonPrefixes": [
        {
            "Prefix": "Veeam/Archive/003/"
        }
    ]
}
```

Also add coverage tests similar to ListObjects to
catch errors in future, skip these tests in FS mode
2020-08-18 12:19:44 -07:00
poornas 0272973175
Fix regression in web ui for retention (#10285)
Fixes: #10283 regression from PR #9259
2020-08-18 12:09:42 -07:00
Harshavardhana d2a3f92452
fix: health handler for lockers (#10280) 2020-08-18 07:27:41 -07:00
Harshavardhana ede86845e5
docs: Add policy variables for resource and conditions (#10278)
Bonus fix adds LDAP policy variable and clarifies the
usage of policy variables for temporary credentials.

fixes #10197
2020-08-17 17:39:55 -07:00
Harshavardhana e57c742674
use single dynamic timeout for most locked API/heal ops (#10275)
newDynamicTimeout should be allocated once, in-case
of temporary locks in config and IAM we should
have allocated timeout once before the `for loop`

This PR doesn't fix any issue as such, but provides
enough dynamism for the timeout as per expectation.
2020-08-17 11:29:58 -07:00
Klaus Post bb5976d727
healbucket: Send object version ID (#10263)
Based on our previous conversations I assume we should send the version
 id when healing an object.

Maybe we should even list object versions and heal all?
2020-08-17 08:25:44 -07:00
Harshavardhana f7c1a59de1
add validation logs for configured Logger/Audit HTTP targets (#10274)
extra logs in-case of misconfiguration of audit/logger targets
2020-08-16 10:25:00 -07:00
Anis Elleuch 51ba1dac49
listing: Fix result when prefix is an object with a slash (#10267)
In a non recursive mode, issuing a list request where prefix
is an existing object with a slash and delimiter is a slash will
return entries in the object directory (data dir IDs)

```
$ aws s3api --profile minioadmin --endpoint-url http://localhost:9000 \
        list-objects-v2 --bucket testbucket --prefix code_of_conduct.md/ --delimiter '/'
{
    "CommonPrefixes": [
        {
            "Prefix":
"code_of_conduct.md/ec750fe0-ea7e-4b87-bbec-1e32407e5e47/"
        }
    ]
}
```

This commit adds a fast exit track in Walk() in this specific case.
2020-08-14 20:13:24 -07:00
Harshavardhana a4463dd40f
fix: storageClass shouldn't set the value upon failure (#10271) 2020-08-14 19:48:04 -07:00
Harshavardhana 83a82d818e
allow lock tolerance to match storage-class drive tolerance (#10270) 2020-08-14 18:17:14 -07:00
Harshavardhana 1d1c4430b2
decrypt ETags in parallel around 500 at a time (#10261)
Listing speed-up gained from 10secs for
just 400 entries to 2secs for 400 entries
2020-08-14 11:56:35 -07:00
Harshavardhana 43e6d1ce2d
fix: missing proxy request by bucket for ListVersions (#10260) 2020-08-13 16:31:58 -07:00
Harshavardhana 30da442a85
rootDisk on containers can have different device Id (#10259)
use `/etc/hosts` instead of `/` to check for common
device id, if the device is same for `/etc/hosts`
and the --bind mount to detect root disks.

Bonus enhance healthcheck logging by adding maintenance
tags, for all messages.
2020-08-13 15:21:20 -07:00
Harshavardhana 038d91feaa
fix: add public certs automatically as part of global CAs (#10256) 2020-08-13 09:46:50 -07:00
Harshavardhana e7ba78beee
use GlobalContext instead of context.Background when possible (#10254) 2020-08-13 09:16:01 -07:00
Harshavardhana b32d0a5b60 use the correct endpoints for offline drives 2020-08-12 19:17:49 -07:00
poornas 79e21601b0
fix: web handlers to enforce replication (#10249)
This PR also preserves source ETag for replication
2020-08-12 17:32:24 -07:00
Harshavardhana 34253aa595
feat: cache env value in-case network is not reachable (#10251) 2020-08-12 16:53:15 -07:00
Harshavardhana 79ed7ce451
fs: listObjects shouldn't take FS locks while listing (#10248) 2020-08-12 15:23:14 +05:30
Harshavardhana 0dd3a08169
move the certPool loader function into pkg/certs (#10239) 2020-08-11 08:29:50 -07:00
Klaus Post f8f290e848
security: Remove insecure custom headers (#10244)
Background: https://github.com/google/security-research/security/advisories/GHSA-76wf-9vgp-pj7w

Remove these custom headers from incoming and outgoing requests.
2020-08-11 08:29:29 -07:00
Harshavardhana 1e2ebc9945
feat: time to bring back http2.0 support (#10230)
Bonus move our CI/CD to go1.14
2020-08-10 09:02:29 -07:00
Harshavardhana 2a9819aff8
fix: refactor background heal for cluster health (#10225) 2020-08-07 19:43:06 -07:00
Harshavardhana 6c6137b2e7
add cluster maintenance healthcheck drive heal affinity (#10218) 2020-08-07 13:22:53 -07:00
Anis Elleuch 9138b2b503
Avoid duplicate headers when proxying S3 listing requests (#10220) 2020-08-07 04:10:16 -07:00
Harshavardhana 77509ce391
Support looking up environment remotely (#10215)
adds a feature where we can fetch the MinIO
command-line remotely, this
is primarily meant to add some stateless
nature to the MinIO deployment in k8s
environments, MinIO operator would run a
webhook service endpoint
which can be used to fetch any environment
value in a generalized approach.
2020-08-06 18:03:16 -07:00
poornas adcaa6f9de
fix: Change ListBucketTargets handler (#10217)
to list all targets across a tenant.
Also fixing some validations.
2020-08-06 17:10:21 -07:00
poornas 121164db56
fix: relax some replication validations (#10210)
Also inherit storage class from source object
if replication configuration does not have a storage
class specified for destination bucket.
2020-08-05 20:01:20 -07:00
Harshavardhana a20d4568a2
fix: make sure to use uniform drive count calculation (#10208)
It is possible in situations when server was deployed
in asymmetric configuration in the past such as

```
minio server ~/fs{1...4}/disk{1...5}
```

Results in setDriveCount of 10 in older releases
but with fairly recent releases we have moved to
having server affinity which means that a set drive
count ascertained from above config will be now '4'

While the object layer make sure that we honor
`format.json` the storageClass configuration however
was by mistake was using the global value obtained
by heuristics. Which leads to prematurely using
lower parity without being requested by the an
administrator.

This PR fixes this behavior.
2020-08-05 13:31:12 -07:00
Harshavardhana e656beb915
feat: allow service accounts to be generated with OpenID STS (#10184)
Bonus also fix a bug where we did not purge relevant
service accounts generated by rotating credentials
appropriately, service accounts should become invalid
as soon as its corresponding parent user becomes invalid.

Since service account themselves carry parent claim always
we would never reach this problem, as the access get
rejected at IAM policy layer.
2020-08-05 13:08:40 -07:00
poornas 88daaef76b
Validate object lock when setting replication config. (#10200)
Check if object lock is enabled on
destination bucket while setting replication
configuration on a object lock enabled bucket.
2020-08-04 23:02:27 -07:00
Harshavardhana 0b8255529a
fix: proxies set keep-alive timeouts to be system dependent (#10199)
Split the DialContext's one for internode and another
for all other external communications especially
proxy forwarders, gateway transport etc.
2020-08-04 14:55:53 -07:00
Harshavardhana 019fe69a57
fix: reduce an extra system call for writes instead fail later (#10187) 2020-08-04 12:09:41 -07:00
Anis Elleuch 6ae30b21c9
fix ILM should not remove a protected version (#10189) 2020-08-03 23:04:40 -07:00
Harshavardhana b16781846e
allow server to start even with corrupted/faulty disks (#10175) 2020-08-03 18:17:48 -07:00
Harshavardhana 5ce82b45da
add CopyObject optimization when source and destination are same (#10170)
when source and destination are same and versioning is enabled
on the destination bucket - we do not need to re-create the entire
object once again to optimize on space utilization.

Cases this PR is not supporting

- any pre-existing legacy object will not
  be preserved in this manner, meaning a new
  dataDir will be created.

- key-rotation and storage class changes
  of course will never re-use the dataDir
2020-08-03 16:21:10 -07:00
Harshavardhana e99bc177c0
fix: allow FS mode situations when conflicting files exist (#10185)
conflicting files can exist on FS at
`.minio.sys/buckets/testbucket/policy.json/`, this is an
expected valid scenario for FS mode allow it to work,
i.e ignore and move forward
2020-08-03 13:20:49 -07:00
Harshavardhana b68bc75dad
fix: quorum calculation mistake with reduced parity (#10186)
With reduced parity our write quorum should be same
as read quorum, but code was still assuming

```
readQuorum+1
```

In all situations which is not necessary.
2020-08-03 12:15:08 -07:00
Harshavardhana d61eac080b
fix: connection_string should override other params (#10180)
closes #9965
2020-08-03 09:16:00 -07:00
poornas a8dd7b3eda
Refactor replication target management. (#10154)
Generalize replication target management so
that remote targets for a bucket can be
managed with ARNs. `mc admin bucket remote`
command will be used to manage targets.
2020-07-30 19:55:22 -07:00
Harshavardhana 25a55bae6f
fix: avoid buffering of server sent events by proxies (#10164) 2020-07-30 19:45:12 -07:00
Harshavardhana fe157166ca
fix: Pass context all the way down to the network call in lockers (#10161)
Context timeout might race on each other when timeouts are lower
i.e when two lock attempts happened very quickly on the same resource
and the servers were yet trying to establish quorum.

This situation can lead to locks held which wouldn't be unlocked
and subsequent lock attempts would fail.

This would require a complete server restart. A potential of this
issue happening is when server is booting up and we are trying
to hold a 'transaction.lock' in quick bursts of timeout.
2020-07-29 23:15:34 -07:00
Adam Brown f7259adf83
Update LastUpdate timestamp before save (#10152) 2020-07-28 13:20:50 -07:00
Harshavardhana 6669560cb9
turn-off bucket usage metrics in gateway mode (#10150)
closes #10147
2020-07-28 13:04:26 -07:00
poornas b46ab7e921
Rename replication target handler (#10142)
Rename replication target handler to a generic bucket target handler
2020-07-28 11:50:47 -07:00
Harshavardhana 27266f8a54
fix: if OPA set do not enforce policy claim (#10149) 2020-07-28 11:47:57 -07:00
poornas 1b6ba0d062
Add validation in cache for offline drives (#10146)
closes #10144
2020-07-28 10:06:52 -07:00
Harshavardhana f200a7fb6a
fix: speed up OBD tests avoid unnecessary memory allocation (#10141)
replace dummy buffer with nullReader{} instead,
to avoid large memory allocations in memory
constrainted environments. allows running
obd tests in such environments.
2020-07-27 14:51:59 -07:00
Harshavardhana 47e304d03c
fix: add missing content-disposition from CORS handler (#10137) 2020-07-27 09:03:38 -07:00
Harshavardhana 9108abf204
fix: allow shareable URLs with rotating creds (#10135)
closes #8935
2020-07-27 09:02:53 -07:00
Harshavardhana 6529dcb3b5
fix: gateway Walk() implementation to list correct contents (#10131)
closes #10122
2020-07-26 22:56:05 -07:00
Harshavardhana abbf6ce6cc
simplify JWKS decoding in OpenID and more tests (#10119)
add tests for non-compliant Azure AD behavior
with "nonce" to fail properly and treat it as
expected behavior for non-standard JWT tokens.
2020-07-25 08:42:41 -07:00
Harshavardhana 5ffc733eec
fix: enforce bucket quota from browser uploads (#10129) 2020-07-24 21:16:54 -07:00
Harshavardhana 35212b673e
add unformatted disk as part of the error list (#10128)
these errors should be ignored for quorum
error calculation to ensure that we don't
prematurely return unformatted disk error
as part of API calls
2020-07-24 13:16:11 -07:00
Harshavardhana 57ff9abca2
Apply quota usage cache invalidation per second (#10127)
Allow faster lookups for quota check enforcement
2020-07-24 12:24:21 -07:00
Jorge Israel Peña 4752323e1c
Use hdfs.Readdir() to optimize HDFS directory listings (#10121)
Currently, listing directories on HDFS incurs a per-entry remote Stat() call
penalty, the cost of which can really blow up on directories with many
entries (+1,000) especially when considered in addition to peripheral
calls (such as validation) and the fact that minio is an intermediary to the
client (whereas other clients listed below can query HDFS directly).

Because listing directories this way is expensive, the Golang HDFS library
provides the [`Client.Open()`] function which creates a [`FileReader`] that is
able to batch multiple calls together through the [`Readdir()`] function.

This is substantially more efficient for very large directories.

In one case we were witnessing about +20 seconds to list a directory with 1,500
entries, admittedly large, but the Java hdfs ls utility as well as the HDFS
library sample ls utility were much faster.

Hadoop HDFS DFS (4.02s):

    λ ~/code/minio → use-readdir
    » time hdfs dfs -ls /directory/with/1500/entries/
    …
    hdfs dfs -ls   5.81s user 0.49s system 156% cpu 4.020 total

Golang HDFS library (0.47s):

    λ ~/code/hdfs → master
    » time ./hdfs ls -lh /directory/with/1500/entries/
    …
    ./hdfs ls -lh   0.13s user 0.14s system 56% cpu 0.478 total

mc and minio **without** optimization (16.96s):

    λ ~/code/minio → master
    » time mc ls myhdfs/directory/with/1500/entries/
    …
    ./mc ls   0.22s user 0.29s system 3% cpu 16.968 total

mc and minio **with** optimization (0.40s):

    λ ~/code/minio → use-readdir
    » time mc ls myhdfs/directory/with/1500/entries/
    …
    ./mc ls   0.13s user 0.28s system 102% cpu 0.403 total

[`Client.Open()`]: https://godoc.org/github.com/colinmarc/hdfs#Client.Open
[`FileReader`]: https://godoc.org/github.com/colinmarc/hdfs#FileReader
[`Readdir()`]: https://godoc.org/github.com/colinmarc/hdfs#FileReader.Readdir
2020-07-24 11:31:51 -07:00
Klaus Post 11593c6cc4
Usage: Reset merged info when updating (#10126)
When merging multiple buckets reset between each update.

Avoids merging the same usage metrics multiple times resulting 
in duplicate data entries.
2020-07-24 11:02:10 -07:00
Harshavardhana 10025bda45
fix: add missing response headers to CORS handler (#10124) 2020-07-24 00:46:51 -07:00
Harshavardhana 3a73f1ead5
refactor server update behavior (#10107) 2020-07-23 08:03:31 -07:00
poornas b9be841fd2
Add missing validation for replication API conditions (#10114) 2020-07-22 17:39:40 -07:00
Anis Elleuch 456b2ef6eb
Avoid healing to be stuck with many concurrent event listeners (#10111)
If there are many listeners to bucket notifications or to the trace
subsystem, healing fails to work properly since it suspends itself when
the number of concurrent connections is above a certain threshold.

These connections are also continuous and not costly (*no disk access*),
it is okay to just ignore them in waitForLowHTTPReq().
2020-07-22 13:16:55 -07:00
poornas c43da3005a
Add support for server side bucket replication (#9882) 2020-07-21 17:49:56 -07:00
Harshavardhana a880283593
Send the lower level error directly from GetDiskID() (#10095)
this is to detect situations of corruption disk
format etc errors quickly and keep the disk online
in such scenarios for requests to fail appropriately.
2020-07-21 13:54:06 -07:00
Harshavardhana eb6bf454f1
fix: copyObject encryption from unencrypted object (#10102)
This is a continuation of #10085
2020-07-21 12:25:01 -07:00
Harshavardhana ec06089eda
fix: re-implement cluster healthcheck (#10101) 2020-07-20 18:31:22 -07:00
Harshavardhana 0c4be55936
fix: fix lockup in merge-walk pool (#10098)
Fixes two different types of problems

- continuation of the problem seen in FS #9992
  as not fixed for erasure coded deployments,
  reproduced this issue with spark and its fixed now

- another issue was leaking walk go-routines which
  would lead to high memory usage and crash the system
  this is simply because all the walks which were purged
  at the top limit had leaking end walkers which would
  consume memory endlessly.

closes #9966
closes #10088
2020-07-20 17:28:26 -07:00
Harshavardhana 11d21d5d1b
fix: pass around the correct drives per set (#10097)
this is a precursor change before adding parity
based SLA across zones instead of same stripe size
2020-07-20 16:38:40 -07:00
Harshavardhana 2955aae8e4
feat: Add notification support for bucketCreates and removal (#10075) 2020-07-20 12:52:49 -07:00
Harshavardhana 9fd836e51f
add dnsStore interface for upcoming operator webhook (#10077) 2020-07-20 12:28:48 -07:00
Anis Elleuch 518f44908c
fs: Close object fs.json before deletion (#10092)
NFS fails when deleting a file while it is already opened. The reason is
that the object fs.json meta file is opened but not closed before
removal.
2020-07-20 08:52:24 -07:00
Harshavardhana e2c71717f8
add different TCP timeouts for internal and incoming (#10090)
closes #10086
2020-07-19 17:16:12 -07:00
Harshavardhana 7764c542f2
allow claims to be optional in STS (#10078)
not all claims need to be present for
the JWT claim, let the policies not
exist and only apply which are present
when generating the credentials

once credentials are generated then
those policies should exist, otherwise
the request will fail.
2020-07-19 15:34:01 -07:00
Harshavardhana d53e560ce0
fix: copyObject key rotation issue (#10085)
- copyObject in-place decryption failed
  due to incorrect verification of headers
- do not decode ETag when object is encrypted
  with SSE-C, so that pre-conditions don't fail
  prematurely.
2020-07-18 17:36:32 -07:00
Harshavardhana 17747db93f
fix: support healing older content (#10076)
This PR adds support for healing older
content i.e from 2yrs, 1yr. Also handles
other situations where our config was
not encrypted yet.

This PR also ensures that our Listing
is consistent and quorum friendly,
such that we don't list partial objects
2020-07-17 17:41:29 -07:00
Harshavardhana 3fe27c8411
fix: In federated setup dial all hosts to figure out online host (#10074)
In federated NAS gateway setups, multiple hosts in srvRecords
was picked at random which could mean that if one of the
host was down the request can indeed fail and if client
retries it would succeed. Instead allow server to figure
out the current online host quickly such that we can
exclude the host which is down.

At the max the attempt to look for a downed node is to
300 millisecond, if the node is taking longer to respond
than this value we simply ignore and move to the node,
total attempts are equal to number of srvRecords if no
server is online we simply fallback to last dialed host.
2020-07-17 14:25:47 -07:00
Harshavardhana 14b1c9f8e4
fix: return Range errors after If-Matches (#10045)
closes #7292
2020-07-17 13:01:22 -07:00
Klaus Post d84fc58cac
fix: CheckParts endpoint call to correct API (#10073)
CheckParts is calling the wrong endpoint, so instead of 
checking parts, it is writing metadata.
2020-07-17 10:17:59 -07:00
Harshavardhana 187c3f62df
fix: heal replaced drives properly (#10069)
healing was not working properly when drives were
replaced, due to the error check in root disk
calculation this PR fixes this behavior

This PR also adds additional fix for missing
metadata entries from .minio.sys as part of
disk healing as well.

Added code to ignore and print more context
sensitive errors for better debugging.

This PR is continuation of fix in 7b14e9b660
2020-07-17 10:08:04 -07:00
Harshavardhana 4bfc50411c
fix: return versionId in tagging APIs (#10068) 2020-07-16 22:38:58 -07:00
Harshavardhana d3c81a6e93
add missing available space from metrics (#10065) 2020-07-16 14:43:48 -07:00
Harshavardhana 7342b5355f
fix: obtain correct location string with DNS style buckets (#10060)
closes #10054
2020-07-16 13:28:29 -07:00
Harshavardhana 7b14e9b660
fix: diskInfo should check diskID only if disk is online (#10058)
closes #10057
2020-07-16 07:30:05 -07:00
Harshavardhana cd849bc2ff
update STS docs with new values (#10055)
Co-authored-by: Poorna <poornas@users.noreply.github.com>
2020-07-15 14:36:14 -07:00
Klaus Post 00d3cc4b69
Enforce quota checks after crawl (#10036)
Enforce bucket quotas when crawling has finished. 
This ensures that we will not do quota enforcement on old data.

Additionally, delete less if we are closer to quota than we thought.
2020-07-14 18:59:05 -07:00
Harshavardhana 14ff7f5fcf
add hdfs sub-path support (#10046)
for users who don't have access to HDFS rootPath '/'
can optionally specify `minio gateway hdfs hdfs://namenode:8200/path`
for which they have access to, allowing all writes to be
performed at `/path`.

NOTE: once configured in this manner you need to make
sure command line is correctly specified, otherwise
your data might not be visible

closes #10011
2020-07-14 15:49:10 -07:00
Harshavardhana 369a876ebe
fix: handle array policies in JWT claim (#10041)
PR #10014 was not complete as only handled
policy claims partially.
2020-07-14 10:26:47 -07:00
Anis Elleuch 778e9c864f
Move dependency from minio-go v6 to v7 (#10042) 2020-07-14 09:38:05 -07:00
Harshavardhana a2616b8227
allow turning off secure ciphers (#10038)
this PR to allow legacy support for big-data
applications which run older Java versions
which do not support the secure ciphers
currently defaulted in MinIO. This option
allows optionally to turn them off such
that client and server can negotiate the
best ciphers themselves.

This env is purposefully not documented,
meant as a last resort when client
application cannot be changed easily.
2020-07-13 14:20:21 -07:00
Harshavardhana e7d7d5232c
fix: admin info output and improve overall performance (#10015)
- admin info node offline check is now quicker
- admin info now doesn't duplicate the code
  across doing the same checks for disks
- rely on StorageInfo to return appropriate errors
  instead of calling locally.
- diskID checks now return proper errors when
  disk not found v/s format.json missing.
- add more disk states for more clarity on the
  underlying disk errors.
2020-07-13 09:51:07 -07:00
Harshavardhana 1d65ef3201
fix: deletes on older format properly (#10029)
while we handle all situations for writes and reads
on older format, what we didn't cater for properly
yet was delete where we only ended up deleting
just `xl.meta` - instead we should allow all the
deletes to go through for older format without
versioning enabled buckets.
2020-07-13 09:01:17 -07:00
Harshavardhana 37c14207d6
fix: cors handling again for not just OPTIONS request (#10025)
CORS is notorious requires specific headers to be
handled appropriately in request and response,
using cors package as part of handlerFunc() for
options method lacks the necessary control this
package needs to add headers.
2020-07-12 10:56:57 -07:00
Harshavardhana 3b9fbf80ad
fix: make sure to use new restClient for healthcheck (#10026)
Without instantiating a new rest client we can
have a recursive error which can lead to
healthcheck returning always offline, this can
prematurely take the servers offline.
2020-07-11 22:19:38 -07:00
Harshavardhana 143f9371c6 fix: loading users regression
additionally also move to latest gorilla/mux master
to fix the DNS style bucket routing regression

resolves #10022
resolves #10023
2020-07-11 14:03:27 -07:00
Harshavardhana 3f1902face
fix: cors should be available on all paths (#10020) 2020-07-11 13:49:24 -07:00
Harshavardhana c0adb52213
sync to disk only upon successful legacy metadata rename (#10018) 2020-07-11 09:37:34 -07:00
Harshavardhana 2d17c16d93
fix: make sure to honor versioning from browser UI deletes (#10016) 2020-07-10 22:21:04 -07:00
Harshavardhana 36d36fab0b
fix: add virtual host style workaround for gorilla/mux issue (#10010)
gorilla/mux broke their recent release 1.7.4 which we
upgraded to, we need the current workaround to ensure
that our regex matches appropriately.

An upstream PR is sent, we should remove the
workaround once we have a new release.
2020-07-10 15:21:32 -07:00
Harshavardhana ba756cf366
fix: extract array type for policy claim if present (#10014) 2020-07-10 14:48:44 -07:00
Benjamin Sodenkamp c00d410e61
Added bucket name param to ToJSONError call (#9961)
when called with InvalidBucketName error.
The user is shown a more specific error
when the param is present.
2020-07-10 12:10:39 -07:00
Klaus Post 968342c732
Remove usage of go-ieproxy for windows (#10009)
There is a potential for deadlock on Windows 10
refer https://github.com/mattn/go-ieproxy/issues/17 

remove this dependency for now.
2020-07-10 12:08:14 -07:00
Harshavardhana 5c15656c55
support bootstrap client to use healthcheck restClient (#10004)
- reduce locker timeout for early transaction lock
  for more eagerness to timeout
- reduce leader lock timeout to range from 30sec to 1minute
- add additional log message during bootstrap phase
2020-07-10 09:26:21 -07:00
kannappanr efe9fe6124
azure: Return success when deleting non-existent object (#9981) 2020-07-10 08:30:23 -07:00
Klaus Post c850905e43
fix: threadwalk lockup under high load (#9992)
Main issue is that `t.pool[params]` should be `t.pool[oldest]`.

We add a bit more safety features for the code.

* Make writes to the endTimerCh non-blocking in all cases
   so multiple releases cannot lock up.
* Double check expectations.
* Shift down deletes with copy instead of truncating slice.
* Actually delete the oldest if we are above total limit.
* Actually delete the oldest found and not the current.
* Unexport the mutex so nobody from the outside can meddle with it.
2020-07-09 07:02:18 -07:00
Andreas Auernhammer a317a2531c
admin: new API for creating KMS master keys (#9982)
This commit adds a new admin API for creating master keys.
An admin client can send a POST request to:
```
/minio/admin/v3/kms/key/create?key-id=<keyID>
```

The name / ID of the new key is specified as request
query parameter `key-id=<ID>`.

Creating new master keys requires KES - it does not work with
the native Vault KMS (deprecated) nor with a static master key
(deprecated).

Further, this commit removes the `UpdateKey` method from the `KMS`
interface. This method is not needed and not used anymore.
2020-07-08 18:50:43 -07:00
Harshavardhana 2743d4ca87
fix: Add support for preserving mtime for replication (#9995)
This PR is needed for bucket replication support
2020-07-08 17:36:56 -07:00
Harshavardhana 6136a963c8
fix: bump the response header timeout for forwarder as well (#9994)
continuation of #9986, add more place where the lower timeout
comes into effect.
2020-07-08 10:55:24 -07:00
Anis Elleuch fa211f6a10
heal: Fix healing delete markers (#9989) 2020-07-07 20:54:09 -07:00
Harshavardhana 72e0745e2f
fix: migrate to go.etcd.io import path (#9987)
with the merge of https://github.com/etcd-io/etcd/pull/11823
etcd v3.5.0 will now have a properly imported versioned path

this fixes our pending migration to newer repo
2020-07-07 19:04:29 -07:00
Klaus Post aa4d1021eb
Remove timeout from putobject and listobjects (#9986)
Use a separate client for these calls that can take a long time.

Add request context to these so they are canceled when the client 
disconnects instead except for ListObject which doesn't have any equivalent.
2020-07-07 12:19:57 -07:00
Harshavardhana 93e7e4a0e5
fix: cors handling after gorilla mux update (#9980)
fixes #9979
2020-07-06 20:55:19 -07:00
Anis Elleuch c2f7cd1104
Consider errFileVersionNotFound during healing assessment (#9977)
Healing an object which has multiple versions was not working because
the healing code forgot to consider errFileVersionNotFound error as a
use case that needs healing
2020-07-06 08:09:48 -07:00
Anis Elleuch 4cf80f96ad
fix: lifecycle XML parsing errors with Versioning (#9974) 2020-07-05 09:08:42 -07:00
Anis Elleuch d4af132fc4
lifecycle: Expiry should not delete versions (#9972)
Currently, lifecycle expiry is deleting all object versions which is not
correct, unless noncurrent versions field is specified.

Also, only delete the delete marker if it is the only version of the
given object.
2020-07-04 20:56:02 -07:00
Harshavardhana c087a05b43
fix: simplify data structure before release (#9968)
- additionally upgrade to msgp@v1.1.2
- change StatModTime,StatSize fields as
  simple Size/ModTime
- reduce 50000 entries per List batch to 10000
  as client needs to wait too long to see the
  first batch some times which is not desired
  and it is worth we write the data as soon
  as we have it.
2020-07-04 12:25:53 -07:00
Harshavardhana cdb0e6ffed
support proper values for listMultipartUploads/listParts (#9970)
object KMS is configured with auto-encryption,
there were issues when using docker registry -
this has been left unnoticed for a while.

This PR fixes an issue with compatibility.

Additionally also fix the continuation-token implementation
infinite loop issue which was missed as part of #9939

Also fix the heal token to be generated as a client
facing value instead of what is remembered by the
server, this allows for the server to be stateless 
regarding the token's behavior.
2020-07-03 19:27:13 -07:00
Harshavardhana 03b84091fc
auto enable versioning with object locking (#9967)
this is to preserve versioning for object-locked
buckets from current release code.
2020-07-03 15:30:06 -07:00
Anis Elleuch 2be20588bf
Reroute requests based token heal/listing (#9939)
When manual healing is triggered, one node in a cluster will 
become the authority to heal. mc regularly sends new requests 
to fetch the status of the ongoing healing process, but a load 
balancer could land the healing request to a node that is not 
doing the healing request.

This PR will redirect a request to the node based on the node 
index found described as part of the client token. A similar
technique is also used to proxy ListObjectsV2 requests
by encoding this information in continuation-token
2020-07-03 11:53:03 -07:00
Harshavardhana e59ee14f40
Tune tcp keep-alives with new kernel timeout options (#9963)
For more deeper understanding https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
2020-07-03 10:03:41 -07:00
Anis Elleuch 21a37e3393
fix: ListObjectVersions should return ordered Version & DeleteMarker (#9959)
The S3 specification says that versions are ordered in the response of
list object versions.

mc snapshot needs this to know which version comes first especially when
two versions have the same exact last-modified field.
2020-07-03 09:15:44 -07:00
Harshavardhana 810a4f0723
fix: return proper errors Get/HeadObject for deleteMarkers (#9957) 2020-07-02 16:17:27 -07:00
Krishna Srinivas 4c266df863
fix: proxy ListObjects request to one of the server based on hash(bucket) (#9881) 2020-07-02 10:56:22 -07:00
Klaus Post abd999f64a
fix: list object versions in distributed setup (#9958)
Remove calls to `WalkVersions` was calling the wrong endpoint, 
so unless quorum could be reached with local disks no results 
would ever be returned.
2020-07-02 10:29:50 -07:00
Benjamin Sodenkamp 648cb13e02
Added 'close' to results channel in Walk() (#9956) 2020-07-01 14:29:45 -07:00
Harshavardhana 174f428571
add additional fdatasync before close() on writes (#9947) 2020-07-01 10:57:23 -07:00
Harshavardhana 5388ae4acb
make sure to delete data-usage cache upon bucket deletes (#9952) 2020-07-01 10:55:28 -07:00
kannappanr 5089a7167d
Handle empty retention in get/put object retention (#9948)
Fixes #9943
2020-06-30 16:44:24 -07:00
Harshavardhana c0ac25bfff
fix: readiness needs to be like liveness (#9941)
Readiness as no reasoning to be cluster scope
because that is not how the k8s networking works
for pods, all the pods to a deployment are not
sharing the network in a singleton. Instead they
are run as local scopes to themselves, with
readiness failures the pod is potentially taken
out of the network to be resolvable - this
affects the distributed setup in myriad of
different ways.

Instead readiness should behave like liveness
with local scope alone, and should be a dummy
implementation.

This PR all the startup times and overal k8s
startup time dramatically improves.

Added another handler called as `/minio/health/cluster`
to understand the cluster scope health.
2020-06-30 11:28:27 -07:00
Klaus Post 27a1f3ed2b
fs: Check if cache root was added (#9945)
Fixes #9942
2020-06-30 09:32:36 -07:00
Harshavardhana 91817d0d1a
fix: implement generic Walk for gateway (#9938)
Walk() functionality was missing on gateway
implementations leading to missing functionality
for the browser UI such as remove multiple objects,
download as zip file etc.

This PR brings a generic implementation across
all gateway's, it is not required to repeat the
same code in all gateway's
2020-06-29 17:07:23 -07:00
poornas 55a3b071ea
Allow optionally to disable range caching. (#9908)
The default behavior is to cache each range requested
to cache drive. Add an environment variable
`MINIO_RANGE_CACHE` - when set to off, it disables
range caching and instead downloads entire object
in the background.

Fixes #9870
2020-06-29 13:25:29 -07:00
Harshavardhana a38ce29137
fix: simplify background heal and trigger heal items early (#9928)
Bonus fix during versioning merge one of the PR was missing
the offline/online disk count fix from #9801 port it correctly
over to the master branch from release.

Additionally, add versionID support for MRF

Fixes #9910
Fixes #9931
2020-06-29 13:07:26 -07:00
Harshavardhana 4bba2cd034
fix: disallow versioning to be suspended with object lock (#9930) 2020-06-28 08:15:15 -07:00
Harshavardhana f7f12b8604
fix: crash in storage rest client due to spurious query params (#9924)
regression got introduced in dee3cf2d7f
when the DeleteVersion API was changed, but the corresponding query
params were left in-tact.
2020-06-26 16:49:49 -07:00
Praveen raj Mani cf5d051afc
update notification rulesMap when reloading bucketMetadata (#9917) 2020-06-26 13:17:31 -07:00
Harshavardhana 2f681bed57
fix: pop entries from each drives in parallel (#9918) 2020-06-25 23:20:12 -07:00
Praveen raj Mani b1705599e1
Fix config leaks and deprecate file-based config setters in NAS gateway (#9884)
This PR has the following changes

- Removing duplicate lookupConfigs() calls.
- Deprecate admin config APIs for NAS gateways. This will avoid repeated reloads of the config from the disk.
- WatchConfigNASDisk will be removed
- Migration guide for NAS gateways users to migrate to ENV settings.

NOTE: THIS PR HAS A BREAKING CHANGE

Fixes #9875

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-06-25 15:59:28 +05:30
Harshavardhana f4b2ed2a92
fix: filter list buckets operation with ListObjects perm (#9907)
fix regression introduced in #9305
2020-06-23 23:21:11 -07:00
Harshavardhana dee3cf2d7f
fix: preserve modTime for DeleteMarker on remote disks (#9905) 2020-06-23 10:20:31 -07:00
Harshavardhana 21058c34d0
add some description of xl.meta (#9901) 2020-06-22 17:27:54 -07:00
Harshavardhana 5b1e6c7dbc
Add check for object statTime non-negative (#9899) 2020-06-22 14:33:58 -07:00
Harshavardhana e92434c2e7
fix: support client customized scopes for OpenID (#9880)
Fixes #9238
2020-06-22 12:08:50 -07:00
Klaus Post cae09d8b84
crawler: Wait max 1 second (#9894)
Add 1-second timeout to crawler wait.

This will make the crawler able to run, albeit very, 
very slowly on high load servers.
2020-06-22 11:57:22 -07:00
Harshavardhana c54e3b4ea3
Add support for minioreleaser a fork for goreleaser (#9890)
This is to support building containers for multiple
platforms, rpms and debs all in a single build process

https://github.com/harshavardhana/minioreleaser
2020-06-22 08:26:40 -07:00
Klaus Post 972d876ca9
Do not select zones with <5% free after upload (#9877)
Looking into full disk errors on zoned setup. We don't take the
5% space requirement into account when selecting a zone.

The interesting part is that even considering this we don't
know the size of the object the user wants to upload when
they do multipart uploads.

It seems quite defensive to always upload multiparts to
the zone where there is the most space since all load will
be directed to a part of the cluster.

In these cases we make sure it can at least hold a 1GiB file
and we disadvantage fuller zones more by subtracting the
expected size before weighing.
2020-06-20 06:36:44 -07:00
Harshavardhana b8cb21c954
allow more than N number of locks in TopLocks (#9883) 2020-06-20 06:33:01 -07:00
Harshavardhana 67062840c1
fix: perform CopyObject under more conditions (#9879)
- x-amz-storage-class specified CopyObject
  should proceed regardless, its not a precondition
- sourceVersionID is specified CopyObject should
  proceed regardless, its not a precondition
2020-06-19 13:53:45 -07:00
Harshavardhana 9626a981bc
fix: Preserve old data appropriately (#9873)
This PR fixes all the below scenarios
and handles them correctly.

- existing data/bucket is replaced with
  new content, no versioning enabled old
  structure vanishes.

- existing data/bucket - enable versioning
  before uploading any data, once versioning
  enabled upload new content, old content
  is preserved.

- suspend versioning on the bucket again, now
  upload content again the old content is purged
  since that is the default "null" version.

Additionally sync data after xl.json -> xl.meta
rename(), to avoid any surprises if there is a
crash during this rename operation.
2020-06-19 10:58:17 -07:00
Harshavardhana b912c8f035
fix: generate new version when replacing metadata in CopyObject (#9871) 2020-06-19 08:44:51 -07:00
Harshavardhana fa13fe2184
allow loading some from config and some values from ENVs (#9872)
A regression perhaps introduced in #9851
2020-06-18 17:31:56 -07:00
Harshavardhana 85a1956e5c
Avoid duplicate object holding locks (#9867)
Fixes #9866
2020-06-18 10:25:07 -07:00
Harshavardhana 7ed1077879
Add a custom healthcheck function for online status (#9858)
- Add changes to ensure remote disks are not
  incorrectly taken online if their order has
  changed or are incorrect disks.
- Bring changes to peer to detect disconnection
  with separate Health handler, to avoid a
  rather expensive call GetLocakDiskIDs()
- Follow up on the same changes for Lockers
  as well
2020-06-17 14:49:26 -07:00
Harshavardhana 94424e14d7
fix: rename legacy xl.json to xl.meta properly in ListDir() (#9863) 2020-06-17 13:58:38 -07:00
Harshavardhana e79874f58e
[feat] Preserve version supplied by client (#9854)
Just like GET/DELETE APIs it is possible to preserve
client supplied versionId's, of course the versionIds
have to be uuid, if an existing versionId is found
it is overwritten if no object locking policies
are found.

- PUT /bucketname/objectname?versionId=<id>
- POST /bucketname/objectname?uploads=&versionId=<id>
- PUT /bucketname/objectname?verisonId=<id> (with x-amz-copy-source)
2020-06-17 11:13:41 -07:00
Klaus Post 8aae8b1d27
Put an upper limit on walk pool sizes (#9848)
Fixes potentially infinite allocations, especially in FS mode, 
since lookups live up to 30 minutes. Limit walk pool sizes to 50 
max parameter entries and 4 concurrent operations with the same
parameters.

Fixes #9835
2020-06-17 09:52:07 -07:00
Klaus Post 1813ff9dfa
Re-add missing bucket bloom filters (#9861) 2020-06-17 08:54:41 -07:00
Harshavardhana 4ac31ea82b
fix: find current location of object multi-zones (#9840)
PutObject on multiple-zone with versioning would not
overwrite the correct location of the object if the
object has delete marker, leading to duplicate objects
on two zones.

This PR fixes by adding affinity towards delete marker
when GetObjectInfo() returns error, use the zone index
which has the delete marker.
2020-06-17 08:33:14 -07:00
Harshavardhana 67ca157329
fix: content-md5 is not mandatory for PutBucketVersioning (#9852) 2020-06-17 07:59:08 -07:00
Harshavardhana f5e1b3d09e
fix: initialize config once per startup (#9851) 2020-06-16 20:15:21 -07:00
Klaus Post 3ba4804d6c
Move online status to REST client (#9808) 2020-06-16 18:59:32 -07:00
Harshavardhana 216de230e2
remove unnecessary log for setMaxResources (#9856)
fixes #9855
2020-06-16 18:57:29 -07:00
ebozduman a91cfa03e7
extend the HINT on backend ownership and its contents (#9846) 2020-06-16 15:32:29 -07:00
Harshavardhana 087aaaf894
fix: save deleteMarker properly, precision upto UnixNano() (#9843) 2020-06-16 07:54:27 -07:00
Harshavardhana cbb7a09376
Allow etcd, cache setup to exit when starting gateway mode (#9842)
- Initialize etcd once per call
- Fail etcd, cache setup pro-actively for gateway setups
- Support deleting/updating bucket notification,
  tagging, lifecycle, sse-encryption
2020-06-15 22:09:39 -07:00
Harshavardhana 1a956424e0 Add logs when quorum is lost during readiness checks (#9839) 2020-06-15 13:11:22 -07:00
Harshavardhana f9aa239973
fix: export prometheus metrics for cache GC triggers (#9815)
Bonus change to use channel to serialize triggers,
instead of using atomic variables. More efficient
mechanism for synchronization.

Co-authored-by: Nitish Tiwari <nitish@minio.io>
2020-06-15 09:05:35 -07:00
Anis Elleuch 2073b79633
fix: Remove unnecessary debug log line (#9834) 2020-06-15 08:55:33 -07:00
Anis Elleuch 63e9005f01 fix: Avoid updating object tags on failed disks (#9819) 2020-06-14 10:53:07 -07:00
Harshavardhana d55f4336ae
preserve context per request for local locks (#9828)
In the Current bug we were re-using the context
from previously granted lockers, this would
lead to lock timeouts for existing valid
read or write locks, leading to premature
timeout of locks.

This bug affects only local lockers in FS
or standalone erasure coded mode. This issue
is rather historical as well and was present
in lsync for some time but we were lucky to
not see it.

Similar changes are done in dsync as well
to keep the code more familiar

Fixes #9827
2020-06-14 07:43:10 -07:00
ethan ho 535efd34a0
Fix peer server update failure (#9824)
When updating all servers following the constructions of mc update,
only the endpoint server will be updated successfully.
All the other peer servers' updating failed due to the error below:
--------------------------------------------------------------------------
parsing time "2006-01-02T15:04:05Z07:00" as "<release version>": cannot parse "-01-02T15:04:05Z07:00" as "0-" 
--------------------------------------------------------------------------
2020-06-13 07:12:49 -07:00
Harshavardhana 4915433bd2
Support bucket versioning (#9377)
- Implement a new xl.json 2.0.0 format to support,
  this moves the entire marshaling logic to POSIX
  layer, top layer always consumes a common FileInfo
  construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
  for object placements.

Fixes #2111
2020-06-12 20:04:01 -07:00
Klaus Post 43d6e3ae06
merge object lifecycle checks into usage crawler (#9579) 2020-06-12 10:28:21 -07:00
kannappanr 225b812b5e
Update minio-go library to latest (#9813) 2020-06-12 10:18:42 -07:00
Harshavardhana 96ed0991b5
fix: optimize IAM users load, add fallback (#9809)
Bonus fix, load service accounts properly
when service accounts were generated with
LDAP
2020-06-11 14:11:30 -07:00
Harshavardhana a42df3d364
Allow idiomatic usage of middlewares in gorilla/mux (#9802)
Historically due to lack of support for middlewares
we ended up writing wrapped handlers for all
middlewares on top of the gorilla/mux, this causes
multiple issues when we want to let's say

- Overload r.Body with some custom implementation
  to track the incoming Reads()
- Add other sort of top level checks to avoid
  DDOSing the server with large incoming HTTP
  bodies.

Since 1.7.x release gorilla/mux provides proper
use of middlewares, which are honored by the muxer
directly. This makes sure that Go can honor its
own internal ServeHTTP(w, r) implementation where
Go net/http can wrap into its own customer readers.

This PR as a side-affect fixes rare issues of client
hangs which were reported in the wild but never really
understood or fixed in our codebase.

Fixes #9759
Fixes #7266
Fixes #6540
Fixes #5455
Fixes #5150

Refer https://github.com/boto/botocore/pull/1328 for
one variation of the same issue in #9759
2020-06-11 08:19:55 -07:00
Harshavardhana ff94b1b0a9
isEndpointConnected should take local disk inputs (#9803)
PR #9801 while it is correct, the loop isEndpointConnected()
was changed to rely on endpoint.String() which has the host
information as well, which is not correct value as input to
detect if the disk is down or up, if endpoint is local use
its local path value instead.
2020-06-11 08:05:25 -07:00
Andreas Auernhammer b1845c6c83
kes: try to auto. create master key if not present (#9790)
This commit changes the data key generation such that
if a MinIO server/nodes tries to generate a new DEK
but the particular master key does not exist - then
MinIO asks KES to create a new master key and then
requests the DEK again.

From now on, a SSE-S3 master key must not be created
explicitly via: `kes key create <key-name>`.
Instead, it is sufficient to just set the env. var.
```
export MINIO_KMS_KES_KEY_NAME=<key-name>
```

However, the MinIO identity (mTLS client certificate)
must have the permission to access the `/v1/key/create/`
API. Therefore, KES policy for MinIO must look similar to:
```
[
  /v1/key/create/<key-name-pattern>
  /v1/key/generate/<key-name-pattern>
  /v1/key/decrypt/<key-name-pattern>
]
```
However, in our guides we already suggest that.
See e.g.: https://github.com/minio/kes/wiki/MinIO-Object-Storage#kes-server-setup

***

The ability to create master keys on request may also be
necessary / useful in case of SSE-KMS.
2020-06-11 02:00:47 -07:00
Harshavardhana 62b1da3e2c
fix offline disk calculation (#9801)
Current code was relying on globalEndpoints as
the source of secondary truth to obtain
the missing endpoints list when the disk
is offline, this is problematic

- there is no way to know if the getDisks()
  returned endpoints total is same as the
  ones list of globalEndpoints and it
  belongs to a particular set.
- there is no order guarantee as getDisks()
  is ordered as per format.json, globalEndpoints
  may not be, so potentially end up including
  incorrect endpoints.

To fix this bring getEndpoints() just like getDisks()
to ensure that consistently ordered endpoints are
always available for us to ensure that returned values
are consistent with what each erasure set would observe.
2020-06-10 17:10:31 -07:00
poornas d26b24f670
avoid storing X-Amz-Tagging-Directive in metadata (#9800) 2020-06-10 14:29:24 -07:00
kannappanr 2c372a9894
Send Partscount only when partnumber is specified (#9793)
Fixes #9789
2020-06-10 09:22:15 -07:00
poornas 3d3b75fb8d
Avoid overwriting object tags when changing lock (#9794) 2020-06-10 08:16:30 -07:00
Klaus Post 142b057be8
Check object names on windows (#9798)
Uploading files with names that could not be written to disk 
would result in "reduce your request" errors returned.

Instead check explicitly for disallowed characters and reject 
files with `Object name contains unsupported characters.`
2020-06-10 08:14:22 -07:00
Harshavardhana 4790868878
allow background IAM load to speed up startup (#9796)
Also fix healthcheck handler to run success
only if object layer has initialized fully
for S3 API access call.
2020-06-09 19:19:03 -07:00
Harshavardhana 342ade03f6
deprecate listDir usage for healing (#9792)
listDir was incorrectly used for healing which
is slower, instead use Walk() to heal the entire
set.
2020-06-09 17:09:19 -07:00
P R 9407dbf387
display proper used space based on disk usage (#9551)
Fixes #9346
2020-06-09 15:05:39 -07:00
Harshavardhana 423aeb0d81
allow large buffer to list more entries per directory (#9785) 2020-06-09 09:44:50 -07:00
Anis Elleuch 790323ac37
lifecycle: Fix object expiration date (#9791)
re-use PredictExpiryTime() in ComputeAction()
2020-06-09 09:40:53 -07:00
Harshavardhana febe9cc26a
fix: avoid timer leaks in dsync/lsync (#9781)
At a customer setup with lots of concurrent calls
it can be observed that in newRetryTimer there
were lots of tiny alloations which are not
relinquished upon retries, in this codepath
we were only interested in re-using the timer
and use it wisely for each locker.

```
(pprof) top
Showing nodes accounting for 8.68TB, 97.02% of 8.95TB total
Dropped 1198 nodes (cum <= 0.04TB)
Showing top 10 nodes out of 79
      flat  flat%   sum%        cum   cum%
    5.95TB 66.50% 66.50%     5.95TB 66.50%  time.NewTimer
    1.16TB 13.02% 79.51%     1.16TB 13.02%  github.com/ncw/directio.AlignedBlock
    0.67TB  7.53% 87.04%     0.70TB  7.78%  github.com/minio/minio/cmd.xlObjects.putObject
    0.21TB  2.36% 89.40%     0.21TB  2.36%  github.com/minio/minio/cmd.(*posix).Walk
    0.19TB  2.08% 91.49%     0.27TB  2.99%  os.statNolog
    0.14TB  1.59% 93.08%     0.14TB  1.60%  os.(*File).readdirnames
    0.10TB  1.09% 94.17%     0.11TB  1.25%  github.com/minio/minio/cmd.readDirN
    0.10TB  1.07% 95.23%     0.10TB  1.07%  syscall.ByteSliceFromString
    0.09TB  1.03% 96.27%     0.09TB  1.03%  strings.(*Builder).grow
    0.07TB  0.75% 97.02%     0.07TB  0.75%  path.(*lazybuf).append
```
2020-06-08 11:28:40 -07:00
Praveen raj Mani 2ce2e88adf
Support mTLS Authentication in Webhooks (#9777) 2020-06-08 05:55:44 -07:00
Harshavardhana c7599d323b
fix: throw error if symmetry cannot be obtained (#9780)
For example `{1...17}/{1...52}` symmetrical
distribution of drives cannot be obtained

- Because 17 is a prime number
- Is not divisible by any pre-defined setCounts i.e
  from 1 to 16
2020-06-06 22:13:48 -07:00
Harshavardhana d93bdea433
fix remove LDAPPassword from audit logs (#9773)
the previous fix for #9707 was not correct,
fix this properly passing the right filter
keys to be filtered from the audit
log output.

Fixes #9767
2020-06-04 22:07:55 -07:00
Harshavardhana 5e529a1c96
simplify context timeout for readiness (#9772)
additionally also add CORS support to restrict
for specific origin, adds a new config and
updated the documentation as well
2020-06-04 14:58:34 -07:00
Harshavardhana 5686a7e273
fix NAS gateway support for policy/notification (#9765)
Fixes #9764
2020-06-03 13:18:54 -07:00
Harshavardhana 566e0e2048
allow deleting of dropped multiparts (#9753)
bonus change trigger MRF heal when single
offline disk is found, break out early.
2020-06-02 15:27:03 -07:00
Anis Elleuch 3aad09be28
heal: Fix passing healing opts (#9756)
Manual healing (as background healing) creates a heal task with a
possiblity to override healing options, such as deep or normal mode.

Use a pointer type in heal opts so nil would mean use the default
healing options.
2020-06-02 09:07:16 -07:00
Harshavardhana f0358acb32
concurrently load bucket metadata (#9749) 2020-06-01 22:32:53 -07:00
Anis Elleuch fd0de4ab32
azure: Show better message when credentials are wrong (#9748) 2020-06-01 18:23:48 -07:00
Anis Elleuch 73a308502f
Relax content-md5 requirement in set encryption handler (#9750)
aws cli fails to set a bucket encryption configuration to MinIO server.
The reason is that aws cli does not send MD5-Content header. It seems
that MD5-Content is not required anymore.

This commit also returns Not Implemented header early to help mint tests
to ignore testing this API in gateway modes.
2020-06-01 18:08:19 -07:00
Anis Elleuch bd59f150b8
azure: Implement CopyPart API (#9747) 2020-06-01 11:12:18 -07:00
Harshavardhana f90422a890
fix prometheus calculation of offline disks per instance (#9744)
This was a regression introduced in 9baeda7 for prometheus
calculation of offline disks which should be local to
an instance.

fixes #9742
2020-06-01 07:35:40 -07:00
Harshavardhana 8befedef14
simplify FS multipart cleanup (#9740)
fixes #9671
2020-05-30 13:56:31 -07:00
Nathan Brown 2af3004409
Use registry to check Atime support on Windows (#9741) 2020-05-30 09:47:42 -07:00
Harshavardhana 38ee40d59c
move to upstream code colinmarc/hdfs (#9738)
- supports SASL based authentication now
- upgrades to new changes in gokrb library
- implement force delete feature

Fixes #8206
2020-05-29 18:38:50 -07:00
kannappanr d583f1ac0e
check if container is empty before invoking DeleteContainer (#9733) 2020-05-29 13:24:39 -07:00
Harshavardhana 2bcb02f628
Avoid '\n' from constant strings (#9737)
Fixes #9736
2020-05-29 11:40:57 -07:00
Klaus Post 167ddf9c9c
Workaround for Windows Docker Engine 19.03.8 (#9735)
Add workaround for issue preventing servers from starting on 
Windows Docker Engine 19.03.8

Fixes #9726
2020-05-29 07:05:19 -07:00
Anton Huck f833e41e69
IAM: Fix nil panic due to uninit. iamGroupPolicyMap. Fixes #9730 (#9734) 2020-05-29 06:13:54 -07:00
Harshavardhana 41688a936b
fix: CopyObject behavior on expanded zones (#9729)
CopyObject was not correctly figuring out the correct
destination object location and would end up creating
duplicate objects on two different zones, reproduced
by doing encryption based key rotation.
2020-05-28 14:36:38 -07:00
Harshavardhana b2db8123ec
Preserve errors returned by diskInfo to detect disk errors (#9727)
This PR basically reverts #9720 and re-implements it differently
2020-05-28 13:03:04 -07:00
Harshavardhana b330c2c57e
Introduce simpler GetMultipartInfo call for performance (#9722)
Advantages avoids 100's of stats which are needed for each
upload operation in FS/NAS gateway mode when uploading a large
multipart object, dramatically increases performance for
multipart uploads by avoiding recursive calls.

For other gateway's simplifies the approach since
azure, gcs, hdfs gateway's don't capture any specific
metadata during upload which needs handler validation
for encryption/compression.

Erasure coding was already optimized, additionally
just avoids small allocations of large data structure.

Fixes #7206
2020-05-28 12:36:20 -07:00
kannappanr 7214a0160a
allow bucket policy to set/removed in NAS gateway (#9706) 2020-05-28 08:31:16 -07:00
Anis Elleuch 375b79f11b
storage: Implement GetDiskID request in REST server side (#9720)
GetDiskID() in storage rest client does not really issue a REST request
to the remote disk, but returns an in-memory value instead.

However, GetDiskID() should return an error when format.json is not
found or for other similar issues (unmounted disks, etc..)

GetDiskID() is only called when formatting disks and getting storage
informatio, hence this commit should not have a performance degradation.
2020-05-28 08:17:42 -07:00
Harshavardhana 3da1869d5e
Avoid double reads on metadata during GetObject() (#9719)
Overall TTFB can see a dramatic improvement with
this change - did not do any benchmark as such
but the change itself is self-explanatory
2020-05-27 16:14:26 -07:00
Harshavardhana 7cedc5369d
fix: send valid claims in AuditLogs for browser requests (#9713)
Additionally also fix STS logs to filter out LDAP
password to be sent out in audit logs.

Bonus fix handle the reload of users properly by
making sure to preserve the newer users during the
reload to be not invalidated.

Fixes #9707
Fixes #9644
Fixes #9651
2020-05-27 12:38:44 -07:00
Harshavardhana 53aaa5d2a5
Export bucket usage counts as part of bucket metrics (#9710)
Bonus fixes in quota enforcement to use the
new datastructure and use timedValue to cache
a value/reload automatically avoids one less
global variable.
2020-05-27 06:45:43 -07:00
P R 9d39fb3604
add copyobject tagging replace directive for gateway (#9711) 2020-05-26 17:32:53 -07:00
Klaus Post 4a007e3767
Prefer local disks when fetching data blocks (#9563)
If the requested server is part of the set this will always read 
from the local disk, even if the disk contains a parity shard. 
In default setup there is a 50% chance that at least 
one shard that otherwise would have been fetched remotely 
will be read locally instead.

It basically trades RPC call overhead for reed-solomon. 
On distributed localhost this seems to be fairly break-even, 
with a very small gain in throughput and latency. 
However on networked servers this should be a bigger

1MB objects, before:

```
Operation: GET. Concurrency: 32. Hosts: 4.

Requests considered: 76257:
 * Avg: 25ms 50%: 24ms 90%: 32ms 99%: 42ms Fastest: 7ms Slowest: 67ms
 * First Byte: Average: 23ms, Median: 22ms, Best: 5ms, Worst: 65ms

Throughput:
* Average: 1213.68 MiB/s, 1272.63 obj/s (59.948s, starting 14:45:44 CEST)
```

After:
```
Operation: GET. Concurrency: 32. Hosts: 4.

Requests considered: 78845:
 * Avg: 24ms 50%: 24ms 90%: 31ms 99%: 39ms Fastest: 8ms Slowest: 62ms
 * First Byte: Average: 22ms, Median: 21ms, Best: 6ms, Worst: 57ms

Throughput:
* Average: 1255.11 MiB/s, 1316.08 obj/s (59.938s, starting 14:43:58 CEST)
```

Bonus fix: Only ask for heal once on an object.
2020-05-26 16:47:23 -07:00
Klaus Post 95814359bd
cache disk info to avoid repeated calls (#9682)
This value is requested on every upload when there are multiple zones.

Since this will result in an RPC call to every remote disk this scales 
quite badly in a distributed setup. Load every 1second interval.

2 servers, localhost only. In large distributed setups much bigger 
gains can be expected.

```
Operations: 21743 -> 22454
* Average: +3.28% (+0.0 MiB/s) throughput, +3.28% (+11.9) obj/s
* Fastest: +3.37% (+0.0 MiB/s) throughput, +3.37% (+13.0) obj/s
* 50% Median: +3.03% (+0.0 MiB/s) throughput, +3.03% (+11.2) obj/s
* Slowest: +8.03% (+0.0 MiB/s) throughput, +8.03% (+22.8) obj/s
```

For easy management of this a generic helper has been added.
2020-05-26 12:52:24 -07:00
Harshavardhana d0ae69087c
fix: add proper errors for disks with preexisting content (#9703) 2020-05-26 09:32:33 -07:00
Harshavardhana 7ea026ff1d
fix: reply back user-metadata in lower case form (#9697)
some clients such as veeam expect the x-amz-meta to
be sent in lower cased form, while this does indeed
defeats the HTTP protocol contract it is harder to
change these applications, while these applications
get fixed appropriately in future.

x-amz-meta is usually sent in lowercased form
by AWS S3 and some applications like veeam
incorrectly end up relying on the case sensitivity
of the HTTP headers.

Bonus fixes

 - Fix the iso8601 time format to keep it same as
   AWS S3 response
 - Increase maxObjectList to 50,000 and use
   maxDeleteList as 10,000 whenever multi-object
   deletes are needed.
2020-05-25 16:51:32 -07:00
Harshavardhana 6e0575a53d
Revert "Disable crawler in FS/NAS gateway mode (#9695)" (#9702)
This reverts commit eba423bb9d.

Additionally also address the FS crawler to properly
calculate the sizes for encrypted/compressed content.
2020-05-25 11:32:53 -07:00
Harshavardhana eba423bb9d
Disable crawler in FS/NAS gateway mode (#9695)
No one really uses FS for large scale accounting
usage, neither we crawl in NAS gateway mode. It is
worthwhile to simply disable this feature as its
not useful for anyone.

Bonus disable bucket quota ops as well in, FS
and gateway mode
2020-05-25 00:17:52 -07:00
Erkki Eilonen 301de169e9
in cache build ranges metadata as needed (#9698) 2020-05-25 00:17:03 -07:00
Harshavardhana 0c71ce3398
fix size accounting for encrypted/compressed objects (#9690)
size calculation in crawler was using the real size
of the object instead of its actual size i.e either
a decrypted or uncompressed size.

this is needed to make sure all other accounting
such as bucket quota and mcs UI to display the
correct values.
2020-05-24 11:19:17 -07:00
Krishna Srinivas 7d19ab9f62
readiness returns error quickly if any of the set is down (#9662)
This PR adds a new configuration parameter which allows readiness
check to respond within 10secs, this can be reduced to a lower value
if necessary using 

```
mc admin config set api ready_deadline=5s
```

 or

```
export MINIO_API_READY_DEADLINE=5s
```
2020-05-23 17:38:39 -07:00
P R 3f6d624c7b
add gateway object tagging support (#9124) 2020-05-23 11:09:35 -07:00
Harshavardhana c138272d63
reject object lock requests on existing buckets (#9684)
a regression was introduced fix it to ensure that we
do not allow object locking settings on existing buckets
without object locking
2020-05-23 10:01:01 -07:00
Harshavardhana 7dbfea1353
avoid net/http ErrorLog for consistent logging experience (#9672)
net/http exposes ErrorLog but it is log.Logger
instance not an interface which can be overridden,
because of this reason the logging is interleaved
sometimes with TLS with messages like this on the
server

```
http: TLS handshake error from 139.178.70.188:63760: EOF
```

This is bit problematic for us as we need to have
consistent logging view for allow --json or --quiet
flags.

With this PR we ensure that this format is adhered to.
2020-05-22 21:59:18 -07:00
Sidhartha Mani c121d27f31
progressively report obd results (#9639) 2020-05-22 17:56:45 -07:00
Anis Elleuch 43c19a6b82
nas: ensure loading of bucket notifications during startup (#9681) 2020-05-22 11:55:30 -07:00
Harshavardhana e45c90060f
remove references for deprecated dockerfiles and deployment styles (#9675) 2020-05-22 08:40:59 -07:00
Harshavardhana d15042470e
add missing signature v2 query params (#9670) 2020-05-21 18:51:23 -07:00
Anis Elleuch cdf4815a6b
Add x-amz-expiration header in some S3 responses (#9667)
x-amz-expiration is described in the S3 specification as a header which
indicates if the object in question will expire any time in the future.
2020-05-21 14:12:52 -07:00
kannappanr fade056244
filter all encryption headers in gateway (#9661)
fixes #9655
2020-05-21 11:07:50 -07:00
Harshavardhana a546047c95
keep bucket metadata fields to be consistent (#9660)
added bonus reload bucket metadata always after
a successful MakeBucket, current we were only
doing it with object locking enabled.
2020-05-21 11:03:59 -07:00
ebozduman 2896e780ae
fixes misleading assume role error msgs (#9642) 2020-05-21 09:09:18 -07:00
Harshavardhana baa30f4289
reload bucket metadata outside the locker (#9659) 2020-05-20 14:11:13 -07:00
Harshavardhana 189c861835
fix: remove LDAP groups claim and store them on server (#9637)
Groups information shall be now stored as part of the
credential data structure, this is a more idiomatic
way to support large LDAP groups.

Avoids the complication of setups where LDAP groups
can be in the range of 150+ which may lead to excess
HTTP header size > 8KiB, to reduce such an occurrence
we shall save the group information on the server as
part of the credential data structure.

Bonus change support multiple mapped policies, across
all types of users.
2020-05-20 11:33:35 -07:00
Harshavardhana 6656fa3066
simplify further bucket configuration properly (#9650)
This PR is a continuation from #9586, now the
entire parsing logic is fully merged into
bucket metadata sub-system, simplify the
quota API further by reducing the remove
quota handler implementation.
2020-05-20 10:18:15 -07:00
Praveen raj Mani 0cc2ed04f5
humanize `timeToFirstByte` and `timeToResponse` upto nanoseconds (#9641) 2020-05-19 18:34:02 -07:00
Anis Elleuch 9baeda781a
fix storage info output with unordered endpoints arguments (#9610)
Shuffling arguments that we pass to MinIO server are supported. However,
when that happens, Prometheus returns wrong information about disks usage
and online/offline status.

The commit fixes the issue by avoiding relying on xl.endpoints since
it is not ordered.
2020-05-19 14:27:20 -07:00
Harshavardhana bd032d13ff
migrate all bucket metadata into a single file (#9586)
this is a major overhaul by migrating off all
bucket metadata related configs into a single
object '.metadata.bin' this allows us for faster
bootups across 1000's of buckets and as well
as keeps the code simple enough for future
work and additions.

Additionally also fixes #9396, #9394
2020-05-19 13:53:54 -07:00
Harshavardhana d31eaddba3
fix: avoid double body reads in SelectObject call (#9638)
Bonus fix handle encryption headers in response
properly for both notification and response to
the client.
2020-05-19 02:01:08 -07:00
poornas 3202f78f0f
Fix cache metadata update for range GET (#9636)
This was inadvertently deleting cached ranges
because HTTPRangeSpec was not being passed down

fixes #9597
2020-05-18 18:33:43 -07:00
Harshavardhana 6de410a0aa
fix: possiblity of double write lockers on same resource (#9616)
To avoid this issue with refCounter refactor the code
such that

- locker() always increases refCount upon success
- unlocker() always decrements refCount upon success
  (as a special case removes the resource if the
  refCount is zero)

By these two assumptions we are able to see that we
are never granted two write lockers in any situation.

Thanks to @vcabbage for writing a nice reproducer.
2020-05-18 17:33:35 -07:00
Klaus Post 1847f17f50
Set Deployment ID before starting handlers (#9635)
Global handler ID is added to response headers, so initialize it before the server starts.

Fixes #9634
2020-05-18 11:35:05 -07:00
Harshavardhana 1bc32215b9
enable full linter across the codebase (#9620)
enable linter using golangci-lint across
codebase to run a bunch of linters together,
we shall enable new linters as we fix more
things the codebase.

This PR fixes the first stage of this
cleanup.
2020-05-18 09:59:45 -07:00
Anis Elleuch 96009975d6
relax validation when loading lifecycle document from the backend (#9612) 2020-05-18 08:33:43 -07:00
Harshavardhana de9b391db3
fix: Disable presigned without appropriate policy (#9621)
Fixes #9590
2020-05-17 23:38:52 -07:00
kannappanr a62572fb86
Check for address flags in all positions (#9615)
Fixes #9599
2020-05-17 08:46:23 -07:00
poornas 011a2c0b78
Add docs for bucket quota feature (#9503)
This PR also adds a check to not enforce
bucket quota for server-side metadata copy
of an object onto itself.
2020-05-16 19:27:33 -07:00
Harshavardhana 814ddc0923
add missing admin actions, enhance AccountUsageInfo (#9607) 2020-05-15 18:16:45 -07:00
Harshavardhana d348ec0f6c
avoid double listObjectParts calls improves performance (#9606)
this PR is to avoid double calls across multiple calls
in APIs

- CopyObjectPart
- PutObjectPart
2020-05-15 08:06:45 -07:00
Harshavardhana b730bd1396
fix: possible race in FS local lockMap (#9598) 2020-05-14 23:59:07 -07:00
Klaus Post 56e0c6adf8
Track if bloom filter is dirty (#9601)
Only save bloom filter on cycles and updates.

Fixes #9600
2020-05-14 21:46:36 -07:00
Anis Elleuch f44a960dcd
tests: Fix one multi-delete test failure in Windows CI (#9602)
There is a disparency of behavior under Linux & Windows about
the returned error when trying to rename a non existant path.

err := os.Rename("/path/does/not/exist", "/tmp/copy")

Linux:
  isSysErrNotDir(err) = false
  os.IsNotExist(err) = true

Windows:
  isSysErrNotDir(err) = true
  os.IsNotExist(err) = true

ENOTDIR in Linux is returned when the destination path
of the rename call contains a file in one of the middle
segments of the path (e.g. /tmp/file/dst, where /tmp/file
is an actual file not a directory)

However, as shown above, Windows has more scenarios when
it returns ENOTDIR. For example, when the source path contains
an inexistant directory in its path.

In that case, we want errFileNotFound returned and not
errFileAccessDenied, so this commit will add a further check to close
the disparency between Windows & Linux.
2020-05-14 18:09:30 -07:00
kannappanr 6c1bbf918d
do not add quotes around etag, if already present (#9603) 2020-05-14 17:43:54 -07:00
Anis Elleuch 48e614b167
honor lifecycle expiration with tag rule (#9604) 2020-05-14 16:21:03 -07:00
poornas fe8d33452b
Allow writes for bucket exceeding FIFO quota (#9575)
the quota will be enforced while
deleting oldest entries in FIFO manner.
2020-05-14 15:18:24 -07:00
Klaus Post 216fa57b88
merge nested hash readers (#9582)
The `ioutil.NopCloser(reader)` was hiding nested hash readers.

We make it an `io.Closer` so it can be attached without wrapping 
and allows for nesting, by merging the requests.
2020-05-14 14:01:31 -07:00
Klaus Post ee9077db7d
fix: windows tests for all cases (#9594)
Replaces #9299
2020-05-13 23:55:38 -07:00
Harshavardhana 9c85928740
add formatting message for zones in ordinals (#9596)
Unlike the message
> Formatting 2 zone, 1 set(s), 6 drives per set.

It is more readable as ordinal
> Formatting 2nd zone, 1 set(s), 6 drives per set.
2020-05-13 20:25:29 -07:00
Harshavardhana 6ac48a65cb
fix: use unused cacheMetrics code in prometheus (#9588)
remove all other unusued/deadcode
2020-05-13 08:15:26 -07:00
Krishna Srinivas 94f1a1dea3
add option for O_SYNC writes for standalone FS backend (#9581) 2020-05-12 19:24:59 -07:00
Anis Elleuch c045ae15e7
fix: avoid undoing bucket creation and return the first err instead (#9578) 2020-05-12 15:20:42 -07:00
Harshavardhana 1756b7c6ff
fix: LDAP derivative accounts parentUser validation is not needed (#9573)
* fix: LDAP derivative accounts parentUser validation is not needed

fixes #9435

* Update cmd/iam.go

Co-authored-by: Lenin Alevski <alevsk.8772@gmail.com>

Co-authored-by: Lenin Alevski <alevsk.8772@gmail.com>
2020-05-12 09:21:08 -07:00
Klaus Post e25ace2151
Forward RPC errors from crawler (#9569)
The `keepHTTPResponseAlive` would cause errors to be 
returned with status OK.

- Add '32' as a filler byte until a response is ready
- '0' to indicate the response is ready to be consumed
- '1' to indicate response has an error which needs
to be returned to the caller

Clear out 'file not found' errors from dir walker, since it may be 
in a folder that has been deleted since it was scanned.
2020-05-11 20:41:38 -07:00
poornas a8e5a86fa0
Remove brittle tests for cache (#9570) 2020-05-11 15:41:10 -07:00
Harshavardhana f8edc233ab
support multiple policies for temporary users (#9550) 2020-05-11 13:04:11 -07:00
Harshavardhana 337c2a7cb4
add audit logging for all admin calls (#9568)
- add ServiceRestart/ServiceStop actions
- audit log appropriately in all admin handlers

fixes #9522
2020-05-11 10:34:08 -07:00
Harshavardhana b5ed42c845
ignore policy/group missing errors appropriately (#9559) 2020-05-09 13:59:12 -07:00
Klaus Post d9e7cadacf
Update reed+solomon (#9562)
Only create encoder when strictly needed.
2020-05-09 09:54:20 -07:00
Anis Elleuch 6d76efb9bb
Add support of TCP fast open in internode calls (#9486) 2020-05-08 14:33:23 -07:00
Harshavardhana a1de9cec58
cleanup object-lock/bucket tagging for gateways (#9548)
This PR is to ensure that we call the relevant object
layer APIs for necessary S3 API level functionalities
allowing gateway implementations to return proper
errors as NotImplemented{}

This allows for all our tests in mint to behave
appropriately and can be handled appropriately as
well.
2020-05-08 13:44:44 -07:00
Anis Elleuch 6885c72f32
disable check for DirectIO in standalone FS mode (#9558) 2020-05-08 12:07:51 -07:00
poornas 0f1389e992
Fix azure gateway handling of ETag for CopyObject (#9544)
fixes #9428
2020-05-08 11:30:35 -07:00
Harshavardhana 9dda1fd624
Remove B2 gateway implementation (#9547)
S3 is now natively supported by B2 cloud storage provider
there is no reason to use specialized gateway for B2 anymore,
our current S3 gateway with caching would work with B2.

Resolves #8584
2020-05-07 19:00:30 -07:00
Harshavardhana 2dc46cb153
Report correct error when O_DIRECT is not supported (#9545)
fixes #9537
2020-05-07 16:12:16 -07:00
remche 0674c0075e
add LDAP StartTLS support (#9472) 2020-05-07 15:08:33 -07:00
Harshavardhana 0dd626ec67
fix: requests without bucket should route to the original router (#9541)
requests in federated setups for STS type calls which are
performed at '/' resource should be routed by the muxer,
the assumption is simply such that requests without a bucket
in a federated setup cannot be proxied, so serve them at
current server.
2020-05-07 11:49:04 -07:00
P R 7e3ea77fdf
Checking for access denied in web browser request. (#9523)
Fixes #9485
2020-05-06 21:31:44 -07:00
Harshavardhana 7290d23b26
Apply partNumber checks only on multipart objects (#9528) 2020-05-06 16:58:09 -07:00
Harshavardhana 4c9de098b0
heal buckets during init and make sure to wait on quorum (#9526)
heal buckets properly during expansion, and make sure
to wait for the quorum properly such that healing can
be retried.
2020-05-06 14:25:05 -07:00
Harshavardhana a2ccba69e5
add kes retries upto two times with jitter backoff (#9527)
KES calls are not retried and under certain situations
when KES is under high load, the request should be
retried automatically.
2020-05-06 11:44:06 -07:00
Harshavardhana 8eb99d3a87
fix: complete multipart upload respond with ETag quoted (#9525)
Fixes #9517
2020-05-05 17:47:54 -07:00
Bala FA 3773874cd3
add bucket tagging support (#9389)
This patch also simplifies object tagging support
2020-05-05 14:18:13 -07:00
Harshavardhana 6c62b1a2ea fix broken retry tests 2020-05-04 22:01:39 -07:00
Harshavardhana b768645fde
fix: unexpected logging with bucket metadata conversions (#9519) 2020-05-04 20:04:06 -07:00
Harshavardhana 7b58dcb28c
fix: return context error from context reader (#9507) 2020-05-04 14:33:49 -07:00
Harshavardhana fea4a1e68e
fix logical error in path length handling for windows (#9520)
fixes #9515
2020-05-04 13:11:56 -07:00
Andreas Auernhammer a9e83dd42c
crypto: remove dead code (#9516)
This commit removes some crypto-related code
that is not used anywhere anymore.
2020-05-04 11:41:18 -07:00
Andreas Auernhammer 145f501a21
use HTTP/2 when connecting to KES (#9514)
This commit makes the KES client use HTTP/2
when establishing a connection to the KES server.

This is necessary since the next KES server release
will require HTTP/2.
2020-05-04 10:17:13 -07:00
Harshavardhana 9b3b04ecec
allow retries for bucket encryption/policy quorum reloads (#9513)
We should allow quorum errors to be send upwards
such that caller can retry while reading bucket
encryption/policy configs when server is starting
up, this allows distributed setups to load the
configuration properly.

Current code didn't facilitate this and would have
never loaded the actual configs during rolling,
server restarts.
2020-05-04 09:42:58 -07:00
Anis Elleuch 3e063cca5c
Show the cause error in startup when directio is not supported (#9497)
This commit tries to create a file using direct i/o in the startup
so the server returns quickly and avoid cryptic other errors.
2020-05-04 08:48:03 -07:00
Harshavardhana 27d716c663
simplify usage of mutexes and atomic constants (#9501) 2020-05-03 22:35:40 -07:00
ebozduman fbd15cb7b7
Fixes browser delete issue for anon and authorized users (#9440) 2020-05-03 14:01:28 -07:00
Egor Rudinsky f7c91eff54
Share button for public objects (#9162) 2020-05-01 23:55:53 -07:00
Dmitry Gadeev a6bdc086a2
fix: use source scheme retrieved from X-Forwarded headers (#9483) 2020-05-01 23:53:01 -07:00
Bala FA 83ccae6c8b
Store bucket created time as a metadata (#9465)
Fixes #9459
2020-05-01 09:53:14 -07:00
Harshavardhana 28f9c477a8
fix: assume parentUser correctly for serviceAccounts (#9504)
ListServiceAccounts/DeleteServiceAccount didn't work properly
with STS credentials yet due to incorrect Parent user.
2020-05-01 08:05:14 -07:00
Harshavardhana 09571d03a5
avoid unnecessary logging in IAM (#9502) 2020-05-01 18:11:17 +05:30
Harshavardhana 71ce63f79c
fix: background heal to call HealFormat only if needed (#9491)
In large setups this avoids unnecessary data transfer
across nodes and potential locks.

This PR also optimizes heal result channel, which should
be avoided for each queueHealTask as its expensive
to create/close channels for large number of objects.
2020-04-30 20:23:00 -07:00
Harshavardhana 5205c9591f
print proper certinfo on console when starting up (#9479)
also potentially fix a race in certs.go implementation
while accessing tls.Certificate concurrently.
2020-04-30 16:15:29 -07:00
poornas 9a547dcbfb
Add API's for managing bucket quota (#9379)
This PR allows setting a "hard" or "fifo" quota
restriction at the bucket level. Buckets that
have reached the FIFO quota configured, will
automatically be cleaned up in FIFO manner until
bucket usage drops to configured quota.
If a bucket is configured with a "hard" quota
ceiling, all further writes are disallowed.
2020-04-30 15:55:54 -07:00
Anis Elleuch 27632ca6ec
audit: Merge ResponseWriter with RecordAPIStats (#9496)
ResponseWriter & RecordAPIStats has similar role, merge them.

This commit will also fix wrong auditing for STS and Web and others
since they are using ResponseWriter instead of the RecordAPIStats.
2020-04-30 11:27:19 -07:00
Anis Elleuch d090a17ed0
fix: Audit tests on the correct response writer type (#9445) 2020-04-29 22:17:36 -07:00
Harshavardhana c2529260e7
fix: crash observed when position of drives different (#9490)
allocate the disk slice properly before populating
disk by its ID and its position.

Fixes #9416
2020-04-29 13:42:37 -07:00
P R 5dd9cf4398
fix: CopyObject with REPLACE directive deletes existing tags (#9478)
Fixes #9477
2020-04-29 10:26:37 +05:30
Harshavardhana ab77b216d1
fix: remove restrictions on windows for NAME_MAX (#9469)
Fixes #9393
2020-04-28 17:32:46 -07:00
Anis Elleuch c3c3e9087b
config: More fixes in parsing Audit & Logger env variables (#9474)
- Add support of missed legacy Logger webhook
- Disable enabling Audit or logger if _ENABLE
  if not explicitly set to "on".
2020-04-28 15:20:40 -07:00
Anis Elleuch 7ad6bc955f
show a notice when mixed rootfs & mounted disks is detected (#9471)
A user can incorrectly mounts a newly fresh disk. MinIO will detect
that it is writing with a rootfs disk and will mark it down. However,
it is hard for the user to understand what's going on.

This commit will just print a notice so it will be easy to spot
such use case.
2020-04-28 14:55:01 -07:00
Harshavardhana 7a5271ad96
fix: re-use connections in webhook/elasticsearch (#9461)
- elasticsearch client should rely on the SDK helpers
  instead of pure HTTP calls.
- webhook shouldn't need to check for IsActive() for
  all notifications, failure should be delayed.
- Remove DialHTTP as its never used properly

Fixes #9460
2020-04-28 13:57:56 -07:00
Harshavardhana 1b122526aa
fix: add service account support for AssumeRole/LDAPIdentity creds (#9451)
allow generating service accounts for temporary credentials
which have a designated parent, currently OpenID is not yet
supported.

added checks to ensure that service account cannot generate
further service accounts for itself, service accounts can
never be a parent to any credential.
2020-04-28 12:49:56 -07:00
Anis Elleuch a3b266761e
Fix audit loading from the env and consider enable env variable (#9467)
Audit was not working properly when enabled from the environment
caused by a typo in the code.

This commit fixes that but also consider the following variables:
  `MINIO_LOGGER_WEBHOOK_ENABLE_*` and 
`MINIO_AUDIT_WEBHOOK_ENABLE_*` so the user can use 
this latter to temporarily disable a logger or audit configuration.
2020-04-28 16:10:51 +05:30
Harshavardhana 498389123e
avoid unnecessary logging on fresh/newly replaced drives (#9470)
data usage tracker and crawler seem to be logging
non-actionable information on console, which is not
useful and is fixed on its own in almost all deployments,
lets keep this logging to minimal.
2020-04-28 01:16:57 -07:00
Harshavardhana bc61417284
calculate automatic node based symmetry (#9446)
it is possible in many screnarios that even
if the divisible value is optimal, we may
end up with uneven distribution due to number
of nodes present in the configuration.

added code allow for affinity towards various
ellipses to figure out optimal value across
ellipses such that we can always reach a
symmetric value automatically.

Fixes #9416
2020-04-27 14:39:57 -07:00
Harshavardhana 97d952e61c
fix: ensure buckets are preserved if one set returns error (#9468)
the bucket should be deleted if it can be successfully
deleted on all sets, if not we should ensure to
restore those buckets properly.
2020-04-27 14:18:02 -07:00
Klaus Post 073aac3d92
add data update tracking using bloom filter (#9208)
By monitoring PUT/DELETE and heal operations it is possible
to track changed paths and keep a bloom filter for this data. 

This can help prioritize paths to scan. The bloom filter can identify
paths that have not changed, and the few collisions will only result
in a marginal extra workload. This can be implemented on either a
bucket+(1 prefix level) with reasonable performance.

The bloom filter is set to have a false positive rate at 1% at 1M 
entries. A bloom table of this size is about ~2500 bytes when serialized.

To not force a full scan of all paths that have changed cycle bloom
filters would need to be kept, so we guarantee that dirty paths have
been scanned within cycle runs. Until cycle bloom filters have been
collected all paths are considered dirty.
2020-04-27 10:06:21 -07:00
Harshavardhana eff4127efd Revert "Write files in O_SYNC for fs backend to protect against machine crashes (#9434)"
This reverts commit 4843affd0e.
2020-04-27 09:22:05 -07:00
Harshavardhana b1c0c32ba6
fix: ignore symlinks in backend filesystems (#9457)
fixes #9419
2020-04-27 06:30:12 -07:00
Harshavardhana f14bf25cb9
optimize Listen bucket notification implementation (#9444)
this commit avoids lots of tiny allocations, repeated
channel creates which are performed when filtering
the incoming events, unescaping a key just for matching.

also remove deprecated code which is not needed
anymore, avoids unexpected data structure transformations
from the map to slice.
2020-04-27 06:25:05 -07:00
Harshavardhana f216670814
use context specific to the etcd call (#9458) 2020-04-26 21:42:41 -07:00
Harshavardhana 6ecc98fddb
fix: crash in metrics handler when some disks are offline (#9450)
Fixes #9449
2020-04-25 19:48:07 -07:00
Krishna Srinivas 4843affd0e
Write files in O_SYNC for fs backend to protect against machine crashes (#9434) 2020-04-25 01:18:54 -07:00
Harshavardhana 558785a4bb
fix: config Set/Get decrypt/encrypt using authenticated credentials (#9447)
we have policy available for sub-admin users to set/get/delete
config, but we incorrectly decrypt the content using admin secret
key which in-fact should be the credential authenticating the
request.
2020-04-24 22:36:48 -07:00
Harshavardhana 60d415bb8a
deprecate/remove global WORM mode (#9436)
global WORM mode is a complex piece for which
the time has passed, with the advent of S3 compatible
object locking and retention implementation global
WORM is sort of deprecated, this has been mentioned
in our documentation for some time, now the time
has come for this to go.
2020-04-24 16:37:05 -07:00
BigUstad 45e22cf8aa
fix: selectObject to return error when object does not exist (#9423) 2020-04-24 13:51:48 -07:00
Anis Elleuch 20766069a8
add list/delete API service accounts admin API (#9402) 2020-04-24 12:10:09 -07:00
Harshavardhana 957ecb1b64
use optimal memory while purging cache (#9426)
re-implement the cache purging routine to
avoid using ioutil.ReadDir which can lead
to high allocations when there are cache
directories with lots of content, or
when cache is installed in memory constrainted
environments.

Instead rely on a callback function where we
are not using memory no-more than 8KiB per
cycle.

Precursor for this change refer #9425, original
issue pointed by Caleb Case <caleb@storj.io>
2020-04-23 12:26:13 -07:00
Boaz ac5061df2c
fix: make azure gateway chunk size configurable (#9292) 2020-04-23 02:04:13 -07:00
Anis Elleuch 4cd6ca02c7
fix: Add missing return in admin requests auth (#9422) 2020-04-22 13:42:01 -07:00
Egon Elbre a5efcbab51
fix: cacheReader.Close in all paths that don't return it. (#9418) 2020-04-22 12:13:57 -07:00
Egon Elbre 85be7b39ac
Call cleanup funcs when skip fails (#9417) 2020-04-22 10:06:56 -07:00
Nitish Tiwari ebf3dda449
Update server startup example to showcase local erasure code (#9407) 2020-04-21 23:59:13 -07:00
poornas 582953260b
Increase response header timeout for gateway (#9400)
fixes: #9295
2020-04-21 19:21:27 -07:00
Praveen raj Mani 322385f1b6
fix: only show active/available ARNs in server startup banner (#9392) 2020-04-21 09:38:32 -07:00
Anis Elleuch a69c98e394
fix: Correct typo when registering peer Delete User API (#9403) 2020-04-21 08:35:19 -07:00
Harshavardhana 282c9f790a
fix: validate partNumber in queryParam as part of preConditions (#9386) 2020-04-20 22:01:59 -07:00
Anis Elleuch 2eeb0e6a0b
heal: Fix heal buckets result reporting (#9397)
healBucket() was not properly collecting results after healing
buckets. This commit adds After drives information correctly.
2020-04-20 13:48:54 -07:00
Harshavardhana 3ff5bf2369
fix: convert storage class into azure tiers (#9381) 2020-04-19 13:42:56 -07:00
Harshavardhana 69ee28a082
remove OSS gateway due to lack of licensing (#9390)
OSS go sdk lacks licensing terms in their
repository, and there has been no activity

On the issue here https://github.com/aliyun/aliyun-oss-go-sdk/issues/245

This PR is to ensure we remove any dependency code which
lacks explicit license file in their repo.
2020-04-18 22:12:51 -07:00
Sidhartha Mani 3e78ea8acc
improve obd tests and optimize network (#9378)
- keep long running obd network tests alive
- fix error - wrong number of parents in process OBD info
- ensure that osinfo does not error out when inside containers
- remove limit on max number of connections per client transport

The generic client transport uses a default limit of 64 conns per transport.
This could end up limiting and throttling usage, and artificially slowing
down the performance of MinIO even on hardware capable of doing better.
2020-04-18 11:06:11 -07:00
Praveen raj Mani c79358c67e
notification queue limit has no maxLimit (#9380)
New value defaults to 100K events by default,
but users can tune this value upto any value
they seem necessary.

* increase the limit to maxint64 while validating
2020-04-18 01:20:56 -07:00
Klaus Post c4464e36c8
fix: limit HTTP transport tuables to affordable values (#9383)
Close connections pro-actively in transient calls
2020-04-17 11:20:56 -07:00
Harshavardhana d92db198d1
Add target parsing code for config (#9375)
This code is helper for mcs project
2020-04-16 17:43:14 -07:00
Harshavardhana 8bae956df6
allow copyObject to rotate storageClass of objects (#9362)
Added additional mint tests as well to verify, this
functionality.

Fixes #9357
2020-04-16 17:42:44 -07:00
Harshavardhana c82fa2c829
fix: load LDAP users appropriately (#9360)
This PR also fixes issues when

deletePolicy, deleteUser is idempotent so can lead to
issues when client can prematurely timeout, so a retry
call error response should be ignored when call returns
http.StatusNotFound

Fixes #9347
2020-04-16 16:22:34 -07:00
Harshavardhana a51280fd20
allow config help in gateway mode (#9356)
allow `mc admin config set mygateway/ audit_webhook --env`
to fetch the documentation as needed, this is just to
ensure that our users can still access the relevant
ENV docs while running in gateway mode.
2020-04-16 14:49:12 -07:00
Klaus Post bd437c1c17
set server base context on gateway http server (#9365) 2020-04-16 11:54:12 -07:00
Harshavardhana 69fb68ef0b
fix simplify code to start using context (#9350) 2020-04-16 10:56:18 -07:00
Harshavardhana bde0f444db
fix support OBDAdminAction is valid action (#9354) 2020-04-15 12:16:40 -07:00
Klaus Post f19cbfad5c
fix: use per test context (#9343)
Instead of GlobalContext use a local context for tests.
Most notably this allows stuff created to be shut down 
when tests using it is done. After PR #9345 9331 CI is 
often running out of memory/time.
2020-04-14 17:52:38 -07:00
Harshavardhana 5c11a46412 update minio-go/parquet-go to latest 2020-04-14 16:53:29 -07:00
Anis Elleuch 8a94aebdb8
config: Add api requests max & deadline configs (#9273)
Add two new configuration entries, api.requests-max and
api.requests-deadline which have the same role of
MINIO_API_REQUESTS_MAX and MINIO_API_REQUESTS_DEADLINE.
2020-04-14 12:46:37 -07:00
Sidhartha Mani ec11e99667
implement configurable timeout for OBD tests (#9324) 2020-04-14 11:48:32 -07:00
Harshavardhana 37d066b563
fix: deprecate requirement of session token for service accounts (#9320)
This PR fixes couple of behaviors with service accounts

- not need to have session token for service accounts
- service accounts can be generated by any user for themselves
  implicitly, with a valid signature.
- policy input for AddNewServiceAccount API is not fully typed
  allowing for validation before it is sent to the server.
- also bring in additional context for admin API errors if any
  when replying back to client.
- deprecate GetServiceAccount API as we do not need to reply
  back session tokens
2020-04-14 11:28:56 -07:00
Praveen raj Mani bfec5fe200
fix: fetchLambdaInfo should return consistent results (#9332)
- Introduced a function `FetchRegisteredTargets` which will return
  a complete set of registered targets irrespective to their states,
  if the `returnOnTargetError` flag is set to `False`
- Refactor NewTarget functions to return non-nil targets
- Refactor GetARNList() to return a complete list of configured targets
2020-04-14 11:19:25 -07:00
Bala FA 525287f4b6
remove queue only if index is within the range (#9341)
Fixes minio/mc#3155
2020-04-14 11:06:23 -07:00
Harshavardhana 9054ce73b2
fix: deprecate skyring/uuid and use maintained google/uuid (#9340) 2020-04-14 02:40:05 -07:00
Harshavardhana d079adc167
fix: remove initGlobalContext writes in tests (#9331)
since we do not close GlobalContext, we do not
need to reinitialize it inside test code
2020-04-13 23:21:01 -07:00
Harshavardhana a9d401ac10
fix: update docs to mention erasure guide (#9339) 2020-04-14 11:38:14 +05:30
kannappanr 1fa65c7f2f
fix: object lock behavior when default lock config is enabled (#9305) 2020-04-13 14:03:23 -07:00
Harshavardhana 4314ee1670
fix: remove unusued PerfInfoHandler code (#9328)
- Removes PerfInfo admin API as its not OBDInfo
- Keep the drive path without the metaBucket in OBD
  global latency map.
- Remove all the unused code related to PerfInfo API
- Do not redefined global mib,gib constants use
  humanize.MiByte and humanize.GiByte instead always
2020-04-12 19:37:09 -07:00
Harshavardhana 7d636a7c13
enable --compat flag by default (#9326)
if needed use --no-compat to disable md5sum while
verifying any performance numbers.

bring back --compat behavior as default to avoid
additional documentation and confusing behavior,
as we are working towards improving md5sum to
be faster on AVX instructions, enabling this
should be hardly a problem in future versions
of MinIO.

fixes #8012
fixes #7859
fixes #7642
2020-04-12 18:08:27 -07:00
Harshavardhana bf9d51cf14
fix: add missing copyright headers in some files (#9321) 2020-04-12 13:55:22 -07:00
Harshavardhana 29e0727b58
fix: regression in CopyObject not preserving ETag in --compat (#9322)
issue found after `git bisect` to commit db41953618
2020-04-11 20:20:30 -07:00
Anis Elleuch c434dff0a4
posix: Add missing error return in RenameFile() (#9319)
Although it should not happen in most cases.
2020-04-11 11:15:30 -07:00
Taras Parkhomenko b2a8cb4aba
Add SHA-3 support (#9308) 2020-04-10 14:59:52 -07:00
Harshavardhana b412a222ae
Add missing comment key from key list (#9313)
Continuing from previous PR #9304, comment
is a special key is not present in the
default KV list. Add it explicitly when
tokenizing fields as it may be possible that
some clients might try to set comments.
2020-04-10 11:44:28 -07:00
Sidhartha Mani 9f81d014f1
fix: drive names in output of parallel obd test (#9312) 2020-04-09 22:44:17 -07:00
Harshavardhana 3184205519
fix: config to support keys with special values (#9304)
This PR adds context-based `k=v` splits based
on the sub-system which was obtained, if the
keys are not provided an error will be thrown
during parsing, if keys are provided with wrong
values an error will be thrown. Keys can now
have values which are of a much more complex
form such as `k="v=v"` or `k=" v = v"`
and other variations.

additionally, deprecate unnecessary postgres/mysql
configuration styles, support only

- connection_string for Postgres
- dsn_string for MySQL

All other parameters are removed.
2020-04-09 21:45:17 -07:00
Andreas Auernhammer db41953618
avoid unnecessary KMS requests during single-part PUT (#9220)
This commit fixes a performance issue caused
by too many calls to the external KMS - i.e.
for single-part PUT requests.

In general, the issue is caused by a sub-optimal
code structure. In particular, when the server
encrypts an object it requests a new data encryption
key from the KMS. With this key it does some key
derivation and encrypts the object content and
ETag.

However, to behave S3-compatible the MinIO server
has to return the plaintext ETag to the client
in case SSE-S3.
Therefore, the server code used to decrypt the
(previously encrypted) ETag again by requesting
the data encryption key (KMS decrypt API) from
the KMS.

This leads to 2 KMS API calls (1 generate key and
1 decrypt key) per PUT operation - while only
one KMS call is necessary.

This commit fixes this by fetching a data key only
once from the KMS and keeping the derived object
encryption key around (for the lifetime of the request).

This leads to a significant performance improvement
w.r.t. to PUT workloads:
```
Operation: PUT
Operations: 161 -> 239
Duration: 28s -> 29s
* Average: +47.56% (+25.8 MiB/s) throughput, +47.56% (+2.6) obj/s
* Fastest: +55.49% (+34.5 MiB/s) throughput, +55.49% (+3.5) obj/s
* 50% Median: +58.24% (+32.8 MiB/s) throughput, +58.24% (+3.3) obj/s
* Slowest: +1.83% (+0.6 MiB/s) throughput, +1.83% (+0.1) obj/s
```
2020-04-09 17:01:45 -07:00
Harshavardhana f44cfb2863
use GlobalContext whenever possible (#9280)
This change is throughout the codebase to
ensure that all codepaths honor GlobalContext
2020-04-09 09:30:02 -07:00
Anis Elleuch 1b45be0d60
lifecycle: Disallow delete when the object is locked (#9272) 2020-04-09 09:28:57 -07:00
Aditya Manthramurthy 6bb693488c
Fix policy setting error in LDAP setups (#9303)
Fixes #8667

In addition to the above, if the user is mapped to a policy or 
belongs in a group, the user-info API returns this information, 
but otherwise, the API will now return a non-existent user error.
2020-04-09 01:04:08 -07:00
Harshavardhana e20e08d700
fix: remove the sleep from listing operations (#9287)
make rest of the Walk() function more predictable,
it was observed that in nominal deployments even
without much workload the drives are generally
slow for respond for readdir operations, for the
sleepDuration factor of 10 this can cause
unexpected slowness in the Listing calls, while
it is good for all other I/O, it may simply slow
down Listing immensely which is not useful.

fixes #9261
2020-04-08 19:42:57 -07:00
Harshavardhana ac07df2985
start watcher after all creds have been loaded (#9301)
start watcher after all creds have been loaded
to avoid any conflicting locks that might get
deadlocked.

Deprecate unused peer calls for LoadUsers()
2020-04-08 19:00:39 -07:00
Anis Elleuch e51e465543
delete: Use physical Dir() for proper prefix cleanup in Windows (#9297)
In FS mode under Windows, removing an object will not automatically.
remove parent empty prefixes.

The reason is that path.Dir() was used, however filepath.Dir() is
more appropriate since filepath is physical (meaning it operates
on OS filesystem paths)

This is not caught because failure for Windows CI is not caught.
2020-04-08 11:32:58 -07:00
Pontus Leitzler a973402821
add object api check in fs-v1 before returning ready (#9285)
fs-v1 in server mode only checks to see if the path exist, so that it
returns ready before it is indeed ready.

This change adds a check to ensure that the global object api is
available too before reporting ready.

Fixes #9283
2020-04-08 08:53:20 -07:00
César Nieto 3ea1be3c52
allow delete of a group with no policy set (#9288) 2020-04-08 06:03:57 -07:00
Harshavardhana 2642e12d14
fix: change policies API to return and take struct (#9181)
This allows for order guarantees in returned values
can be consumed safely by the caller to avoid any
additional parsing and validation.

Fixes #9171
2020-04-07 19:30:59 -07:00
Harshavardhana e7276b7b9b
fix: make single locks for both IAM and object-store (#9279)
Additionally add context support for IAM sub-system
2020-04-07 14:26:39 -07:00
Harshavardhana e375341c33
fix: allow any 127.0.0.x as bind IPs (#9281)
It is some times common and convenient to use
just local IPs for testing purposes, 127.0.0.x
are special IPs regardless of being available on
an interface they can be bound to on all operating
systems.

Allow this behavior to work for minio server

fixes #9274
2020-04-07 09:40:20 -07:00
Harshavardhana 2c20716f37
fix: Avoid force delete in compliance/worm mode (#9276)
also, bring in an additional policy to ensure that
force delete bucket is only allowed with the right
policy for the user, just DeleteBucketAction
policy action is not enough.
2020-04-06 17:51:05 -07:00
Harshavardhana 91f21ddc47
fix: ignore lost+found properly while reading disks (#9278)
Fixes #9277
2020-04-06 16:51:18 -07:00
Harshavardhana 43a3778b45
fix: support object-remaining-retention-days policy condition (#9259)
This PR also tries to simplify the approach taken in
object-locking implementation by preferential treatment
given towards full validation.

This in-turn has fixed couple of bugs related to
how policy should have been honored when ByPassGovernance
is provided.

Simplifies code a bit, but also duplicates code intentionally
for clarity due to complex nature of object locking
implementation.
2020-04-06 13:44:16 -07:00
Harshavardhana 4714958e99
fix: possible connection leaks in sets init, heal (#9263) 2020-04-03 18:06:31 -07:00
Harshavardhana ab66b23194
fix: allow listBuckets with listBuckets permission (#9253) 2020-04-02 12:35:22 -07:00
Harshavardhana 73f9d8a636
set default storage class always (#9250)
gateway implementations might not respond
back with right storage class which is
an AWS S3 concept, add default storage
if its empty.
2020-04-02 00:23:09 -07:00
Krishna Srinivas 541a778d7b
fix: do not exit on bootstrap Verify() to allow for rolling upgrades (#9235) 2020-04-01 21:40:03 -07:00
Harshavardhana d49f2ec19c
fix: use specified authToken for audit/logger HTTP targets (#9249)
We were not using the auth token specified
even when config supports it.
2020-04-01 20:53:07 -07:00
ebozduman 8dd63a462f
fix: ETag returned by OSS endpoint (#9243) 2020-04-01 19:51:12 -07:00
poornas 336460f67e
fix: gateway_s3_bytes_sent metric for all API methods (#9242)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-04-01 12:52:31 -07:00
Bala FA 95e89f1712
proactive deep heal object when a bitrot is detected (#9192) 2020-04-01 12:14:00 -07:00
Harshavardhana d8af244708
Add numeric/date policy conditions (#9233)
add new policy conditions

- NumericEquals
- NumericNotEquals
- NumericLessThan
- NumericLessThanEquals
- NumericGreaterThan
- NumericGreaterThanEquals
- DateEquals
- DateNotEquals
- DateLessThan
- DateLessThanEquals
- DateGreaterThan
- DateGreaterThanEquals
2020-04-01 00:04:25 -07:00
Sidhartha Mani c8243706b4
Add Parallel NetOBD tests to saturate all nodes at once (#9241) 2020-03-31 17:08:28 -07:00
Harshavardhana 30707659b5
[feature] allow for an odd number of erasure packs (#9221)
Too many deployments come up with an odd number
of hosts or drives, to facilitate even distribution
among those setups allow for odd and prime numbers
based packs.
2020-03-31 09:32:16 -07:00
poornas 90c365a174
fix: allow overwriting objects under lock after retention period (#9232)
fixes #9230
2020-03-31 09:15:42 -07:00
Sidhartha Mani 7b732b566f
[Bugfix] Fix Net tests being omitted (#9234) 2020-03-31 01:15:21 -07:00
Harshavardhana ba52a925f9
fix: delete dangling directories properly (#9222) 2020-03-30 09:48:24 -07:00
ebozduman fdda5f98c6
Makes mandatory dsn_string parameter optional (#8931) 2020-03-28 22:20:02 -07:00
Ingmar Runge fa4d627b57
B2 gateway S3 compat: return MD5 hash as ETag from PutObject (#9183)
- B2 does actually return an MD5 hash for newly uploaded objects
  so we can use it to provide better compatibility with S3 client
  libraries that assume the ETag is the MD5 hash such as boto.
- depends on change in blazer library.
- new behaviour is only enabled if MinIO's --compat mode is active.
- behaviour for multipart uploads is unchanged (works fine as is).
2020-03-28 13:59:55 -07:00
Bala FA 2c3e34f001
add force delete option of non-empty bucket (#9166)
passing HTTP header `x-minio-force-delete: true` would 
allow standard S3 API DeleteBucket to delete a non-empty
bucket forcefully.
2020-03-27 21:52:59 -07:00
Anis Elleuch 7f8f1ad4e3
fix: cleanup lifecycle unused code (#9219) 2020-03-27 18:57:50 -07:00
Harshavardhana 6f992134a2
fix: startup load time by reusing storageDisks (#9210) 2020-03-27 14:48:30 -07:00
Sidhartha Mani 0c80bf45d0
Implement oboard diagnostics admin API (#9024)
- Implement a graph algorithm to test network bandwidth from every 
  node to every other node
- Saturate any network bandwidth adaptively, accounting for slow 
  and fast network capacity
- Implement parallel drive OBD tests
- Implement a paging mechanism for OBD test to provide periodic updates to client
- Implement Sys, Process, Host, Mem OBD Infos
2020-03-26 21:07:39 -07:00
Anis Elleuch b207520d98
Fix lifecycle GET: AWS SDK complaints on empty config (#9201) 2020-03-25 21:06:03 -07:00
Krishna Srinivas ef6304c5c2
Improve connectDisks() performance (#9203) 2020-03-24 23:26:13 -07:00
Nitish Tiwari 6b984410d5
Add support for self-healing related metrics in Prometheus (#9079)
Fixes #8988

Co-authored-by: Anis Elleuch <vadmeste@users.noreply.github.com>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-24 22:40:45 -07:00
Harshavardhana 813e0fc1a8
fix: optimize isConnected to avoid url.String() conversions (#9202)
Stringifying in a loop can tax the system, avoid this
and convert the endpoints to strings early on and
remember them for the lifetime of the server.
2020-03-24 18:53:24 -07:00
Harshavardhana 6f6a2214fc
Add rate limiter for S3 API layer (#9196)
- total number of S3 API calls per server
- maximum wait duration for any S3 API call

This implementation is primarily meant for situations
where HDDs are not capable enough to handle the incoming
workload and there is no way to throttle the client.

This feature allows MinIO server to throttle itself
such that we do not overwhelm the HDDs.
2020-03-24 12:43:40 -07:00
Anis Elleuch 791821d590
sa: Allow empty policy to indicate parent user's policy is inherited (#9185) 2020-03-23 14:17:18 -07:00
Harshavardhana 9a951da881
honor the credentials of user admin for encrypt/decrypt (#9194)
Fixes #9193
2020-03-23 14:06:00 -07:00
Harshavardhana ff932ca2a0
fix: log only catastrophic errors in prepare storage (#9189) 2020-03-23 07:32:18 -07:00
poornas 818d3bcaf5
fix: deprecate TestDiskCache test from unit tests (#9187) 2020-03-22 23:46:36 -07:00
Krishna Srinivas 45b1c66195
fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186) 2020-03-22 19:23:47 -07:00
Harshavardhana da04cb91ce
optimize listObjects to list only from 3 random disks (#9184) 2020-03-22 16:33:49 -07:00
Harshavardhana cfc9cfd84a
fix: various optimizations, idiomatic changes (#9179)
- acquire since leader lock for all background operations
  - healing, crawling and applying lifecycle policies.

- simplify lifecyle to avoid network calls, which was a
  bug in implementation - we should hold a leader and
  do everything from there, we have access to entire
  name space.

- make listing, walking not interfere by slowing itself
  down like the crawler.

- effectively use global context everywhere to ensure
  proper shutdown, in cache, lifecycle, healing

- don't read `format.json` for prometheus metrics in
  StorageInfo() call.
2020-03-22 12:16:36 -07:00
Harshavardhana ea18e51f4d
Support multiple LDAP OU's, smAccountName support (#9139)
Fixes #8532
2020-03-21 22:47:26 -07:00
Harshavardhana 3d3beb6a9d
Add response header timeouts (#9170)
- Add conservative timeouts upto 3 minutes
  for internode communication
- Add aggressive timeouts of 30 seconds
  for gateway communication

Fixes #9105
Fixes #8732
Fixes #8881
Fixes #8376
Fixes #9028
2020-03-21 22:10:13 -07:00
poornas 27b8f18cce
Fix storage info message on startup (#9177) 2020-03-21 10:02:20 -07:00
Harshavardhana b4bfdc92cc fix: admin console logger changes to log.Info 2020-03-20 15:14:14 -07:00
Harshavardhana ae654831aa
Add madmin package context support (#9172)
This is to improve responsiveness for all
admin API operations and allowing callers
to cancel any on-going admin operations,
if they happen to be waiting too long.
2020-03-20 15:00:44 -07:00
Stephen N 1ffa983a9d
added support for SASL/SCRAM on Kafka bucket notifications. (#9168)
fixes #9167
2020-03-20 11:10:27 -07:00
Nitish Tiwari ecf1566266
Add an option to allow plaintext connection to LDAP/AD Server (#9151) 2020-03-19 19:20:51 -07:00
Harshavardhana b1a2169dcc
fix: data usage crawler env handling, usage-cache.bin location (#9163)
canonicalize the ENVs such that we can bring these ENVs 
as part of the config values, as a subsequent change.

- fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin`
- fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json`
  (avoid conflicts with a bucket named `usage.json` )
- fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin`
  (avoid conflicts with a bucket named `usage.bin`
2020-03-19 09:47:47 -07:00
Harshavardhana d45a1808f2
fix: Walk() should require quorum number of disks only (#9164) 2020-03-18 20:56:07 -07:00
Anis Elleuch db2155551a
heal: Pass scan mode to HealObjects to deep scan full quorum objects (#9159)
As an optimization of the healing, HealObjects() avoid sending an
object to the background healing subsystem when the object is
present in all disks.

However, HealObjects() should have checked the scan type, if this
deep, always pass the object to the healing subsystem.
2020-03-18 17:50:00 -07:00
Harshavardhana 09d35d3b4c
fix: sts to return appropriate errors (#9161) 2020-03-18 17:25:45 -07:00
Anis Elleuch 5b9342d35c
xl: Tree walking should not quit when one disk returns empty (#9160)
Currently, a tree walking, needed to a list objects in a specific
set quits listing as long as it finds no entries in a disk, which
is wrong.

This affected background healing, because the latter is using
tree walk directly. If one object does not exist in the first
disk for example, it will be seemed like the object does not
exist at all and no healing work is needed.

This commit fixes the behavior.
2020-03-18 16:58:05 -07:00
Klaus Post 8d98662633
re-implement data usage crawler to be more efficient (#9075)
Implementation overview: 

https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b
2020-03-18 16:19:29 -07:00
Anis Elleuch 7fdeb44372
info: Initialize boot time early so uptime will always be correct (#9154) 2020-03-17 16:37:28 -07:00
poornas 59dced8237
Print node status even in --quiet mode (#9149) 2020-03-17 15:25:00 -07:00
Anis Elleuch 496f4a7dc7
Add service account type in IAM (#9029) 2020-03-17 10:36:13 -07:00
kannappanr 8b880a246a
fix: deleteObjectTagging should 204 on success (#9150) 2020-03-16 23:21:24 -07:00
Klaus Post eeb5942b6b
fix: remote profile names and extension (#9145)
Remote profiles are not formatted correctly:

```
profile-172.31.91.126_9000-cpu.pprof
profile-172.31.91.126_9000-goroutines-before.txt
profile-172.31.91.126_9000-goroutines.txt
profiling-172.31.80.49_9000-cpu.pprof.pprof
profiling-172.31.80.49_9000-goroutines-before.txt.pprof
profiling-172.31.80.49_9000-goroutines.txt.pprof
profiling-172.31.86.101_9000-cpu.pprof.pprof
profiling-172.31.86.101_9000-goroutines-before.txt.pprof
profiling-172.31.86.101_9000-goroutines.txt.pprof
profiling-172.31.91.191_9000-cpu.pprof.pprof
profiling-172.31.91.191_9000-goroutines-before.txt.pprof
profiling-172.31.91.191_9000-goroutines.txt.pprof
```

`profiling` -> `profile`, remove extra extension.
2020-03-16 11:39:53 -07:00
Harshavardhana c9212819af
fix: lock maintenance should honor quorum (#9138)
The staleness of a lock should be determined by
the quorum number of entries returning stale,
this allows for situations when locks are held
when nodes are down - we don't accidentally
clear locks unintentionally when they are valid
and correct.

Also lock maintenance should be run by all servers,
not one server, stale locks need to be run outside
the requirement for holding distributed locks.

Thanks @klauspost for reproducing this issue
2020-03-15 11:55:52 -07:00
poornas 10fd53d6bb
Fix: admin config set API for notifications (#9085)
Filter out targets set via env when
validating incoming config change against
configured notification targets

Fixes #9066
2020-03-14 00:01:15 -07:00
Krishna Srinivas 2e9fed1a14
non-empty dirs should not be listed as objects (#9129) 2020-03-13 17:43:00 -07:00
Kody A Kantor 06e30b5aa1
Skip building directio on platforms that don't support Direct IO (#9059) 2020-03-12 18:57:41 -07:00
Harshavardhana a54cdb9587
fix: Send x-amz-mp-parts-count for multiparted objects (#9116)
Some AWS SDKs latently rely on this value some times
to calculate the right number of parts during a parallel
GetObject request, this is feature used along with
content-range - we should support this as well.
2020-03-12 12:37:27 -07:00
Harshavardhana cfd12914e1
fix: crash in serverInfo handler when ldap is configured (#9123) 2020-03-11 23:13:32 -07:00
Anis Elleuch fdf65aa9b9
heal: Add info about the next background healing round (#9122)
- avoid setting last heal activity when starting self-healing

This can be confusing to users thinking that the self healing
cycle was already performed.

- add info about the next background healing round
2020-03-11 23:00:31 -07:00
Harshavardhana 69b2aacf5a
fix return proper error for OperationTimedout (#9117)
OperationTimedout error occurs when locking
timesout, trying to acquire a lock. This
error should be returned appropriately to
the client with http status "408" (request timedout)

This translation was broken, fix it.
2020-03-11 14:11:04 -07:00
Anis Elleuch 0af62d35a0
xl: Implement posix.DeletePrefixes to enhance delete perf (#9100)
Bulk delete API was using cleanupObjectsBulk() which calls posix
listing and delete API to remove objects internal files in the
backend (xl.json and parts) one by one.

Add DeletePrefixes in the storage API to remove the content
of a directory in a single call.

Also use a remove goroutine for each disk to accelerate removal.
2020-03-11 08:56:36 -07:00
Nitish Tiwari 7c32f3f554
Fix the URL for MinIO update when using custom download server (#9111)
Co-authored-by: Nitish Tiwari <nitish@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-11 20:09:20 +05:30
Harshavardhana 5ab9cc029d
fix: crash observed for anonymous deletes from UI (#9107) 2020-03-09 21:21:35 -07:00
Harshavardhana 6a00eb10bf
fix: allow set drive count of proper divisible values (#9101)
Currently the code assumed some orthogonal requirements
which led situations where when we have a setup where
we have let's say for example 168 drives, the final
set_drive_count chosen was 14. Indeed 168 drives are
divisible by 12 but this wasn't allowed due to an
unexpected requirement to have 12 to be a perfect modulo
of 14 which is not possible. This assumption was incorrect.

This PR fixes this old assumption properly, also adds
few tests and some negative tests as well. Improvements
are seen in error messages as well.
2020-03-08 13:30:25 -07:00
Harshavardhana 792ee48d2c
add additional logging during server formatting (#9102) 2020-03-08 12:12:07 -07:00
Harshavardhana 88ae0f1196
Improve delete performance by reducing the number of calls (#9092)
- Remove the requirement to honor storage class for deletes
- Improve `posix.DeleteFileBulk` code to Stat the volumeDir
  only once per call, rather than for all object paths.
2020-03-06 13:44:24 -08:00
Anis Elleuch 23a0415eb7
profiling: Fix crash when enabling goroutines profiling (#9097)
This commit replaces 'goroutines' with 'goroutine' when passing it
to pprof library when activating goroutine type profiling
2020-03-06 13:22:47 -08:00
Anis Elleuch 75a0661213
data-usage: Fix the calculation of the next crawling round (#9096)
This commit fixes a simple typo miscalculated the waiting time
until the next round of data crawling to compute the data usage.
2020-03-06 11:34:12 -08:00
kannappanr 07a7f329e7
xl: Fix counting offline disks in StorageInfo (#9082)
Recent modification in the code led to incorrect calculation
of offline disks.

This commit saves the endpoint list in a xlObjects then we know
the name of each disk.
2020-03-04 16:18:32 -08:00
kannappanr c7ca791c58
fix: lock expiry on zoned setups (#9084)
lock ownership is limited to endpoints on first zone,
as we do not hold locks on other zones in an expanded
setup. current code unintentionally expired active locks
when it couldn't see ownership from the secondary zone
which leads to unexpected bugs as locking fails to work
as expected.
2020-03-04 16:06:17 -08:00
kannappanr d9be8bc693
Add env. variable to disable data usage crawling (#9086) 2020-03-04 15:51:03 -08:00
poornas 9fc7537f2a
Enforce md5sum checks for object retention APIs (#9030)
this PR enforces md5sum verification for following
API's to be compatible with AWS S3 spec
 - PutObjectRetention
 - PutObjectLegalHold

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-04 07:04:12 -08:00
Klaus Post f1b2462193
Add goroutine profiles (#9078)
Allow downloading goroutine dump to help detect leaks
or overuse of goroutines.

Extensions are now type dependent.

Change `profiling` -> `profile` prefix, since that is what they are 
not the abstract concept.
2020-03-04 06:58:12 -08:00
poornas c93157019f
Allow gc to run in parallel on cache drives (#9051) 2020-03-03 06:42:26 +03:00
Harshavardhana e3b44c3829
Remove partName, partETag requirement (#9044)
This is a precursor change before versioning,
removes/deprecates the requirement of remembering
partName and partETag which are not useful after
a multipart transaction has finished.

This PR reduces the overall size of the backend
JSON for large file uploads.
2020-03-03 03:29:30 +03:00
poornas 978bd4e2c4
check cacheControl not nil before access (#9055)
Fixes: #9053
2020-02-27 10:57:00 -08:00
poornas 5d25b10f72
Fix panic in StorageInfo call (#9050) 2020-02-26 15:29:50 -08:00
poornas eac02c04f7
Fix sporadic failure in TestDiskCacheMaxUse (#9049) 2020-02-26 13:31:15 -08:00
Harshavardhana 1330e59307
accessKeyId missing should return appropriate error in AssumeRole (#9048)
For a non-existent user server would return STS not initialized
```
aws --profile harsha --endpoint-url http://localhost:9000 \
      sts assume-role \
      --role-arn arn:xxx:xxx:xxx:xxxx \
      --role-session-name anything
```

instead return an appropriate error as expected by STS API

Additionally also format the `trace` output for STS APIs
2020-02-26 12:26:47 -08:00
Harshavardhana 2dd14c0b89
print version with proper indentation (#9047)
currently version is printed as

> VERSION:
> DEVELOPMENT.2020-02-26T14-30-02Z

this is what we want

> VERSION:
>   DEVELOPMENT.2020-02-26T14-30-02Z
>
2020-02-26 23:09:08 +05:30
Harshavardhana 6f66f1a910
close channel upon error in Walk()'er (#9042) 2020-02-25 19:58:58 -08:00
Harshavardhana 23a8411732
Add a generic Walk()'er to list a bucket, optinally prefix (#9026)
This generic Walk() is used by likes of Lifecyle, or
KMS to rotate keys or any other functionality which
relies on this functionality.
2020-02-25 21:22:28 +05:30
Harshavardhana ece0d4ac53
simplify recordAPIStats wrapper for ResponseWriters (#9034) 2020-02-24 09:45:32 -08:00
Harshavardhana 4c92bec619
allow rolling upgrades, remove same MinIO version requirement (#9033)
Upgrades between releases are failing due to strict
rule to avoid rolling upgrades, it is enough to
bump up APIs between versions to allow for quorum
failure and wait times. Authentication failures are
catastrophic in nature which leads to server not
be able to upgrade properly.

Fixes #9021
Fixes #8968
2020-02-24 10:32:30 +05:30
Harshavardhana dcd63b4146
fix: avoid double ListBuckets() loading object lock (#9031) 2020-02-24 06:39:11 +05:30
poornas 224b4f13b8
Add cache eviction low and high watermarks (#8958)
To allow better control the cache eviction process.

Introduce MINIO_CACHE_WATERMARK_LOW and 
MINIO_CACHE_WATERMARK_HIGH env. variables to specify 
when to stop/start cache eviction process. 

Deprecate MINIO_CACHE_EXPIRY environment variable. Cache 
gc sweeps at 30 minute intervals whenever high watermark is
reached to clear least recently accessed entries in the cache
until sufficient space is cleared to reach the low watermark.

Garbage collection uses an adaptive file scoring approach based
on last access time, with greater weights assigned to larger
objects and those with more hits to find the candidates for eviction.

Thanks to @klauspost for this file scoring algorithm

Co-authored-by: Klaus Post <klauspost@minio.io>
2020-02-23 19:03:39 +05:30
Harshavardhana 51a9d1bdb7
Avoid unnecessary allocations for XML parsing (#9017) 2020-02-23 09:06:46 +05:30
Klaus Post b2db1e96e2
Remove crawler concurrency (#9023)
Only have one crawler per disk. Removes locking, but keep
fastwalk itself able to run concurrently.
2020-02-21 20:50:16 +05:30
Harshavardhana ab7d3cd508
fix: Speed up multi-object delete by taking bulk locks (#8974)
Change distributed locking to allow taking bulk locks
across objects, reduces usually 1000 calls to 1.

Also allows for situations where multiple clients sends
delete requests to objects with following names

```
{1,2,3,4,5}
```

```
{5,4,3,2,1}
```

will block and ensure that we do not fail the request
on each other.
2020-02-21 11:29:57 +05:30
Anis Elleuch d4dcf1d722
metrics: Use StorageInfo() instead to have consistent info (#9006)
Metrics used to have its own code to calculate offline disks.
StorageInfo() was avoided because it is an expensive operation
by sending calls to all nodes.

To make metrics & server info share the same code, a new
argument `local` is added to StorageInfo() so it will only
query local disks when needed.

Metrics now calls StorageInfo() as server info handler does
but with the local flag set to false.

Co-authored-by: Praveen raj Mani <praveen@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-20 09:21:33 +05:30
poornas 02a59a04d1
Fix error messages returned by (Put)GetObjectLegalHold (#9013)
fiixing some minor discrepancies between aws s3 responses
vs minio server
2020-02-19 08:15:48 +05:30
Harshavardhana 16a6e68d7b
fix: indicate PutBucketEncryption as a valid policy action (#9009) 2020-02-18 10:32:53 -08:00
Praveen raj Mani 1b427ddb69
Support for Kafka version in the config (#9001)
Add a field for the Kafka version in the config. The user can explicitly 
set the version of the Kafka cluster.

Fixes #8768
2020-02-17 07:56:33 +05:30
Harshavardhana 712e82344c
acl: Support PUT calls with success for 'private' ACL's (#9000)
Add dummy calls which respond success when ACL's
are set to be private and fails, if user tries
to change them from their default 'private'

Some applications such as nuxeo may have an
unnecessary requirement for this operation,
we support this anyways such that don't have
to fully implement the functionality just that
we can respond with success for default ACLs
2020-02-16 11:37:52 +05:30
poornas 716a52f261
Fix hang in cache copyobject call (#8993)
Avoid GetObjectNInfo call from cache in CopyObjectHandler
- in the case of server side copy with metadata replacement,
the reader returned from cache is never consumed, but the net
effect of GetObjectNInfo from cache layer, is cache holding a
write lock to fill the cache. Subsequent stat operation on cache in
CopyObject is not able to acquire a read lock, thus causing the hang.

Fixes #8991
2020-02-13 15:32:26 -08:00
Harshavardhana d1144c2c7e
reference format obtained doesn't need further validation (#8964)
we don't need to validateFormats again once we have obtained
reference format, because it is possible that at this stage
another server is doing a disk heal during startup, once
in a while due to delays we get false positives and our
server doesn't start.

Format in quorum as reference format can be assumed as valid
and we proceed further, until and unless HealFormat re-inits
the disks after a successful heal.

Also use separate port for healing tests to avoid any
conflicts with regular build testing.

Fixes #8884
2020-02-13 14:01:41 -08:00
Harshavardhana 9ecd66007f
fix: reduce the load on CPU when loading users/policies (#8984)
Trying to be conservative by slowing ourselves down
on a regular basis.
2020-02-13 06:36:23 -08:00
Anis Elleuch 6b9805e891
fix: Avoid crash when there is an error testing a target notif (#8986)
RegisterNotificationTargets() cleans up all connections
that it makes to notification targets when an error occurs
during its execution.

However there is a typo in the code that makes the function to always
try to access to a nil pointer in the defer code since the function
in question will always return nil in the case of any error.

This commit fixes the typo in the code.
2020-02-13 11:26:23 +05:30
poornas 013773065c
Save metadata correctly in cache.json on PUT (#8985)
fixes #8979
2020-02-13 08:49:32 +05:30
Anis Elleuch 7d6766adc6
fix: erroneous high value for gateway received bytes metrics (#8978)
http.Request.ContentLength can be negative, which affects
the gateway_s3_bytes_received value in Prometheus output.

The commit only increases the value of the total received bytes
in gateway mode when r.ContentLength is greater than zero.
2020-02-12 10:15:00 +05:30
Harshavardhana c56c2f5fd3
fix routing issue for esoteric characters in gorilla/mux (#8967)
First step is to ensure that Path component is not decoded
by gorilla/mux to avoid routing issues while handling
certain characters while uploading through PutObject()

Delay the decoding and use PathUnescape() to escape
the `object` path component.

Thanks to @buengese and @ncw for neat test cases for us
to test with.

Fixes #8950
Fixes #8647
2020-02-12 09:08:02 +05:30
Nitish Tiwari 7e819d00ea
Fix Error Code for ObjectTagging Parsing (#8971)
Also add Mint tests
2020-02-11 17:42:28 -08:00
Nitish Tiwari 63be4709b7
Add metrics support for Azure & GCS Gateway (#8954)
We added support for caching and S3 related metrics in #8591. As
a continuation, it would be helpful to add support for Azure & GCS
gateway related metrics as well.
2020-02-11 21:08:01 +05:30
astorath 6b1f2fc133
fix: using correct response on get_bucket_lifecycle_configuration (#8962) 2020-02-08 16:46:59 +05:30
poornas 9b4d46a6ed
evict cached entry for server side copy (#8947)
Fixes #8942
2020-02-07 14:36:46 -08:00
Anis Elleuch 502e652b7a
fix: Avoid closing target in RegisterNotificationTargets (#8966)
This will prevent a double target Close() call when fetchLambdaInfo()
is executed (mc admin info)

This fixes a crash when mc admin info is called.
2020-02-07 14:35:56 -08:00
Nitish Tiwari 15e2ea2c96
Fix an issue where MinIO was logging every error twice (#8953)
The logging subsystem was initialized under init() method in
both gateway-main.go and server-main.go which are part of
same package. This created two logging targets and hence
errors were logged twice. This PR moves the init() method
to common-main.go
2020-02-07 13:48:07 +05:30
Klaus Post d0cea7adea
Fix stream read IO count (#8961)
Streams are returning a readcloser and returning would 
decrement io count instantly, fix it.


change maxActiveIOCount to 3, meaning it will pause
crawling if 3 operations are running.
2020-02-07 09:43:55 +05:30
Klaus Post 2165d45d3f
Time getSize and use to estimate latency (#8959)
Remove the random sleep. This is running in 4 goroutines, 
so mostly doing nothing.

We use the getSize latency to estimate system load, 
meaning when there is little load on the system and 
we get the result fast we sleep a little.

If it took a long time we have high load and release
ourselves longer.

We are sleeping inside the mutex so this affects all
goroutines doing IO.
2020-02-07 09:05:55 +05:30
Anis Elleuch 6d5d77f62c
usage typo: Fix creating .minio.sys/background-ops bucket (#8957)
Due to a typo in the code, a cluster was not correctly creating
`background-ops` in all disks and nodes print the following error:

minio3_1  | API: SYSTEM()
minio3_1  | Time: 19:32:45 UTC 02/06/2020
minio3_1  | DeploymentID: d67c20fa-4a1e-41f5-b319-7e3e90f425d8
minio3_1  | Error: Bucket not found: .minio.sys/background-ops
minio3_1  |        2: cmd/data-usage.go:109:cmd.runDataUsageInfo()
minio3_1  |        1: cmd/data-usage.go:56:cmd.runDataUsageInfoUpdateRoutine()

This commit fixes the typo.
2020-02-06 13:12:36 -08:00
Harshavardhana 49df290270 Add metadata parsing to be inside mutex to slow down (#8952)
Adding mutex slows down the crawler to avoid large
spikes in CPU, also add millisecond interval jitter
in calculation of disk usage to slow down the spikes
further.
2020-02-06 00:22:11 -08:00
Nitish Tiwari e5951e30d0
Add support for Object Tagging in LifeCycle configuration (#8880)
Fixes #8870

Co-Authored-By: Krishnan Parthasarathi <krisis@users.noreply.github.com>
2020-02-06 13:20:10 +05:30
Harshavardhana c2c5b09bb1
Avoid object names with '//' to avoid hash inconsistencies (#8946)
This is to fix a situation where an object name incorrectly
is sent with '//' in its path heirarchy, we should reject
such object names because they may be hashed to a set where
the object might not originally belong because, this can
cause situations where once object is uploaded we cannot
delete it anymore.

Fixes #8873
2020-02-06 08:29:38 +05:30
Andreas Auernhammer 086fbb745e
fix and improve KMS server info (#8944)
This commit fixes typos in the displayed server info
w.r.t. the KMS and removes the update status.

For more information about why the update status
is removed see: PR #8943
2020-02-06 06:18:34 +05:30
Andreas Auernhammer 4f37c8ccf2
refine the KMS admin API (#8943)
This commit removes the `Update` functionality
from the admin API. While this is technically
a breaking change I think this will not cause
any harm because:
 - The KMS admin API is not complete, yet.
   At the moment only the status can be fetched.
 - The `mc` integration hasn't been merged yet.
   So no `mc` client could have used this API
   in the past.

The `Update`/`Rewrap` status is not useful anymore.
It provided a way to migrate from one master key version
to another. However, KES does not support the concept of
key versions. Instead, key migration should be implemented
as migration from one master key to another.

Basically, the `Update` functionality has been implemented just
for Vault.
2020-02-05 22:47:35 +05:30
Krishnan Parthasarathi 026265f8f7
Add support for bucket encryption feature (#8890)
- pkg/bucket/encryption provides support for handling bucket 
  encryption configuration
- changes under cmd/ provide support for AES256 algorithm only

Co-Authored-By: Poorna  <poornas@users.noreply.github.com>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-05 15:12:34 +05:30
Anis Elleuch 52bdbcd046
Add new admin API to return Accounting Usage (#8689) 2020-02-04 18:20:39 -08:00
poornas 301c50b721
Add canned `diagnostics` policy for admin users (#8937) 2020-02-04 17:58:38 -08:00
Harshavardhana e9c111c8d0
Avoid unnecessary statPart() calls in PutObjectPart (#8905)
Assume `xl.json` as the source of truth for all operations.
2020-02-04 10:04:37 +05:30
poornas 278a165674
Allow caching based on a configurable number of hits. (#8891)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-04 09:10:01 +05:30
Anis Elleuch e934c3e2a2
usage: Fix buckets count calculation when no object is present (#8929)
XL crawling wrongly returns a zero buckets count when
there are no objects uploaded in the server yet. The reason is 
data of the crawler of posix returns invalid result when all 
disks has zero objects.

A simple fix is to always pick the crawling result of the first 
disk but choose over the result of the disk which has the most 
objects in it.
2020-02-04 06:57:47 +05:30
Harshavardhana 2d295a31de
Avoid select inside a recursive function to avoid CPU spikes (#8923)
Additionally also allow configurable go-routines
2020-02-03 16:45:59 -08:00
Harshavardhana 9bbf5cb74f
fix: Avoid re-reading bucket names from etcd (#8924)
This helps improve performance when there are
1000+ bucket entries on etcd, improves the
startup time significantly.
2020-02-03 13:54:20 +05:30
Harshavardhana 680e493065
fix a crash in base64 buffer pool (#8925)
looks like 1024 buffer size is not enough in
all situations, use 8192 instead which
can satisfy all the rare situations that
may arise in base64 decoding.
2020-02-03 08:42:32 +05:30
poornas 1ea2449269
NAS gateway: fix notification initialization (#8920)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-02 15:22:07 +05:30
Harshavardhana 7ce63b3078
fix: multi-delete API write quorum failures (#8926)
multi-delete API failed with write quorum errors
under following situations

- list of files requested for delete doesn't exist
  anymore can lead to quorum errors and failure
- due to usage of query param for paths, for really
  long paths MinIO server rejects these requests as
  malformed as unexpected.

This was reproduced with warp
2020-02-01 18:11:29 -08:00
Anis Elleuch 7432b5c9b2
Use user CAs in checkEndpoint() call (#8911)
The server info handler makes a http connection to other
nodes to check if they are up but does not load the custom
CAs in ~/.minio/certs/CAs.

This commit fix it.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-02 07:15:29 +05:30
Harshavardhana d76160c245
Initialize only one retry timer for all sub-systems (#8913)
Also make sure that we create buckets on all zones
successfully, do not run quick heal buckets if not
running with expansion.
2020-02-02 06:37:43 +05:30
poornas 5d838edcef
Fix panic in ServerInfoHandler when (#8915)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-01 17:50:04 +05:30
poornas c9116e6bd7
trace - log request body (#8917) 2020-02-01 02:39:49 -08:00
Harshavardhana d7dc9aaf52
fix: remove response header timeout (#8919)
Adding respone header timeout seems to have
premature timeout like consequences which
leads to potential disconnections.
2020-02-01 08:31:55 +05:30
Harshavardhana bfe8a9bccc
jwt: Simplify JWT parsing (#8802)
JWT parsing is simplified by using a custom claim
data structure such as MapClaims{}, also writes
a custom Unmarshaller for faster unmarshalling.

- Avoid as much reflections as possible
- Provide the right types for functions as much
  as possible
- Avoid strings.Join, strings.Split to reduce
  allocations, rely on indexes directly.
2020-01-31 08:29:22 +05:30
Klaus Post 9990464cd5
Fix recursive deep scan of buckets (#8900) 2020-01-30 17:20:07 +05:30
poornas 881e983ed9
Fix Retention, ObjectLock, LegalHold struct namespaces correctly. (#8909)
Reverts #8903 to allow structs to be unmarshalled 
even if the namespace is missing.
2020-01-30 09:58:05 +05:30
Harshavardhana f98616dce7
heal: Optimize heal listing by avoiding batches (#8901)
Also limit the heal per object if there is incoming
requests by suspending heal for longer periods of time.
2020-01-29 12:05:44 +05:30
Ashish Kumar Sinha 5bd0e95eef
Set default namespace for necessary structs (#8903) 2020-01-29 10:19:38 +05:30
Harshavardhana 0cbebf0f57 Rename pkg/{tagging,lifecycle} to pkg/bucket sub-directory (#8892)
Rename to allow for more such features to come in a more
proper hierarchical manner.
2020-01-27 14:12:34 -08:00
poornas 2232e095d5 Make admin permissions more granular for admin handlers. (#8888) 2020-01-26 20:47:52 -06:00
poornas a78e5d4763 Add missing error check in cache GetObjectNInfo (#8889) 2020-01-24 15:49:16 -08:00
Harshavardhana cf37c7997e Heal bucket only on missing drives in quorum (#8883)
MakeVol shouldn't be called in heal bucket
when bucket doesn't really exist in quorum.
2020-01-24 15:38:07 -08:00
Harshavardhana 1ffbb5c24c fix racy tests when editing xl.getDisks (#8879) 2020-01-23 11:50:09 -08:00
Harshavardhana b9c48e0ab0 fix return appropriate error for MakeBucket in federation (#8878) 2020-01-22 08:25:28 -08:00
Harshavardhana fe5d599802 fix: STS creds without "aud" should be honored with STS checks (#8868)
Fixes #8865
2020-01-22 15:09:46 +05:30
Aditya Manthramurthy 55063906b5 Fix group add/remove membership bug (#8877) 2020-01-21 19:00:41 -08:00
Klaus Post c7178d2066 Profiling: Add base, fix memory profiling (#8850)
For 'snapshot' type profiles, record a 'before' profile that can be used 
as `go tool pprof -base=before ...` to compare before and after.

"Before" profiles are included in the zipped package.

[`runtime.MemProfileRate`](https://golang.org/pkg/runtime/#pkg-variables) 
should not be updated while the application is running, so we set it at startup.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-01-21 15:49:25 -08:00
Harshavardhana f14f60a487 fix: Avoid double usage calculation on every restart (#8856)
On every restart of the server, usage was being
calculated which is not useful instead wait for
sufficient time to start the crawling routine.

This PR also avoids lots of double allocations
through strings, optimizes usage of string builders
and also avoids crawling through symbolic links.

Fixes #8844
2020-01-21 14:07:49 -08:00
Harshavardhana e2b3c083aa
fix: close and drain the response body always (#8847) 2020-01-21 02:46:58 -08:00
Harshavardhana 86252ec7e1
fix: document _ENABLE for all notification targets (#8864)
Fixes #8863
2020-01-20 16:48:19 -08:00
Nitish Tiwari 61c17c8933 Add ObjectTagging Support (#8754)
This PR adds support for AWS S3 ObjectTagging API as explained here
https://docs.aws.amazon.com/AmazonS3/latest/dev/object-tagging.html
2020-01-20 08:45:59 -08:00
Forest Lovewood dd93eee1e3 Implement bucket caching for b2 gateway (#8820)
fixes #8739 #6806
2020-01-20 22:13:38 +05:30
Harshavardhana 88286cf8d0 fix: support pre-sign signature for STS tokens (#8826)
Fixes #8391
2020-01-18 17:04:50 -08:00
Klaus Post 8cb6184f1d Fix erasure block allocation (#8851)
Small blocks are undersized when file size isn't divisible by the 
shard could leading to allocation in *reedsolomon.Split()*
2020-01-18 14:21:58 -08:00
Harshavardhana 09ee145e9c gw/hdfs: indicate hdfs gateway is production ready (#8848) 2020-01-18 07:25:03 -08:00
Harshavardhana 23e46f9dba
log formatting only the first time (#8846) 2020-01-17 15:39:07 -08:00
Harshavardhana fc5213258e
posix: Do not take disk offline on I/O errors (#8836)
Choosing maxAllowedIOError is arbitrary and
prone to errors, when drives might be perfectly
capable of taking I/O with only few locations
return I/O error. This is a hindrance of sort
where backend filesystems like ZFS can automatically
fix and handle these scenarios.

The added problem with current approach that we
take the drive offline, making it virtually impossible
to bring it online without restart the server which
is not desirable on a busy cluster. Remove this state
such that let the backend return error appropriately
to caller and let the caller decide what to do with
the error.
2020-01-17 13:34:43 -08:00
Anis Elleuch 017067e11f data-usage: Avoid crawling duplicated call (#8843)
This fix will also picks 3 and not 4 disks from a single erasure set.
2020-01-17 09:59:37 -08:00
Harshavardhana 2bb69033e5 http: fail appropriately and return standard Go error (#8837)
return http.ErrServerClosed with proper body when
server is shutting down, allowing more context instead
of just returning '503' which doesn't mean the same
thing.
2020-01-17 05:48:39 -08:00
Harshavardhana fca4ee84c9
gw/hdfs: listing should list directories properly (#8827)
Fixes #8822
2020-01-16 17:11:25 -08:00
poornas 60e60f68dd Add support for object locking with legal hold. (#8634) 2020-01-16 15:41:56 -08:00
Harshavardhana c6b218e5df
fix: readiness should return 200 OK with first zone online (#8834) 2020-01-16 13:49:25 -08:00
Anis Elleuch c18fbdb29a posix: Remove a non needed nil check in DiskInfo() (#8830)
posix.DiskInfo() returns errFaultyDisk when posix is nil,
but there is no way that this would happen any time, therefore
removing un-needed code.
2020-01-16 11:27:50 -08:00
Harshavardhana b1ad99edbf
fix: avoid crash copy map before reading (#8825)
code of this form is always racy, when the
map itself is being written to as well

```
func (r Map) retMap() map[string]string {
     .. lock ..
     return r.internalMap
}

func (r Map) addMap(k, v string) {
     .. lock ..
     r.internalMap[k] = v
}
```

Anyone reading from `retMap()` is not protected
because of locking and we need to make sure
to avoid code in this manner. Always safe to
copy the map and return.
2020-01-16 01:35:30 -08:00
Anis Elleuch 935546d5ca xl: Implement MRF healing (#8470) 2020-01-15 18:30:32 -08:00
Harshavardhana 64fde1ab95
xl/zones: return errNoHealRequired when no heal is required (#8821)
Zone abstraction of object layer was returning `nil`
incorrectly under situations where disk healing is
not required. Returning `nil` is considered as healing
successful, which leads to unexpected ReloadFormat()
peer notification calls during startup.

This PR fixes this behavior properly for zones.
2020-01-15 17:19:13 -08:00
Anis Elleuch 069876e262 xl: All nodes create meta volumes in its local disks (#8786)
Meta volumes directories, tmp/, background-ops/, etc..
undr .minio.sys are created when disks are formatted
but also when the cluster is started.

However using MakeVolBulk() is not appropriate in the
case of a user migrating from a version which does not
have .minio.sys/background-ops/. The reason is that
MakeVolBulk() exits early when an error is occured:
errVolumeExists in this case, which is expected since
some directories such as tmp/ already exist.

This commit will avoid use MakeVolBulk and use MakeVol
instead.

Also the PR will make each node creates meta volumes
in its local disks and stop relying on the first disk
since the first node could be offline.
2020-01-15 12:36:52 -08:00
Harshavardhana 442e1698cb
heal: Avoid spinning up object healing during startup (#8819)
auto-heal disks, metadata and buckets in background but
not objects, let the auto heal kick in for objects after
the cluster has been up for a while.
2020-01-15 01:08:39 -08:00
poornas d76518eeb9 Remove TestPutObjectPartDiskNotFound unit test (#8815) 2020-01-14 18:46:33 -08:00
Harshavardhana 0879a4f743 rest/storage: Remove racy LastError usage (#8817)
instead perform a liveness check call to
verify if server is online and print relevant
errors.

Also introduce a StorageErr string error type
instead of errors.New() deprecate usage of
VerifyFileError, DeleteFileError for gob,
change in datastructure also requires bump in
storage REST version to v13.

Fixes #8811
2020-01-14 18:45:17 -08:00
Harshavardhana 9be7066715
fix: Hold locks before closing all drives (#8818)
Fixes #8813
2020-01-14 17:13:58 -08:00
Klaus Post d8660b30cc Reduce MemProfileRate (#8814)
Enabling the memory profiling has a significant impact on performance.

Reduce the profiling rate by 2 orders of magnitude. It is still 128x smaller than default so it should be plenty.
2020-01-14 16:18:45 -08:00
poornas 30922148fb Fix bug preventing overwrite of object if (#8796)
object lock config is enabled for a bucket.

Creating a bucket with object lock configuration
enabled does not automatically cause WORM protection
to be applied. PUT operation needs to specifically
request object locking or bucket has to have default
retention settings configured.

Fixes regression introduced in #8657
2020-01-13 17:29:31 -08:00
Klaus Post 37b32199e3 Validate XL sets on format (#8779)
When formatting a set validate if a host failure will likely lead to data loss.

While we don't know what config will be set in the future 
evaluate to our best knowledge, assuming default settings.
2020-01-13 13:09:10 -08:00
Klaus Post 627fdfeab7 Fix Windows console printing (#8805)
Print to console which does translation and not directly to stdout.

Fixes #8804
2020-01-13 13:05:51 -08:00
poornas 9199033db7 Set X-Cache and X-Cache-Lookup headers for cache (#8794)
X-Cache sets cache status of HIT if object is
served from the disk cache, or MISS otherwise.
X-Cache-Lookup is set to HIT if object was found
in the cache even if not served (for e.g. if cache
 entry was invalidated by ETag verification)
2020-01-10 20:21:13 -08:00
Klaus Post 2bf6cf0e15 Enable multiple concurrent profile types (#8792) 2020-01-10 17:19:58 -08:00
Harshavardhana 686d4656de
fix: set appropriate defaults when new keys added (#8795)
A new key was added in identity_openid recently
required explicitly for client to set the optional
value without that it would be empty, handle this
appropriately.

Fixes #8787
2020-01-10 16:57:18 -08:00
Harshavardhana 5aa5dcdc6d
lock: improve locker initialization at init (#8776)
Use reference format to initialize lockers
during startup, also handle `nil` for NetLocker
in dsync and remove *errorLocker* implementation

Add further tuning parameters such as

 - DialTimeout is now 15 seconds from 30 seconds
 - KeepAliveTimeout is not 20 seconds, 5 seconds
   more than default 15 seconds
 - ResponseHeaderTimeout to 10 seconds
 - ExpectContinueTimeout is reduced to 3 seconds
 - DualStack is enabled by default remove setting
   it to `true`
 - Reduce IdleConnTimeout to 30 seconds from
   1 minute to avoid idleConn build up

Fixes #8773
2020-01-10 02:35:06 -08:00
Praveen raj Mani 4cd1bbb50a This PR fixes two things (#8772)
- Stop spawning store replay routines when testing the notification targets
- Properly honor the target.Close() to clean the resources used

Fixes #8707

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-01-09 19:45:44 +05:30
Harshavardhana abc1c1070a Add custom policy claim name (#8764)
In certain organizations policy claim names
can be not just 'policy' but also things like
'roles', the value of this field might also
be *string* or *[]string* support this as well

In this PR we are still not supporting multiple
policies per STS account which will require a
more comprehensive change.
2020-01-08 17:21:58 -08:00
poornas fd56aa42a6 Fix error message wording for PutObjectLockConfig (#8759)
Co-Authored-By: kannappanr <30541348+kannappanr@users.noreply.github.com>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-01-08 15:36:23 -08:00
Klaus Post 3d318bae76 init: Use constant time retries (#8769)
Exponential backoff does not seem like a good fit for
this function since we can expect a few roundtrips on
initial startup.

This retry loop get slow pretty quickly with initial
wait being 1 second and each try being double the
wait until 30 seconds is reached.

Instead simply try 2 times per second.
2020-01-08 13:37:34 -08:00
Harshavardhana aa2e89bfe3 Use jsoniter whenever applicable instead of encoding/json (#8766)
This PR adds jsoniter package to replace encoding/json
in places where faster json unmarshal is necessary
whenever input JSON is large enough.

Some benchmarking comparison between jsoniter and enconding/json

benchmark                            old MB/s     new MB/s     speedup
BenchmarkParseUnmarshal/N10-4        110.02       331.17       3.01x
BenchmarkParseUnmarshal/N100-4       125.74       524.09       4.17x
BenchmarkParseUnmarshal/N500-4       131.68       542.60       4.12x
BenchmarkParseUnmarshal/N1000-4      133.93       514.88       3.84x
BenchmarkParseUnmarshal/N5000-4      122.10       415.36       3.40x
BenchmarkParseUnmarshal/N10000-4     132.13       403.90       3.06x
2020-01-08 17:01:42 +05:30
Harshavardhana 60813bef29
Allow proper setCount SLAs across zones (#8752)
Fixes scenario where zones are appropriately
handled, along with supporting overriding set
count. The new fix also ensures that we handle
the various setup types properly.

Update documentation to properly indicate the
behavior.

Fixes #8750

Co-authored-by: Nitish Tiwari <nitish@minio.io>
2020-01-07 09:13:44 -08:00
Harshavardhana b123be5612 fix: browser should listBuckets from etcd in global federation (#8760) 2020-01-07 09:03:00 +05:30
Harshavardhana 933c60bc3a Add crypto context errors (#8740)
Currently when connections to vault fail, client
perpetually retries this leads to assumptions that
the server has issues and masks the problem.

Re-purpose *crypto.Error* type to send appropriate
errors back to the client.
2020-01-06 16:15:22 -08:00
ebozduman 796cca4166 Creates zipped files with correct mod times for objects (#8693) 2020-01-06 12:43:00 -08:00
Klaus Post fe379f9428 Copy metadata on update (#8755)
Fixes #8706

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-01-06 10:15:44 -08:00
Harshavardhana ae0b165431 fix: --anonymous flag shouldn't print any keys (#8753)
Fixes #8744
2020-01-06 22:12:47 +05:30
George Xie 7f31d933a8 fixes some typos, for CREDITS change (#8743) 2020-01-03 17:49:01 -08:00
Harshavardhana 6695fd6a61
Add more context aware error for policy parsing errors (#8726)
In existing functionality we simply return a generic
error such as "MalformedPolicy" which indicates just
a generic string "invalid resource" which is not very
meaningful when there might be multiple types of errors
during policy parsing. This PR ensures that we send
these errors back to client to indicate the actual
error, brings in two concrete types such as

 - iampolicy.Error
 - policy.Error

Refer #8202
2020-01-03 11:28:52 -08:00
Harshavardhana b00cda8ad4 Avoid running lock maintenance from all nodes (#8737)
Co-Authored-By: Krishnan Parthasarathi <krisis@users.noreply.github.com>
2020-01-03 23:11:07 +05:30
Anis Elleuch d861edfc00 xl: Print the correct err msg when access to the backend is forbidden (#8735)
minio server /data{1..4} shows an error about inability to bind a port, though
the real problem is /data{1..4} cannot be created because of the lack of
permissions.

This commit fix the behavior.
2020-01-03 21:15:26 +05:30
Harshavardhana cb935980a5 Fix version to be release-tag (#8730) 2020-01-02 20:18:32 +05:30
Praveen raj Mani 157721f694 Fix readiness to return 200 for read-only mode (#8728)
- We should declare a cluster ready even if read quorum is achieved (atleast n/2 disks are online).
- Such that, all the zones should have enough read quorum. Thus making the cluster ready for reads.
2020-01-02 05:05:01 -08:00
Harshavardhana 0b7bd024fb Fix dependencies graph for minio source compilation (#8717)
We had messy cyclical dependency problem with `mc`
due to dependencies in pkg/console, moved the pkg/console
to minio for more control and also to avoid any further
cyclical dependencies of `mc` clobbering up the
dependencies on server.

Fixes #8659
2019-12-31 09:36:13 +05:30
Harshavardhana 3af70b36fd Disallow creating buckets even with different domains (#8716)
If two distinct clusters are started with different domains
along with single common domain, this situation was leading
to conflicting buckets getting created on different clusters

To avoid this do not prematurely error out if the key has no
entries, let the caller decide on which entry matches and
which entry is valid. This allows support for MINIO_DOMAIN
with one common domain, but each cluster may have their own
domains.

Fixes #8705
2019-12-30 17:11:47 -08:00
Harshavardhana 669c9da85d Disable federated buckets when etcd is namespaced (#8709)
This is to ensure that when we have multiple tenants
deployed all sharing the same etcd for global bucket
should avoid listing each others buckets, this leads
to information leak which should be avoided unless
etcd is not namespaced for IAM assets in which case
it can be assumed that its a federated setup.

Federated setup and namespaced IAM assets on etcd
is not supported since namespacing is only useful
when you wish to separate the tenants as isolated
instances of MinIO.

This PR allows a new type of behavior, primarily
driven by the usecase of m3(mkube) multi-tenant
deployments with global bucket support.
2019-12-29 08:56:45 -08:00
Praveen raj Mani 5d09233115 Fix Readiness check (#8681)
- Remove goroutine-check in Readiness check
- Bring in quorum check for readiness

Fixes #8385

Co-authored-by: Harshavardhana <harsha@minio.io>
2019-12-28 22:24:43 +05:30
Anis Elleuch c31e67dcce Better error when the server is unable to write in the backend (#8697) 2019-12-25 22:05:54 -08:00
Harshavardhana 99ad445260
Avoid double for loops in notification init (#8691) 2019-12-24 13:49:48 -08:00
Harshavardhana 54431b3953 Change replica set detection for localhost on single endpoint (#8692) 2019-12-24 11:31:32 -08:00
Harshavardhana f68a7005c0 Improve disk formatting stage for large disk sets (#8690) 2019-12-23 16:31:03 -08:00
Harshavardhana 725172e13b
fix: Do not need safe-mode for unreachable targets upon restart (#8686) 2019-12-21 22:35:50 -08:00
Harshavardhana a3c8ef79a4 fix: remove extra newline from GetConfig() output (#8678) 2019-12-20 14:47:14 -08:00
Aditya Manthramurthy 01468d5a75 Fix user and policy deletion IAM commands (#8683) 2019-12-20 14:42:08 -08:00
Harshavardhana 8f1243986e
fix: listenBucket should filter events based on bucket (#8677)
Currently all bucket events are sent to all watchers
with matching prefix and event names, this becomes
problematic and prone to performance issues, fix this
situation by filtering based on buckets as well.
2019-12-20 11:45:03 -08:00
Harshavardhana 586614c73f fix: temp credentials shouldn't allow policy/group changes (#8675)
This PR fixes the issue where we might allow policy changes
for temporary credentials out of band, this situation allows
privilege escalation for those temporary credentials. We
should disallow any external actions on temporary creds
as a practice and we should clearly differentiate which
are static and which are temporary credentials.

Refer #8667
2019-12-19 14:21:21 -08:00
Harshavardhana d140074773 fix: replica set deployment for multi tenants (#8673)
Changes in IP underneath are dynamic in replica sets
with multiple tenants, so deploying in that fashion
will not work until we wait for atleast one participatory
server to be local.

This PR also ensures that multi-tenant zone expansion also
works in replica set k8s deployments.

Introduces a new ENV `KUBERNETES_REPLICA_SET` check to call
appropriate code paths.
2019-12-19 13:45:56 -08:00
Harshavardhana 39face27cf Simplify k8s replicated set deployment (#8666)
Continuation from #8629 which basically broke
zone deployments on k8s statefulset environment
due to incorrect assumptions which made it work
on replicated set.

Fix this properly such that this container works
for both replicated set and stateful set deployment
2019-12-18 17:05:24 -08:00
Andreas Auernhammer e047ac52b8 remove github.com/minio/kes as a dependency (#8665)
This commit removes github.com/minio/kes as
a dependency and implements the necessary
client-side functionality without relying
on the KES project.

This resolves the licensing issue since
KES is licensed under AGPL while MinIO
is licensed under Apache.
2019-12-18 15:10:57 -08:00
poornas 04de3ea4bd Change cache purge routine granularity to hours (#8660)
With this PR,cache eviction will continue until
no LRU entries older than an hour can be cache
evicted or sufficient percentage of disk space
has been reclaimed.
2019-12-18 13:49:10 -08:00
Amol Umbarkar e6ce9da087 fix BucketForward Handler for federated setup (#8646)
fixes #8595
2019-12-18 14:06:03 +05:30
Harshavardhana c9c0d5eec2 Allow CNAME records when specified as MINIO_PUBLIC_IPS (#8662)
This is necessary for `m3` global bucket support
2019-12-18 11:02:45 +05:30
Harshavardhana 63c3114657 fix: doc notifications formatting issues (#8661) 2019-12-17 17:34:17 -08:00
Harshavardhana 9bb0869b73
fix: populate buckets on etcd after config has loaded (#8658) 2019-12-17 13:50:07 -08:00
Harshavardhana 5f2318567e
Allow metadata updates on meta bucket even in WORM mode (#8657)
This ensures that we can update the

- .minio.sys is updated for accounting/data usage purposes
- .minio.sys is updated to indicate if backend is encrypted
  or not.
2019-12-17 10:13:12 -08:00
kannappanr 16ac4a3c64 PutBucketLifeCycleConfiguration: Return 200 instead of 204 (#8656) 2019-12-17 07:39:49 -08:00
Harshavardhana c8d82588c2 Fix crash in console logger and also handle bucket DNS updates (#8654)
Also fix listenBucketNotification bugs seen by minio-js
listen bucket notification API.
2019-12-16 20:30:57 -08:00
Harshavardhana 1dc5f2d0af
Remove safe mode for invalid entries in config (#8650)
The approach is that now safe mode is only invoked when
we cannot read the config or under some catastrophic
situations, but not under situations when config entries
are invalid or unreachable. This allows for maximum
availability for MinIO and not fail on our users unlike
most of our historical releases.
2019-12-14 17:27:57 -08:00
Harshavardhana c10ecacf91 Always use SourceIP for host target filtering (#8649) 2019-12-14 11:12:59 -08:00
poornas 1cf3e3b7b5 PutBucket: Case-insensitive validation of x-amz-bucket-object-lock-enabled (#8648)
Fix: case insensitive validation of x-amz-bucket-object-lock-enabled header in PutBucket handler
2019-12-13 15:51:28 -08:00
Andreas Auernhammer c3d4c1f584 add minio/keys KMS integration (#8631)
This commit adds support for the minio/kes KMS.
See: https://github.com/minio/kes

In particular you can configure it as KMS by:
 - `export MINIO_KMS_KES_ENDPOINT=`  // Server URL
 - `export MINIO_KMS_KES_KEY_FILE=`  // TLS client private key
 - `export MINIO_KMS_KES_CERT_FILE=` // TLS client certificate
 - `export MINIO_KMS_KES_CA_PATH=`   // Root CAs issuing server cert
 - `export MINIO_KMS_KES_KEY_NAME=`  // The name of the (default)
master key
2019-12-13 12:57:11 -08:00
Harshavardhana 471a3a650a
fix: Don't allow to set unconfigured notification ARNs (#8643)
Fixes #8642
2019-12-13 12:36:45 -08:00
Harshavardhana cc02bf0442
Remove old ListenBucketNotification API (#8645) 2019-12-13 11:33:11 -08:00
Harshavardhana 39e8e4f4aa
Allow empty target KVS for notification targets (#8644)
This is allowed with enable=off arg value
2019-12-12 17:02:14 -08:00
poornas 80558e839d Clear cache if reverting to backend (#8637)
Clear cached entry before reverting to backend for
encrypted objects or those under retention to avoid
stale objects remaining in cache.
2019-12-12 15:11:27 -08:00
Harshavardhana ca62ac65d4
Reject mandatory KVS if not set for any sub-sys (#8641) 2019-12-12 14:55:07 -08:00
Harshavardhana f5abe4e1f1
Support ListenBucketNotificationV2 streaming (#8622) 2019-12-12 10:01:23 -08:00
Klaus Post 3211cb5df6 Add encryption buffer (#8626)
Quite hard to measure difference:

```
λ warp cmp put-before.csv.zst put-after2.csv.zst
Operation: PUT
Operations: 340 -> 353
* Average: +4.11% (+22.7 MB/s) throughput, +4.11% (+0.2) obj/s
* 50% Median: +1.58% (+7.3 MB/s) throughput, +1.58% (+0.1) obj/s
```

Difference is likely bigger on Intel platforms due to higher syscall costs.
2019-12-12 10:01:15 -08:00
Ashish Kumar Sinha abc266caa1 Add bucket and object count along with total object size (#8639) 2019-12-12 09:58:59 -08:00
Harshavardhana c364f0af6c Start using custom HTTP transport for webhook endpoints (#8630)
Use a more performant http transport for webhook
endpoints with proper connection pooling, appropriate
timeouts etc.
2019-12-12 06:53:50 -08:00
Anis Elleuch 555969ee42 Add data usage collect with its new admin API (#8553)
Admin data usage info API returns the following

(Only FS & XL, for now)

- Number of buckets
- Number of objects
- The total size of objects
- Objects histogram
- Bucket sizes
2019-12-12 06:02:37 -08:00
Ashish Kumar Sinha e2c5d29017 Bucket,Object count & Usage removed if set to default (#8638) 2019-12-11 21:56:47 -08:00
Harshavardhana fa00a84709
Avoid crashes on peers if IAMSys is not initialized (#8636) 2019-12-11 20:46:57 -08:00
kannappanr d266b3a066
Admin Info: Modify Uptime to return seconds (#8635) 2019-12-11 17:56:02 -08:00
Ashish Kumar Sinha 24fb1bf258 New Admin Info (#8497) 2019-12-11 14:27:03 -08:00
Harshavardhana 8b803491af
fix: CacheOpts parsing tests (#8632) 2019-12-11 13:26:18 -08:00
Harshavardhana 10b2f15f6f Add randomize sleep times for lock checkers (#8628) 2019-12-11 10:57:05 -08:00
Harshavardhana 3e9ab5f4a9
Fix k8s replica set deployment (#8629)
In replica sets, hosts resolve to localhost
IP automatically until the deployment fully
comes up. To avoid this issue we need to
wait for such resolution.
2019-12-10 20:28:22 -08:00
Krishna Srinivas 3b67f629a4 Retry peer notification of events (#8621) 2019-12-09 05:29:37 -08:00
poornas 3c30e4503d Cache only the range requested for range GETs (#8599) 2019-12-08 13:58:04 -08:00
poornas 8390bc26db Fix cache hit metrics. (#8617) 2019-12-07 23:14:33 +05:30
Nitish Tiwari 24ad59316d
Use atomic.Uint64 for gateway metrics count instead of mutex (#8615) 2019-12-07 11:21:52 +05:30
poornas be0c8b1ec0 Add support for missing Cache-Control directives (#8619)
no-cache, only-if-cached and no-store directives are
being enforced in this PR.
2019-12-07 07:49:36 +05:30
Harshavardhana 476111968a Update help messages with new wording (#8616)
Final update to all messages across sub-systems
after final review, the only change here is that
NATS now has TLS and TLSSkipVerify to be consistent
for all other notification targets.
2019-12-06 13:53:51 -08:00
Harshavardhana 97deba2a7c GetKVS should add new keys automatically, preserve order (#8612) 2019-12-06 16:13:10 +05:30
Nitish Tiwari 3df7285c3c Add Support for Cache and S3 related metrics in Prometheus endpoint (#8591)
This PR adds support below metrics

- Cache Hit Count
- Cache Miss Count
- Data served from Cache (in Bytes)
- Bytes received from AWS S3
- Bytes sent to AWS S3
- Number of requests sent to AWS S3

Fixes #8549
2019-12-05 23:16:06 -08:00
Aleksandr Petruhin d2dc964cb5 Support TLS auth for Kafka notification target (#8609) 2019-12-05 15:31:46 -08:00
Harshavardhana d8e3de0cae Ensure comment is always a valid key (#8604)
Also fix LDAP leaky connection
2019-12-05 18:17:42 +05:30
Harshavardhana c9940d8c3f Final changes to config sub-system (#8600)
- Introduces changes such as certain types of
  errors that can be ignored or which need to 
  go into safe mode.
- Update help text as per the review
2019-12-04 15:32:37 -08:00
Harshavardhana 794eb54da8 Export command prints turned-off sub-sys as comments (#8594)
This PR also tries to

- Preserve the order of keys printed in export command
- Fix cache to be enabled with _STATE env to keep
  backward compatibility
2019-12-03 10:50:20 -08:00
Harshavardhana 2ab8d5e47f Enable build verification with race (#8583) 2019-12-02 15:54:26 -08:00
Clemens Wolff 947bc8c7d3 Update Azure Gateway to azure-storage-blob SDK (#8537)
The azure-sdk-for-go/storage package has been in maintenance-
only mode since February 2018 (see [1]) and will be deprecated in the future.
2019-12-02 09:32:19 -08:00
Harshavardhana 5d3d57c12a
Start using error wrapping with fmt.Errorf (#8588)
Use fatih/errwrap to fix all the code to use
error wrapping with fmt.Errorf()
2019-12-02 09:28:01 -08:00
Harshavardhana 0bfd20a8e3
Add client_id support for OpenID (#8579)
- One click OpenID authorization on Login page
- Add client_id help, config keys etc

Thanks to @egorkaru @ihostage for the
original work and testing.
2019-11-29 21:37:42 -08:00
Klaus Post db3dbcce3a Print goroutines when shutdown hangs (#8574) 2019-11-29 19:40:08 +05:30
Harshavardhana b21835f195 Honor DurationSeconds properly for WebIdentity (#8581)
Also cleanup code to add various constants for
verbatim strings across the code base.

Fixes #8482
2019-11-29 18:57:54 +05:30
Klaus Post c7844fb1fb posix: cache disk ID for a short while (#8564)
`*posix.getDiskID()` takes up to 30% of all CPU due to the `os.Stat` call on `GET` calls.

Before:
```
Operation: GET - Concurrency: 12
* Average: 1333.97 MB/s, 1365.99 obj/s, 1365.98 ops ended/s (4m59.975s)
* First Byte: Average: 7.801487ms, Median: 7.9974ms, Best: 1.9822ms, Worst: 110.0021ms

Aggregated, split into 299 x 1s time segments:
* Fastest: 1453.50 MB/s, 1488.38 obj/s, 1492.00 ops ended/s (1s)
* 50% Median: 1360.47 MB/s, 1393.12 obj/s, 1393.00 ops ended/s (1s)
* Slowest: 978.68 MB/s, 1002.17 obj/s, 1004.00 ops ended/s (1s)
```

After:
```
Operation: GET - Concurrency: 12
* Average: 1706.07 MB/s, 1747.02 obj/s, 1747.01 ops ended/s (4m59.985s)
* First Byte: Average: 5.797886ms, Median: 5.9959ms, Best: 996.3µs, Worst: 84.0007ms

Aggregated, split into 299 x 1s time segments:
* Fastest: 1830.03 MB/s, 1873.96 obj/s, 1872.00 ops ended/s (1s)
* 50% Median: 1735.04 MB/s, 1776.68 obj/s, 1776.00 ops ended/s (1s)
* Slowest: 994.94 MB/s, 1018.82 obj/s, 1018.00 ops ended/s (1s)
```

TLDR; `os.Stat` is not free.
2019-11-29 02:57:14 -08:00
Harshavardhana 2ff8132e2d Fix the regression introduced in #8580 2019-11-27 16:13:07 -08:00
Harshavardhana 30e80d0a86
Add ReadFrom,WriteTo helpers for server config (#8580) 2019-11-27 09:36:08 -08:00
Harshavardhana 5d65428b29
Handle localhost distributed setups properly (#8577)
Fixes an issue reported by @klauspost and @vadmeste

This PR also allows users to expand their clusters
from single node XL deployment to distributed mode.
2019-11-26 11:42:10 -08:00
Harshavardhana 78eb3b78bb
Repurpose Get/SetConfig as import/export support (#8578) 2019-11-26 10:08:25 -08:00
Harshavardhana 720442b1a2
Add lock expiry handler to expire state locks (#8562) 2019-11-25 16:39:43 -08:00
Harshavardhana e542084c37
Add etcd path prefix for all IAM assets (#8569)
Currently, we use the top-level prefix "config/"
for all our IAM assets, instead of to provide
tenant-level separation bring 'path_prefix'
to namespace the access properly.

Fixes #8567
2019-11-25 16:33:34 -08:00
poornas f931fc7bfb Fix retention enforcement in Compliance mode (#8556)
In compliance mode, the retention date can be extended with 
governance bypass permissions
2019-11-25 10:58:39 -08:00
Harshavardhana 0a56e33ce1 Preserve client sent config appropriately (#8566) 2019-11-22 13:46:05 -08:00
Harshavardhana c3771df641
Add bootstrap REST handler for verifying server config (#8550) 2019-11-22 12:45:13 -08:00
Klaus Post 890b493a2e Use random file name for write check (#8563)
Since there may be multiple writes going on concurrently
Use a random file name for the write check to avoid collisions.
2019-11-22 09:50:17 -08:00
Harshavardhana f96e902f63 Do not rely on quorum for StorageInfo() (#8557)
StorageInfo() call is supposed to give each
server/disk information independently, rely
on this appropriately so that `mc admin info server`
gets correct information all the time.
2019-11-21 22:08:41 -08:00
Sergey Morgunov 06bd1e582a Log in with OIDC not work with MINIO_DOMAIN (#8558) (#8559) 2019-11-21 17:45:15 -08:00
Harshavardhana fb43d64dc3
Fix healing on multiple zones (#8555)
It is expected in zone healing underlying
callers should return appropriate errors
2019-11-21 13:18:32 -08:00
Harshavardhana fd0fa4e5c5 Add NTP retention time (#8548) 2019-11-21 18:22:35 +05:30
Harshavardhana 4e9de58675 Avoid pointer based copy, instead use Clone() (#8547)
This PR adds functional test to test expanded
cluster syntax.
2019-11-21 17:54:51 +05:30
Harshavardhana 9565641b9b
Enhance ListObjectsV2 API to return UserDefined metadata (#8539) 2019-11-21 01:54:49 -08:00
poornas 4da68cfcfc Handle indexes correctly in DeleteMultipleObjectsHandler (#8544)
Regression from #8509 which changes objectsToDelete entry
from a list to map. This will cause index out of range
panic if object is not selected for delete.
2019-11-20 17:51:10 -08:00
Harshavardhana 5ac4b517c9
Order all keys in config (#8541)
New changes

- return default values when sub-sys is
  not configured.
- state is hidden parameter now
- remove worm mode to be saved in config
2019-11-20 15:10:24 -08:00
poornas ca96560d56 Add object retention at the per object (#8528)
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS

This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
2019-11-20 13:18:09 -08:00
Nitish Tiwari cc1a84b62e Fix heal result item output to properly count drives and sets (#8543) 2019-11-20 10:10:26 -08:00
Harshavardhana 8392d2f510 Preserve same deploymentID on all zones (#8542) 2019-11-20 15:39:30 +05:30
Harshavardhana 347b29d059 Implement bucket expansion (#8509) 2019-11-19 17:42:27 -08:00
Harshavardhana 3a34d98db8
Initialize local nsLocker for gateway instances (#8540) 2019-11-19 16:45:35 -08:00
Harshavardhana 7cdb67680e
Add help with order of keys (#8535) 2019-11-19 13:48:13 -08:00
poornas 929951fd49 Add support for multiple admins (#8487)
Also define IAM policies for administering
MinIO server
2019-11-19 02:03:18 -08:00
Harshavardhana 13a3d17321
Do not add comments after migration (#8530)
Also filter out empty comments from being
printed.
2019-11-16 14:57:36 -08:00
Harshavardhana a8e156d6a5
Fix cache locking to use local namespace locking (#8529) 2019-11-16 13:44:28 -08:00
svistoi c9be601988 NATS TLS specify CA and client TLS authentication (#8389)
- added ability to specify CA for self-signed certificates
- added option to authenticate using client certificates
- added unit tests for nats connections
2019-11-15 09:13:23 -08:00
poornas 13e2b97ad9 Fix regression in caching on single PUT (#8526)
Regression caused by #8120
2019-11-15 15:46:27 +05:30
Ville Skyttä 95e5d7a9c3 Improve access and secret key validation error, sync with implementation (#8516) 2019-11-14 14:47:35 -08:00
Harshavardhana 32c200fe12 Fix console logger crash in gateway mode (#8525)
This PR also fixes config migration only
for credentials and region which are valid
and set.

Also fix implicit `state="on"` behavior
2019-11-14 14:19:57 -08:00
Klaus Post 1dd38750f7 Remove read-ahead for small files (#8522)
We should only read ahead if we are reading big files. We enable it for files >= 16MB.

Benchmark on 64KB objects.

Before:

```
Operation: GET
Errors: 0
Average: 59.976s, 87.13 MB/s, 1394.07 ops ended/s.
Fastest: 1s, 90.99 MB/s, 1455.00 ops ended/s.
50% Median: 1s, 87.53 MB/s, 1401.00 ops ended/s.
Slowest: 1s, 81.39 MB/s, 1301.00 ops ended/s.
```

After:

```
Operation: GET
Errors: 0
Average: 59.992s, 207.99 MB/s, 3327.85 ops ended/s.
Fastest: 1s, 219.20 MB/s, 3507.00 ops ended/s.
50% Median: 1s, 210.54 MB/s, 3368.00 ops ended/s.
Slowest: 1s, 179.14 MB/s, 2865.00 ops ended/s.
```

The 64KB buffer is actually a small disadvantage for this case, but I believe it will be better in general than no buffer.
2019-11-14 12:58:41 -08:00
Harshavardhana 26a866a202
Fix review comments and new changes in config (#8515)
- Migrate and save only settings which are enabled
- Rename logger_http to logger_webhook and
  logger_http_audit to audit_webhook
- No more pretty printing comments, comment
  is a key=value pair now.
- Avoid quotes on values which do not have space in them
- `state="on"` is implicit for all SetConfigKV unless
  specified explicitly as `state="off"`
- Disabled IAM users should be disabled always
2019-11-13 17:38:05 -08:00
Anis Elleuch 60690a7e1d fs: Fix setting new deployment ID in format when not present (#8517)
The code does not properly set a new deployemnt ID when not present
in format.json: it loops twice without releasing write lock on format.json
causing an infinite locking error on the same file.

This commit fixes and simplifies a little the code.
2019-11-13 12:18:23 -08:00
Harshavardhana e9b2bf00ad Support MinIO to be deployed on more than 32 nodes (#8492)
This PR implements locking from a global entity into
a more localized set level entity, allowing for locks
to be held only on the resources which are writing
to a collection of disks rather than a global level.

In this process this PR also removes the top-level
limit of 32 nodes to an unlimited number of nodes. This
is a precursor change before bring in bucket expansion.
2019-11-13 12:17:45 -08:00
Harshavardhana 069b8ee8ff Add restrictions of object retention to AWS S3 limits (#8514)
This PR also fixes issues related

 - Peer notification handler was missing "/"
 - Missing prometheus metrics for retention APIs
2019-11-13 08:21:41 -08:00
Bala FA fb48ca5020 Add Get/Put Bucket Lock Configuration API support (#8120)
This feature implements [PUT Bucket object lock configuration][1] and
[GET Bucket object lock configuration][2]. After object lock
configuration is set, existing and new objects are set to WORM for
specified duration. Currently Governance mode works exactly like
Compliance mode.

Fixes #8101

[1] https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketPUTObjectLockConfiguration.html
[2] https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGETObjectLockConfiguration.html
2019-11-12 14:50:18 -08:00
Harshavardhana 2dad14974e
Handle port as json.Number for DNS records in etcd (#8513) 2019-11-12 14:42:35 -08:00
Harshavardhana d97d53bddc
Honor etcd legacy v/s new config settings properly (#8510)
This PR also fixes issues related to

- Add proper newline for `mc admin config get` output
  for more than one targets
- Fixes issue of temporary user credentials to have
  consistent output
- Fixes a crash when setting a key with empty values
- Fixes a parsing issue with `mc admin config history`
- Fixes gateway ENV handling for etcd server and gateway
2019-11-12 03:16:25 -08:00
Harshavardhana 1027afa853
Indicate backend encrypted only if encryption is requested (#8508) 2019-11-11 18:42:10 -08:00
Harshavardhana aa04f97f95 Config migration should handle plain-text (#8506)
This PR fixes issues found in config migration

 - StorageClass migration error when rrs is empty
 - Plain-text migration of older config
 - Do not run in safe mode with incorrect credentials
 - Update logger_http documentation for _STATE env

Refer more reported issues at #8434
2019-11-11 12:01:21 -08:00
Kanagaraj M 4082764d48 fix loading config from openid config_url (#8503) 2019-11-11 09:31:46 -08:00
Harshavardhana 822eb5ddc7 Bring in safe mode support (#8478)
This PR refactors object layer handling such
that upon failure in sub-system initialization
server reaches a stage of safe-mode operation
wherein only certain API operations are enabled
and available.

This allows for fixing many scenarios such as

 - incorrect configuration in vault, etcd,
   notification targets
 - missing files, incomplete config migrations
   unable to read encrypted content etc
 - any other issues related to notification,
   policies, lifecycle etc
2019-11-09 09:27:23 -08:00
Harshavardhana 26863009c0 Load certs even if they are symlinks (#8494) 2019-11-08 11:59:20 +05:30
Harshavardhana 1e457dda7e Enhance config restore to carry previous set content as well (#8483)
This PR brings support for `history` list to
list in the following agreed format

```
~ mc admin config history list -n 2 myminio
RestoreId: df0ebb1e-69b0-4043-b9dd-ab54508f2897
Date: Mon, 04 Nov 2019 17:27:27 GMT

region name="us-east-1" state="on"
region name="us-east-1" state="on"
region name="us-east-1" state="on"
region name="us-east-1" state="on"

RestoreId: ecc6873a-0ed3-41f9-b03e-a2a1bab48b5f
Date: Mon, 04 Nov 2019 17:28:23 GMT

region name=us-east-1 state=off
```

This PR also moves the help templating and coloring to
fully `mc` side instead than `madmin` API.
2019-11-05 06:18:26 -08:00
Anis Elleuch 26ed9e81b1 lifecycle: Send delete notification when deleting objects (#8416) 2019-11-04 15:52:03 -08:00
Harshavardhana 4e63e0e372 Return appropriate errors API versions changes across REST APIs (#8480)
This PR adds code to appropriately handle versioning issues
that come up quite constantly across our API changes. Currently
we were also routing our requests wrong which sort of made it
harder to write a consistent error handling code to appropriately
reject or honor requests.

This PR potentially fixes issues

 - old mc is used against new minio release which is incompatible
   returns an appropriate for client action.
 - any older servers talking to each other, report appropriate error
 - incompatible peer servers should report error and reject the calls
   with appropriate error
2019-11-04 09:30:59 -08:00
Harshavardhana 07a556a10b Avoid ListBuckets() call instead rely on simple HTTP GET (#8475)
This is to avoid making calls to backend and requiring
gateways to allow permissions for ListBuckets() operation
just for Liveness checks, we can avoid this and make
our liveness checks to be more performant.
2019-11-01 16:58:10 -07:00
Harshavardhana d28bcb4f84 Migrate all backend at .minio.sys/config to encrypted backend (#8474)
- Supports migrating only when the credential ENVs are set,
  so any FS mode deployments which do not have ENVs set will
  continue to remain as is.
- Credential ENVs can be rotated using MINIO_ACCESS_KEY_OLD
  and MINIO_SECRET_KEY_OLD envs, in such scenarios it allowed
  to rotate the encrypted content to a new admin key.
2019-11-01 15:53:16 -07:00
Praveen raj Mani fa325665b1 Do not append the endpoint for fs/xl disks in StorageInfo (#8472) 2019-10-31 09:13:54 -07:00
Andreas Auernhammer eac518b178 admin API: change returned HTTP error in hardware info (#8471)
This commit replaces the returned error message by
the hardware info handler from `Method-Not-Allowed`
to `Bad-Request` since the current HTTP error is not
correct according to the HTTP spec.

In particular:
```
The origin server MUST generate an Allow header field
in a 405 response containing a list of the target
resource's currently supported methods.
```
From: https://tools.ietf.org/html/rfc7231#section-6.5.5
2019-10-30 23:41:18 -07:00
Harshavardhana 9e7a3e6adc Extend further validation of config values (#8469)
- This PR allows config KVS to be validated properly
  without being affected by ENV overrides, rejects
  invalid values during set operation

- Expands unit tests and refactors the error handling
  for notification targets, returns error instead of
  ignoring targets for invalid KVS

- Does all the prep-work for implementing safe-mode
  style operation for MinIO server, introduces a new
  global variable to toggle safe mode based operations
  NOTE: this PR itself doesn't provide safe mode operations
2019-10-30 23:39:09 -07:00
Harshavardhana 599aae5ba6 Move all List calls to honor new maxObjectList limit (#8459) 2019-10-30 13:20:01 -07:00
Anis Elleuch 8cc5ecec23 xl: Fix locking in xl HealObject (#8455)
Move locking to the correct location, before loading
object data.
2019-10-30 11:40:57 -07:00
Harshavardhana 47b13cdb80 Add etcd part of config support, add noColor/json support (#8439)
- Add color/json mode support for get/help commands
- Support ENV help for all sub-systems
- Add support for etcd as part of config
2019-10-30 00:04:39 -07:00
Harshavardhana 1f481c0967 Return appropriate error if user,group,policy doesn't exist (#8465)
Fixes https://github.com/minio/mc/issues/2944
2019-10-29 19:50:26 -07:00
Anis Elleuch 4cec0501ca heal: Remove daily sweeper code (#8462)
This has no effect on the functional change of the server
2019-10-29 14:13:05 -07:00
cc 1b6de05a51 refine NextMarker comments (#8450) 2019-10-28 13:18:12 -07:00
Harshavardhana a2825702f8
Increase maximum 1000 List keys to 10000 (#8444) 2019-10-28 10:36:15 -07:00
Anis Elleuch a49d4a9cb2 xl: Rewrite auto-healing and implement auto new-disk healer (#8114)
The new auto healing model selects one node always responsible
for auto-healing the whole cluster, erasure set by erasure set.
If that node dies, another node will be elected as a leading
operator to perform healing.

This code also adds a goroutine which checks each 10 minutes
if there are any new unformatted disks and performs its healing
in that case, only the erasure set which has the new disk will
be healed.
2019-10-28 10:27:49 -07:00
Nitish Tiwari 10b526ed86 Fix panic during trace requests (#8448)
While Tracing requests on server, type assertion on logger.ResponseWriter
caused nil pointer exception because of recordAPIStats{} being
used as ResponseWriter. This PR avoids the type assertion and
initializes a new logger.ResponseWriter.

Fixes regression introduced in #8003
2019-10-27 08:49:16 -07:00
Krishna Srinivas 980bf78b4d Detect underlying disk mount/unmount (#8408) 2019-10-25 10:37:53 -07:00
Harshavardhana 8aaaa46be9 Fix typo in prometheus getBucketLocation metrics (#8438) 2019-10-23 18:54:22 -07:00
Harshavardhana ee4a6a823d Migrate config to KV data format (#8392)
- adding oauth support to MinIO browser (#8400) by @kanagaraj
- supports multi-line get/set/del for all config fields
- add support for comments, allow toggle
- add extensive validation of config before saving
- support MinIO browser to support proper claims, using STS tokens
- env support for all config parameters, legacy envs are also
  supported with all documentation now pointing to latest ENVs
- preserve accessKey/secretKey from FS mode setups
- add history support implements three APIs
  - ClearHistory
  - RestoreHistory
  - ListHistory
- add help command support for each config parameters
- all the bug fixes after migration to KV, and other bug
  fixes encountered during testing.
2019-10-22 22:59:13 -07:00
Praveen raj Mani 8836d57e3c The prometheus metrics refractoring (#8003)
The measures are consolidated to the following metrics

- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `disks_offline`: Total number of offline disks in current MinIO instance.
- `disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests.

And this PR also modifies the current StorageInfo queries

- Decouples StorageInfo from ServerInfo .
- StorageInfo is enhanced to give endpoint information.

NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR

Fixes #7873
2019-10-22 21:01:14 -07:00
poornas f01d53b20f cache: do not evict entry on ErrClosedPipe (#8432)
Fixes: #8431. If client prematurely closes the read end of the pipe,
cache entry should not be evicted.
2019-10-22 15:04:25 -07:00
Harshavardhana 40fcd3dc48 Deprecate listDirFactory in HealObjects, rely on ListObjectsHeal (#8419) 2019-10-22 03:13:04 +05:30
poornas 1b74ce3924 Ensure actual object size is sent in notification (#8418)
Fixes: #8407
2019-10-20 23:48:19 -07:00
Anis Elleuch 7bf093c06a xl: Fix isObject() to consider not found disks (#8411)
xl.isObject() returns 'nil' for not found disks when
calculating the existance of xl.json for a given object,
which what StatFile() is also doing (setting nil) if
xl.json exists.

This commit avoids this confusion by setting errDiskNotFound
error when the storage disk is not found.
2019-10-18 23:19:11 +05:30
Kaan Kabalak 140a7eadb4 Fix browser showing compressed instead of actual object size (#8412)
Fixes #8930
2019-10-18 18:21:52 +05:30
Harshavardhana fce2d6ddd1
Remote update should be on by default (#8413)
Fixes a regression introduced in PR #8351
2019-10-17 20:54:02 -07:00
Ashish Kumar Sinha 18cb15559d Add network hardware info (#8358)
peerRESTVersion changed to v6
2019-10-17 04:09:49 -07:00
poornas 3adc311c1c Fix regression in admin router when no route matches (#8409)
When `mc admin user add` is attempted in gateway mode without
etcd setup, NoSuchBucket error is returned instead of MethodNotAllowed.
Regression from commit - e48005ddc7
2019-10-16 20:39:23 -07:00
Anis Elleuch ee05280721 fs: Remove stale background append temporary file (#8404)
Background append creates a temporary file which appends
uploaded parts as long as they are available, but when a
client stops the upload, the temporary file is not removed
by any way.

This commit removes the temporary file when the server does
its regular removing stale multipart uploads.
2019-10-17 00:27:52 +05:30
poornas c4e2af8ca3 Remove cache env from server help message (#8405) 2019-10-16 23:22:57 +05:30
Harshavardhana 5afb1b6747
Add support for {jwt:sub} substitutions for policies (#8393)
Fixes #8345
2019-10-16 08:59:59 -07:00
Harshavardhana f2cc97a44c
Improve MQTT token registration retry (#8397) 2019-10-15 19:39:14 -07:00
Harshavardhana d48fd6fde9
Remove unusued params and functions (#8399) 2019-10-15 18:35:41 -07:00
Harshavardhana 68a519a468
Use errgroups instead of sync.WaitGroup as needed (#8354) 2019-10-14 09:44:51 -07:00
poornas d7060c4c32 Allow logging targets to be configured to receive `minio` (#8347)
specific errors, `application` errors or `all` by default.

console logging on server by default lists all logs -
enhance admin console API to accept `type` as query parameter to
subscribe to application/minio logs.
2019-10-11 18:50:54 -07:00
Harshavardhana bd10640846 Fix logger and audit http endpoint config lookup 2019-10-11 18:33:25 -07:00
Harshavardhana 175b07d6e4
Fix queueStore stops working with concurrent PUT/DELETE requests (#8381)
- This PR fixes situation to avoid underflow, this is possible
  because of disconnected operations in replay/sendEvents
- Hold right locks if Del() operation is performed in Get()
- Remove panic in the code and use loggerOnce
- Remove Timer and instead use Ticker instead for proper ticks
2019-10-11 17:46:03 -07:00
Ashish Kumar Sinha 1c90485b56 Remove duplicate cpu hardware info (#8384) 2019-10-12 00:15:43 +05:30
Harshavardhana 36e12a6038 Assume local endpoints appropriately in k8s deployments (#8375)
On Kubernetes/Docker setups DNS resolves inappropriately
sometimes where there are situations same endpoints with
multiple disks come online indicating either one of them
is local and some of them are not local. This situation
can never happen and its only a possibility in orchestrated
deployments with dynamic DNS. Following code ensures that we
treat if one of the endpoint says its local for a given host
it is true for all endpoints for the same host. Following code
ensures that this assumption is true and it works in all
scenarios and it is safe to assume for a given host.

This PR also adds validation such that we do not crash the
server if there are bugs in the endpoints list in dsync
initialization.

Thanks to Daniel Valdivia <hola@danielvaldivia.com> for
reproducing this, this fix is needed as part of the
https://github.com/minio/m3 project.
2019-10-10 10:14:17 +05:30
Harshavardhana 6a4ef2e48e Initialize configs correctly, move notification config (#8367)
This PR also removes deprecated tests, adds checks
to avoid races reproduced on CI/CD.
2019-10-09 11:41:15 +05:30
Harshavardhana d2a8be6fc2 gateway/hdfs: Fix isObjectDir to behave correctly (#8368) 2019-10-09 04:20:43 +05:30
Harshavardhana 290ad0996f Move etcd, logger, crypto into their own packages (#8366)
- Deprecates _MINIO_PROFILER, `mc admin profile` does the job
- Move ENVs to common location in cmd/config/
2019-10-08 11:17:56 +05:30
Harshavardhana 3b8adf7528 Move storageclass config handling into cmd/config/storageclass (#8360)
Continuation of the changes done in PR #8351 to refactor,
add tests and move global handling into a more idiomatic
style for Go as packages.
2019-10-07 11:20:24 +05:30
Harshavardhana e85df07518 Add prometheus auth-type to turn-off authentication (#8356)
Also this PR moves the original doc from cookbook to
MinIO repo under docs/metrics/prometheus/

Fixes #8323
2019-10-04 23:48:59 +05:30
Harshavardhana 589e32a4ed Refactor config and split them in packages (#8351)
This change is related to larger config migration PR
change, this is a first stage change to move our
configs to `cmd/config/` - divided into its subsystems
2019-10-04 23:05:33 +05:30
Ashish Kumar Sinha 74008446fe CPU hardware info (#8187) 2019-10-03 20:18:38 +05:30
Praveen raj Mani e48005ddc7 Add more context to rpc version mismatch errors (#8271)
Fixes #5665
2019-10-03 00:08:12 -07:00
Harshavardhana 90bfa6260a Fix LDAP TLS support to use custom CAs (#8352) 2019-10-03 01:44:57 +05:30
Harshavardhana 8b80eca184 List buckets only once per sub-system initialization (#8333)
Current master repeatedly calls ListBuckets() during
initialization of multiple sub-systems

Use single ListBuckets() call for each sub-system as
follows

- LifeCycle
- Policy
- Notification
2019-10-02 05:35:02 +05:30
Harshavardhana fb1374f2f7 Rename iam/validator -> iam/openid and add tests (#8340)
Refactor as part of config migration
2019-10-02 03:37:20 +05:30
Harshavardhana ff5bf51952 admin/heal: Fix deep healing to heal objects under more conditions (#8321)
- Heal if the part.1 is truncated from its original size
- Heal if the part.1 fails while being verified in between
- Heal if the part.1 fails while being at a certain offset

Other cleanups include make sure to flush the HTTP responses
properly from storage-rest-server, avoid using 'defer' to
improve call latency. 'defer' incurs latency avoid them
in our hot-paths such as storage-rest handlers.

Fixes #8319
2019-10-02 01:42:15 +05:30
Anis Elleuch 61927d228c listV2: Continuation and NextContinuation tokens are encoded with base64 (#8337)
Minio V2 listing uses object names/prefixes as continuation tokens. This
is problematic when object names contain some characters that are forbidden
in XML documents. This PR will use base64 encoded form of continuation
and next continuation tokens to address that corner case.
2019-10-02 01:39:29 +05:30
Yao Zongyou 6a19d7b25a skip checking error also on Mac in TestCheckPortAvailability (#8343) 2019-10-01 23:12:18 +05:30
Harshavardhana f45977d371
Fix error handling in DeleteFileBulk storage handler (#8327)
errors.errorString() cannot be marshalled by gob
encoder, so using a slice of []error would fail
to be encoded. This leads to no errors being
generated instead gob.Decoder on the storage-client
would see an io.EOF

To avoid such bugs introduce a typed error for
handling such translations and register this type
for gob encoding support.
2019-09-30 19:01:28 -07:00
Harshavardhana 4ec9b349d0
azure: Fix upload corruption with PutObject() on certain sizes (#8330)
On objects bigger than 100MiB can have a corrupted object
stored due to partial blockListing attempted right after
each blocks uploaded. Simplify this code to ensure that
all the blocks successfully uploaded are committed right
away.

This PR also updates the azure-sdk-go to latest release.
2019-09-30 18:42:18 -07:00
poornas 5c2af3f792 Add more context to error messages in STS handlers(#8304) 2019-10-01 02:35:19 +05:30
Ashish Kumar Sinha fa5a1cebd9 support space character in access key (#8335) 2019-10-01 02:25:37 +05:30
Harshavardhana 127641731a
Parallelize initialization of storageDisks (#8288) 2019-09-27 16:47:12 -07:00
Harshavardhana 4155f4e49b
trace: Print either Transfer-Encoding or Content-Length (#8314)
If Transfer-Encoding is set client would have
never set Content-Length as its considered
malformed HTTP request
2019-09-27 10:19:27 -07:00
Bala FA 2a2ff96ee1 change `ReadPerf` into `ReadThroughput` in NetPerfInfo. (#8316)
Previously `ReadPerf` was in time.Duration is changed to `ReadThroughput` in uint64.
2019-09-27 00:01:18 +05:30
Harshavardhana fd53057654 Add InfoCannedPolicy API to fetch only necessary policy (#8307)
This PR adds
- InfoCannedPolicy() API for efficiency in fetching policies
- Send group memberships for LDAPUser if available
2019-09-26 23:53:13 +05:30
Klaus Post ff726969aa Switch to Snappy -> S2 compression (#8189) 2019-09-25 23:08:24 -07:00
Harshavardhana c8fbc94329
Fix writing 'format.json' and make it atomic (#8296)
- Choose a unique uuid such that under situations of duplicate
  mounts we do not append to an existing json entry.
- Avoid AppendFile instead use WriteAll() to write the entire
  byte array atomically.
2019-09-24 18:47:26 -07:00
Anis Elleuch a790877c01 s3: Encode continuation & next continuation tokens when asked (#8292)
When url encoding is passed in v2 listing handler, continuationToken
and nextContinuationToken needs to be encoded. The reason is that
both represents an object name/prefix in Minio server and it could
contain a character unsupported by XML specification.
2019-09-24 05:30:53 +05:30
Harshavardhana 77dc2031a2 Fix LDAP responseXML to be named appropriately (#8285)
This PR additionally also adds support for missing

- Session policy support for AD/LDAP
- Add API request/response parameters detail
- Update example to take ldap username,
  password input from the command line
- Fixes session policy handling for
  ClientGrants and WebIdentity
2019-09-24 03:51:16 +05:30
Harshavardhana 975134e42b
Add checks in DiskInfo() to protect against changing mounts (#8286) 2019-09-23 15:16:55 -07:00
Andreas Auernhammer cb7d23cb17 remove SSE-S3 key rotation in CopyObject (#8278)
This commit removes the SSE-S3 key rotation functionality
from CopyObject since there will be a dedicated Admin-API
for this purpose.

Also update the security documentation to link to mc and
the admin documentation.
2019-09-24 02:05:04 +05:30
poornas 2e02e1889b Cleanup ResponseWriter function for audit and trace (#8283) 2019-09-24 02:04:28 +05:30
ebozduman dbf7b1e573 starts-with policy condition support issue (#7937) 2019-09-22 14:20:49 -07:00
Harshavardhana 26985ac632 Fix all failing tests with -race 2019-09-22 11:01:46 -07:00
Praveen raj Mani ad75683bde Authorize prometheus endpoint with bearer token (#7640) 2019-09-22 20:27:12 +05:30
poornas 4925bc3e80 log server startup messages to admin console api (#8264) 2019-09-22 13:54:32 +05:30
Andreas Auernhammer ffded5a930 make the crypto error type a native go type (#8267)
This commit makes the `crypto.Error` type a native go (string)
type. That allows us to define error values as constants instead
of variables.

For reference see:
 - https://twitter.com/_aead_/status/1118170258215514115?s=20
 - https://dave.cheney.net/2016/04/07/constant-errors
2019-09-22 01:12:51 -07:00
Andreas Auernhammer 2b51fe9f26 make SSE request header check comprehensive (#8276)
This commit refactors the SSE header check
by moving it into the `crypto` package, adds
a unit test for it and makes the check comprehensive.
2019-09-21 03:26:12 +05:30
Harshavardhana 4780fa5a58 Remove setting net.Conn Deadlines as its not needed anymore (#8269)
This commit fixes a bug introduced in af6c6a2b35.

Setting deadlines in Go results in arbitrary hangs as reported here
https://github.com/golang/go/issues/34385

Fixes https://github.com/minio/minio/issues/7852
2019-09-20 23:37:24 +05:30
Andreas Auernhammer b823d6d7bd remove the unused code for decrypting `io.Writer` (#8277)
This commit removes unused code for decrypting
`io.Writer` since the actual implementation only
decrypts `io.Reader`
2019-09-20 14:51:07 +05:30
Andreas Auernhammer a9d724120f remove TLS 1.3 opt-in code (#8275)
This commit removes the TLS 1.3 opt-in code.
Since TLS 1.3 is opt-out for >= Go 1.13 this
code is not needed anymore.
2019-09-20 01:51:44 +05:30
Andreas Auernhammer e34369c860 prepare SSE-S3 metadata parsing for K/V data key store (#8259)
This commit allows the MinIO server to parse the metadata if:
 - either the `X-Minio-Internal-Server-Side-Encryption-S3-Key-Id`
   and the `X-Minio-Internal-Server-Side-Encryption-S3-Kms-Sealed-Key`
   entries are present.
 - or *both* headers are not present.

This is in service to support a K/V data key store.
2019-09-19 04:08:09 +05:30
Praveen raj Mani 456ce4cc92 Add rootCAs support to Kafka & MQTT (#8236)
Fixes #8211
2019-09-18 23:43:04 +05:30
Harshavardhana cb01516a26 In HDFS gateway fix non-empty folder behavior (#8254)
To be compatible with our FS and Erasure coded
mode deployments, make sure that we do not send
200 OK for folders which have files inside.

Fixes #8143
2019-09-18 01:59:59 +05:30
Krishnan Parthasarathi 31bee6b6ed Remove size query parameter from PerfInfo handler (#8258) 2019-09-18 01:59:12 +05:30
poornas 04b92124c5 fs/xl: Log warning if cache config specified (#8251)
in non-gateway mode.
2019-09-16 19:55:52 -07:00
Harshavardhana 5392eee250 Avoid recursion and use a simple loop to merge entries (#8239)
This avoids stack overflows when there are
lot of entries to be skipped, this PR also
optimizes the code to reuse the buffers.
2019-09-17 06:08:37 +05:30
Harshavardhana 14b137aa66 posix/readDir should populate name for DT_UKNOWN (#8240)
In commit a8296445ad we changed the code to handle
some corner cases on ARM and other platforms, this
PR just avoids the return for unknown filetypes
prematurely and let the name be populated appropriately.

This fixes bug for older XFS implementations such as
in Ubuntu 14.04
2019-09-17 03:04:01 +05:30
Andreas Auernhammer 3064da7b08 return error during part listing when no quorum (#8241)
This commit fixes a subtle bug that (probably)
caused an issue affecting encrypted multipart objects.

When a cluster has no quorum this bug causes `ListObjectParts`
to return nil as error instead of a quorum error.

Thanks to @harshavardhana for detecting this.
2019-09-17 02:57:34 +05:30
poornas 76df027264 Allow caching only in gateway mode. (#8232)
This PR changes cache on PUT behavior to background fill the cache
after PutObject completes. This will avoid concurrency issues as in #8219.

Added cleanup of partially filled cache to prevent cache corruption
- Fixes #8208
2019-09-17 02:54:04 +05:30
Harshavardhana 9ac12cf898
Remove unusued Set/GetConfigKeys API (#8235) 2019-09-13 16:34:34 -07:00
Harshavardhana e7f491a14b Use optimized sha256-simd whenever possible (#8227)
Avoid using `crypto/sha256` and use always
`github.com/minio/sha256-simd`
2019-09-14 00:39:39 +05:30
Praveen raj Mani 8700945cdf Handle connection failures on webhook/url pings (#8204)
Properly handle connection failures while replaying events

Fixes #8194
2019-09-12 16:44:51 -07:00
Harshavardhana ff6aabd9c0 Honor standard HTTP headers for sourceIP (#8233)
Behind load balancers we should be tracing sourceIP
preserved by load balancers.
2019-09-13 03:59:59 +05:30
Krishnan Parthasarathi 6ba323b009 Add ability to test drive speeds on a MinIO setup (#7664)
- Extends existing Admin API to measure disk performance
2019-09-13 03:22:30 +05:30
Anis Elleuch e7b3f39064 xl: Fix verifying non streaming highway algo with a dist setup (#8230)
VerifyFile in the distributed setup does not work with
the non streaming highway hash. The reason is that the
internode mux router did not expect `storageRESTBitrotHash`
parameter.
2019-09-12 13:08:02 -07:00
Harshavardhana 9fa727d154 Provide a friendlier error when an update fails (#8228)
Add upgrading documentation as well
2019-09-13 01:33:42 +05:30
Harshavardhana 73e4e99942 Hosts should be skipped, when calculating local info (#8191)
endpoint.IsLocal will not have .Host entries so
using them to skip double entries will never work.

change the code such that we look for endpoint.Host
outside of endpoint.IsLocal logic to skip double
hosts appropriately.

Move these functions to their appropriate file.
2019-09-12 23:36:12 +05:30
Alex Pardoe a87fc7d09b Use the B2 'list' endpoint to determine file ID (#8169)
- More effective deletion and checking for existence.
- Rever Dockerfile.
- Add a 'GOPROXY' to the Dockerfile to workaround Apache issues.
2019-09-12 22:48:47 +05:30
Harshavardhana 475df52a19 Fix etcd watch regression in IAM subsystem (#8224)
Fixes #8223
2019-09-12 07:24:25 +05:30
Anis Elleuch 3f258062d8 bitrot: Verify file size inside storage interface (#7932) 2019-09-12 02:19:53 +05:30
Harshavardhana 53e4887e02 Simplify and cleanup metadata r/w functions (#8146) 2019-09-11 22:52:12 +05:30
Harshavardhana a7be313230 Start using new errors package (#8207) 2019-09-11 22:51:43 +05:30
Harshavardhana e12f52e2c6 Enhancements to daily-sweeper routine to reduce CPU load (#8209)
- ListObjectsHeal should list only objects
  which need healing, not the entire namespace.
- DeleteObjects() to be used to delete 1000s of
  objects in bulk instead of serially.
2019-09-11 00:38:44 +05:30
Aditya Manthramurthy a0456ce940 LDAP STS API (#8091)
Add LDAP based users-groups system

This change adds support to integrate an LDAP server for user
authentication. This works via a custom STS API for LDAP. Each user
accessing the MinIO who can be authenticated via LDAP receives
temporary credentials to access the MinIO server.

LDAP is enabled only over TLS.

User groups are also supported via LDAP. The administrator may
configure an LDAP search query to find the group attribute of a user -
this may correspond to any attribute in the LDAP tree (that the user
has access to view). One or more groups may be returned by such a
query.

A group is mapped to an IAM policy in the usual way, and the server
enforces a policy corresponding to all the groups and the user's own
mapped policy.

When LDAP is configured, the internal MinIO users system is disabled.
2019-09-10 04:42:29 +05:30
Harshavardhana b52a3e523c Avoid using fastjson parser pool, move back to jsoniter (#8190)
It looks like from implementation point of view fastjson
parser pool doesn't behave the same way as expected
when dealing many `xl.json` from multiple disks.

The fastjson parser pool usage ends up returning incorrect
xl.json entries for checksums, with references pointing
to older entries. This led to the subtle bug where checksum
info is duplicated from a previous xl.json read of a different
file from different disk.
2019-09-06 04:21:27 +05:30
poornas 259a5d825b cache - fix corruption when client prematurely terminates request (#8155) 2019-09-05 23:33:32 +05:30
poornas 29f64355ce Allow caching on single PutObject (#8100) 2019-09-05 19:50:16 +05:30
Nitish Tiwari 496fba3e9a
Return 200 OK for liveness checks while distributed cluster starts (#8176)
With this PR, liveness check responds with 200 OK with "server-not-
initialized" header while objectLayer gets initialized. The header
is removed as objectLayer is initialized. This is to allow
MinIO distributed cluster to get started when running on an
orchestration platforms like Docker Swarm.

This PR also updates sample Swarm yaml files to use correct values
for healthcheck fields.

Fixes #8140
2019-09-05 14:50:56 +05:30
Andreas Auernhammer 810a44e951 KMS Admin-API: add route and handler for KMS key info (#7955)
This commit adds an admin API route and handler for
requesting status information about a KMS key.

Therefore, the client specifies the KMS key ID (when
empty / not set the server takes the currently configured
default key-ID) and the server tries to perform a dummy encryption,
re-wrap and decryption operation. If all three succeed we know that
the server can access the KMS and has permissions to generate, re-wrap
and decrypt data keys (policy is set correctly).
2019-09-05 01:49:44 +05:30
Praveen raj Mani 341d61e3d8 Fix for web-uploads in federated mode (#8175)
Fixes #8173
2019-09-04 23:14:02 +05:30
poornas 8a71b0ec5a Add admin API to send console log messages (#7784)
Utilized by mc admin console command.
2019-09-03 23:40:48 +05:30
Anis Elleuch b3c19e2d4b storage: Expect empty param in REST requests (#8167)
Empty parameter was forgotten to be added to restQueries() function,
scanning with deep parameter wasn't working properly for distributed
setup.
2019-08-31 13:51:25 +05:30
Bala FA fa3546bb03 Add NetPerfInfo() API in madmin (#8112) 2019-08-31 08:27:53 +05:30
Harshavardhana 42e716a094
formatsToDrivesInfo should return drives with correct order (#8157)
This is a defensive change to avoid any future issues,
from this part of the code. New change also ensures
to populate UUID if present for the right disk.
2019-08-30 14:11:18 -07:00
Andreas Auernhammer 6b2ed0fc47 fix `DownloadZIP` for encrypted objects (#8159)
This commit fixes the web ZIP download handler for
encrypted objects. The decryption logic has moved into
`getObjectNInfo`. So trying to decrypt the (already decrypted)
content again in the ZIP handler obviously causes an error.

This commit fixes this by removing the decryption logic from the
the handler.

Fixes #7965
2019-08-30 10:46:09 -07:00
Harshavardhana 0cd0f6c255
Avoid error modification during IAM migration (#8156)
The underlying errors are important, for IAM
requirements and should wait appropriately at
the caller level, this allows for distributed
setups to run properly and not fail prematurely
during startup.

Also additionally fix the onlineDisk counting
2019-08-30 10:41:02 -07:00
Aditya Manthramurthy 847a3ea0a2 Add unit tests and refactor to improve coverage (#7617) 2019-08-29 13:53:27 -07:00
Aditya Manthramurthy 1f3d270de8 Fix delete policy routing (#8145) 2019-08-29 07:07:43 +05:30
Aditya Manthramurthy eb18c82976 Remove policy query param from being rejected for objects (#8144) 2019-08-28 16:58:40 -07:00
Krishna Srinivas 2ab0681c0c Do not ignore Lock()'s return value (#8142) 2019-08-28 16:12:57 -07:00
Harshavardhana 83d4c5763c
Decouple ServiceUpdate to ServerUpdate to be more native (#8138)
The change now is to ensure that we take custom URL as
well for updating the deployment, this is required for
hotfix deliveries for certain deployments - other than
the community release.

This commit changes the previous work d65a2c6725
with newer set of requirements.

Also deprecates PeerUptime()
2019-08-28 15:04:43 -07:00
Harshavardhana d65a2c6725
Implement cluster-wide in-place updates (#8070)
This PR is a breaking change and also deprecates
`minio update` command, from this release onwards
all users are advised to just use `mc admin update`
2019-08-27 11:37:47 -07:00
Harshavardhana 70136fb55b
Look for network errors appropriately for RemoteStorageAPI (#8128)
net.Error is very unreliable in providing better error
handling, we need to ensure that we always have a fallback
option in case of network failures.

This fixes an important issue in our distributed server
setups when one of the servers is down, all deployments
out there are recommended to upgrade after this fix is
merged to ensure that availability is not lost.

Fixes #8127
Fixes #8016
Fixes #7964
2019-08-25 13:32:49 -07:00
Harshavardhana d6dd98e597
Avoid data-race in getDisksInfo call (#8126) 2019-08-23 17:03:15 -07:00
Krishna Srinivas c38ada1a26 write() to disk in 4MB blocks for better performance (#7888) 2019-08-23 15:36:46 -07:00
poornas 48bc3f1d53 Allow cached content to be encrypted (#8001)
If MINIO_CACHE_ENCRYPTION_MASTER_KEY is set,
automatically encrypt all cached content on disk.
2019-08-23 10:13:22 -07:00
Praveen raj Mani e211f6f52e Parallelize the DiskInfo calls in xl.StorageInfo() (#8115) 2019-08-22 20:02:40 -07:00
Harshavardhana f13f421e84
Allow CopyObject in pathStyle across federated instances (#8064)
Fixes #7976
2019-08-21 22:02:39 -10:00
Aditya Manthramurthy cd03bfb3cf Fix ignoring claims in list buckets call (#8118) 2019-08-21 19:20:11 -10:00
poornas 2e19619e79 browser: Avoid logging BucketNotEmpty error (#8110) 2019-08-21 10:01:46 -10:00
Harshavardhana 2fa98b1d6a Convert errAuthentication as AccessDenied appropriately (#8105)
Fixes #8062
2019-08-21 09:13:15 +05:30
kannappanr 99a4298938 Use a non-strict invalid bucket name check in Get and Delete object (#8073) 2019-08-20 17:40:52 -10:00
Harshavardhana 069badc7e9
Allow CopyObjectPart to work in federated setups (#8066)
Fixes #8065
2019-08-20 07:19:22 -10:00
Harshavardhana c601cb2f1e
Add listBucketObjectsVersions implementation (#8093)
This API implementation simply behaves like listObjects()
but returns back single version for each object, this
implementation should be considered dummy it is only
meant for some applications which rely on this.
2019-08-19 11:02:54 -10:00
Harshavardhana 9ca7470ccc
Avoid using jsoniter, move to fastjson (#8063)
This is to avoid using unsafe.Pointer type
code dependency for MinIO, this causes
crashes on ARM64 platforms

Refer #8005 collection of runtime crashes due
to unsafe.Pointer usage incorrectly. We have
seen issues like this before when using
jsoniter library in the past.

This PR hopes to fix this using fastjson
2019-08-19 08:35:52 -10:00
Harshavardhana b3ca304c01
Avoid excessive listing attempts in the daily sweep (#8081)
Add better dynamic timeouts for locks, also
add jitters before launching daily sweep to ensure
that not all the servers in distributed setup
are not trying to hold locks to begin the sweep
round.

Also, add enough delay for incoming requests based
on totalSetCount*totalDriveCount.

A possible fix for #8071
2019-08-19 08:22:32 -10:00
Bala FA 60f52f461f add network read performance collection support. (#8038)
ReST API on /minio/admin/v1/performance?perfType=net[?size=N] 
returns

```
{
  "PEER-1": [
             {
	       "addr": ADDR,
	       "readPerf": DURATION,
	       "error": ERROR,
	     },
	     ...
	   ],
  ...
  ...
  "PEER-N": [
             {
	       "addr": ADDR,
	       "readPerf": DURATION,
	       "error": ERROR,
	     },
	     ...
	   ]
}
```
2019-08-19 08:26:32 +05:30
Harshavardhana a15bb19d37
Allow audit logging to work while tracing (#8077)
It is observed that when `mc admin trace` is being
used due to ResponseWriter wrapper, we loose information
about statusCode,statusText for audit logging.

This PR fixes this behavior
2019-08-15 16:17:46 -07:00
Harshavardhana 6e7962bf35
Return if paths are empty in DeleteFileBulk (#8085)
This avoids a network call, also fixes an issue
when empty paths are passed the underlying call
fails with "405 Method Not Allowed".

This is reproducible when you are deleting a
non-existent object.

Fixes #8083
2019-08-15 13:15:49 -07:00
Aditya Manthramurthy 825e29f301 Check if user or group is disabled when evaluating policy (#8078) 2019-08-14 16:59:16 -07:00
Krishnan Parthasarathi bbb56739bd Add User-Agent header with MinIO release details in http logs (#7843)
This would allow http log target server to distinguish between log
messages across different versions of MinIO deployments.
2019-08-14 11:43:43 -07:00
Nitish Tiwari 1cd801b2e9 Fix DeleteObjects() to remove renamed objects inside (#8072) 2019-08-14 11:15:25 -07:00
Aditya Manthramurthy bf9b619d86 Set the policy mapping for a user or group (#8036)
Add API to set policy mapping for a user or group

Contains a breaking Admin APIs change.

- Also enforce all applicable policies
- Removes the previous /set-user-policy API

 Bump up peerRESTVersion

Add get user info API to show groups of a user
2019-08-13 13:41:06 -07:00
maihde 0ed6daab59 fix: #8051 so that stale DNS entries are cleaned-up (#8053) 2019-08-13 08:49:26 -07:00
Harshavardhana bf8ec8ad73
Cleanup ui-errors and print proper error messages (#8068)
* Cleanup ui-errors and print proper error messages

Change HELP to HINT instead, handle more error
cases when starting up MinIO. One such is related
to #8048

* Apply suggestions from code review
2019-08-12 21:25:34 -07:00
Harshavardhana 8ce424bacd Enhance audit logging to capture responseTimes (#8067)
Audit logging requires to have

- timeToFirstByte
- timeToResponse

timing information
2019-08-12 20:32:34 -07:00
Anis Elleuch cea3e3f7a6 browser: Add user-agent header filter to gorilla mux route (#8040)
When a peer client which higher version sends a request to a peer
server with lower version, the returned status code is 200 OK instead
of 405 code. The reason is that the peer client request reaches the
browser handler, which registers itself by '/minio' route but without
any other constraints. Adding filtering by user agent header to the
browser route so internal requests to old endpoints versions return
405 error code.
2019-08-12 17:05:30 -07:00
Harshavardhana af36c92cab
With ListBuckets() access-list only buckets the user has access (#8037)
This is a behavior change from AWS S3, but it is done with
better judgment on our end to allow the listing of buckets only
which user has access to.

The advantage is this declutters the UI for users and only
lists bucket which they have access to.

Precursor for this feature to be applicable is a policy
must have the following actions

```
s3:ListAllMyBuckets
```
and
```
s3:ListBucket
```

enabled in the policy.
2019-08-12 10:27:38 -07:00
Jakob Ackermann 1b258da108 [web-router] update the white list for favicons (#8024) 2019-08-11 22:17:02 -07:00
Andreas Auernhammer 35427a017d fix type conversion in `UpdateKey` for Vault (#8058)
This commit fixes a type conversion in the `UpdateKey`
implementation of Vault.
2019-08-11 22:20:25 +05:30
Harshavardhana 5a28ef0d47 Bump readiness check upto 10000 go-routines (#8057)
Most of our current workloads reach this value
regularly, it doesn't make sense to keep 1000
go-routine limit.
2019-08-10 18:13:14 +05:30
poornas 3385bf3da8 Rewrite cache implementation to cache only on GET (#7694)
Fixes #7458
Fixes #7573 
Fixes #7938 
Fixes #6934
Fixes #6265 
Fixes #6630 

This will allow the cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if a cache-control header
is set and the cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
2019-08-09 17:09:08 -07:00
Anis Elleuch 1ce8d2c476 Add bucket lifecycle expiry feature (#7834) 2019-08-09 10:02:41 -07:00
Harshavardhana a8296445ad
Safely use unsafe.Pointer to avoid crashes on ARM (#8027)
Refactor the Dirent parsing code such that when we
calculate offsets are correct based on the platform
This PR fixes a silent potential crash on ARM
architecture.
2019-08-09 08:54:11 -07:00
Aditya Manthramurthy 5d2b5ee6a9 Refactor IAM to use new IAMStorageAPI (#7999) 2019-08-08 15:10:04 -07:00
kannappanr 930943f058
Fix IAM users migration regression in etcd (#8029)
PR #8008 did not migrate user data stored in etcd.
This PR fixes that.
2019-08-06 17:06:31 -07:00
Harshavardhana e6d8e272ce
Use const slashSeparator instead of "/" everywhere (#8028) 2019-08-06 12:08:58 -07:00
Harshavardhana b52b90412b Avoid data-transfer in distributed locking (#8004) 2019-08-05 11:45:30 -07:00
Harshavardhana 843f481eb3 Allow "tmp" directory to be not available (#8021)
Also additionally add more context to the errors
generated by filesystem, to facilitate better
debugging.
2019-08-05 11:41:29 -07:00
Andreas Auernhammer f6d0645a3c fix DoS vulnerability in the content SHA-256 processing (#8026)
This commit fixes a DoS issue that is caused by an incorrect
SHA-256 content verification during STS requests.

Before that fix clients could write arbitrary many bytes
to the server memory. This commit fixes this by limiting the
request body size.
2019-08-05 10:06:40 -07:00
Aditya Manthramurthy 414a7eca83 Add IAM groups support (#7981)
This change adds admin APIs and IAM subsystem APIs to:

- add or remove members to a group (group addition and deletion is
  implicit on add and remove)

- enable/disable a group

- list and fetch group info
2019-08-02 14:25:00 -07:00
maihde 5cd9f10a02 Support Federation on a single machine (#8009)
When checking if federation is necessary, the code compares
the SRV record stored in etcd against the list of endpoints
that the MinIO server is exposing.  If there is an intersection
in this list the request is forwarded.

The SRV record includes both the host and the port, but the
intersection check previously only looked at the IP address.  This
would prevent federation from working in situations where the endpoint
IP is the same for multiple MinIO servers.  Some examples of where this
can occur are:
 - running mulitiple copies of MinIO on the same host
 - using multiple MinIO servers behind a NAT with port-forwarding
2019-08-02 12:40:51 -07:00
Praveen raj Mani b976521c83 Ignore faulty disks in xl-sets Storage info (#7878) 2019-08-02 12:17:26 -07:00
Andreas Auernhammer a6f4cf61f2 add `UpdateKey` method to KMS interface (#7974)
This commit adds a new method `UpdateKey` to the KMS
interface.

The purpose of `UpdateKey` is to re-wrap an encrypted
data key (the key generated & encrypted with a master key by e.g.
Vault).
For example, consider Vault with a master key ID: `master-key-1`
and an encrypted data key `E(dk)` for a particular object. The
data key `dk` has been generated randomly when the object was created.
Now, the KMS operator may "rotate" the master key `master-key-1`.
However, the KMS cannot forget the "old" value of that master key
since there is still an object that requires `dk`, and therefore,
the `D(E(dk))`.
With the `UpdateKey` method call MinIO can ask the KMS to decrypt
`E(dk)` with the old key (internally) and re-encrypted `dk` with
the new master key value: `E'(dk)`.

However, this operation only works for the same master key ID.
When rotating the data key (replacing it with a new one) then
we perform a `UnsealKey` operation with the 1st master key ID
and then a `GenerateKey` operation with the 2nd master key ID.

This commit also updates the KMS documentation and removes
the `encrypt` policy entry (we don't use `encrypt`) and
add a policy entry for `rewarp`.
2019-08-01 15:47:47 -07:00
Anis Elleuch c5ac901e8d xl: Fix healing empty directories (#8013)
After some extensive refactors, it turned out empty directories
are not healed and heal status is also not reported correctly.

This commit fixes it and adds the appropriate unit tests
2019-08-01 14:13:06 -07:00
Aditya Manthramurthy 4101d4917c Fix IAM users migration regression (#8008) 2019-08-01 12:31:04 -07:00
Harshavardhana 123cccaed1 Honor connection pooling while tracing (#7979)
This PR fixes relying on r.Context().Done()
by setting

```
Connection: "close"
```

HTTP Header, this has detrimental issues for
client side connection pooling. Since this
header explicitly tells clients to turn-off
connection pooling. This causing pro-active
connections to be closed leaving many conn's
in TIME_WAIT state. This can be observed with
`mc admin trace -a` when running distributed
setup.

This PR also fixes tracing filtering issue
when bucket names have `minio` as prefixes,
trace was erroneously ignoring them.
2019-07-31 11:08:39 -07:00
Anis Elleuch cbd02c58be federation: Avoid printing context canceled error (#7997)
Golang proactively prints this error
        `http: proxy error: context canceled`

when a request arrived to the current deployment and
redirected to another deployment in a federated setup.

Since this error can confuse users, this commit will
just hide it.
2019-07-31 11:08:10 -07:00
Aditya Manthramurthy c71895f225 Listen for PolicyDB events from etcd and fix etcd watch handling (#7992) 2019-07-30 18:50:49 -07:00
Praveen raj Mani 63e0a81760 Ignore stale notification queues in notification.xml (#7673)
Allow renaming/editing a notification config. By replying with 
a successful GetBucketNotification response, without checking 
for any missing config ARN in targetList.

Fixes #7650
2019-07-30 14:19:06 +05:30
Harshavardhana 8d47ef503c Fix crash observed in OPA initialization (#7990)
Related to #7982, this PR refactors the code
such that we validate the OPA or JWKS in a
common place.

This is also a refactor which is already done
in the new config migration change. Attempt
to avoid any network I/O during Unmarshal of
JSON from disk, instead do it later when
updating the in-memory data structure.
2019-07-29 15:58:25 -07:00
Harshavardhana 54eded2e6f Do not assume all HTTP errors as Network errors (#7983)
In situations such as when client uploading data,
prematurely disconnects from server such as pressing
ctrl-c before uploading all the data. Under this
situation in distributed setup we prematurely
disconnect disks causing a reconnect loop. This has
an adverse affect we end up leaving a lot of files
in temporary location which ideally should have been
cleaned up when Put() prematurely fails.

This is also a regression which got introduced in #7610
2019-07-29 14:48:18 -07:00
Harshavardhana 94c88890b8 Add additional logging for OPA connections (#7982) 2019-07-28 08:33:25 +05:30
Harshavardhana e871e27562 Refactor and simplify etcd helpers used in IAM subsystem (#7980) 2019-07-26 13:42:54 -07:00
Harshavardhana 007a52b546
Add common validation for compression and encryption (#7978) 2019-07-26 02:41:16 -07:00
Harshavardhana d744865dc6 Enable config for NAS gateway mode (#7948)
Starting with #7751 we don't store config
in etcd anymore, allow NAS to honor config
on disk.
2019-07-25 17:41:25 -07:00
Harshavardhana e40c29e834 Fail appropriately if the disk has I/O errors (#7972)
If the disk has I/O errors, we should simply ignore
such a disk and not be bothered about it - until
it is replaced.
2019-07-25 13:35:27 -07:00
Praveen raj Mani b0cea1c0f3 Enable event persistence in AMQP (#7565) 2019-07-25 11:20:24 -07:00
Harshavardhana 6f2b4675fa
Add krb5 support for HDFS gateway (#7933) 2019-07-24 18:05:48 -07:00
Aditya Manthramurthy 7bdaf9bc50 Update on-disk storage format for users system (#7949)
- Policy mapping is now at `config/iam/policydb/users/myuser1.json`
  and includes version.

- User identity file is now versioned.

- Migrate old data to the new format.
2019-07-24 17:34:23 -07:00
Praveen raj Mani 55d4eee6f1 Enable event persistence in MySQL and PostgreSQL (#7629) 2019-07-24 10:18:29 -07:00
Harshavardhana ac82798d0a Remove uneeded calls on FS (#7967) 2019-07-24 15:59:13 +05:30
Praveen raj Mani c9349747ca Enable event-persistence in NATS and NATS-Streaming (#7612) 2019-07-23 10:37:25 -07:00
Praveen raj Mani 2b9b907f9c Enable event persistence in Redis (#7601) 2019-07-23 10:22:08 -07:00
Daryl Finlay 9389a55e5d Cancel PutObjectPart on upload abort (#7940)
Calling ListMultipartUploads fails if an upload is aborted while a
part is being uploaded because the directory for the upload exists
(since fsRenameFile ends up calling os.MkdirAll) but the meta JSON file
doesn't. To fix this we make sure an upload hasn't been aborted during
PutObjectPart by checking the existence of the directory for the upload
while moving the temporary part file into it.
2019-07-22 22:36:15 -07:00
Christian Muehlhaeuser 38bc3a45db Fixed tautological conditions (#7959)
We already check for err being equal to nil above, no need
to check again.
2019-07-22 17:06:08 -07:00
Christian Muehlhaeuser c5faba55c1 Comment: Typo Fix (#7958) 2019-07-21 05:55:09 +01:00
poornas 0373a1699b Add error filter to admin trace API (#7923)
This allows MinIO to have the ability to send back only error trace
2019-07-20 01:38:26 +01:00
Krishnan Parthasarathi 559a59220e Add initial support for bucket lifecycle (#7563)
This PR is based off @sinhaashish's PR for object lifecycle
management, which includes support only for,
- Expiration of object
- Filter using object prefix (_not_ object tags)

N B the code for actual expiration of objects will be included in a
subsequent PR.
2019-07-19 21:20:33 +01:00
poornas 041a812ba0 trace api: add call stats to trace (#7915)
Stats such as call latency, bytes received and sent have been added
2019-07-18 23:29:17 +01:00
Krishnan Parthasarathi fbfc9a61ec Add node address information to logs (#7941) 2019-07-18 09:58:37 -07:00
Anis Elleuch 28661c0413 heal: Trigger auto-heal once each month instead of 24 hours (#7934) 2019-07-16 00:03:42 +01:00
Harshavardhana 04a152be12 Redirect to browser only if browser is enabled (#7914) 2019-07-15 20:01:17 +01:00
Harshavardhana bce3f8237d Allow users to give anonymous access (#7926)
Current code already allows users to GetPolicy/SetPolicy
there was a missing code in ListAllBucketPolicies to allow
access, this fixes this behavior.

Fixes #7913
2019-07-15 20:00:41 +01:00
Harshavardhana 16a45e5aff
Fix dynamic help vars for sub-commands (#7925)
The fix in #7646 introduced a regression which
was left unnoticed, the fix didn't work for
sub-commands unfortunately. This fixes it
by moving v1.21.0 version of the minio/cli
package.

Fixes #7924
2019-07-12 23:32:27 -07:00
Anis Elleuch 000a60f238 xl: Heal empty parts (#7860)
posix.VerifyFile() doesn't know how to check if a file
is corrupted if that file is empty. We do have the part
size in xl.json so we pass it to VerifyFile to return
an error so healing empty parts can work properly.
2019-07-13 00:29:44 +01:00
Praveen raj Mani bf278ca36f Enable event persistence in NSQ (#7579) 2019-07-12 10:41:57 +01:00
Ashish Kumar Sinha 97f2bc26b9 Add validations for object name length and prefix (#7746)
fixes #7717
2019-07-12 10:08:12 +05:30
Praveen raj Mani bba562235b Enable persistent event store in elasticsearch (#7564) 2019-07-12 08:23:20 +05:30
Krishnan Parthasarathi ffd7b7059c Pass on web-handler arguments properly to log entries (#7894) 2019-07-11 22:37:13 +01:00
Harshavardhana 5c0acbc6fc
Add text/event-stream for long running http connections (#7909)
When MinIO is behind a proxy, proxies end up killing
clients when no data is seen on the connection, adding
the right content-type ensures that proxies do not come
in the way.
2019-07-11 13:19:25 -07:00
poornas 20a15567b8 Fix atime support check for disk cache (#7891)
- add a sleep between Stat operations to
accurately detect atime
2019-07-10 23:41:11 +01:00
Krishnan Parthasarathi 94f67ad224 Log error response even if a handler doesn't logBody (#7867) 2019-07-10 11:49:02 -07:00
ebozduman 36ee110563 Regression fix to bring back checkPolicyCond function call (#7897)
Fixes #7895
2019-07-10 10:48:43 +05:30
mzukowski-reef 9d49688c87 Switch to kurin/blazer from minio/blazer fork for b2 gateway (#7879) 2019-07-09 08:14:02 -07:00
Anis Elleuch 8e09374cb8 Avoid go-prompt to show colored prompt properly in Windows (#7890)
Update prompt shows some weird characters under Windows, the reason
is that go-prompt is used to show a yes/no prompt, since go-prompt
does not seem to have a way to support color/fatih, this PR will
implements its own yes/no prompt with the correct text coloration.
2019-07-09 01:46:04 +01:00
Krishna Srinivas 58d90ed73c Avoid network transfer for bitrot verification during healing (#7375) 2019-07-08 13:51:18 -07:00
Anis Elleuch e857b6741d Add one log in health checker liveness code (#7861) 2019-07-06 16:38:39 -07:00
poornas 0505ef83b5 Fix host address returned in admin API calls (#7846) 2019-07-05 20:41:35 -07:00
Krishna Srinivas a2e904b966 Support any string as delimiter for listing (#7882) 2019-07-05 14:06:12 -07:00
Praveen raj Mani bb871a7c31 Enable event persistence in webhook (#7614) 2019-07-05 15:21:41 +05:30
mizuno-keyence 09103991ea [Bugfix] duplicating flag registration (#7853) 2019-07-03 14:31:19 -07:00
Harshavardhana c43f745449
Ensure that we use constants everywhere (#7845)
This allows for canonicalization of the strings
throughout our code and provides a common space
for all these constants to reside.

This list is rather non-exhaustive but captures
all the headers used in AWS S3 API operations
2019-07-02 22:34:32 -07:00
Anis Elleuch 9610a74c19 auto-heal: Use fast scan instead of the deep one (#7868) 2019-07-02 18:53:08 -07:00
kannappanr 70b350c383
Remove DeploymentID from response headers (#7815)
Response headers need not contain deployment ID.
2019-07-01 12:22:01 -07:00
Krishna Srinivas 338e9a9be9 Put object client disconnect (#7824)
Fail putObject  and postpolicy in case client prematurely disconnects
Use request's context to cancel lock requests on client disconnects
2019-06-28 22:09:17 -07:00
iliul d3f9f8be88 golint: fix redundant code logic (#7842)
Signed-off-by: Lei Liu <liul.stone@gmail.com>
2019-06-27 15:18:33 +05:30
Krishna Srinivas 183ec094c4 Simplify HTTP trace related code (#7833) 2019-06-26 22:41:12 -07:00
Harshavardhana c1d2b3d5c3 Handle HEAD/GET requests for virtual DNS requests (#7839)
r.URL.Path is empty when HEAD bucket with virtual
DNS requests come in since bucket is now part of
r.Host, we should use our domain names and fetch
the right bucket/object names.

This fixes an really old issue in our federation
setups.
2019-06-26 18:21:54 -07:00
Praveen raj Mani be72609d1f Expose version info in prometheus (#7812)
Fixes #7795
2019-06-26 10:36:54 -07:00
Anis Elleuch 48f2c98052 admin: Add Background heal status info API (#7774)
This API returns the information related to the self healing routine.

For the moment, it returns:
- The total number of objects that are scanned
- The last time when an item was scanned
2019-06-25 16:42:24 -07:00
Kanagaraj M 286c663495 list objects in browser ordered by last modified (#7805)
- return all objects in web-handlers listObjects response
- added local pagination to object list ui
- also fixed infinite loader and removed unused fields
2019-06-25 16:31:50 -07:00
Kanagaraj M 48cb271a46 include ip address while doing checkPortAvailability (#7818)
While checking for port availability, ip address should be included.
When a machine has multiple ip addresses, multiple minio instances
or some other applications can be run on same port but different
ip address.

Fixes #7685
2019-06-24 15:02:39 -07:00
Harshavardhana 90ca73af13 Allow trace even if server is not initialized (#7822) 2019-06-21 16:47:51 -07:00
Daniel Valdivia a04b6561a0 Fix a typo on the comment for ListenBucketNotification (#7821) 2019-06-21 11:58:52 -07:00
Harshavardhana 1af6e8cb72
Add support for session policies in STS APIs (#7747)
This PR adds support for adding session policies
for further restrictions on STS credentials, useful
in situations when applications want to generate
creds for multiple interested parties with different
set of policy restrictions.

This session policy is not mandatory, but optional.

Fixes #7732
2019-06-20 15:28:33 -07:00
Andreas Auernhammer 98d3913a1e enable SSE-KMS pass-through on S3 gateway (#7788)
This commit relaxes the restriction that the MinIO gateway
does not accept SSE-KMS headers. Now, the S3 gateway allows
SSE-KMS headers for PUT and MULTIPART PUT requests and forwards them
to the S3 gateway backend (AWS). This is considered SSE pass-through
mode.

Fixes #7753
2019-06-19 17:37:08 -07:00
Harshavardhana cd7d5b59e5
Add DeleteUser() to generate events in etcd (#7804)
Fixes a regression introduced in 6d89435356

Fixes #7797
2019-06-18 15:44:23 -07:00
poornas 299ef9b188 Trace: Replace function name with API prefix (#7794)
This change is required for `Admin Trace`
2019-06-18 13:55:13 -07:00
Harshavardhana b30c436715 [notify] Make sure to return when quorum is missing (#7799)
Fixes a regression introduced in 510ec153b9
2019-06-18 09:23:33 -07:00
Cody Maloney 7b8beecc81 Move lock to not surround pieces which don't use any internal members. (#7779)
Previously the read/write lock applied both for gateway use cases as
well the object store use case. Nothing from sys is touched or looked
at in the gateway usecase though, so we don't need to lock. Don't lock
to make the gateway policy getting a little more efficient, particularly
as where this is called from (checkRequestAuthType) is quite common.
2019-06-15 10:11:10 -07:00
Praveen raj Mani 510ec153b9 Refreshing notification system should not erase the rules-map of other buckets (#7758)
Fixes #7707
2019-06-15 03:14:27 -07:00
Harshavardhana 4a4048fe27 Migrate minio etcd config to backend config (#7751)
etcd when used in federated setups, currently
mandates that all clusters should have same
config.json, which is too restrictive and makes
federation a restrictive environment.

This change makes it apparent that each cluster
needs to be independently managed if necessary
from `mc admin info` command line.

Each cluster with in federation can have their
own root credentials and as well as separate
regions. This way buckets get further restrictions
and allows for root creds to be not common
across clusters/data centers.

Existing data in etcd gets migrated to backend
on each clusters, upon start. Once done
users can change their config entries
independently.
2019-06-15 03:07:54 -07:00
Harshavardhana c22439c82e Update minio-go v6.0.29 (#7778)
Bring improved retry logic
2019-06-12 18:09:21 -07:00
Harshavardhana 38224a4c1a Ignore errors reading fs.json (#7777) 2019-06-12 16:42:03 -07:00
Harshavardhana b4ab778cb2 Fix user IAM policy regression, reload policy appropriately (#7770)
Introduce in commit 7e4c9a9e1e

Fixes #7769
2019-06-12 14:49:45 -07:00
Krishna Srinivas 0394a8f013 Send Content-Length in the response headers (#7771)
curl using http1.0 would hang sometimes when Content-Length is missing in response headers
fixes #7661
2019-06-11 21:04:52 -07:00
Harshavardhana 91ceae23d0 Add support for customizable user (#7569) 2019-06-10 20:27:42 +05:30
kannappanr 1008c2c069
Do not display error logs if user does not have listbuckets privilege (#7370)
Fixes #7367
2019-06-09 13:15:57 -07:00
Anis Elleuch 7abadfccc2 Add self-healing feature (#7604)
- Background Heal routine receives heal requests from a channel, either to
heal format, buckets or objects
- Daily sweeper lists all objects in all buckets, these objects
don't necessarly have read quorum so they can be removed if
these objects are unhealable
- Heal daily ops receives objects from the daily sweeper
and send them to the heal routine.
2019-06-08 22:14:07 -07:00
poornas 97090aa16c Add admin API to send trace notifications to registered (#7128)
Remove current functionality to log trace to file
using MINIO_HTTP_TRACE env, and replace it with
mc admin trace command on mc client.
2019-06-08 15:54:41 -07:00
Harshavardhana cb1566c6e6 S3 Gateway: Handle restricted access credentials (#7757) 2019-06-07 15:49:13 -07:00
Harshavardhana 6d89435356 Reload a specific user or policy on peers (#7705)
Fixes #7587
2019-06-06 17:46:22 -07:00
Harshavardhana a69f74533c Add region as part of error XML (#7752) 2019-06-05 16:28:21 -07:00
Harshavardhana 97be455f63 Fix build failure in web-handlers.go 2019-06-03 16:44:09 -07:00
Krishnan Parthasarathi 74efbb4153 Add deploymentID to web handler logs (#7712) 2019-06-03 15:40:04 -07:00
Harshavardhana 0cfd5a21ba
[gateway] Remove policy reload, instead read policy from backend (#7727)
Inconsistencies can arise after applying bucket policies in
gateway mode, since all gateway instances do not share a
common shared state. This is by design to keep gateway as
shared nothing architecture.

This PR fixes such inconsistencies by reloading policy
if any from the backend.

Fixes #7723
2019-06-03 11:06:13 -07:00
Harshavardhana 1cfd4a48d9 Add specific headers in CORS, along with wildcard (#7726)
Fixes #7492
2019-05-31 09:23:55 -07:00
Harshavardhana 993a79d9c6
Disable http2 until we have upstream bugs fixed (#7711)
We should revert this PR in future once we
have upstream bugs fixed regarding http2 behavior
2019-05-30 19:49:33 -07:00
Kanagaraj M 900cc27b51 validate keys before updating for IAM user (#7720)
New secretkey should be validated before updating
it on the config.

Fixes #7715
2019-05-30 05:14:35 -07:00
Praveen raj Mani a73da7755e Remove senstive encryption entries from event data (#7719)
Fixes #7716
2019-05-29 22:29:37 -07:00
Harshavardhana 2c0b3cadfc Update go mod with sem versions of our libraries (#7687) 2019-05-29 16:35:12 -07:00
Praveen raj Mani 763fce909b Enable event persistence in kafka (#7633) 2019-05-29 13:19:48 -07:00
Kanagaraj M da8214845a allow users to change password through browser (#7683)
Allow IAM users to change the password using
browser UI.
2019-05-29 13:18:46 -07:00
Krishna Srinivas 74e2fe0879 Return "SlowDown" to S3 clients for network related errors (#7610)
Consider errors returned by httpClient.Do() as network errors. This is because
the http clients returns different types of errors and it is hard to catch
all the error types.
2019-05-29 10:21:47 -07:00
Harshavardhana 7e4c9a9e1e Properly watch for users, policies, temp users (#7701)
Users were not reloaded properly when etcd was
configured in gateway, server modes.

This PR fixes this issue.
2019-05-28 11:18:53 +05:30
Nitish Tiwari 46ced81f41
Fix Gateway startup sequence to populate etcd (if set) with bucket info (#7686) 2019-05-24 08:41:52 +05:30
Dee Koder e252114f06 Revert "cache: Rewrite to cache only on download (#7575)" (#7684)
This reverts commit a13b58f630.
2019-05-22 14:54:15 -07:00
Harshavardhana 39b3e4f9b3 Avoid using io.ReadFull() for WriteAll and CreateFile (#7676)
With these changes we are now able to peak performances
for all Write() operations across disks HDD and NVMe.

Also adds readahead for disk reads, which also increases
performance for reads by 3x.
2019-05-22 13:47:15 -07:00
Anis Elleuch 158b8c2e86 sets: Correctly set IsTruncated in listing (#7675)
IsTruncated should not be set to true if there is no further
possible entries beyond maxKeys.

This commit will also move wide testing on object API from xl
to xl sets.
2019-05-22 13:36:16 -07:00
Harshavardhana 59e1d94770 Remove stale entry spurious logging (#7663)
The problem in current code was we were removing
an entry from a lock lockerMap without considering
the fact that different entry for same resource is
a possibility due the nature of locks that can be
acquired in parallel before we decide if the lock
is considered stale

A sequence of events is as follows

 - Lock("resource")
 - lockMaintenance(finds a long lived lock in this "resource")
 - Owner node rebooted which now retruns Expired() as true for
   this "resource"
 - Unlock("resource") which succeeded in quorum
 - Now by this time application retried and acquired a new
   Lock() on the same "resource"
 - Now that we have Expired() true from the previous call,
   we proceed to purge the entry from the local lockMap()
   local lockMap reports a different entry for the expired
   UID which results in a spurious log entry.

This PR removes this logging as this situation is an
expected scenario.
2019-05-22 12:21:36 -07:00
Andrei Mikhalenia 59e847aebe Signature v4: Allow signed headers from GET parameters 2019-05-21 21:00:02 -07:00
poornas a13b58f630 cache: Rewrite to cache only on download (#7575)
This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

Fixes: #7458, #7573, #6265, #6630
2019-05-22 08:30:27 +05:30
Harshavardhana 16c648b109 Remove "Connection" close instead reduce MaxConns per host (#7654)
This is necessary to avoid connection build up between servers
unexpectedly for example in a situation where 16 servers are
talking to each other and one server now allows a maximum of

15*4096 = 61440 idle connections

Will be kept in pool. Such a large pool is perhaps inefficient for
many reasons and also affects overall system resources.

This PR also reduces idleConnection timeout from 120 secs to 60 secs.
2019-05-17 12:52:25 +05:30
Krishnan Parthasarathi c871456269 File must be sync'd before closing (#7657)
- group sync and close action into a single defer statement to avoid
  evaluation order related bugs in future.
2019-05-16 18:30:51 -07:00
Harshavardhana 55aa20595f
Remove an empty entry being added into XML marshal (#7656) 2019-05-16 13:09:21 -07:00
poornas 707ed2b302 gcs: use MD5Sum as ETag if present in object attrs (#7643)
Fixes: 7642
2019-05-16 12:00:12 -07:00
ebozduman 67d508b214 Adjusts help content dynamically according to OS (#7646) 2019-05-15 14:02:44 +05:30
Harshavardhana 0022c9d210 Add connection close proactively for Walk() http/rpc (#7645) 2019-05-14 16:10:51 -07:00
Anis Elleuch 9b4a81ee60 xl: Avoid possible race during bulk Multi Delete (#7644)
errs was passed to many goroutines but they are all allowed
to update errs if any error happens during deletion, which
can cause a data race.

This commit will avoid issuing bulk delete operations in parallel
to avoid the warning race.
2019-05-14 14:43:22 -07:00
Harshavardhana b3f22eac56 Offload listing to posix layer (#7611)
This PR adds one API WalkCh which
sorts and sends list over the network

Each disk walks independently in a sorted manner.
2019-05-14 13:49:10 -07:00
Krishna Srinivas a343d14f19 Simplify putObject by not breaking the stream into parts (#7199)
We broke into parts previously as we had checksum for the entire file
to tax less on memory and to have better TTFB. We dont need to now,
after the streaming-bitrot change.
2019-05-14 12:33:18 -07:00
Anis Elleuch 9c90a28546 Implement bulk delete (#7607)
Bulk delete at storage level in Multiple Delete Objects API

In order to accelerate bulk delete in Multiple Delete objects API,
a new bulk delete is introduced in storage layer, which will accept
a list of objects to delete rather than only one. Consequently,
a new API is also need to be added to Object API.
2019-05-13 12:25:49 -07:00
Praveen raj Mani d9a7f80f68 Remove duplicate checkPutObjectArgs in PutObject and (#7396)
Fixes #7384
2019-05-13 10:12:06 -07:00
Harshavardhana 3eb7a8bde8
Sync before Close() to avoid random I/O (#7638) 2019-05-11 15:03:10 -07:00
Harshavardhana 72929ec05b
Turn off md5sum optionally if content-md5 is not set (#7609)
This PR also brings --compat option to run MinIO in strict
S3 compatibility mode, MinIO by default will now try to run
high performance mode.
2019-05-08 18:35:40 -07:00
Aditya Manthramurthy 589df3d5e7 Deadcode removal (#7627) 2019-05-07 13:49:15 -07:00
kannappanr c422f7f412 Fix: Handle regression caused by gorilla mux v1.7.0 (#7625)
PR #7595 fixed part of the regression, but did not handle the
scenario, where in docker, the internal port is different from
the port on the host.

This PR modifies the regular expression such that all the
scenarios are handled.

Fixes #7619
2019-05-07 10:36:00 -07:00
Anis Elleuch 08b9244c48 Fix listing empty directory in recursive mode (#7613)
After recent listing refactor, recursive list doesn't return empty
directories, this commit will fix the behavior and add unit tests
so it won't happen again.
2019-05-06 07:52:42 -07:00
kannappanr 4b858b562a
Compression: Handle auto encryption when size is unknown (#7600)
When size is unknown and auto encryption is enabled,
and compression is set to true, putobject API is failing.

Moving adding the SSE-S3 header as part of the request to before
checking if compression can be done, otherwise the size is set to -1
and that seems to cause problems.
2019-05-02 08:28:18 -07:00
poornas 033f3a4d51 gcs: check error on object writer close (#7606)
Fixes #7605. Object metadata should be written to storage
only when the object was written successfully
2019-05-02 08:27:10 -07:00
poornas cf2a436bc8 Show SlowDown error message if backend is busy (#7521)
or if there are too many open file descriptors.
2019-05-02 07:09:57 -07:00
Harshavardhana 64998fc4ab Remove delayIsLeaf requirement simplify ListObjects further (#7593) 2019-05-02 10:36:57 +05:30
Krishna Srinivas 43be00ea1b Remove logs from bitrot-streaming.go as erasure layer is already logging (#7603) 2019-05-01 21:46:00 -07:00
Harshavardhana c5f26d5cdd Fix hdfsReader fd leak upon GetObject() (#7596)
Also migrate to minio/hdfs/v3@v3.0.0
2019-05-01 14:43:21 -07:00
Praveen raj Mani c113d4e49c Posix CreateFile should work for compressed lengths (#7584) 2019-04-30 16:27:31 -07:00
kannappanr a436f2baa5 Change order of trace source in error log (#7599)
Change the order of trace source that gets
printed on the console.
2019-04-29 14:56:30 -07:00
kannappanr 781012517d Fix: Handle regression caused by gorilla mux v1.7.0 (#7595)
gorilla/mux#383 broke the compatibility with the existing code.
This PR handles that scenario.
2019-04-29 22:03:27 +05:30
Harshavardhana 091b9b661f Complain if we detect sub-optimal ordering in distributed setup (#7576)
Fixes #6156
2019-04-29 10:10:50 +05:30
Harshavardhana af6c6a2b35
Remove timeout conn on net.Dialer (#7590)
This PR also removes conn_bug_21133 workaround
which is not valid anymore, all we need is deadline
connection with server in place

Fixes #7503
2019-04-27 15:14:16 -07:00
Krishna Srinivas b93ef73f9b Fix divide by 0 error when directio.AlignSize is 0 (#7591) 2019-04-26 16:08:15 -07:00
Harshavardhana 83ca1a8d64 Use etcd watch to reload IAM users (#7551)
Currently we used to reload users every five minutes,
regardless of etcd is configured or not. But with etcd
configured we can do this more asynchronously to trigger
a refresh by using the watch API

Fixes #7515
2019-04-26 18:48:50 +05:30
Anis Elleuch 27ef1262bf xl: Use random UUID during complete multipart upload (#7527)
One user has seen this following error log:

API: CompleteMultipartUpload(bucket=vertica, object=perf-dss-v03/cc2/02596813aecd4e476d810148586c2a3300d00000013557ef_0.gt)
Time: 15:44:07 UTC 04/11/2019
RequestID: 159475EFF4DEDFFB
RemoteHost: 172.26.87.184
UserAgent: vertica-v9.1.1-5
Error: open /data/.minio.sys/tmp/100bb3ec-6c0d-4a37-8b36-65241050eb02/xl.json: file exists
       1: cmd/xl-v1-metadata.go:448:cmd.writeXLMetadata()
       2: cmd/xl-v1-metadata.go:501:cmd.writeUniqueXLMetadata.func1()

This can happen when CompleteMultipartUpload fails with write quorum,
the S3 client will retry (since write quorum is 500 http response),
however the second call of CompleteMultipartUpload will fail because
this latter doesn't truly use a random uuid under .minio.sys/tmp/
directory but pick the upload id.

This commit fixes the behavior to choose a random uuid for generating
xl.json
2019-04-25 07:33:26 -07:00
Harshavardhana ae002aa724 Deprecate updating admin credentials using API calls (#7570)
Root credentials are not allowed to change in all of our
distributed setup deployments, this PR simply removes
that behavior.
2019-04-24 12:54:44 -07:00
Krishna Srinivas a3ec71bc28 Use O_DIRECT while writing to disk (#7479)
- Use O_DIRECT while writing to disk
- Remove MINIO_DRIVE_SYNC option
2019-04-23 21:25:06 -07:00
Harshavardhana 35d19a4ae2 Fix STS AssumeRole route conflict with MultipartUpload (#7574)
Since AssumeRole API was introduced we have a wrong route
match which results in certain clients failing to upload objects
using multipart because, multipart POST conflicts with STS POST
AssumeRole API.

Write a proper matcher function which verifies the route more
appropriately such that both can co-exist.
2019-04-23 15:55:41 -07:00
Harshavardhana f767a2538a
Optimize listing with leaf check offloaded to posix (#7541)
Other listing optimizations include

- remove double sorting while filtering object entries
- improve error message when upload-id is not in quorum
- use jsoniter for full unmarshal json, instead of gjson
- remove unused code
2019-04-23 14:54:28 -07:00
kannappanr 0c75395abe Fix: Allow deleting multiple objects anonymously if policy supports it (#7439)
Fixes #5683
2019-04-22 20:24:43 +05:30
Roman Kalashnikov 188cf1d5ce Add more friendly error message for policy object (#7412) 2019-04-22 01:23:54 -07:00
Praveen raj Mani d96584ef58 Allow server to start if one of local nodes in docker/kubernetes setup is resolved (#7452)
Allow server to start if one of the local nodes in docker/kubernetes setup is successfully resolved

- The rule is that we need atleast one local node to work. We dont need to resolve the
  rest at that point.

- In a non-orchestrational setup, we fail if we do not have atleast one local node up
  and running.

- In an orchestrational setup (docker-swarm and kubernetes), We retry with a sleep of 5
  seconds until any one local node shows up.

Fixes #6995
2019-04-19 10:26:44 -07:00
poornas 2c096c569f do not try to delete non-existent object in cache (#7560)
handle cache cleanup correctly when backend object was deleted.

Fixes: #7558
2019-04-18 13:53:22 -07:00
Chris Hoffman 816459d10f Azure gateway complete multipart ETag (#7500)
Compute md5 for azure multipart upload that matches s3 behavior
Reuse complete multipart md5 function in azure gateway
2019-04-17 23:50:25 -07:00
kannappanr d2f42d830f
Lock: Use REST API instead of RPC (#7469)
In distributed mode, use REST API to acquire and manage locks instead
of RPC.

RPC has been completely removed from MinIO source.

Since we are moving from RPC to REST, we cannot use rolling upgrades as the
nodes that have not yet been upgraded cannot talk to the ones that have
been upgraded.

We expect all minio processes on all nodes to be stopped and then the
upgrade process to be completed.

Also force http1.1 for inter-node communication
2019-04-17 23:16:27 -07:00
Harshavardhana 4c048963dc etcd: Handle create buckets with common prefixes properly (#7556)
common prefixes in bucket name if already created
are disallowed when etcd is configured due to the
prefix matching issue. Make sure that when we look
for bucket we are only interested in exact bucket
name not the prefix.
2019-04-17 17:29:49 -07:00
Harshavardhana 620e462413 Implement S3-HDFS gateway (#7440)
- [x] Support bucket and regular object operations
- [x] Supports Select API on HDFS
- [x] Implement multipart API support
- [x] Completion of ListObjects support
2019-04-17 09:52:08 -07:00
poornas 1d49295943 Close CacheReader before clearing cache entry if object is deleted (#7555)
Fixes: #7549
2019-04-17 11:24:50 +05:30
Krishnan Parthasarathi 35ef5eb236 Don't exit background append if backend specific files show up (#7519) 2019-04-12 15:51:32 -07:00
Anis Elleuch 60d6887992 Make Encoding URL more compliant to S3 spec (#7360)
There is no written specification about how to encode key names
when url encoding type is passed.

However, this change will encode URLs as url.QueryEscape() does
while considering AWS S3 exceptions.
2019-04-12 12:02:37 -07:00
Andreas Auernhammer 012e4b42f9 http: opt-in to TLS 1.3 (#7483)
This commit enables TLS 1.3 on the server. For Go 1.12 TLS 1.3 is
enabled by an explicit opt-in.
2019-04-11 20:46:15 -07:00
poornas a74cb93666 Worm: Permit key-rotation of S3 encrypted objects (#7429)
Fixes : #7399
2019-04-10 11:31:50 -07:00
Andreas Auernhammer 849e06a316 crypto: add unit test for vault config verification (#7413)
This commit adds a unit test for the vault
config verification (which covers also `IsEmpty()`).

Vault-related code is hard to test with unit tests
since a Vault service would be necessary. Therefore
this commit only adds tests for a fraction of the code.

Fixes #7409
2019-04-10 11:05:53 -07:00
Praveen raj Mani 47ca411163 Enhance the event store interface to support channeling (#7343)
- Avoids code duplication across the other targets. By having a
  centralized function call.

- Reduce the room for race.
2019-04-10 18:16:01 +05:30
Aditya Manthramurthy ddb0d646aa Use passed lock-type in GetObjectNInfo cache implementation (#7505) 2019-04-09 14:49:45 -07:00
kannappanr 5ecac91a55
Replace Minio refs in docs with MinIO and links (#7494) 2019-04-09 11:39:42 -07:00
kannappanr 188ac8e369 Browser: Allow users to do s3 operations, if policy allows (#7487)
Fixes #7472
2019-04-09 20:47:41 +05:30
Harshavardhana a2e344bf30 Preserve ETag case for S3 compatibility (#7498)
Most hadoop distributions hortonworks, cloudera all
depend on aws-sdk-java 1.7.x to 1.10.x - the releases
which have bugs related case sensitive check for
ETag header. Go changes the case of the headers set
to be canonical but only preserves them when set
through a direct map.

This fixes most compatibility issues we have had
in the past supporting older hadoop distributions.
2019-04-08 16:54:46 -07:00
poornas 10a607154d Fix ListObjectsV2 for gateway encryption mode (#7491)
Fixes #7468 by setting NextContinuationToken only if list is
truncated
2019-04-08 15:12:00 -07:00
Harshavardhana 0188009c7e Expose total and available disk space (#7453) 2019-04-05 09:51:50 +05:30
Harshavardhana c90999df98 Valid if bucket names are internal (#7476)
This commit fixes a privilege escalation issue against
the S3 and web handlers. An authenticated IAM user
can:

- Read from or write to the internal '.minio.sys'
bucket by simply sending a properly signed
S3 GET or PUT request. Further, the user can
- Read from or write to the internal '.minio.sys'
bucket using the 'Upload'/'Download'/'DownloadZIP'
API by sending a "browser" request authenticated
with its JWT token.
2019-04-03 23:10:37 -07:00
Andreas Auernhammer 9a740736a4 fix privilege escalation against inter-node communication (#7474)
This commit fixes another privilege escalation issue
abusing the inter-node communication of distributed
servers to obtain/modify the server configuration.

The inter-node communication is authenticated using
JWT-Tokens. Further, IAM users accessing the cluster
via the web UI also get a JWT token and the browser
will add this "user" JWT token to each the request.

Now, a user can extract that JWT token an can craft
HTTP POST requests for the inter-node communication
API endpoint. Since the server accepts ANY valid
JWT token it also accepts inter-node commands from
an authenticated user such that the user can execute
arbitrary commands bypassing the IAM policy engine
and impersonate other users, change its own IAM policy
or extract the admin access/secret key.

This is fixed by only accepting "admin" JWT tokens
(tokens containing the admin access key - and therefore
were generated with the admin secret key). Consequently,
only the admin user can execute such inter-node commands.
2019-04-03 12:16:19 -07:00
Harshavardhana 313a3a286a Migrate to go1.12 to simplify our cmd/http package (#7302)
Simplify the cmd/http package overall by removing
custom plain text v/s tls connection detection, by
migrating to go1.12 and choose minimum version
to be go1.12

Also remove all the vendored deps, since they
are not useful anymore.
2019-04-02 18:28:39 -07:00
Anis Elleuch 4c23e6fa55 rpc: Avoid using Pool since it conflicts with http2 (#7467)
A race is detected between a bytes.Buffer generated with cmd/rpc.Pool
and http2 module. An issue is raised in golang (https://github.com/golang/go/issues/31192).

Meanwhile, this commit disables Pool in RPC code and it generates a
new 1kb of bytes.Buffer for each RPC call.
2019-04-02 13:34:21 -07:00
Krishna Srinivas ef791764e0 Do no access nsLockMap.lockMap when using dsync (#7464)
There is no need to access nsLockMap.lockMap when using dsync
2019-04-02 12:27:20 -07:00
Anis Elleuch 53011606a5 Show 401 unauthorized msg when nodes are started with different creds (#7433)
Before this commit, nodes wait indefinitely without showing any
indicate error message when a node is started with different access
and secret keys.

This PR will show '401 Unauthorized' in this case.
2019-04-02 12:25:34 -07:00
Krishnan Parthasarathi 93a9078b23 Assign deploymentID for first minio server in distributed setup (#7427)
- Pass local endpoints to functions fixing formatXL during startup
2019-04-02 10:50:13 -07:00
Ashish Kumar Sinha a4bdcba503 Add check for extra input field (#7437)
fixes #6559
2019-04-02 12:15:32 +05:30
poornas 023866642c canonicalize ETag correctly (#7442)
Fixes #7441 
Trim extra quotes prefixing/suffixing ETag in
CompleteMultipartUpload request.
2019-04-01 12:19:52 -07:00
Harshavardhana 619611933a
Remove policy nesting errors (#7449)
Policy nesting has been supported for a while
now, we should remove references of code and
docs indicating nesting is not allowed anymore.
2019-03-31 08:42:43 -07:00
Harshavardhana 6df05e489d Set Read/Write timeouts only for net.Conn not http.Server (#7431)
Fixes #7425
2019-03-27 22:10:06 +05:30
Harshavardhana 4a698c731b HealObjects should remove objects without quorum (#7407)
This PR adds a way to list objects without quorum
such that they can purged by `mc admin heal --remove`
2019-03-26 14:57:44 -07:00
Harshavardhana 9629de8230 Add proper context based logging when bitrot stream calls fail (#7415) 2019-03-26 13:59:33 -07:00
Harshavardhana 0250f7de67 Cleanup stale multipart uploads older than 3 days (#7424)
Fixes #6627
2019-03-25 13:41:05 -07:00
kannappanr 7154b8a568
Error log: Correct error type in anonymous mode (#7414)
Currently message is set to error type value.
Message field is not used in error logs. it is used only in the case of info logs.

This PR sets error message field to store error type correctly.
2019-03-25 13:40:08 -07:00
Anis Elleuch 8689ec258b Don't decrypt ETag in validation when source is SSEC multipart (#7423)
Copying an encrypted SSEC object when this latter is uploaded using
multipart mechanism was failing because ETag in case of encrypted
multipart upload is not encrypted.

This PR fixes the behavior.
2019-03-25 12:17:31 -07:00
Krishnan Parthasarathi aac9e2a7dd Return deploymentID in ServerInfo REST call (#7422)
Makes deploymentID information uniform in distributed setup
2019-03-25 11:55:28 -07:00
Harshavardhana e0a87e96de
Populate host value from GetSourceIP directly (#7417) 2019-03-25 11:45:42 -07:00
Praveen raj Mani 89e45d0695 Restart process should use the current process' pid (#7373)
This fixes varying pids for server-respawns. And avoids duplicate process
creating multiple pids when the server restart signal is triggered with
service restart enabled.

Fixes #7350
2019-03-20 22:20:30 -07:00
poornas 8e1e701d35 Azure:ETag returned by ListObjects to be consistent with GetObjectInfo (#7301) 2019-03-20 18:11:46 -07:00
kannappanr 87cf51d5ab
unused code: Remove LoadCredentials function (#7369)
It is required to set the environment variable in the case of distributed
minio. LoadCredentials is used to notify peers of the change and will not work if
environment variable is set. so, this function will never be called.
2019-03-20 18:09:57 -07:00
Harshavardhana a9032b52b8 Change storageRESTTimeout to 1minute (#7398) 2019-03-20 13:20:09 -07:00
Harshavardhana c184038b6a Add proper custom errors object creations (#7387)
In scenario 1

```
- bucket/object-prefix
- bucket/object-prefix/object
```

Server responds with `XMinioParentIsObject`

In scenario 2

```
- bucket/object-prefix/object
- bucket/object-prefix
```

Server responds with `XMinioObjectExistsAsDirectory`

Fixes #6566
2019-03-20 13:06:53 -07:00
poornas 12b79d9f3b Remove duplicate error in switch case. (#7381)
Fixes: #7380

crypto.ErrInvalidCustomerKey was being handled twice in toAPIErrorCode()
2019-03-19 17:21:05 -07:00
Krishnan Parthasarathi 8a77a298f2 Add deploymentID to ServerInfo (#7372) 2019-03-19 16:12:24 +05:30
Harshavardhana 328eb74cbb Fix regression in peer clients in TLS setups (#7391)
Regression was introduced in eb69c4f946
2019-03-19 09:44:49 +05:30
zy 73be3ed0ca format import style (#7383) 2019-03-18 13:07:58 -07:00
Kirill Motkov 3d29ab4059 Rewrite if-else chains to switch statements (#7382) 2019-03-18 07:46:20 -07:00
Harshavardhana 6702d23d52 Simplify ReadFileStream closer, make sure to flush all HTTP responses (#7374) 2019-03-18 10:50:26 +05:30
poornas 1011d21416 Fix credential parsing in signature v4 (#7377)
Fixes #7376
2019-03-16 22:45:42 -07:00
Kirill Motkov 85c5acc088 fix staticcheck warning (#7378) 2019-03-16 22:44:43 -07:00
kannappanr eb69c4f946
Use REST api for inter node communication (#7205) 2019-03-14 16:27:31 -07:00
Anis Elleuch facbd653ba Add normal/deep type of heal scanning (#7251)
Healing scan used to read all objects parts to check for bitrot
checksum. This commit will add a quicker way of healing scan
by only checking if parts are actually present in disks or not.
2019-03-14 13:08:51 -07:00
Harshavardhana 233824bf92 Configure http2 with higher maxconcurrent streams (#7363)
This value is needed for Minio's internode communication,
read the meaning of this value as per the HTTP 2.0 spec

https://http2.github.io/http2-spec/#rfc.section.5.1.2
2019-03-14 11:57:35 -07:00
Harshavardhana 7079abc931 Implement HealObjects API to simplify healing (#7351) 2019-03-13 17:35:09 -07:00
Harshavardhana 285c09fe6b Support buckets with '.' with etcd+coreDNS (#7353)
Fixes #7340
2019-03-12 17:57:08 -07:00
kannappanr ce4563370c
Distributed: Allow healing if all disks are on root partitions (#7358)
If all the disks are on root partitions in distributed mode, consider it
to be a test setup and allow healing to proceed.

Fixes #7346
2019-03-12 16:47:06 -07:00
Harshavardhana 0b96ad4fdc
http2 throws custom error Content-Length shorter handle it (#7334)
We should internally handle when http2 input stream has smaller
content than its content-length header

Upstream issue reported https://github.com/golang/go/issues/30648

This a change which we need to handle internally until Go fixes it
correctly, till now our code doesn't expect a custom error to be returned.
2019-03-07 16:11:28 -08:00
Anis Elleuch b05825ffe8 s3: Fix precondition failed in CopyObjectPart when src is encrypted (#7276)
CopyObject precondition checks into GetObjectReader
in order to perform SSE-C pre-condition checks using the
last 32 bytes of encrypted ETag rather than the decrypted
ETag

This also necessitates moving precondition checks for
gateways to gateway layer rather than object handler check
2019-03-06 12:38:41 -08:00
kannappanr 39ddb78c75
CORS: Expose all headers on response (#7331)
Fixes #7289
2019-03-06 11:58:53 -08:00
Harshavardhana 12eb71828b Fix posix tests for SimpleCI (#7328) 2019-03-05 19:53:01 -08:00
Praveen raj Mani c0a1369b73 Construct dynamic XML error responses for postpolicyform validation (#7321)
Fixes #7314
2019-03-05 12:10:47 -08:00
kannappanr c57159a0fe
fs mode: List already existing buckets with capital letters (#7244)
if a bucket with `Captialized letters` is created, `InvalidBucketName` error
will be returned. 
In the case of pre-existing buckets, it will be listed.

Fixes #6938
2019-03-05 10:42:32 -08:00
Kale Blankenship ef132c5714 Replace snappy.Writer/io.Pipe with snappyCompressReader. (#7316)
Prevents deferred close functions from being called while still
  attempting to copy reader to snappyWriter.
 Reduces code duplication when compressing objects.
2019-03-05 08:35:37 -08:00
Aditya Manthramurthy c54b0c0ca1 Fix a race in tests (#7326) 2019-03-05 21:34:17 +05:30
Aditya Manthramurthy e8e9cd3e74 Close GlobalServiceDoneCh when quitting (#7322)
This change allows indefinitely running go-routines to cleanup
gracefully.

This channel is now closed at the beginning of each test so that
long-running go-routines quit and a new one is assigned.
2019-03-04 14:33:14 -08:00
poornas 2564147ab4 Filter Expires header from user metadata (#7269)
Instead save it as a struct field in ObjectInfo as it is
a standard HTTP header - Fixes minio/mc#2690
2019-02-28 11:01:25 -08:00
Harshavardhana c3ca954684 Implement AssumeRole API for Minio users (#7267)
For actual API reference read here

https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html

Documentation is added and updated as well at docs/sts/assume-role.md

Fixes #6381
2019-02-27 17:46:55 -08:00
Harshavardhana ce588d1489 Improve ListObjects performance by listing in parallel (#7270)
The side affect of this change memory
increase, but this is a trade-off between
performance and actual memory usage.

For all practical scenarios this should be
an adequate change.
2019-02-27 14:39:22 -08:00
poornas 8022a6efd9 Return ETag for 0-byte object prefixes (#7291)
Fixes: #7290
2019-02-26 15:09:14 -08:00
Praveen raj Mani 78d116c487 Event persistence for MQTT (#7268)
- The events will be persisted in queueStore if `queueDir` is set.
- Else, if queueDir is not set events persist in memory.

The events are replayed back when the mqtt broker is back online.
2019-02-25 18:01:13 -08:00
Anis Elleuch 6584c7ea2b s3: Encode StartAfter when encoding type is passed (#7281)
In ObjectList V2, StartAfter needs to be encoded when encoding-type
is specified.
2019-02-24 18:50:28 -08:00
Anis Elleuch 5efbe8a1b3 s3: Add support of encodingType parameter (#7265)
This commit honors encoding-type parameter in object listing,
parts listing and multipart uploads listing.
2019-02-24 11:44:24 +05:30
Harshavardhana 7923b83953 Support multiple-domains in MINIO_DOMAIN (#7274)
Fixes #7173
2019-02-23 08:48:01 +05:30
Harshavardhana bedcb7442a Write xml.Header first instead of spaces to handle XML parsers (#7253)
Clients like AWS SDK Java and AWS cli XML parsers are
unable to handle on `\r\n` characters to avoid these
errors send XML header first and write white space characters
instead.

Also handle cases to avoid double WriteHeader calls
2019-02-21 11:50:15 +05:30
Harshavardhana 91576d416d Fix GetLocalPeer usage in perf handlers (#7249)
GetLocalPeer usage should be fixed and used only
once per call for not all local endpoints.
2019-02-20 16:04:55 -08:00
Krishna Srinivas 6dd26b8231 Detect change in underlying mounted disks (#7229) 2019-02-20 13:32:29 -08:00
poornas e098852a80 Revert PR #7241 to fix vault renewal (#7259)
- Current implementation was spawning renewer goroutines
without waiting for the lease duration to end. Remove vault renewer
and call vault.RenewToken directly and manage reauthentication if
lease expired.
2019-02-20 12:23:59 -08:00
Krishna Srinivas ce960565b1 Validate and reject unusual requests (#7258) 2019-02-19 21:02:41 -08:00
poornas 755e675d5c Fix: send decrypted size to notification event (#7248) 2019-02-19 14:14:26 +05:30
Harshavardhana b6c00405ec Do not pro-actively return false in isObjectDir() (#7246)
We should change the logic for both isObject()
and isObjectDir() leaf detection to be done
with quorum, due to how our directory navigation
works - this allows for properly deleting all
the dangling directories or objects if any.
2019-02-15 16:21:19 -08:00
Harshavardhana 8f62935448 Add proper requestID for STS errors (#7245) 2019-02-14 17:54:33 -08:00
Harshavardhana 396d78352d Support HTTP/2.0 (#7204)
Fixes #6704
2019-02-14 17:53:46 -08:00
Harshavardhana a51781e5cf Use context to fill in more details about error XML (#7232) 2019-02-13 16:07:21 -08:00
Krishna Srinivas 90213ff1b2 Detect peer reboots to invalidate current storage REST clients (#7227) 2019-02-13 15:29:46 -08:00
Andreas Auernhammer 6f764a8efd crypto: fix nil pointer dereference of vault secret (#7241)
This commit fixes a nil pointer dereference issue
that can occur when the Vault KMS returns e.g. a 404
with an empty HTTP response. The Vault client SDK
does not treat that as error and returns nil for
the error and the secret.

Further it simplifies the token renewal and
re-authentication mechanism by using a single
background go-routine.

The control-flow of Vault authentications looks
like this:
1. `authenticate()`: Initial login and start of background job
2. Background job starts a `vault.Renewer` to renew the token
3. a) If this succeeds the token gets updated
   b) If this fails the background job tries to login again
4. If the login in 3b. succeeded goto 2. If it fails
   goto 3b.
2019-02-13 15:25:32 -08:00
Harshavardhana df35d7db9d Introduce staticcheck for stricter builds (#7035) 2019-02-13 18:29:36 +05:30
Harshavardhana 4ba77a916d Select should return early errors as XML (#7230)
Currently, we were sending errors in Select binary format,
which is incompatible with AWS S3 behavior, errors in binary
are  sent after HTTP status code is already 200 OK - i.e it
happens during the evaluation of the record reader.
2019-02-13 13:18:11 +05:30
Anis Elleuch f9fecf0e76 storage: Increase the timeout of storage REST requests (#7218)
This commit increases storage REST requests to 5 minutes, this includes
the opening TCP connection, and sending/receiving data. This will reduce
clients receiving errors when the server is under high load.
2019-02-12 23:27:33 -08:00
Krishna Srinivas 14544d8d84 Validate incoming requests (#7234) 2019-02-12 13:24:14 -08:00
Harshavardhana fef5416b3c Support unknown gateway errors and convert at handler layer (#7219)
Different gateway implementations due to different backend
API errors, might return different unsupported errors at
our handler layer. Current code posed a problem for us because
this information was lost and we would convert it to InternalError
in this situation all S3 clients end up retrying the request.

To avoid this unexpected situation implement a way to support
this cleanly such that the underlying information is not lost
which is returned by gateway.
2019-02-12 14:55:52 +05:30
Harshavardhana 9f87283cd5 Revert and bring back B2 gateway implementation (#7224)
This PR is simply a revert of 3265112d04
just for B2 gateway.
2019-02-12 12:44:22 +05:30
Harshavardhana b8955fe577 Fix DummyHandlers to authorize and send/validate correct XMLs (#7223) 2019-02-11 17:58:26 -08:00
Harshavardhana 082f777281 Revamp bucket metadata healing (#7208)
Bucket metadata healing in the current code was executed multiple
times each time for a given set. Bucket metadata just like
objects are hashed in accordance with its name on any given set,
to allow hashing to play a role we should let the top level
code decide where to navigate.

Current code also had 3 bucket metadata files hardcoded, whereas
we should make it generic by listing and navigating the .minio.sys
to heal such objects.

We also had another bug where due to isObjectDangling changes
without pre-existing bucket metadata files, we were erroneously
reporting it as grey/corrupted objects.

This PR fixes all of the above items.
2019-02-11 09:23:13 +05:30
John Liu 9600e2b35e Comment Typo: Changed 'jason' to 'json` (#7216) 2019-02-10 05:49:00 -08:00
poornas 40b8d11209 Move metadata into ObjectOptions for NewMultipart and PutObject (#7060) 2019-02-09 11:01:06 +05:30
Sidhartha Mani c1b3f1994b remove unnecessary buffer while discarding stream (#7214) 2019-02-08 19:29:51 -08:00
ebozduman dd52e5ebe9 Implements dummy tagging handlers for Terraform (#7040) 2019-02-08 16:18:13 -08:00
Praveen raj Mani 8af1f0cc7b Improved error message for user and access key conflict (#7190) 2019-02-07 17:25:58 -08:00
Harshavardhana 85e939636f Fix JSON parser handling for certain objects (#7162)
This PR also adds some comments and simplifies
the code. Primary handling is done to ensure
that we make sure to honor cached buffer.

Added unit tests as well

Fixes #7141
2019-02-07 08:04:42 +05:30
poornas d203e7e1cc azure gateway: return MD5Sum as ETag for S3 API compatibility (#6884)
Fixes #6872.

This PR refactors multipart upload implementation to use a per
part metadata file which is cleaned up at the end of the upload
2019-02-06 16:58:43 -08:00
Harshavardhana 817269475f Make sure to drain body upon an error (#7197)
Also cleanup redundant code and use it at a common place
2019-02-06 12:07:03 -08:00
Krishna Srinivas 2d168b532b Allow format.json healing on dev/test setup (single node XL, all root disks) (#7170) 2019-02-06 11:44:19 -08:00
Krishna Srinivas 3dfbe0f68c Send white spaces to client till completeMultipart() process completes (#7198) 2019-02-05 20:58:09 -08:00
Harshavardhana 30135eed86 Redo how to handle stale dangling files (#7171)
foo.CORRUPTED should never be created because when
multiple sets are involved we would hash the file
to wrong a location, this PR removes the code.

But allows DeleteBucket() to work properly to delete
dangling buckets/objects. Also adds another option
to Healing where a user needs to specify `--remove`
such that all dangling objects will be deleted with
user confirmation.
2019-02-05 17:58:48 -08:00
Harshavardhana e4081aee62 Added support for reading body in STS API (#7188)
STS API supports both URL query params and reading
from a body.
2019-02-05 15:47:11 -08:00
kannappanr df418a2783
Create Cors handler with permissive configuration (#7186)
Create new Cors handler allowing all origins with all standard
methods with any header and credentials.

Fixes #7181
2019-02-05 14:06:52 -08:00
kannappanr 9a65f6dc97 Remove duplicate code in object-handlers.go (#7176)
removed duplicate code in CompleteMultipartUploadHandler
and CopyObjectPartHandler.
2019-02-05 13:36:38 -08:00
Harshavardhana ea6d61ab1f Use loadCachedConfigs appropriately to load ENVs (#7187) 2019-02-04 10:31:11 +05:30
Krishna Srinivas 6f08edfb36 Use O_EXCL when creating file as we never overwrite an existing file (#7189) 2019-02-01 19:01:06 -08:00
Anis Elleuch de2c106386 xl: ListObjectParts uses the latest valid xl meta (#7184)
ListObjectParts is using xl.readXLMetaParts which picks the first
xl meta found in any disk, which is an inconsistent information.

E.g.: In a middle of a multipart upload, one node can go offline
and get back later with an outdated multipart information.
2019-02-01 08:58:41 -08:00
Harshavardhana 32a6dd1dd6 Remove sporadic tests which fail on windows (#7178) 2019-01-31 16:48:47 -08:00
Harshavardhana 432aec73d9 Return proper errors for invalid bodies (#7179) 2019-01-31 07:19:09 -08:00
Anis Elleuch 36dae04671 CopyObjectPart: remove duplicated etag decryption (#7174) 2019-01-30 19:33:31 -08:00
Krishna Srinivas b18c0478e7 Only heal on disks where we are sure that healing is needed (#7148) 2019-01-30 10:53:57 -08:00
Anis Elleuch 2d9860e875 heal: Fix healing empty directories (#7154)
This commit fixes the computation of Before/After healing state
for empty directories.

Issues before the commit:
- Before state doesn't reflect the real status (no StatVol() called)
- For any MakeVol() error, healObjectDir is exited directly, which is
  wrong.
2019-01-30 10:51:56 -08:00
kannappanr d3553f8dfc
Bucket Heal: Do not add empty endpoint entry (#7172)
Currently during a heal of a bucket, if one disk is offline an empty endpoint entry is added.
Then another entry with the missing endpoint is also added.

This results in more entries than disks being added.

Code that adds empty endpoint has been removed.
2019-01-30 10:40:43 -08:00
Harshavardhana e1ae90c12b Make sure to pass the right username for correct ConditionValues (#7169)
Without passing proper username value would result in AccessDenied
errors when policies with `{aws:username}` substitutions are used.

Fixes #7165
2019-01-30 14:21:09 +05:30
Sidhartha Mani 34e7259f95 Add Historic CPU and memory stats (#7136)
Collect historic cpu and mem stats.  Also, use actual values 
instead of formatted strings while returning to the client. The string 
formatting prevents values from being processed by the server or 
by the client without parsing it. 

This change will allow the values to be processed (eg. 
compute rolling-average over the lifetime of the minio server)
and offloads the formatting to the client.
2019-01-30 12:47:32 +05:30
poornas 3467460456 Fix vault client to autorenew or reauthenticate (#7161)
Switch to Vault API's Renewer for token renewal.If
token can no longer be renewed, reauthenticate to
get a fresh token.
2019-01-29 16:57:23 +05:30
Harshavardhana 64b5701971 Support AWS envs creds for non-aws endpoints in S3 gateway (#7156)
We made a change previously in #7111 which moved support
for AWS envs only for AWS S3 endpoint. Some users requested
that this be added back to Non-AWS endpoints as well as
they require separate credentials for backend authentication
from security point of view.
2019-01-29 16:05:20 +05:30
Praveen raj Mani fad59da29d `clientID` removed in the MQTT config (#7157)
More than one client can't use the same clientID for MQTT connection. 
This causes problem in distributed deployments where config is shared 
across nodes, as each Minio instance tries to connect to MQTT using the
same clientID.

This commit removes the clientID field in config, and allows
MQTT client to create random clientID for each node.
2019-01-29 15:00:15 +05:30
Aditya Manthramurthy 2786055df4 Add new SQL parser to support S3 Select syntax (#7102)
- New parser written from scratch, allows easier and complete parsing
  of the full S3 Select SQL syntax. Parser definition is directly
  provided by the AST defined for the SQL grammar.

- Bring support to parse and interpret SQL involving JSON path
  expressions; evaluation of JSON path expressions will be
  subsequently added.

- Bring automatic type inference and conversion for untyped
  values (e.g. CSV data).
2019-01-28 17:59:48 -08:00
Harshavardhana 0a28c28a8c Avoid code which looks at local files when etcd is configured (#7144)
This situation happens only in gateway nas which supports
etcd based `config.json` to support all FS mode features.

The issue was we would try to migrate something which doesn't
exist when etcd is configured which leads to inconsistent
server configs in memory.

This PR fixes this situation by properly loading config after
initialization, avoiding backend disk config migration to be
done only if etcd is not configured.
2019-01-28 13:31:35 -08:00
Harshavardhana 526546d588 Remove '.minio.sys/tmp' files in background (#7124)
If it does happen that we have a lot files in '.minio.sys/tmp',
minio startup might block deleting this folder. Rename and
delete in background instead to allow Minio to start serving
requests.
2019-01-25 13:33:28 -08:00
Aditya Manthramurthy 2053b3414f Reduce heal parallelism (#7155)
To avoid a large number of concurrent connections between minio
servers and to reduce CPU pressure, it is better to limit the number
of objects healed in parallel to number_of_CPUs.
2019-01-25 13:11:17 -08:00
kannappanr ce870466ff
Top Locks command implementation (#7052)
API to list locks used in distributed XL mode
2019-01-24 07:22:14 -08:00
Harshavardhana 964e354d06 Fix liveness check for NAS gateway (#7142)
Current master throws '503' unavailable for liveness check
```
~ curl -v http://localhost:9000/minio/health/live
> GET /minio/health/live HTTP/1.1
...
...
< HTTP/1.1 503 Service Unavailable
```

With this fix liveness check returns error appropriately
```
~ curl -v http://localhost:9000/minio/health/live
> GET /minio/health/live HTTP/1.1
...
...
< HTTP/1.1 200 OK
```
2019-01-24 19:14:05 +05:30
kannappanr 8ee8ad777c
logger: do not interpret encoded url as format string (#7110)
Error logger currently interprets encoded url in the error as a format string.
2019-01-24 00:30:00 -08:00
Krishna Srinivas 82af0be1aa Healing process should not heal root disk (#7089) 2019-01-23 15:29:29 -08:00
Harshavardhana bd25f31100 Use IAM creds only if endpoint is S3 (#7111)
Requirements like being able to run minio gateway in ec2
pointing to a Minio deployment wouldn't work properly
because IAM creds take precendence on ec2.

Add checks such that we only enable AWS specific features
if our backend URL points to actual AWS S3 not S3 compatible
endpoints.
2019-01-23 11:12:33 -08:00
Harshavardhana ee7dcc2903 Handle errs returned with etcd properly for config init and migration (#7134)
Returning unexpected errors can cause problems for config handling,
which is what led gateway deployments with etcd to misbehave and
had stopped working properly
2019-01-23 11:10:59 -08:00
Anis Elleuch dc2348daa5 heal: Preserve deployment ID from reference format.json (#7126)
Deployment ID is not copied into new formats after healing format. Although,
this is not critical since a new deployment ID will be generated and set in the
next cluster restart, it is still much better if we don't change the deployment
id of a cluster for a better tracking.
2019-01-22 18:32:06 -08:00
Aditya Manthramurthy 042d7f25e4 Fix regexp matcher of browser assets and paths (#7083)
Fix regexp matcher for special assets for the browser to clash with
less of the object namespace.

Assets should now be loaded with the /minio/ prefix. Previously,
favicon.ico (and others) could be loaded at any path matching
/minio/*/favicon.ico. This clashes with a large part of the object
namespace. With this change, /minio/favicon.ico will serve the favicon
but not /minio/mybucket/favicon.ico

Fixes #7077
2019-01-22 10:58:28 -08:00
Andreas Auernhammer 8c1b649b2d load system CAs before trying to load custom CAs (#7133)
This changes causes `getRootCAs` to always load system-wide CAs.
Any additional custom CAs (at `certs/CA/`) are added to the certificate pool
of system CAs.

The previous behavior was incorrect since all no system-wide CAs were
loaded if either there were CAs under `certs/CA` or the `certs/CA`
directory didn't exist at all.
2019-01-22 09:18:06 -08:00
Nitish Tiwari 0bb65f84bb
Add example for IPv6 for address flag (#7127) 2019-01-22 15:55:27 +05:30
Harshavardhana 8e0910ab3e Fix build issues on BSDs in pkg/cpu (#7116)
Also add a cross compile script to test always cross
compilation for some well known platforms and architectures
, we support out of box compilation of these platforms even
if we don't make an official release build.

This script is to avoid regressions in this area when we
add platform dependent code.
2019-01-22 09:27:23 +05:30
Harshavardhana 5353edcc38
Support policy variable replacement (#7085)
This PR supports iam and bucket policies to have
policy variable replacements in resource and
condition key values.

For example
- ${aws:username}
- ${aws:userid}
2019-01-21 10:27:14 +05:30
Harshavardhana 3265112d04 Remove gateway implementations for manta, sia and b2 (#7115) 2019-01-20 08:10:58 -08:00
Harshavardhana 4fdacb8b14
Add policy conditions support for Listing operations on browser (#7106)
Fixes https://github.com/minio/minio/issues/7095
2019-01-20 12:50:01 +05:30
Krishna Srinivas 267f183fc8 Do not do StorageInfo() and ListBuckets() for FS/Erasure in health check handler (#7090)
Health checking programs very frequently use /minio/health/live 
to check health, hence we can avoid doing StorageInfo() and 
ListBuckets() for FS/Erasure backend.
2019-01-20 10:28:36 +05:30
Harshavardhana 3d22a9d84f Support rootCAs for notification targets (#7108)
Add support for RootCAs for notification targets
mqtt and webhook
2019-01-20 09:57:18 +05:30
Krishna Srinivas 51ec61ee94 Fix healing whole file bitrot (#7123)
* Use 0-byte file for bitrot verification of whole-file-bitrot files

Also pass the right checksum information for bitrot verification

* Copy xlMeta info from latest meta except []checksums and []Parts while healing
2019-01-20 07:58:40 +05:30
Harshavardhana 74c2048ea9 Add proper contexts with timeouts for etcd operations (#7097)
This fixes an issue of perceived hang when incorrect
unreachable URLs are specified in MINIO_ETCD_ENDPOINTS
variable.

Fixes #7096
2019-01-18 09:36:45 -08:00
Krishna Srinivas 730ac5381c Simplify parallelReader.Read() (#7109)
Simplify parallelReader.Read() which also fixes previous 
implementation where it was returning before all the parallel 
reading go-routines had terminated which caused race conditions.
2019-01-18 21:18:24 +05:30
Alex Simenduev 6dd8a83c5a change credential chain order in s3 gateway to mimic official docs (#7091) 2019-01-17 10:31:51 -08:00
Krishna Srinivas 98c950aacd Streaming bitrot verification support (#7004) 2019-01-17 18:28:18 +05:30
Harshavardhana 8766c5eb22 Add version as part of Server: header (#7100)
This was agreed after discussing with @abperiasamy, we
borrowed the idea from Apache's own documentation.
2019-01-16 13:38:41 -08:00
kannappanr e0d22359e7
Fix lint warnings (#7099) 2019-01-16 12:49:20 -08:00
Harshavardhana 6dd13e68c2 Support V2 signatures when autoencryption is enabled (#7084)
When auto-encryption is turned on, we pro-actively add SSEHeader
for all PUT, POST operations. This is unusual for V2 signature
calculation because V2 signature doesn't have a pre-defined set
of signed headers in the request like V4 signature. According to
V2 we should canonicalize all incoming supported HTTP headers.

Make sure to validate signatures before we mutate http headers
2019-01-16 12:12:06 -08:00
Harshavardhana 633001c8ba Inherit certsDir from configDir if latter is set (#7098)
This is to ensure backward compatibility for all existing
deployments which use custom config dir to point to their
certs directory.
2019-01-16 12:04:32 -08:00
Bala FA e23a42305c Rebase minio/parquet-go and fix null handling. (#7067) 2019-01-16 21:52:04 +05:30
Krishna Srinivas 63d2583e91 Avoid holding write lock on config in situations where it is not needed (#7082)
This is to allow the cluster to come up when N/2 number of disks is available.
2019-01-16 13:59:21 +05:30
Harshavardhana e8791ae274 Remove Minio server arch, version from `Server:` header (#7074) 2019-01-15 13:16:11 +05:30
Praveen raj Mani 6571641735 Persist offline mqtt events in the `queueDir` and replay (#7037) 2019-01-14 12:39:00 +05:30
Harshavardhana 8757c963ba
Migrate all Peer communication to common Notification subsystem (#7031)
Deprecate the use of Admin Peers concept and migrate all peer
communication to Notification subsystem. This finally allows
for a common subsystem for all peer notification in case of
distributed server deployments.
2019-01-14 12:14:20 +05:30
Nick Craig-Wood 9c26fe47b0 Fix server side copy of files with `?` in - fixes #7058 (#7059)
Before this change the CopyObjectHandler and the CopyObjectPartHandler
both looked for a `versionId` parameter on the `X-Amz-Copy-Source` URL
for the version of the object to be copied on the URL unescaped version
of the header.  This meant that files that had question marks in were
truncated after the question mark so that files with `?` in their
names could not be server side copied.

After this change the URL unescaping is done during the parsing of the
`versionId` parameter which fixes the problem.

This change also introduces the same logic for the
`X-Amz-Copy-Source-Version-Id` header field which was previously
ignored, namely returning an error if it is present and not `null`
since minio does not currently support versions.

S3 Docs:
- https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html
- https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html
2019-01-10 13:10:10 -08:00
Sidhartha Mani f3f47d8cd3 Add ServerCPULoadInfo() and ServerMemUsageInfo() admin API (#7038) 2019-01-09 19:04:19 -08:00
poornas ed1275a063 Fix copy from encrypted multipart to single encrypted part (#7056)
When source is encrypted multipart object and the parts are not
evenly divisible by DARE package block size, target encrypted size
will not necessarily be the same as encrypted source object.
2019-01-09 15:17:21 -08:00
kannappanr a7d407fa42
Display message on failure to get lock on format.json in fs mode on startup (#6538)
Retry to see if the lock is free. Retry time will increase binomially.
2019-01-09 10:13:04 -08:00
Anis Elleuch 4e6e05f8e0 virtual host: Fix making new buckets (#7054)
This commit removes old code preventing PUT requests with '/' as a path,
because this is not needed anymore after the introduction of the virtual
host style in Minio server code.

'PUT /' when global domain is not configured already returns 405 Method
Not Allowed http error.
2019-01-09 11:59:41 +05:30
Bala FA b0deea27df Refactor s3select to support parquet. (#7023)
Also handle pretty formatted JSON documents.
2019-01-08 16:53:04 -08:00
kannappanr c59206bcd3
GCS ListMultipartUploads: Don't return on first uploadid (#7014)
ListMultipartUploads code returns only the first uploadid.

Fixes #7011
2019-01-08 11:03:28 -08:00
Harshavardhana 7f2d439baa Avoid printing in S3 tests (#7043) 2019-01-07 22:32:30 +05:30
poornas 5a80cbec2a Add double encryption at S3 gateway. (#6423)
This PR adds pass-through, single encryption at gateway and double
encryption support (gateway encryption with pass through of SSE
headers to backend).

If KMS is set up (either with Vault as KMS or using
MINIO_SSE_MASTER_KEY),gateway will automatically perform
single encryption. If MINIO_GATEWAY_SSE is set up in addition to
Vault KMS, double encryption is performed.When neither KMS nor
MINIO_GATEWAY_SSE is set, do a pass through to backend.

When double encryption is specified, MINIO_GATEWAY_SSE can be set to
"C" for SSE-C encryption at gateway and backend, "S3" for SSE-S3
encryption at gateway/backend or both to support more than one option.

Fixes #6323, #6696
2019-01-05 14:16:42 -08:00
Harshavardhana 2d19011a1d Add support for AssumeRoleWithWebIdentity (#6985) 2019-01-04 13:48:12 -08:00
Harshavardhana e82dcd195c Deprecate config-dir bring in certs-dir for TLS configuration (#7033)
This PR is to provide indication that config-dir will be removed
in future and all users should migrate to new --certs-dir option

Fixes #7016
Fixes #7032
2019-01-02 10:05:16 -08:00
Nitish Tiwari fcb56d864c Add ServerDrivesPerfInfo() admin API (#6969)
This is part of implementation for mc admin health command. The
ServerDrivesPerfInfo() admin API returns read and write speed
information for all the drives (local and remote) in a given Minio
server deployment.

Part of minio/mc#2606
2018-12-31 09:46:44 -08:00
Harshavardhana b5280ba243
Migrate to Go version 1.11.4 (#7026) 2018-12-28 14:04:39 -08:00
Harshavardhana 4e4f855b30
Add support for new policy conditions (#7024)
This PR implements following condition types

- StringEqualsIgnoreCase and StringNotEqualsIgnoreCase
- BinaryEquals
2018-12-26 17:39:30 -08:00
Harshavardhana fb8d0d7cf7
Add support for hostname lookups instead of IPs in MINIO_PUBLIC_IPS (#7018)
DNS names will be resolved to their respective IPs if specified
in MINIO_PUBLIC_IPS.

Fixes #6862
2018-12-23 03:08:21 -08:00
Harshavardhana a536cf5dc0 Buffconn should buffer upto maxHeaderBytes to avoid ErrBufferFull (#7017)
It can happen with erroneous clients which do not send `Host:`
header until 4k worth of header bytes have been read. This can lead
to Peek() method of bufio to fail with ErrBufferFull.

To avoid this we should make sure that Peek buffer is as large as
our maxHeaderBytes count.
2018-12-23 12:03:04 +05:30
Anis Elleuch 632022971b s3: Don't set NextMarker when listing is not truncated (#7012)
Setting NextMarker when IsTruncated is not set seems to be confusing
AWS C++ SDK, this commit will avoid setting any string in NextMarker.
2018-12-20 13:30:25 -08:00
kannappanr 7881791a91
CopyObject:Set Content-Type to application/octet-stream if it is not set (#6958) 2018-12-19 14:31:45 -08:00
Harshavardhana d2f8f8c7ee Fix ETag handling with auto-encryption with CopyObject conditions (#7000)
minio-java tests were failing under multiple places when
auto encryption was turned on, handle all the cases properly

This PR fixes

 - CopyObject should decrypt ETag before it does if-match
 - CopyObject should not try to preserve metadata of source
   when rotating keys, unless explicitly asked by the user.
 - We should not try to decrypt Compressed object etag, the
   potential case was if user sets encryption headers along
   with compression enabled.
2018-12-19 14:12:53 -08:00
kannappanr 8c32311b80 Change lock name to include names instead of hash. (#6886)
Previously lockname included the hash of the bucket, object
and uploadid.

This is a part of fix required to list oldest locks on the server.
2018-12-19 13:57:51 -08:00
Ashish Kumar Sinha 9bb88e610e Deletion of subfolders of multipart (#6961)
Delete subfolders under multipart folder upon completion of CompleteMultipartUpload, AbortMultipartUpload and cleanupStaleMultipartUploads functions
2018-12-19 11:27:10 -08:00
Harshavardhana d1e41695fe Add support for federation on browser (#6891) 2018-12-19 18:43:47 +05:30
Anis Elleuch 99b843a64e Add anonymous flag to prevent logging sensitive information (#6899) 2018-12-18 16:08:11 -08:00
Harshavardhana 4f31a9a33b Reload users upon AddUser on peers (#6975)
Also migrate ReloadFormat to notification subsystem,
remove GetConfig() we do not use this API anymore
2018-12-18 14:39:21 -08:00
Harshavardhana e7c902bbbc
Return proper errors when admin API is not initialized (#6988)
Especially in gateway IAM admin APIs are not enabled
if etcd is not enabled, we should enable admin API though
but only enable IAM and Config APIs with etcd configured.
2018-12-18 13:03:26 -08:00
poornas 7da0336ac8 Make sure env are loaded before gateway layer initialization (#6989) 2018-12-18 10:42:09 -08:00
Harshavardhana 3be616de3f Send deployment ID in notification event response elements (#6991) 2018-12-18 10:05:26 -08:00
Harshavardhana c5bf22fd90 Turn off printing IPv6 endpoints when listening on all interfaces (#6986)
By default when we listen on all interfaces, we print all the
endpoints that at local to all interfaces including IPv6
addresses. Remove IPv6 addresses in endpoint list to be
printed in endpoints unless explicitly specified with '--address'
2018-12-18 21:56:30 +05:30
poornas 7c9f934875 Disallow SSE requests when object layer has encryption disabled (#6981) 2018-12-14 21:39:59 -08:00
Andreas Auernhammer d264d2c899 add auto-encryption feature (#6523)
This commit adds an auto-encryption feature which allows
the Minio operator to ensure that uploaded objects are
always encrypted.

This change adds the `autoEncryption` configuration option
as part of the KMS conifguration and the ENV. variable
`MINIO_SSE_AUTO_ENCRYPTION:{on,off}`.

It also updates the KMS documentation according to the
changes.

Fixes #6502
2018-12-14 13:35:48 -08:00
Harshavardhana bebaff269c Support IPv6 in minio command line (#6947)
Fixes #6946
2018-12-14 13:07:46 +05:30
Harshavardhana 52b159b1db Allow versionId to be null for Delete,CopyObjectPart (#6972) 2018-12-14 11:34:37 +05:30
Nitish Tiwari 324834e4da Remove duplicate switch case (#6966)
Fixes #6948
2018-12-13 21:58:48 -08:00
Harshavardhana c2ed1347d9 Do not list objects unless specified in policy (#6970)
Currently we use GetObject to check if we are allowed to list,
this might be a security problem since there are many users now
who actively disable a publicly readable listing, anyone who
can guess the browser URL can list the objects.

This PR turns off this behavior and provides a more expected way
based on the policies.

This PR also additionally improves the Download() object
implementation to use a more streamlined code.

These are precursor changes to facilitate federation and web
identity support in browser.
2018-12-14 09:45:09 +05:30
Anis Elleuch 50f6f9fe58 S3 api: Ignore encoding in xml body (#6953)
One user reported having discovered the following error:

API: SYSTEM()
Time: 20:06:17 UTC 12/06/2018
Error: xml: encoding "US-ASCII" declared but Decoder.CharsetReader is nil
1: cmd/handler-utils.go:43:cmd.parseLocationConstraint()
2: cmd/auth-handler.go:250:cmd.checkRequestAuthType()
3: cmd/bucket-handlers.go:411:cmd.objectAPIHandlers.PutBucketHandler()
4: cmd/api-router.go100cmd.(objectAPIHandlers).PutBucketHandler-fm()
5: net/http/server.go:1947:http.HandlerFunc.ServeHTTP()

Hence, adding support of different xml encoding. Although there
is no clear specification about it, even setting "GARBAGE" as an xml
encoding won't change the behavior of AWS, hence the encoding seems
to be ignored.

This commit will follow that behavior and will ignore encoding field
and consider all xml as utf8 encoded.
2018-12-13 12:09:50 -08:00
Harshavardhana 6f7c99a333 Allow versionId to be null for Copy,Get,Head API calls (#6942)
Fixes #6935
2018-12-12 11:43:44 -08:00
Andreas Auernhammer 21d8c0fd13 refactor vault configuration and add master-key KMS (#6488)
This refactors the vault configuration by moving the
vault-related environment variables to `environment.go`
(Other ENV should follow in the future to have a central
place for adding / handling ENV instead of magic constants
and handling across different files)

Further this commit adds master-key SSE-S3 support.
The operator can specify a SSE-S3 master key using
`MINIO_SSE_MASTER_KEY` which will be used as master key
to derive and encrypt per-object keys for SSE-S3
requests.

This commit is also a pre-condition for SSE-S3
auto-encyption support.

Fixes #6329
2018-12-12 12:20:29 +05:30
Kale Blankenship 79b9a9ce46 Provide actual size in events instead of compressed size. (#6950)
Previous behavior did not check if the object was compressed and
incorrectly reported the stored size rather than the actual object
size.
2018-12-11 17:30:15 -08:00
Harshavardhana b9b353db4b Add env to support synchronous ops for all calls (#6877) 2018-12-11 16:22:56 -08:00
poornas 11a9b317a3 Disable ListenBucket notifications for NAS gateway (#6954) 2018-12-11 16:16:09 -08:00
Praveen raj Mani 9af7d627ac Preserve the compression headers while copying (#6952)
Fixes #6951
2018-12-11 12:05:41 -08:00
Harshavardhana 76d9d54603 Filter listing buckets based on user level access (#6940)
Fixes #6701
2018-12-10 22:57:22 +05:30
Harshavardhana 4c7c571875 Support JSON to CSV and CSV to JSON output format conversion (#6910)
This PR implements one of the pending items in issue #6286
in S3 API a user can request CSV output for a JSON document
and a JSON output for a CSV document. This PR refactors
the code a little bit to bring this feature.
2018-12-07 14:55:32 -08:00
Harshavardhana 3e124315c8 Increase the keep alive timeout to 30 secs (#6924)
Go by default uses a 3 * minute, we should
atleast use 30 secs as 10 secs is too aggressive.
2018-12-06 22:56:16 +05:30
Anis Elleuch 40852801ea fix: Better check of RPC type requests (#6927)
guessIsRPCReq() considers all POST requests as RPC but doesn't
check if this is an object operation API or not, which is actually
confusing bucket forwarder handler when it receives a new multipart
upload API which is a POST http request.

Due to this bug, users having a federated setup are not able to
upload a multipart object using an endpoint which doesn't actually
contain the specified bucket that will store the object.

Hence this commit will fix the described issue.
2018-12-05 14:28:48 -08:00
poornas f6980c4630 fix ConfigSys and NotificationSys initialization for NAS (#6920) 2018-12-05 14:03:42 -08:00
Harshavardhana 8fcc787cba Register notFound handler only once per root router (#6926)
registering notFound handler more than once causes
gorilla mux to return error for all registered paths
greater than > 8. This might be a bug in the gorilla/mux
but we shouldn't be using it this way. NotFound handler
should be only registered once per root router.

Fixes #6915
2018-12-05 11:54:12 -08:00
Praveen raj Mani 4e6d3c093f Errors in notification config should not crash the server (#6881)
Fixes #6870
2018-12-04 18:27:12 -08:00
Anis Elleuch 61145361fd Fallback to non-loopback IF addresses for Domain IPs (#6918)
When MINIO_PUBLIC_IPS is not specified and no endpoints are passed
as arguments, fallback to the address of non loop-back interfaces.

This is useful so users can avoid setting MINIO_PUBLIC_IPS in docker
or orchestration scripts, ince users naturally setup an internal
network that connects all instances.
2018-12-04 17:35:22 -08:00
Harshavardhana 6add646130 Fix logging of initialization errors in distributed mode (#6914) 2018-12-04 10:25:56 -08:00
Harshavardhana 20e61fb362 Redirect browser requests only if browser is enabled (#6909)
This PR fixes an issue introduced in PR #6848, when
browser is disabled we shouldn't re-direct the requests
returning AccessDenied.

Fixes #6907
2018-12-04 13:08:24 +05:30
Bala FA 18ced1102c handle post policy only if it is set. (#6852)
Previously policy in post form is assumed to be set always.  This is
fixed by doing the check when policy is set.
2018-12-03 12:01:28 -08:00
Harshavardhana d6af3c1237 Add bucket notification support for NAS gateway (#6908)
Fixes #6885
2018-12-03 14:02:14 +05:30
Andreas Auernhammer 5549a44566 rename vault namespace env variable to be more idiomatic (#6905)
This commit renames the env variable for vault namespaces
such that it begins with `MINIO_SSE_`. This is the prefix
for all Minio SSE related env. variables (like KMS).
2018-12-01 05:28:49 -08:00
Praveen raj Mani e7af31c2ff Removed `clientID` from NATS-Streaming Config (#6391)
clientID must be a unique `UUID` for each connections. Now, the
server generates it, rather considering the config.

Removing it as it is non-beneficial right now.

Fixes #6364
2018-11-30 10:46:17 +05:30
Harshavardhana 26120d7838 Ignore permission errors on config-dir (#6894) 2018-11-29 18:14:05 -08:00
Harshavardhana bef7c01c58 Choose right users in federation mode for CopyObject (#6895) 2018-11-29 17:35:11 -08:00
kannappanr d85199e9de
Vendorize minio-go (#6883)
Fixes #6873
2018-11-29 11:13:03 -08:00
Harshavardhana 8608a84c23 Ignore hidden directory .snapshot for NetApp volumes (#6889) 2018-11-29 11:39:21 +05:30
Harshavardhana b7226f4c82 Add transaction lock when migrating configs (#6878)
When migrating configs it happens often that some
servers fail to start due to version mismatch etc.

Hold a transaction lock such that all servers get
serialized.
2018-11-28 17:38:23 -08:00
Pontus Leitzler f930ffe9e2 Fix gcs context (#6869) 2018-11-28 13:49:51 +05:30