Commit Graph

2573 Commits

Author SHA1 Message Date
Harshavardhana
957ecb1b64
use optimal memory while purging cache (#9426)
re-implement the cache purging routine to
avoid using ioutil.ReadDir which can lead
to high allocations when there are cache
directories with lots of content, or
when cache is installed in memory constrainted
environments.

Instead rely on a callback function where we
are not using memory no-more than 8KiB per
cycle.

Precursor for this change refer #9425, original
issue pointed by Caleb Case <caleb@storj.io>
2020-04-23 12:26:13 -07:00
Boaz
ac5061df2c
fix: make azure gateway chunk size configurable (#9292) 2020-04-23 02:04:13 -07:00
Anis Elleuch
4cd6ca02c7
fix: Add missing return in admin requests auth (#9422) 2020-04-22 13:42:01 -07:00
Egon Elbre
a5efcbab51
fix: cacheReader.Close in all paths that don't return it. (#9418) 2020-04-22 12:13:57 -07:00
Egon Elbre
85be7b39ac
Call cleanup funcs when skip fails (#9417) 2020-04-22 10:06:56 -07:00
Nitish Tiwari
ebf3dda449
Update server startup example to showcase local erasure code (#9407) 2020-04-21 23:59:13 -07:00
poornas
582953260b
Increase response header timeout for gateway (#9400)
fixes: #9295
2020-04-21 19:21:27 -07:00
Praveen raj Mani
322385f1b6
fix: only show active/available ARNs in server startup banner (#9392) 2020-04-21 09:38:32 -07:00
Anis Elleuch
a69c98e394
fix: Correct typo when registering peer Delete User API (#9403) 2020-04-21 08:35:19 -07:00
Harshavardhana
282c9f790a
fix: validate partNumber in queryParam as part of preConditions (#9386) 2020-04-20 22:01:59 -07:00
Anis Elleuch
2eeb0e6a0b
heal: Fix heal buckets result reporting (#9397)
healBucket() was not properly collecting results after healing
buckets. This commit adds After drives information correctly.
2020-04-20 13:48:54 -07:00
Harshavardhana
3ff5bf2369
fix: convert storage class into azure tiers (#9381) 2020-04-19 13:42:56 -07:00
Harshavardhana
69ee28a082
remove OSS gateway due to lack of licensing (#9390)
OSS go sdk lacks licensing terms in their
repository, and there has been no activity

On the issue here https://github.com/aliyun/aliyun-oss-go-sdk/issues/245

This PR is to ensure we remove any dependency code which
lacks explicit license file in their repo.
2020-04-18 22:12:51 -07:00
Sidhartha Mani
3e78ea8acc
improve obd tests and optimize network (#9378)
- keep long running obd network tests alive
- fix error - wrong number of parents in process OBD info
- ensure that osinfo does not error out when inside containers
- remove limit on max number of connections per client transport

The generic client transport uses a default limit of 64 conns per transport.
This could end up limiting and throttling usage, and artificially slowing
down the performance of MinIO even on hardware capable of doing better.
2020-04-18 11:06:11 -07:00
Praveen raj Mani
c79358c67e
notification queue limit has no maxLimit (#9380)
New value defaults to 100K events by default,
but users can tune this value upto any value
they seem necessary.

* increase the limit to maxint64 while validating
2020-04-18 01:20:56 -07:00
Klaus Post
c4464e36c8
fix: limit HTTP transport tuables to affordable values (#9383)
Close connections pro-actively in transient calls
2020-04-17 11:20:56 -07:00
Harshavardhana
d92db198d1
Add target parsing code for config (#9375)
This code is helper for mcs project
2020-04-16 17:43:14 -07:00
Harshavardhana
8bae956df6
allow copyObject to rotate storageClass of objects (#9362)
Added additional mint tests as well to verify, this
functionality.

Fixes #9357
2020-04-16 17:42:44 -07:00
Harshavardhana
c82fa2c829
fix: load LDAP users appropriately (#9360)
This PR also fixes issues when

deletePolicy, deleteUser is idempotent so can lead to
issues when client can prematurely timeout, so a retry
call error response should be ignored when call returns
http.StatusNotFound

Fixes #9347
2020-04-16 16:22:34 -07:00
Harshavardhana
a51280fd20
allow config help in gateway mode (#9356)
allow `mc admin config set mygateway/ audit_webhook --env`
to fetch the documentation as needed, this is just to
ensure that our users can still access the relevant
ENV docs while running in gateway mode.
2020-04-16 14:49:12 -07:00
Klaus Post
bd437c1c17
set server base context on gateway http server (#9365) 2020-04-16 11:54:12 -07:00
Harshavardhana
69fb68ef0b
fix simplify code to start using context (#9350) 2020-04-16 10:56:18 -07:00
Harshavardhana
bde0f444db
fix support OBDAdminAction is valid action (#9354) 2020-04-15 12:16:40 -07:00
Klaus Post
f19cbfad5c
fix: use per test context (#9343)
Instead of GlobalContext use a local context for tests.
Most notably this allows stuff created to be shut down 
when tests using it is done. After PR #9345 9331 CI is 
often running out of memory/time.
2020-04-14 17:52:38 -07:00
Harshavardhana
5c11a46412 update minio-go/parquet-go to latest 2020-04-14 16:53:29 -07:00
Anis Elleuch
8a94aebdb8
config: Add api requests max & deadline configs (#9273)
Add two new configuration entries, api.requests-max and
api.requests-deadline which have the same role of
MINIO_API_REQUESTS_MAX and MINIO_API_REQUESTS_DEADLINE.
2020-04-14 12:46:37 -07:00
Sidhartha Mani
ec11e99667
implement configurable timeout for OBD tests (#9324) 2020-04-14 11:48:32 -07:00
Harshavardhana
37d066b563
fix: deprecate requirement of session token for service accounts (#9320)
This PR fixes couple of behaviors with service accounts

- not need to have session token for service accounts
- service accounts can be generated by any user for themselves
  implicitly, with a valid signature.
- policy input for AddNewServiceAccount API is not fully typed
  allowing for validation before it is sent to the server.
- also bring in additional context for admin API errors if any
  when replying back to client.
- deprecate GetServiceAccount API as we do not need to reply
  back session tokens
2020-04-14 11:28:56 -07:00
Praveen raj Mani
bfec5fe200
fix: fetchLambdaInfo should return consistent results (#9332)
- Introduced a function `FetchRegisteredTargets` which will return
  a complete set of registered targets irrespective to their states,
  if the `returnOnTargetError` flag is set to `False`
- Refactor NewTarget functions to return non-nil targets
- Refactor GetARNList() to return a complete list of configured targets
2020-04-14 11:19:25 -07:00
Bala FA
525287f4b6
remove queue only if index is within the range (#9341)
Fixes minio/mc#3155
2020-04-14 11:06:23 -07:00
Harshavardhana
9054ce73b2
fix: deprecate skyring/uuid and use maintained google/uuid (#9340) 2020-04-14 02:40:05 -07:00
Harshavardhana
d079adc167
fix: remove initGlobalContext writes in tests (#9331)
since we do not close GlobalContext, we do not
need to reinitialize it inside test code
2020-04-13 23:21:01 -07:00
Harshavardhana
a9d401ac10
fix: update docs to mention erasure guide (#9339) 2020-04-14 11:38:14 +05:30
kannappanr
1fa65c7f2f
fix: object lock behavior when default lock config is enabled (#9305) 2020-04-13 14:03:23 -07:00
Harshavardhana
4314ee1670
fix: remove unusued PerfInfoHandler code (#9328)
- Removes PerfInfo admin API as its not OBDInfo
- Keep the drive path without the metaBucket in OBD
  global latency map.
- Remove all the unused code related to PerfInfo API
- Do not redefined global mib,gib constants use
  humanize.MiByte and humanize.GiByte instead always
2020-04-12 19:37:09 -07:00
Harshavardhana
7d636a7c13
enable --compat flag by default (#9326)
if needed use --no-compat to disable md5sum while
verifying any performance numbers.

bring back --compat behavior as default to avoid
additional documentation and confusing behavior,
as we are working towards improving md5sum to
be faster on AVX instructions, enabling this
should be hardly a problem in future versions
of MinIO.

fixes #8012
fixes #7859
fixes #7642
2020-04-12 18:08:27 -07:00
Harshavardhana
bf9d51cf14
fix: add missing copyright headers in some files (#9321) 2020-04-12 13:55:22 -07:00
Harshavardhana
29e0727b58
fix: regression in CopyObject not preserving ETag in --compat (#9322)
issue found after `git bisect` to commit db41953618
2020-04-11 20:20:30 -07:00
Anis Elleuch
c434dff0a4
posix: Add missing error return in RenameFile() (#9319)
Although it should not happen in most cases.
2020-04-11 11:15:30 -07:00
Taras Parkhomenko
b2a8cb4aba
Add SHA-3 support (#9308) 2020-04-10 14:59:52 -07:00
Harshavardhana
b412a222ae
Add missing comment key from key list (#9313)
Continuing from previous PR #9304, comment
is a special key is not present in the
default KV list. Add it explicitly when
tokenizing fields as it may be possible that
some clients might try to set comments.
2020-04-10 11:44:28 -07:00
Sidhartha Mani
9f81d014f1
fix: drive names in output of parallel obd test (#9312) 2020-04-09 22:44:17 -07:00
Harshavardhana
3184205519
fix: config to support keys with special values (#9304)
This PR adds context-based `k=v` splits based
on the sub-system which was obtained, if the
keys are not provided an error will be thrown
during parsing, if keys are provided with wrong
values an error will be thrown. Keys can now
have values which are of a much more complex
form such as `k="v=v"` or `k=" v = v"`
and other variations.

additionally, deprecate unnecessary postgres/mysql
configuration styles, support only

- connection_string for Postgres
- dsn_string for MySQL

All other parameters are removed.
2020-04-09 21:45:17 -07:00
Andreas Auernhammer
db41953618
avoid unnecessary KMS requests during single-part PUT (#9220)
This commit fixes a performance issue caused
by too many calls to the external KMS - i.e.
for single-part PUT requests.

In general, the issue is caused by a sub-optimal
code structure. In particular, when the server
encrypts an object it requests a new data encryption
key from the KMS. With this key it does some key
derivation and encrypts the object content and
ETag.

However, to behave S3-compatible the MinIO server
has to return the plaintext ETag to the client
in case SSE-S3.
Therefore, the server code used to decrypt the
(previously encrypted) ETag again by requesting
the data encryption key (KMS decrypt API) from
the KMS.

This leads to 2 KMS API calls (1 generate key and
1 decrypt key) per PUT operation - while only
one KMS call is necessary.

This commit fixes this by fetching a data key only
once from the KMS and keeping the derived object
encryption key around (for the lifetime of the request).

This leads to a significant performance improvement
w.r.t. to PUT workloads:
```
Operation: PUT
Operations: 161 -> 239
Duration: 28s -> 29s
* Average: +47.56% (+25.8 MiB/s) throughput, +47.56% (+2.6) obj/s
* Fastest: +55.49% (+34.5 MiB/s) throughput, +55.49% (+3.5) obj/s
* 50% Median: +58.24% (+32.8 MiB/s) throughput, +58.24% (+3.3) obj/s
* Slowest: +1.83% (+0.6 MiB/s) throughput, +1.83% (+0.1) obj/s
```
2020-04-09 17:01:45 -07:00
Harshavardhana
f44cfb2863
use GlobalContext whenever possible (#9280)
This change is throughout the codebase to
ensure that all codepaths honor GlobalContext
2020-04-09 09:30:02 -07:00
Anis Elleuch
1b45be0d60
lifecycle: Disallow delete when the object is locked (#9272) 2020-04-09 09:28:57 -07:00
Aditya Manthramurthy
6bb693488c
Fix policy setting error in LDAP setups (#9303)
Fixes #8667

In addition to the above, if the user is mapped to a policy or 
belongs in a group, the user-info API returns this information, 
but otherwise, the API will now return a non-existent user error.
2020-04-09 01:04:08 -07:00
Harshavardhana
e20e08d700
fix: remove the sleep from listing operations (#9287)
make rest of the Walk() function more predictable,
it was observed that in nominal deployments even
without much workload the drives are generally
slow for respond for readdir operations, for the
sleepDuration factor of 10 this can cause
unexpected slowness in the Listing calls, while
it is good for all other I/O, it may simply slow
down Listing immensely which is not useful.

fixes #9261
2020-04-08 19:42:57 -07:00
Harshavardhana
ac07df2985
start watcher after all creds have been loaded (#9301)
start watcher after all creds have been loaded
to avoid any conflicting locks that might get
deadlocked.

Deprecate unused peer calls for LoadUsers()
2020-04-08 19:00:39 -07:00
Anis Elleuch
e51e465543
delete: Use physical Dir() for proper prefix cleanup in Windows (#9297)
In FS mode under Windows, removing an object will not automatically.
remove parent empty prefixes.

The reason is that path.Dir() was used, however filepath.Dir() is
more appropriate since filepath is physical (meaning it operates
on OS filesystem paths)

This is not caught because failure for Windows CI is not caught.
2020-04-08 11:32:58 -07:00
Pontus Leitzler
a973402821
add object api check in fs-v1 before returning ready (#9285)
fs-v1 in server mode only checks to see if the path exist, so that it
returns ready before it is indeed ready.

This change adds a check to ensure that the global object api is
available too before reporting ready.

Fixes #9283
2020-04-08 08:53:20 -07:00
César Nieto
3ea1be3c52
allow delete of a group with no policy set (#9288) 2020-04-08 06:03:57 -07:00
Harshavardhana
2642e12d14
fix: change policies API to return and take struct (#9181)
This allows for order guarantees in returned values
can be consumed safely by the caller to avoid any
additional parsing and validation.

Fixes #9171
2020-04-07 19:30:59 -07:00
Harshavardhana
e7276b7b9b
fix: make single locks for both IAM and object-store (#9279)
Additionally add context support for IAM sub-system
2020-04-07 14:26:39 -07:00
Harshavardhana
e375341c33
fix: allow any 127.0.0.x as bind IPs (#9281)
It is some times common and convenient to use
just local IPs for testing purposes, 127.0.0.x
are special IPs regardless of being available on
an interface they can be bound to on all operating
systems.

Allow this behavior to work for minio server

fixes #9274
2020-04-07 09:40:20 -07:00
Harshavardhana
2c20716f37
fix: Avoid force delete in compliance/worm mode (#9276)
also, bring in an additional policy to ensure that
force delete bucket is only allowed with the right
policy for the user, just DeleteBucketAction
policy action is not enough.
2020-04-06 17:51:05 -07:00
Harshavardhana
91f21ddc47
fix: ignore lost+found properly while reading disks (#9278)
Fixes #9277
2020-04-06 16:51:18 -07:00
Harshavardhana
43a3778b45
fix: support object-remaining-retention-days policy condition (#9259)
This PR also tries to simplify the approach taken in
object-locking implementation by preferential treatment
given towards full validation.

This in-turn has fixed couple of bugs related to
how policy should have been honored when ByPassGovernance
is provided.

Simplifies code a bit, but also duplicates code intentionally
for clarity due to complex nature of object locking
implementation.
2020-04-06 13:44:16 -07:00
Harshavardhana
4714958e99
fix: possible connection leaks in sets init, heal (#9263) 2020-04-03 18:06:31 -07:00
Harshavardhana
ab66b23194
fix: allow listBuckets with listBuckets permission (#9253) 2020-04-02 12:35:22 -07:00
Harshavardhana
73f9d8a636
set default storage class always (#9250)
gateway implementations might not respond
back with right storage class which is
an AWS S3 concept, add default storage
if its empty.
2020-04-02 00:23:09 -07:00
Krishna Srinivas
541a778d7b
fix: do not exit on bootstrap Verify() to allow for rolling upgrades (#9235) 2020-04-01 21:40:03 -07:00
Harshavardhana
d49f2ec19c
fix: use specified authToken for audit/logger HTTP targets (#9249)
We were not using the auth token specified
even when config supports it.
2020-04-01 20:53:07 -07:00
ebozduman
8dd63a462f
fix: ETag returned by OSS endpoint (#9243) 2020-04-01 19:51:12 -07:00
poornas
336460f67e
fix: gateway_s3_bytes_sent metric for all API methods (#9242)
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-04-01 12:52:31 -07:00
Bala FA
95e89f1712
proactive deep heal object when a bitrot is detected (#9192) 2020-04-01 12:14:00 -07:00
Harshavardhana
d8af244708
Add numeric/date policy conditions (#9233)
add new policy conditions

- NumericEquals
- NumericNotEquals
- NumericLessThan
- NumericLessThanEquals
- NumericGreaterThan
- NumericGreaterThanEquals
- DateEquals
- DateNotEquals
- DateLessThan
- DateLessThanEquals
- DateGreaterThan
- DateGreaterThanEquals
2020-04-01 00:04:25 -07:00
Sidhartha Mani
c8243706b4
Add Parallel NetOBD tests to saturate all nodes at once (#9241) 2020-03-31 17:08:28 -07:00
Harshavardhana
30707659b5
[feature] allow for an odd number of erasure packs (#9221)
Too many deployments come up with an odd number
of hosts or drives, to facilitate even distribution
among those setups allow for odd and prime numbers
based packs.
2020-03-31 09:32:16 -07:00
poornas
90c365a174
fix: allow overwriting objects under lock after retention period (#9232)
fixes #9230
2020-03-31 09:15:42 -07:00
Sidhartha Mani
7b732b566f
[Bugfix] Fix Net tests being omitted (#9234) 2020-03-31 01:15:21 -07:00
Harshavardhana
ba52a925f9
fix: delete dangling directories properly (#9222) 2020-03-30 09:48:24 -07:00
ebozduman
fdda5f98c6
Makes mandatory dsn_string parameter optional (#8931) 2020-03-28 22:20:02 -07:00
Ingmar Runge
fa4d627b57
B2 gateway S3 compat: return MD5 hash as ETag from PutObject (#9183)
- B2 does actually return an MD5 hash for newly uploaded objects
  so we can use it to provide better compatibility with S3 client
  libraries that assume the ETag is the MD5 hash such as boto.
- depends on change in blazer library.
- new behaviour is only enabled if MinIO's --compat mode is active.
- behaviour for multipart uploads is unchanged (works fine as is).
2020-03-28 13:59:55 -07:00
Bala FA
2c3e34f001
add force delete option of non-empty bucket (#9166)
passing HTTP header `x-minio-force-delete: true` would 
allow standard S3 API DeleteBucket to delete a non-empty
bucket forcefully.
2020-03-27 21:52:59 -07:00
Anis Elleuch
7f8f1ad4e3
fix: cleanup lifecycle unused code (#9219) 2020-03-27 18:57:50 -07:00
Harshavardhana
6f992134a2
fix: startup load time by reusing storageDisks (#9210) 2020-03-27 14:48:30 -07:00
Sidhartha Mani
0c80bf45d0
Implement oboard diagnostics admin API (#9024)
- Implement a graph algorithm to test network bandwidth from every 
  node to every other node
- Saturate any network bandwidth adaptively, accounting for slow 
  and fast network capacity
- Implement parallel drive OBD tests
- Implement a paging mechanism for OBD test to provide periodic updates to client
- Implement Sys, Process, Host, Mem OBD Infos
2020-03-26 21:07:39 -07:00
Anis Elleuch
b207520d98
Fix lifecycle GET: AWS SDK complaints on empty config (#9201) 2020-03-25 21:06:03 -07:00
Krishna Srinivas
ef6304c5c2
Improve connectDisks() performance (#9203) 2020-03-24 23:26:13 -07:00
Nitish Tiwari
6b984410d5
Add support for self-healing related metrics in Prometheus (#9079)
Fixes #8988

Co-authored-by: Anis Elleuch <vadmeste@users.noreply.github.com>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-24 22:40:45 -07:00
Harshavardhana
813e0fc1a8
fix: optimize isConnected to avoid url.String() conversions (#9202)
Stringifying in a loop can tax the system, avoid this
and convert the endpoints to strings early on and
remember them for the lifetime of the server.
2020-03-24 18:53:24 -07:00
Harshavardhana
6f6a2214fc
Add rate limiter for S3 API layer (#9196)
- total number of S3 API calls per server
- maximum wait duration for any S3 API call

This implementation is primarily meant for situations
where HDDs are not capable enough to handle the incoming
workload and there is no way to throttle the client.

This feature allows MinIO server to throttle itself
such that we do not overwhelm the HDDs.
2020-03-24 12:43:40 -07:00
Anis Elleuch
791821d590
sa: Allow empty policy to indicate parent user's policy is inherited (#9185) 2020-03-23 14:17:18 -07:00
Harshavardhana
9a951da881
honor the credentials of user admin for encrypt/decrypt (#9194)
Fixes #9193
2020-03-23 14:06:00 -07:00
Harshavardhana
ff932ca2a0
fix: log only catastrophic errors in prepare storage (#9189) 2020-03-23 07:32:18 -07:00
poornas
818d3bcaf5
fix: deprecate TestDiskCache test from unit tests (#9187) 2020-03-22 23:46:36 -07:00
Krishna Srinivas
45b1c66195
fix: implement splunk specific listObjects when delimiter=guidSplunk (#9186) 2020-03-22 19:23:47 -07:00
Harshavardhana
da04cb91ce
optimize listObjects to list only from 3 random disks (#9184) 2020-03-22 16:33:49 -07:00
Harshavardhana
cfc9cfd84a
fix: various optimizations, idiomatic changes (#9179)
- acquire since leader lock for all background operations
  - healing, crawling and applying lifecycle policies.

- simplify lifecyle to avoid network calls, which was a
  bug in implementation - we should hold a leader and
  do everything from there, we have access to entire
  name space.

- make listing, walking not interfere by slowing itself
  down like the crawler.

- effectively use global context everywhere to ensure
  proper shutdown, in cache, lifecycle, healing

- don't read `format.json` for prometheus metrics in
  StorageInfo() call.
2020-03-22 12:16:36 -07:00
Harshavardhana
ea18e51f4d
Support multiple LDAP OU's, smAccountName support (#9139)
Fixes #8532
2020-03-21 22:47:26 -07:00
Harshavardhana
3d3beb6a9d
Add response header timeouts (#9170)
- Add conservative timeouts upto 3 minutes
  for internode communication
- Add aggressive timeouts of 30 seconds
  for gateway communication

Fixes #9105
Fixes #8732
Fixes #8881
Fixes #8376
Fixes #9028
2020-03-21 22:10:13 -07:00
poornas
27b8f18cce
Fix storage info message on startup (#9177) 2020-03-21 10:02:20 -07:00
Harshavardhana
b4bfdc92cc fix: admin console logger changes to log.Info 2020-03-20 15:14:14 -07:00
Harshavardhana
ae654831aa
Add madmin package context support (#9172)
This is to improve responsiveness for all
admin API operations and allowing callers
to cancel any on-going admin operations,
if they happen to be waiting too long.
2020-03-20 15:00:44 -07:00
Stephen N
1ffa983a9d
added support for SASL/SCRAM on Kafka bucket notifications. (#9168)
fixes #9167
2020-03-20 11:10:27 -07:00
Nitish Tiwari
ecf1566266
Add an option to allow plaintext connection to LDAP/AD Server (#9151) 2020-03-19 19:20:51 -07:00
Harshavardhana
b1a2169dcc
fix: data usage crawler env handling, usage-cache.bin location (#9163)
canonicalize the ENVs such that we can bring these ENVs 
as part of the config values, as a subsequent change.

- fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin`
- fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json`
  (avoid conflicts with a bucket named `usage.json` )
- fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin`
  (avoid conflicts with a bucket named `usage.bin`
2020-03-19 09:47:47 -07:00
Harshavardhana
d45a1808f2
fix: Walk() should require quorum number of disks only (#9164) 2020-03-18 20:56:07 -07:00
Anis Elleuch
db2155551a
heal: Pass scan mode to HealObjects to deep scan full quorum objects (#9159)
As an optimization of the healing, HealObjects() avoid sending an
object to the background healing subsystem when the object is
present in all disks.

However, HealObjects() should have checked the scan type, if this
deep, always pass the object to the healing subsystem.
2020-03-18 17:50:00 -07:00
Harshavardhana
09d35d3b4c
fix: sts to return appropriate errors (#9161) 2020-03-18 17:25:45 -07:00
Anis Elleuch
5b9342d35c
xl: Tree walking should not quit when one disk returns empty (#9160)
Currently, a tree walking, needed to a list objects in a specific
set quits listing as long as it finds no entries in a disk, which
is wrong.

This affected background healing, because the latter is using
tree walk directly. If one object does not exist in the first
disk for example, it will be seemed like the object does not
exist at all and no healing work is needed.

This commit fixes the behavior.
2020-03-18 16:58:05 -07:00
Klaus Post
8d98662633
re-implement data usage crawler to be more efficient (#9075)
Implementation overview: 

https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b
2020-03-18 16:19:29 -07:00
Anis Elleuch
7fdeb44372
info: Initialize boot time early so uptime will always be correct (#9154) 2020-03-17 16:37:28 -07:00
poornas
59dced8237
Print node status even in --quiet mode (#9149) 2020-03-17 15:25:00 -07:00
Anis Elleuch
496f4a7dc7
Add service account type in IAM (#9029) 2020-03-17 10:36:13 -07:00
kannappanr
8b880a246a
fix: deleteObjectTagging should 204 on success (#9150) 2020-03-16 23:21:24 -07:00
Klaus Post
eeb5942b6b
fix: remote profile names and extension (#9145)
Remote profiles are not formatted correctly:

```
profile-172.31.91.126_9000-cpu.pprof
profile-172.31.91.126_9000-goroutines-before.txt
profile-172.31.91.126_9000-goroutines.txt
profiling-172.31.80.49_9000-cpu.pprof.pprof
profiling-172.31.80.49_9000-goroutines-before.txt.pprof
profiling-172.31.80.49_9000-goroutines.txt.pprof
profiling-172.31.86.101_9000-cpu.pprof.pprof
profiling-172.31.86.101_9000-goroutines-before.txt.pprof
profiling-172.31.86.101_9000-goroutines.txt.pprof
profiling-172.31.91.191_9000-cpu.pprof.pprof
profiling-172.31.91.191_9000-goroutines-before.txt.pprof
profiling-172.31.91.191_9000-goroutines.txt.pprof
```

`profiling` -> `profile`, remove extra extension.
2020-03-16 11:39:53 -07:00
Harshavardhana
c9212819af
fix: lock maintenance should honor quorum (#9138)
The staleness of a lock should be determined by
the quorum number of entries returning stale,
this allows for situations when locks are held
when nodes are down - we don't accidentally
clear locks unintentionally when they are valid
and correct.

Also lock maintenance should be run by all servers,
not one server, stale locks need to be run outside
the requirement for holding distributed locks.

Thanks @klauspost for reproducing this issue
2020-03-15 11:55:52 -07:00
poornas
10fd53d6bb
Fix: admin config set API for notifications (#9085)
Filter out targets set via env when
validating incoming config change against
configured notification targets

Fixes #9066
2020-03-14 00:01:15 -07:00
Krishna Srinivas
2e9fed1a14
non-empty dirs should not be listed as objects (#9129) 2020-03-13 17:43:00 -07:00
Kody A Kantor
06e30b5aa1
Skip building directio on platforms that don't support Direct IO (#9059) 2020-03-12 18:57:41 -07:00
Harshavardhana
a54cdb9587
fix: Send x-amz-mp-parts-count for multiparted objects (#9116)
Some AWS SDKs latently rely on this value some times
to calculate the right number of parts during a parallel
GetObject request, this is feature used along with
content-range - we should support this as well.
2020-03-12 12:37:27 -07:00
Harshavardhana
cfd12914e1
fix: crash in serverInfo handler when ldap is configured (#9123) 2020-03-11 23:13:32 -07:00
Anis Elleuch
fdf65aa9b9
heal: Add info about the next background healing round (#9122)
- avoid setting last heal activity when starting self-healing

This can be confusing to users thinking that the self healing
cycle was already performed.

- add info about the next background healing round
2020-03-11 23:00:31 -07:00
Harshavardhana
69b2aacf5a
fix return proper error for OperationTimedout (#9117)
OperationTimedout error occurs when locking
timesout, trying to acquire a lock. This
error should be returned appropriately to
the client with http status "408" (request timedout)

This translation was broken, fix it.
2020-03-11 14:11:04 -07:00
Anis Elleuch
0af62d35a0
xl: Implement posix.DeletePrefixes to enhance delete perf (#9100)
Bulk delete API was using cleanupObjectsBulk() which calls posix
listing and delete API to remove objects internal files in the
backend (xl.json and parts) one by one.

Add DeletePrefixes in the storage API to remove the content
of a directory in a single call.

Also use a remove goroutine for each disk to accelerate removal.
2020-03-11 08:56:36 -07:00
Nitish Tiwari
7c32f3f554
Fix the URL for MinIO update when using custom download server (#9111)
Co-authored-by: Nitish Tiwari <nitish@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-11 20:09:20 +05:30
Harshavardhana
5ab9cc029d
fix: crash observed for anonymous deletes from UI (#9107) 2020-03-09 21:21:35 -07:00
Harshavardhana
6a00eb10bf
fix: allow set drive count of proper divisible values (#9101)
Currently the code assumed some orthogonal requirements
which led situations where when we have a setup where
we have let's say for example 168 drives, the final
set_drive_count chosen was 14. Indeed 168 drives are
divisible by 12 but this wasn't allowed due to an
unexpected requirement to have 12 to be a perfect modulo
of 14 which is not possible. This assumption was incorrect.

This PR fixes this old assumption properly, also adds
few tests and some negative tests as well. Improvements
are seen in error messages as well.
2020-03-08 13:30:25 -07:00
Harshavardhana
792ee48d2c
add additional logging during server formatting (#9102) 2020-03-08 12:12:07 -07:00
Harshavardhana
88ae0f1196
Improve delete performance by reducing the number of calls (#9092)
- Remove the requirement to honor storage class for deletes
- Improve `posix.DeleteFileBulk` code to Stat the volumeDir
  only once per call, rather than for all object paths.
2020-03-06 13:44:24 -08:00
Anis Elleuch
23a0415eb7
profiling: Fix crash when enabling goroutines profiling (#9097)
This commit replaces 'goroutines' with 'goroutine' when passing it
to pprof library when activating goroutine type profiling
2020-03-06 13:22:47 -08:00
Anis Elleuch
75a0661213
data-usage: Fix the calculation of the next crawling round (#9096)
This commit fixes a simple typo miscalculated the waiting time
until the next round of data crawling to compute the data usage.
2020-03-06 11:34:12 -08:00
kannappanr
07a7f329e7
xl: Fix counting offline disks in StorageInfo (#9082)
Recent modification in the code led to incorrect calculation
of offline disks.

This commit saves the endpoint list in a xlObjects then we know
the name of each disk.
2020-03-04 16:18:32 -08:00
kannappanr
c7ca791c58
fix: lock expiry on zoned setups (#9084)
lock ownership is limited to endpoints on first zone,
as we do not hold locks on other zones in an expanded
setup. current code unintentionally expired active locks
when it couldn't see ownership from the secondary zone
which leads to unexpected bugs as locking fails to work
as expected.
2020-03-04 16:06:17 -08:00
kannappanr
d9be8bc693
Add env. variable to disable data usage crawling (#9086) 2020-03-04 15:51:03 -08:00
poornas
9fc7537f2a
Enforce md5sum checks for object retention APIs (#9030)
this PR enforces md5sum verification for following
API's to be compatible with AWS S3 spec
 - PutObjectRetention
 - PutObjectLegalHold

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-04 07:04:12 -08:00
Klaus Post
f1b2462193
Add goroutine profiles (#9078)
Allow downloading goroutine dump to help detect leaks
or overuse of goroutines.

Extensions are now type dependent.

Change `profiling` -> `profile` prefix, since that is what they are 
not the abstract concept.
2020-03-04 06:58:12 -08:00
poornas
c93157019f
Allow gc to run in parallel on cache drives (#9051) 2020-03-03 06:42:26 +03:00
Harshavardhana
e3b44c3829
Remove partName, partETag requirement (#9044)
This is a precursor change before versioning,
removes/deprecates the requirement of remembering
partName and partETag which are not useful after
a multipart transaction has finished.

This PR reduces the overall size of the backend
JSON for large file uploads.
2020-03-03 03:29:30 +03:00
poornas
978bd4e2c4
check cacheControl not nil before access (#9055)
Fixes: #9053
2020-02-27 10:57:00 -08:00
poornas
5d25b10f72
Fix panic in StorageInfo call (#9050) 2020-02-26 15:29:50 -08:00
poornas
eac02c04f7
Fix sporadic failure in TestDiskCacheMaxUse (#9049) 2020-02-26 13:31:15 -08:00
Harshavardhana
1330e59307
accessKeyId missing should return appropriate error in AssumeRole (#9048)
For a non-existent user server would return STS not initialized
```
aws --profile harsha --endpoint-url http://localhost:9000 \
      sts assume-role \
      --role-arn arn:xxx:xxx:xxx:xxxx \
      --role-session-name anything
```

instead return an appropriate error as expected by STS API

Additionally also format the `trace` output for STS APIs
2020-02-26 12:26:47 -08:00
Harshavardhana
2dd14c0b89
print version with proper indentation (#9047)
currently version is printed as

> VERSION:
> DEVELOPMENT.2020-02-26T14-30-02Z

this is what we want

> VERSION:
>   DEVELOPMENT.2020-02-26T14-30-02Z
>
2020-02-26 23:09:08 +05:30
Harshavardhana
6f66f1a910
close channel upon error in Walk()'er (#9042) 2020-02-25 19:58:58 -08:00
Harshavardhana
23a8411732
Add a generic Walk()'er to list a bucket, optinally prefix (#9026)
This generic Walk() is used by likes of Lifecyle, or
KMS to rotate keys or any other functionality which
relies on this functionality.
2020-02-25 21:22:28 +05:30
Harshavardhana
ece0d4ac53
simplify recordAPIStats wrapper for ResponseWriters (#9034) 2020-02-24 09:45:32 -08:00
Harshavardhana
4c92bec619
allow rolling upgrades, remove same MinIO version requirement (#9033)
Upgrades between releases are failing due to strict
rule to avoid rolling upgrades, it is enough to
bump up APIs between versions to allow for quorum
failure and wait times. Authentication failures are
catastrophic in nature which leads to server not
be able to upgrade properly.

Fixes #9021
Fixes #8968
2020-02-24 10:32:30 +05:30
Harshavardhana
dcd63b4146
fix: avoid double ListBuckets() loading object lock (#9031) 2020-02-24 06:39:11 +05:30
poornas
224b4f13b8
Add cache eviction low and high watermarks (#8958)
To allow better control the cache eviction process.

Introduce MINIO_CACHE_WATERMARK_LOW and 
MINIO_CACHE_WATERMARK_HIGH env. variables to specify 
when to stop/start cache eviction process. 

Deprecate MINIO_CACHE_EXPIRY environment variable. Cache 
gc sweeps at 30 minute intervals whenever high watermark is
reached to clear least recently accessed entries in the cache
until sufficient space is cleared to reach the low watermark.

Garbage collection uses an adaptive file scoring approach based
on last access time, with greater weights assigned to larger
objects and those with more hits to find the candidates for eviction.

Thanks to @klauspost for this file scoring algorithm

Co-authored-by: Klaus Post <klauspost@minio.io>
2020-02-23 19:03:39 +05:30
Harshavardhana
51a9d1bdb7
Avoid unnecessary allocations for XML parsing (#9017) 2020-02-23 09:06:46 +05:30
Klaus Post
b2db1e96e2
Remove crawler concurrency (#9023)
Only have one crawler per disk. Removes locking, but keep
fastwalk itself able to run concurrently.
2020-02-21 20:50:16 +05:30
Harshavardhana
ab7d3cd508
fix: Speed up multi-object delete by taking bulk locks (#8974)
Change distributed locking to allow taking bulk locks
across objects, reduces usually 1000 calls to 1.

Also allows for situations where multiple clients sends
delete requests to objects with following names

```
{1,2,3,4,5}
```

```
{5,4,3,2,1}
```

will block and ensure that we do not fail the request
on each other.
2020-02-21 11:29:57 +05:30
Anis Elleuch
d4dcf1d722
metrics: Use StorageInfo() instead to have consistent info (#9006)
Metrics used to have its own code to calculate offline disks.
StorageInfo() was avoided because it is an expensive operation
by sending calls to all nodes.

To make metrics & server info share the same code, a new
argument `local` is added to StorageInfo() so it will only
query local disks when needed.

Metrics now calls StorageInfo() as server info handler does
but with the local flag set to false.

Co-authored-by: Praveen raj Mani <praveen@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2020-02-20 09:21:33 +05:30
poornas
02a59a04d1
Fix error messages returned by (Put)GetObjectLegalHold (#9013)
fiixing some minor discrepancies between aws s3 responses
vs minio server
2020-02-19 08:15:48 +05:30
Harshavardhana
16a6e68d7b
fix: indicate PutBucketEncryption as a valid policy action (#9009) 2020-02-18 10:32:53 -08:00
Praveen raj Mani
1b427ddb69
Support for Kafka version in the config (#9001)
Add a field for the Kafka version in the config. The user can explicitly 
set the version of the Kafka cluster.

Fixes #8768
2020-02-17 07:56:33 +05:30
Harshavardhana
712e82344c
acl: Support PUT calls with success for 'private' ACL's (#9000)
Add dummy calls which respond success when ACL's
are set to be private and fails, if user tries
to change them from their default 'private'

Some applications such as nuxeo may have an
unnecessary requirement for this operation,
we support this anyways such that don't have
to fully implement the functionality just that
we can respond with success for default ACLs
2020-02-16 11:37:52 +05:30