Commit Graph

41 Commits

Author SHA1 Message Date
poornas 0bb6247056 Move nslocking from s3 layer to object layer (#5382)
Fixes #5350
2018-01-13 10:04:52 +05:30
Harshavardhana c0721164be Automatically set goroutines based on shardSize (#5346)
Update reedsolomon library to enable feature to automatically
set number of go-routines based on the input shard size,
since shard size is sort of a constant in Minio for
objects > 10MiB (default blocksize)

klauspost reported around 15-20% improvement in performance
numbers on older systems such as AVX and SSE3

```
name                  old speed      new speed      delta
Encode10x2x10000-8    5.45GB/s ± 1%  6.22GB/s ± 1%  +14.20%    (p=0.000 n=9+9)
Encode100x20x10000-8  1.44GB/s ± 1%  1.64GB/s ± 1%  +13.77%  (p=0.000 n=10+10)
Encode17x3x1M-8       10.0GB/s ± 5%  12.0GB/s ± 1%  +19.88%  (p=0.000 n=10+10)
Encode10x4x16M-8      7.81GB/s ± 5%  8.56GB/s ± 5%   +9.58%   (p=0.000 n=10+9)
Encode5x2x1M-8        15.3GB/s ± 2%  19.6GB/s ± 2%  +28.57%   (p=0.000 n=9+10)
Encode10x2x1M-8       12.2GB/s ± 5%  15.0GB/s ± 5%  +22.45%  (p=0.000 n=10+10)
Encode10x4x1M-8       7.84GB/s ± 1%  9.03GB/s ± 1%  +15.19%    (p=0.000 n=9+9)
Encode50x20x1M-8      1.73GB/s ± 4%  2.09GB/s ± 4%  +20.59%   (p=0.000 n=10+9)
Encode17x3x16M-8      10.6GB/s ± 1%  11.7GB/s ± 4%  +10.12%   (p=0.000 n=8+10)
```
2018-01-03 13:47:22 -08:00
Nitish Tiwari 1a3dbbc9dd
Add x-amz-storage-class support (#5295)
This adds configurable data and parity options on a per object
basis. To use variable parity

- Users can set environment variables to cofigure variable
parity

- Then add header x-amz-storage-class to putobject requests
with relevant storage class values

Fixes #4997
2017-12-22 16:58:13 +05:30
Harshavardhana 8efa82126b
Convert errors tracer into a separate package (#5221) 2017-11-25 11:58:29 -08:00
Aditya Manthramurthy 4c9fae90ff Optimize healObject by eliminating extra data passes (#4949) 2017-09-28 15:57:19 -07:00
Frank Wessels 61e0b1454a Add support for timeouts for locks (#4377) 2017-08-31 14:43:59 -07:00
Andreas Auernhammer 85fcee1919 erasure: simplify XL backend operations (#4649) (#4758)
This change provides new implementations of the XL backend operations:
 - create file
 - read   file
 - heal   file
Further this change adds table based tests for all three operations.

This affects also the bitrot algorithm integration. Algorithms are now
integrated in an idiomatic way (like crypto.Hash).
Fixes #4696
Fixes #4649
Fixes #4359
2017-08-14 18:08:42 -07:00
Aditya Manthramurthy 32da1aa9d6 XL: Simplify heal-format operations
This is in preparation for updated admin heal API.

* Improve case analysis of healFormatXL() - fixes a case where disks
  could have unhandled errors.

* Simplify healFormatXLFreshDisks() and healFormatXLCorruptedDisks()
  to share more code and handle fewer cases for improved simplicity
  and reduced code repetition.

* Fix test cases.
2017-08-08 17:14:24 -07:00
Anis Elleuch af8071c86a xl: Fix rare freeze after many disk/network errors (#4438)
xl.storageDisks is sometimes passed to some low-level XL functions. Some disks in
xl.storageDisks are set to nil when they encounter some errors. This means all
elements in xl.storageDisks will be nil after some time which lead to an unusable XL.
2017-06-14 17:14:27 -07:00
Harshavardhana 075b8903d7 fs: Add safe locking semantics for `format.json` (#4523)
This patch also reverts previous changes which were
merged for migration to the newer disk format. We will
be bringing these changes in subsequent releases. But
we wish to add protection in this release such that
future release migrations are protected.

Revert "fs: Migration should handle bucketConfigs as regular objects. (#4482)"
This reverts commit 976870a391.

Revert "fs: Migrate object metadata to objects directory. (#4195)"
This reverts commit 76f4f20609.
2017-06-12 17:40:28 -07:00
Krishnan Parthasarathi ca64b86112 Return possible states a heal operation (#4045) 2017-04-14 10:28:35 -07:00
Krishnan Parthasarathi 2bd694dbc8 Add disksUnavailable healStatus const (#3990)
`disksUnavailable` healStatus constant indicates that a given object
needs healing but one or more of disks requiring heal are offline. This
can be used by admin heal API consumers to distinguish between a
successful heal and a no-op since the outdated disks were offline.
2017-03-31 17:55:15 -07:00
Krishnan Parthasarathi c27ece409b heal: Check if all parts are available and valid (#3967)
In the algorithm to check if an object requires healing, in addition to
checking if all disks have xl.json present we should check if all parts
of the object are present and have valid blake2b checksums.

Also fixed a minor compilation error in heal-objects-list.go.
2017-03-24 08:40:44 -07:00
Krishnan Parthasarathi c192e5c9b2 Implement heal-upload admin API (#3914)
This API is meant for administrative tools like mc-admin to heal an
ongoing multipart upload on a Minio server.  N B This set of admin
APIs apply only for Minio servers.

`github.com/minio/minio/pkg/madmin` provides a go SDK for this (and
other admin) operations.  Specifically,

  func HealUpload(bucket, object, uploadID string, dryRun bool) error

Sample admin API request:
POST
/?heal&bucket=mybucket&object=myobject&upload-id=myuploadID&dry-run
- Header(s): ["x-minio-operation"] = "upload"

Notes:
- bucket, object and upload-id are mandatory query parameters
- if dry-run is set, API returns success if all parameters passed are
  valid.
2017-03-17 09:25:49 -07:00
Harshavardhana e49efcb9d9 xl: quickHeal heal bucket only when needed. (#3854)
This improves the startup time significantly
for clusters which have lot of buckets.

Also fixes a bug where `.minio.sys` is created
on disks which do not have `format.json`
2017-03-06 02:00:15 -08:00
Krishnan Parthasarathi e3fd4c0dd6 XL: Make listOnlineDisks and outDatedDisks consistent w/ each other. (#3808) 2017-03-04 14:53:28 -08:00
Harshavardhana bcc5b6e1ef xl: Rename getOrderedDisks as shuffleDisks appropriately. (#3796)
This PR is for readability cleanup

- getOrderedDisks as shuffleDisks
- getOrderedPartsMetadata as shufflePartsMetadata

Distribution is now a second argument instead being the
primary input argument for brevity.

Also change the usage of type casted int64(0), instead
rely on direct type reference as `var variable int64` everywhere.
2017-02-24 09:20:40 -08:00
Harshavardhana 6a6c930f5b xl: Abort multipart upload should honor quorum properly. (#3670)
Current implementation didn't honor quorum properly and didn't
handle the errors generated properly. This patch addresses that
and also moves common code `cleanupMultipartUploads` into xl
specific private function.

Fixes #3665
2017-02-01 11:16:17 -08:00
Krishnan Parthasarathi 864b8795aa heal: Should delete stale object parts before healing (#3649) 2017-01-30 00:45:56 -08:00
Anis Elleuch 0715032598 heal: Add ListBucketsHeal object API (#3563)
ListBucketsHeal will list which buckets that need to be healed:
  * ListBucketsHeal() (buckets []BucketInfo, err error)
2017-01-19 09:34:18 -08:00
Krishnan Parthasarathi c194b9f5f1 Implement mgmt REST APIs for heal subcommands (#3533)
The heal APIs supported in this change are,
- listing of objects to be healed.
- healing a bucket.
- healing an object.
2017-01-17 10:02:58 -08:00
Harshavardhana 1c699d8d3f fs: Re-implement object layer to remember the fd (#3509)
This patch re-writes FS backend to support shared backend sharing locks for safe concurrent access across multiple servers.
2017-01-16 17:05:00 -08:00
Harshavardhana 2d6f8153fa format: Check properly for disks in valid formats. (#3427)
There was an error in how we validated disk formats,
if one of the disk was formatted and was formatted with
FS would cause confusion and object layer would never
initialize essentially go into an infinite loop.

Validate pre-emptively and also check for FS format
properly.
2016-12-11 15:18:55 -08:00
Harshavardhana 4daa0d2cee lock: Moving locking to handler layer. (#3381)
This is implemented so that the issues like in the
following flow don't affect the behavior of operation.

```
GetObjectInfo()
.... --> Time window for mutation (no lock held)
.... --> Time window for mutation (no lock held)
GetObject()
```

This happens when two simultaneous uploads are made
to the same object the object has returned wrong
info to the client.

Another classic example is "CopyObject" API itself
which reads from a source object and copies to
destination object.

Fixes #3370
Fixes #2912
2016-12-10 16:15:12 -08:00
Harshavardhana ff4ce0ee14 fs/xl: Combine input checks into re-usable functions. (#3383)
Repeated code around both object layers are moved
and combined into simple re-usable functions.
2016-12-01 23:15:17 -08:00
Bala FA 0f2e493c9a Use isErrIgnored() function wherever applicable. (#3343) 2016-11-23 20:05:04 -08:00
Bala FA 1d4ac4b084 Rename getUUID() into mustGetUUID() (#3320)
In case of UUID generation failure mustGetUUID() will panic than
infinitely trying in for loop.
2016-11-22 16:52:37 -08:00
Harshavardhana 5197649081 utils: reduceErrs returns and validates quorum errors. (#3300)
This is needed as explained by @krisis

Lets say we have following errors.

```
[]error{nil, errFileNotFound, errDiskAccessDenied, errDiskAccesDenied}
```

Since the last two errors are filtered, the maximum is nil,
depending on map order.

Let's say we get nil from reduceErr. Clearly at this point
we don't have quorum nodes agreeing about the data and since
GetObject only requires N/2 (Read quorum) and isDiskQuorum
would have returned true. This is problematic and can lead to
undersiable consequences.

Fixes #3298
2016-11-21 01:47:26 -08:00
Krishnan Parthasarathi eed9ab0464 XL: pickValidXLMeta should return error instead of panic'ing (#3277) 2016-11-20 20:56:44 -08:00
Harshavardhana 0b9f0d14a1 auth/rpc: Take remote disk offline after maximum allowed attempts. (#3288)
Disks when are offline for a long period of time, we should
ignore the disk after trying Login upto 5 times.

This is to reduce the network chattiness, this also reduces
the overall time spent on `net.Dial`.

Fixes #3286
2016-11-20 16:57:12 -08:00
Anis Elleuch ffbee70e04 Avoid removing 'tmp' directory inside '.minio.sys' (#3294) 2016-11-20 14:25:43 -08:00
Harshavardhana 1c47365445 xl/bootup: Upon bootup handle errors loading bucket and event configs. (#3287)
In a situation when we have lots of buckets the bootup time
might have slowed down a bit but during this situation the
servers quickly going up and down would be an in-transit state.

Certain calls which do not use quorum like `readXLMetaStat`
might return an error saying `errDiskNotFound` this is returned
in place of expected `errFileNotFound` which leads to an issue
where server doesn't start.

To avoid this situation we need to ignore them as safe values
to be ignored, for the most part these are network related errors.

Fixes #3275
2016-11-19 17:37:57 -08:00
Harshavardhana c91d3791f9 heal: Add healing support for bucket, bucket metadata files. (#3252)
This patch implements healing in general but it is only used
as part of quickHeal().

Fixes #3237
2016-11-16 16:42:23 -08:00
Aditya Manthramurthy dd0698d14c Improve namespace lock API: (#3203)
- abstract out instrumentation information.
- use separate lockInstance type that encapsulates the nsMutex, volume,
  path and opsID as the frontend or top-level lock object.
2016-11-09 10:58:41 -08:00
Harshavardhana 39331b6b4e xl: GetCheckSumInfo() shouldn't fail if hash not available. (#2984)
In a multipart upload scenario disks going down and coming backup
can lead to certain parts missing on the disk/server which was
going down. This is a valid case since these blocks can be
missing and should be healed through heal operation. But we are
not supposed to fail prematurely since we have enough data on
the other disks as well within read-quorum.

This fix relaxes previous assumption, fixes a major corruption
issue reproduced by @vadmeste.

Fixes #2976
2016-10-18 11:13:25 -07:00
Harshavardhana fee3f99a6e xl: heal bucket should validate if bucket exists first. (#2953)
Fixes #2944
2016-10-17 02:10:23 -07:00
Krishna Srinivas f5f007e183 Test: Add test case for xl.HealObject() (#2884)
fixes #2842
2016-10-08 17:08:17 -07:00
Harshavardhana 1e6d67b16d server: Remove deadcode. (#2699) 2016-09-14 13:43:08 -07:00
Krishna Srinivas 7cc77eba45 XL/Healing: errDiskNotFound is the only pardonable error in xlShouldHeal. (#2586)
This is so that we try to heal a file for all the "bad" cases except when the disk is down.
2016-09-13 21:18:30 -07:00
Anis Elleuch 200d327737 List only objects that need healing (#2546) 2016-09-13 21:18:30 -07:00
Harshavardhana bccf549463 server: Move all the top level files into cmd folder. (#2490)
This change brings a change which was done for the 'mc'
package to allow for clean repo and have a cleaner
github drop in experience.
2016-08-18 16:23:42 -07:00