Commit Graph

55 Commits

Author SHA1 Message Date
Harshavardhana
c2b7b82ef4 Verify bitrot in ReadFile correctly when verifier is set (#6563)
This is a major regression introduced in this commit

ce02ab613d is the first bad commit
commit ce02ab613d
Author: Krishna Srinivas <634494+krishnasrinivas@users.noreply.github.com>
Date:   Mon Aug 6 15:14:08 2018 -0700

    Simplify erasure code by separating bitrot from erasure code (#5959)

:040000 040000 794f58d82a d2201ebfc8 M	cmd

This effects all distributed server deployments since this commit

All the following releases are affected

- RELEASE.2018-09-25T21-34-43Z
- RELEASE.2018-09-12T18-49-56Z
- RELEASE.2018-09-11T01-39-21Z
- RELEASE.2018-09-01T00-38-25Z
- RELEASE.2018-08-25T01-56-38Z
- RELEASE.2018-08-21T00-37-20Z
- RELEASE.2018-08-18T03-49-57Z

Thanks to Anis for reproducing the issue
2018-10-04 10:16:38 -07:00
Pontus Leitzler
df60b3c733 Remove unnecessary contexts passed as data to FatalIf. No need to log an empty context. (#6487) 2018-09-21 16:04:11 -07:00
Anis Elleuch
7571582000 Print storage errors during distributed initialization (#6441)
This commit will print connection failures to other disks in other nodes
after 5 retries. It is useful for users to understand why the
distribued cluster fails to boot up.
2018-09-10 16:21:59 -07:00
Krishna Srinivas
81b7e5c7a8 Send length instead of empty buffer for ReadFile() (#6414) 2018-09-04 23:22:05 -07:00
Krishna Srinivas
ce02ab613d Simplify erasure code by separating bitrot from erasure code (#5959) 2018-08-06 15:14:08 -07:00
Harshavardhana
28d526bc68 Change CriticalIf to FatalIf for proper error message (#6040)
During startup until the object layer is initialized
logger is disabled to provide for a cleaner UI error
message. CriticalIf is disabled, use FatalIf instead.

Also never call os.Exit(1) on running servers where
you can return error to client in handlers.
2018-06-14 10:17:07 -07:00
Bala FA
6a53dd1701 Implement HTTP POST based RPC (#5840)
Added support for new RPC support using HTTP POST.  RPC's 
arguments and reply are Gob encoded and sent as HTTP 
request/response body.

This patch also removes Go RPC based implementation.
2018-06-06 14:21:56 +05:30
Harshavardhana
e6ec645035 Implement support for calculating disk usage per tenant (#5969)
Fixes #5961
2018-05-23 15:41:29 +05:30
Anis Elleuch
6d5f2a4391 Better support of empty directories (#5890)
Better support of HEAD and listing of zero sized objects with trailing
slash (a.k.a empty directory). For that, isLeafDir function is added
to indicate if the specified object is an empty directory or not. Each
backend (xl, fs) has the responsibility to store that information.
Currently, in both of XL & FS, an empty directory is represented by
an empty directory in the backend.

isLeafDir() checks if the given path is an empty directory or not,
since dir listing is costly if the latter contains too many objects,
readDirN() is added in this PR to list only N number of entries.
In isLeadDir(), we will only list one entry to check if a directory
is empty or not.
2018-05-09 01:38:21 -07:00
Harshavardhana
fb96779a8a Add large bucket support for erasure coded backend (#5160)
This PR implements an object layer which
combines input erasure sets of XL layers
into a unified namespace.

This object layer extends the existing
erasure coded implementation, it is assumed
in this design that providing > 16 disks is
a static configuration as well i.e if you started
the setup with 32 disks with 4 sets 8 disks per
pack then you would need to provide 4 sets always.

Some design details and restrictions:

- Objects are distributed using consistent ordering
  to a unique erasure coded layer.
- Each pack has its own dsync so locks are synchronized
  properly at pack (erasure layer).
- Each pack still has a maximum of 16 disks
  requirement, you can start with multiple
  such sets statically.
- Static sets set of disks and cannot be
  changed, there is no elastic expansion allowed.
- Static sets set of disks and cannot be
  changed, there is no elastic removal allowed.
- ListObjects() across sets can be noticeably
  slower since List happens on all servers,
  and is merged at this sets layer.

Fixes #5465
Fixes #5464
Fixes #5461
Fixes #5460
Fixes #5459
Fixes #5458
Fixes #5460
Fixes #5488
Fixes #5489
Fixes #5497
Fixes #5496
2018-02-15 17:45:57 -08:00
Harshavardhana
1164fc60f3 Bring semantic versioning to provide for rolling upgrades (#5495)
This PR brings semver capabilities in our RPC layer to
ensure that we can upgrade the servers in rolling fashion
while keeping I/O in progress. This is only a framework change
the functionality remains the same as such and we do not
have any special API changes for now. But in future when
we bring in API changes we will be able to upgrade servers
without a downtime.

Additional change in this PR is to not abort when serverVersions
mismatch in a distributed cluster, instead wait for the quorum
treat the situation as if the server is down. This allows
for administrator to properly upgrade all the servers in the cluster.

Fixes #5393
2018-02-06 15:07:17 -08:00
Krishna Srinivas
14e6c5ec08 Simplify the steps to make changes to config.json (#5186)
This change introduces following simplified steps to follow 
during config migration.

```
 // Steps to move from version N to version N+1
 // 1. Add new struct serverConfigVN+1 in config-versions.go
 // 2. Set configCurrentVersion to "N+1"
 // 3. Set serverConfigCurrent to serverConfigVN+1
 // 4. Add new migration function (ex. func migrateVNToVN+1()) in config-migrate.go
 // 5. Call migrateVNToVN+1() from migrateConfig() in config-migrate.go
 // 6. Make changes in config-current_test.go for any test change
```
2017-11-29 13:12:47 -08:00
Andreas Auernhammer
7e6b5bdbb7 remove ReadFileWithVerify from StorageAPI (#4947)
This change removes the ReadFileWithVerify function from the
StorageAPI. The ReadFile was basically a redirection to ReadFileWithVerify.
This change removes the redirection and moves the logic of
ReadFileWithVerify directly into ReadFile.
This removes a lot of unnecessary code in all StorageAPI implementations.

Fixes #4946

* review: fix doc and typos
2017-09-25 11:32:56 -07:00
Andreas Auernhammer
85fcee1919 erasure: simplify XL backend operations (#4649) (#4758)
This change provides new implementations of the XL backend operations:
 - create file
 - read   file
 - heal   file
Further this change adds table based tests for all three operations.

This affects also the bitrot algorithm integration. Algorithms are now
integrated in an idiomatic way (like crypto.Hash).
Fixes #4696
Fixes #4649
Fixes #4359
2017-08-14 18:08:42 -07:00
Frank Wessels
98b62cbec8 Implement an offline mode for a distributed node (#4646)
Implement an offline mode for remote storage to cache the
offline status of a node in order to prevent network calls
that are bound to fail. After a time interval an attempt
will be made to restore the connection and mark the node
as online if successful.

Fixes #4183
2017-08-11 11:49:35 -07:00
Harshavardhana
432bf7d99e Fail if formatting is wrong in our CI tests. (#4459)
We didn't fail before, we should helps in avoiding
formatting issues to creep into the codebase.
2017-06-02 14:05:51 -07:00
Frank Wessels
0f0758aece Load IO error count for posix atomically (#4448)
* Load error count atomically in order to check for maximum allowed number of IO errors.

* Remove unused (previously atomic) network IO error count
2017-05-31 09:22:53 -07:00
Aditya Manthramurthy
8975da4e84 Add new ReadFileWithVerify storage-layer API (#4349)
This is an enhancement to the XL/distributed-XL mode. FS mode is
unaffected.

The ReadFileWithVerify storage-layer call is similar to ReadFile with
the additional functionality of performing bit-rot checking. It
accepts additional parameters for a hashing algorithm to use and the
expected hex-encoded hash string.

This patch provides significant performance improvement because:

1. combines the step of reading the file (during
erasure-decoding/reconstruction) with bit-rot verification;

2. limits the number of file-reads; and

3. avoids transferring the file over the network for bit-rot
verification.

ReadFile API is implemented as ReadFileWithVerify with empty hashing
arguments.

Credits to AB and Harsha for the algorithmic improvement.

Fixes #4236.
2017-05-16 14:21:52 -07:00
Harshavardhana
f4dac979a2 server: Fix message when corrupted or unsupported format is found. (#4142)
Refer https://github.com/minio/minio/issues/4140

This is a fix to provide a little more elaborate message.
2017-04-18 10:35:17 -07:00
Bala FA
de204a0a52 Add extensive endpoints validation (#4019) 2017-04-11 15:44:27 -07:00
Harshavardhana
34d9a6b46a Make sure client initializes to proper lock RPC path. (#3763)
Fixes a regression introduced in previous commit.
2017-02-18 02:52:11 -08:00
Harshavardhana
50b4e54a75 fs: Do not return reservedBucket names in ListBuckets() (#3754)
Make sure to skip reserved bucket names in `ListBuckets()`
current code didn't skip this properly and also generalize
this behavior for both XL and FS.
2017-02-16 14:52:14 -08:00
Harshavardhana
fb39c7c26b sRPC/client: Properly trim storageRPCPath for actual disk path. (#3749)
Never print internal RPC endpoint paths.
2017-02-15 03:47:47 -08:00
Harshavardhana
dfc2ef3004 storage/rpc: Remove network error restriction. (#3591)
This restriction has lots of side affects, since
we do not have a mechanism to clear states like
this it is better not to keep them.

Network errors are common and can occur with
simple cable removal etc. Since we already have
a retry mechanism this error count and stateful
nature can bring problems on a long running
cluster.
2017-01-18 12:55:57 -08:00
Harshavardhana
08b6cfb082 ssl: Set a global boolean to enable SSL across Minio (#3558)
We have been using `isSSL()` everywhere we can set
a global value once and re-use it again.
2017-01-11 13:59:51 -08:00
Bala.FA
6d10f4c19a Adopt dsync interface changes and major cleanup on RPC server/client.
* Rename GenericArgs to AuthRPCArgs
* Rename GenericReply to AuthRPCReply
* Remove authConfig.loginMethod and add authConfig.ServiceName
* Rename loginServer to AuthRPCServer
* Rename RPCLoginArgs to LoginRPCArgs
* Rename RPCLoginReply to LoginRPCReply
* Version and RequestTime are added to LoginRPCArgs and verified by
  server side, not client side.
* Fix data race in lockMaintainence loop.
2017-01-02 20:57:42 +05:30
Harshavardhana
8562b22823 Fix delays and iterim fix for the partial fix in #3502 (#3511)
This patch uses a technique where in a retryable storage
before object layer initialization has a higher delay
and waits for longer period upto 4 times with time unit
of seconds.

And uses another set of configuration after the disks
have been formatted, i.e use a lower retry backoff rate
and retrying only once per 5 millisecond.

Network IO error count is reduced to a lower value i.e 256
before we reject the disk completely. This is done so that
combination of retry logic and total error count roughly
come to around 2.5secs which is when we basically take the
disk offline completely.

NOTE: This patch doesn't fix the issue of what if the disk
is completely dead and comes back again after the initialization.
Such a mutating state requires a change in our startup sequence
which will be done subsequently. This is an interim fix to alleviate
users from these issues.
2016-12-30 17:08:02 -08:00
Harshavardhana
dd68cdd802 Auto-reconnect for regular authRPC client. (#3506)
Implement a storage rpc specific rpc client,
which does not reconnect unnecessarily.

Instead reconnect is handled at a different
layer for storage alone.

Rest of the calls using AuthRPC automatically
reconnect, i.e upon an error equal to `rpc.ErrShutdown`
they dial again and call the requested method again.
2016-12-29 19:42:02 -08:00
Harshavardhana
41cf580bb1 Improve reconnection logic, allow jitters. (#3502)
Attempt a reconnect also if disk not found.

This is needed since any network operation error
is converted to disk not found but we also need
to make sure if disk is really not available. 

Additionally we also need to retry more than
once because the server might be in startup
sequence which would render other servers to
wrongly think that the server is offline.
2016-12-29 03:13:51 -08:00
Bala FA
e8ce3b64ed Generate and use access/secret keys properly (#3498) 2016-12-26 10:21:23 -08:00
Harshavardhana
b363709c11 caching: Optimize memory allocations. (#3405)
This change brings in changes at multiple places

 - Reuse buffers at almost all locations ranging
   from rpc, fs, xl, checksum etc.
 - Change caching behavior to disable itself
   under low memory conditions i.e < 8GB of RAM.
 - Only objects cached are of size 1/10th the size
   of the cache for example if 4GB is the cache size
   the maximum object size which will be cached
   is going to be 400MB. This change is an
   optimization to cache more objects rather
   than few larger objects.
 - If object cache is enabled default GC
   percent has been reduced to 20% in lieu
   with newly found behavior of GC. If the cache
   utilization reaches 75% of the maximum value
   GC percent is reduced to 10% to make GC
   more aggressive.
 - Do not use *bytes.Buffer* due to its growth
   requirements. For every allocation *bytes.Buffer*
   allocates an additional buffer for its internal
   purposes. This is undesirable for us, so
   implemented a new cappedWriter which is capped to a
   desired size, beyond this all writes rejected.

Possible fix for #3403.
2016-12-08 20:35:07 -08:00
Harshavardhana
6efee2072d objectLayer: Check for format.json in a wrapped disk. (#3311)
This is needed to validate if the `format.json` indeed exists
when a fresh node is brought online.

This wrapped implementation also connects to the remote node
by attempting a re-login. Subsequently after a successful
connect `format.json` is validated as well.

Fixes #3207
2016-11-23 15:48:10 -08:00
Harshavardhana
1c47365445 xl/bootup: Upon bootup handle errors loading bucket and event configs. (#3287)
In a situation when we have lots of buckets the bootup time
might have slowed down a bit but during this situation the
servers quickly going up and down would be an in-transit state.

Certain calls which do not use quorum like `readXLMetaStat`
might return an error saying `errDiskNotFound` this is returned
in place of expected `errFileNotFound` which leads to an issue
where server doesn't start.

To avoid this situation we need to ignore them as safe values
to be ignored, for the most part these are network related errors.

Fixes #3275
2016-11-19 17:37:57 -08:00
Anis Elleuch
a47ce7ab22 Add support of fallocate for FS and XL backends (#3032) 2016-10-29 12:44:44 -07:00
Harshavardhana
9e2d0ac50b Move to URL based syntax formatting. (#3092)
For command line arguments we are currently following

- <node-1>:/path ... <node-n>:/path

This patch changes this to

- http://<node-1>/path ... http://<node-n>/path
2016-10-27 03:30:52 -07:00
Krishna Srinivas
32c3a558e9 distributed-XL: Support to run one minio process per export even on the same machine. (#2999)
fixes #2983
2016-10-20 18:31:02 -07:00
Harshavardhana
f1bc9343a1 prep: Initialization should wait instead of exit the servers. (#2872)
- Servers do not exit for invalid credentials instead they print and wait.
- Servers do not exit for version mismatch instead they print and wait.
- Servers do not exit for time differences between nodes they print and wait.
2016-10-07 11:15:55 -07:00
Harshavardhana
64f37bbf5b rpc: Add RPC client tests. (#2858) 2016-10-06 02:30:54 -07:00
Harshavardhana
6494b77d41 server: Add more elaborate startup messages. (#2731)
These messages based on our prep stage during XL
and prints more informative message regarding
drive information.

This change also does a much needed refactoring.
2016-10-05 12:48:07 -07:00
Anis Elleuch
9fb1c89f81 Add TLS encryption capability to RPC clients (#2789) 2016-09-29 23:42:37 -07:00
Harshavardhana
ae64b7fac8 XL: Handle object layer initialization properly.
Initialization when disk was down the network disk
reported an incorrect error rather than errDiskNotFound.

This resulted in incorrect error handling during
prepInitStorage() stage.

Fixes #2577
2016-09-13 21:18:30 -07:00
Harshavardhana
339425fd52 server: Fetch StorageInfo() from underlying disks transparently. (#2549)
Fixes #2511
2016-09-13 21:18:30 -07:00
Harshavardhana
9605fde04d controller/auth: Implement JWT based authorization for controller. (#2544)
Fixes #2474
2016-09-13 21:18:30 -07:00
Harshavardhana
e1b0985b5b rpc: Refactor authentication and login changes. (#2543)
- Cache login requests.
- Converge validating distributed setup.
2016-09-13 21:18:30 -07:00
Bala FA
7922a54c9a rpc-client: remove unwanted nil check of rpcClient. (#2538) 2016-09-13 21:18:30 -07:00
Krishnan Parthasarathi
bda6bcd5be Layered rpc-client implementation (#2512) 2016-09-13 21:18:30 -07:00
Harshavardhana
7e3e24b394 rpc: client login should ignore server versions. 2016-09-13 21:18:30 -07:00
Harshavardhana
bb0466f4ce control: Fix controller CLI handling with distributed server object layer.
Object layer initialization is done lazily fix it.
2016-09-13 21:18:30 -07:00
awwalker
7c7eb1475d splitNetPath: Add support for windows paths including volumeNames e.g ip:C:\network\path 2016-09-13 21:18:30 -07:00
Krishnan Parthasarathi
804d91ef61 storage/rpc-client: Reconnect on network disconnect (#2436) 2016-09-13 21:18:30 -07:00