When servers are started simultaneously across multiple
nodes or simulating a local setup, it can happen such
that one of the servers in setup reaches a following
situation where it observes
- Some servers are formatted
- Some servers are unformatted
- Some servers are offline
Current state machine doesn't handle this correctly, to fix
this situation where we have unformatted, formatted and
disks offline we do not decisively know the course of
action. So we wait for the offline disks to change their state.
Once the offline disks change their state to either one of these
states we can decisively move forward.
- nil (formatted disk)
- errUnformattedDisk
- Or any other error such as errCorruptedDisk.
Fixes#4903
The default timeout of 30secs is not enough for high latency
environments, change these values to use 15 minutes instead.
With 30secs I/O timeouts seem to be quite common, this leads
to pretty much most SDKs and clients reconnect. This in-turn
causes significant performance problems. On a low latency
interconnect this can be quite challenging to transfer large
amounts of data. Setting this value to 15minutes covers
pretty much all known cases.
This PR was tested with `wondershaper <NIC> 20000 20000` by
limiting the network bandwidth to 20Mbit/sec. Default timeout
caused a significant amount of I/O timeouts, leading to
constant retires from the client. This seems to be more common
with tools like rclone, restic which have high concurrency set
by default. Once the value was fixed to 15minutes i/o timeouts
stopped and client could steadily upload data to the server
even while saturating the network.
Fixes#4670
Previously if any multipart part size > 100MiB is uploaded, azure
gateway returns error.
This patch fixes the issue by creating sub parts sizing each 100MiB of
given multipart part. On complete multipart, it fetches all uploaded
azure block ids for each parts and performs completion.
Fixes#4868
- Region handling can now use region endpoints directly.
- All uploads are streaming no more large buffer needed.
- Major API overhaul for CopyObject(dst, src)
- Fixes bugs present in existing code for copying
- metadata replace directive CopyObject
- PutObjectPart doesn't require md5Sum and sha256
All `net/rpc` requests go to `/minio`, so the existing
generic handler for reserved bucket check would essentially
erroneously send errors leading to distributed setups to
wait infinitely.
For `net/rpc` requests alone we should skip this check and
allow resource bucket names to be from `/minio` .
Current code was just using io.ReadAll() on an fd()
which might have moved underneath due to a concurrent
read operation. Subsequent read will result in EOF
We should always seek back and read again. pread()
is allowed on all platforms use io.SectionReader to
read from the beginning of the file.
Fixes#4842
Bcrypt is not neccessary and not used properly. This change
replace the whole bcrypt hash computation through a constant time
compare and removes bcrypt from the code base.
Fixes#4813
If a TopicConfiguration element or CloudFunction element is found in
configuration submitted to PutBucketNotification API, an BadRequest
error is returned.
S3 only allows http headers with a size of 8 KB and user-defined metadata
with a size of 2 KB. This change adds a new API error and returns this
error to clients which sends to large http requests.
Fixes#4634
We don't need to typecast identifiers from
their base to type to same type again. This
is not a bug and compiler is fine to skip
it but it is better to avoid if not needed.
This change provides new implementations of the XL backend operations:
- create file
- read file
- heal file
Further this change adds table based tests for all three operations.
This affects also the bitrot algorithm integration. Algorithms are now
integrated in an idiomatic way (like crypto.Hash).
Fixes#4696Fixes#4649Fixes#4359
Wait for remote hosts to resolve instead of failing on first host
resolution error, when running in Kubernetes or Docker environment.
Note that
- Waiting is based on exponential back-off mechanism
- If run as a binary, server fails if remote host is not resolvable
This is needed because in orchestration platforms like Kubernetes, remote
hosts are started sequentially and all the hosts are not up initially,
though they are expected to come up in a short time frame
It is difficult to identify a cap on the waiting time due to
non-deterministic nature of infrastructure platforms, so the server waits
infinitely for the hosts to come up, while logging the error messages to
the console.
Fixes: https://github.com/minio/minio/issues/4669
Since go1.8 os.RemoveAll and os.MkdirAll both support long
path names i.e UNC path on windows. The code we are carrying
was directly borrowed from `pkg/os` package and doesn't need
to be in our repo anymore. As a side affect this also
addresses our codecoverage issue.
Refer #4658
* Prevent unnecessary verification of parity blocks while reading erasure
coded file.
* Update klauspost/reedsolomon and just only reconstruct data blocks while
reading (prevent unnecessary parity block reconstruction)
* Remove Verification of (all) reconstructed Data and Parity blocks since
in our case we are protected by bit rot protection. And even if the
verification would fail (essentially impossible) there is no way to
definitively say whether the data is still correct or not, so this call
make no sense for our use case.
Implement an offline mode for remote storage to cache the
offline status of a node in order to prevent network calls
that are bound to fail. After a time interval an attempt
will be made to restore the connection and mark the node
as online if successful.
Fixes#4183
It is possible at times due to a typo when distributed mode was intended
a user might end up starting standalone erasure mode causing confusion.
Add code to check this based on some standard heuristic guess work and
report an error to the user.
Fixes#4686
Under the call flow
```
Readdir
+
|
|
| path-entry
|
|
v
StatDir
```
Existing code was written in a manner where say
a bucket/top-level directory was indeed deleted
between Readdir() and before StatDir() we would
ignore certain errors. This is not a plausible
situation and might not happen in almost all
practical cases. We do not have to look for
or interpret these errors returned by StatDir()
instead we can just collect the successful
values and return back to the client. We do not
need to pre-maturely decide on bucket access
we just let filesystem decide subsequently for
real I/O operations.
Refer #4658
This is in preparation for updated admin heal API.
* Improve case analysis of healFormatXL() - fixes a case where disks
could have unhandled errors.
* Simplify healFormatXLFreshDisks() and healFormatXLCorruptedDisks()
to share more code and handle fewer cases for improved simplicity
and reduced code repetition.
* Fix test cases.
This commit changes posix's deleteFile() to not upstream errors from
removing parent directories. This fixes a race condition.
The race condition occurs when multiple deleteFile()s are called on the
same parent directory, but different child files. Because deleteFile()
recursively removes parent directories if they are empty, but
deleteFile() errors if the selected deletePath does not exist, there was
an opportunity for a race condition. The two processes would remove the
child directories successfully, then depend on the parent directory
still existing. In some cases this is an invalid assumption, because
other processes can remove the parent directory beforehand. This commit
changes deleteFile() to not upstream an error if one occurs, because the
only required error should be from the immediate deletePath, not from a
parent path.
In the specific bug report, multiple CompleteMultipartUpload requests
would launch multiple deleteFile() requests. Because they chain up on
parent directories, ultimately at the end, there would be multiple
remove files for the ultimate parent directory,
.minio.sys/multipart/{bucket}. Because only one will succeed and one
will fail, an error would be upstreamed saying that the file does not
exist, and the CompleteMultipartUpload code interpreted this as
NoSuchKey, or that the object/part id doesn't exist. This was faulty
behavior and is now fixed.
The added test fails before this change and passes after this change.
Fixes: https://github.com/minio/minio/issues/4727
This commit adds a new test for isDirEmpty (for code coverage) and
changes around the error conditional. Previously, there was a `return
nil` statement that would only be triggered under a race condition and
would trip up our test coverage for no real reason. With this new error
conditional, there's no awkward 'else'-esque condition, which means test
coverage will not change between runs for no reason in this specific
test. It's also a cleaner read.
This commit makes fsDeleteFile() simply call deleteFile() after calling
the relevant path length checking functions. This DRYs the code base.
This commit removes the Stat() call from deleteFile(). This improves
performance and removes any possibility of a race condition.
This additionally adds tests and a benchmark for said function. The
results aren't very consistent, although I'd expect this commit to make
it faster.
This commit fixes a potential security issue, whereby a full-access
token to the server would be available in the GET URL of a download
request. This fixes that issue by introducing short-expiry tokens, which
are only valid for one minute, and are regenerated for every download
request.
This commit specifically introduces the short-lived tokens, adds tests
for the tokens, adds an RPC call for generating a token given a
full-access token, updates the browser to use the new tokens for
requests where the token is passed as a GET parameter, and adds some
tests with the new temporary tokens.
Refs: https://github.com/minio/minio/pull/4673