This is to utilize an optimized version of
sha256 checksum which @fwessels implemented.
blake2b lacks such optimizations on ARM platform,
this can provide us significant boost in performance.
blake2b on ARM64 as expected would be slower.
```
BenchmarkSize1K-4 30000 44015 ns/op 23.26 MB/s
BenchmarkSize8K-4 5000 335448 ns/op 24.42 MB/s
BenchmarkSize32K-4 1000 1333960 ns/op 24.56 MB/s
BenchmarkSize128K-4 300 5328286 ns/op 24.60 MB/s
```
sha256 on ARM64 is faster by orders of magnitude giving close to
AVX performance of blake2b.
```
BenchmarkHash8Bytes-4 1000000 1446 ns/op 5.53 MB/s
BenchmarkHash1K-4 500000 3229 ns/op 317.12 MB/s
BenchmarkHash8K-4 100000 14430 ns/op 567.69 MB/s
BenchmarkHash1M-4 1000 1640126 ns/op 639.33 MB/s
```
ObjectLayer GetObject() now returns the entire object
if starting offset is 0 and length is negative. This
also allows to simplify handler layer code where
we always had to use GetObjectInfo() before proceeding
to read bucket metadata files examples `policy.json`.
This also reduces one additional call overhead.
success_action_redirect in the sent Form means that the server needs to return 303 in addition to a well specific redirection url, this commit adds this feature
This is important in a distributed setup, where the server hosting the
first disk formats a fresh setup. Sorting ensures that all servers
arrive at the same 'first' server.
Note: This change doesn't protect against different disk arguments
with some disks being same across servers.
Previously, more than one goroutine calls RPCClient.dial(), each
goroutine gets a new rpc.Client but only one such client is stored
into RPCClient object. This leads to leaky connection at the server
side. This is fixed by taking lock at top of dial() and release on
return.
There was an error in how we validated disk formats,
if one of the disk was formatted and was formatted with
FS would cause confusion and object layer would never
initialize essentially go into an infinite loop.
Validate pre-emptively and also check for FS format
properly.
This is implemented so that the issues like in the
following flow don't affect the behavior of operation.
```
GetObjectInfo()
.... --> Time window for mutation (no lock held)
.... --> Time window for mutation (no lock held)
GetObject()
```
This happens when two simultaneous uploads are made
to the same object the object has returned wrong
info to the client.
Another classic example is "CopyObject" API itself
which reads from a source object and copies to
destination object.
Fixes#3370Fixes#2912
FS/Multipart: Fix race between PutObjectPart and Complete/Abort multipart. close(timeoutCh) on complete/abort so that a racing PutObjectPart does not leave a dangling go-routine.
Fixes#3351
This change brings in changes at multiple places
- Reuse buffers at almost all locations ranging
from rpc, fs, xl, checksum etc.
- Change caching behavior to disable itself
under low memory conditions i.e < 8GB of RAM.
- Only objects cached are of size 1/10th the size
of the cache for example if 4GB is the cache size
the maximum object size which will be cached
is going to be 400MB. This change is an
optimization to cache more objects rather
than few larger objects.
- If object cache is enabled default GC
percent has been reduced to 20% in lieu
with newly found behavior of GC. If the cache
utilization reaches 75% of the maximum value
GC percent is reduced to 10% to make GC
more aggressive.
- Do not use *bytes.Buffer* due to its growth
requirements. For every allocation *bytes.Buffer*
allocates an additional buffer for its internal
purposes. This is undesirable for us, so
implemented a new cappedWriter which is capped to a
desired size, beyond this all writes rejected.
Possible fix for #3403.
- This is to ensure that the any new config references made to the
serverConfig is also backed by a mutex lock.
- Otherwise any new config assigment will also replace the member mutex
which is currently used for safe access.
EOF err message in Peek Protocol is shown when a client closes the
connection in the middle of peek protocol, this commit hides it since it
doesn't make sense to show it
Current code always appends to a file only if 1byte or
more was sent on the wire was affecting both PutObject
and PutObjectPart uploads.
This patch fixes such a situation and resolves#3385
backgroundAppend type's abort method should wait for appendParts to finish
writing ongoing appending of parts in the background before cleaning up
the part files.
Previously minio server expects content-length-range values as integer
in JSON. However Amazon S3 handles content-length-range values as
integer and strings.
This patch adds support for string values.
Since we moved out reconnection logic from net-rpc-client.go
we should do it from the top-layer properly and bring back
the code to reconnect properly in-case the connection is lost.
setGlobalsFromContext() is added to set global variables after parsing
command line arguments. Thus, global flags will be honored wherever
they are placed in minio command.
Make sure all S3 signature requests are not re-directed
to `/minio`. This should be only done for JWT and some
Anonymous requests.
This also fixes a bug found from https://github.com/bji/libs3
```
$ s3 -u list
ERROR: XmlParseFailure
```
Now after this fix shows proper output
```
$ s3 -u list
Bucket Created
-------------------------------------------------------- --------------------
andoria 2016-11-27T08:19:06Z
```
It would make sense to enable logger just after config initialisation.
That way, errorIf() and fatalIf() will be usable and can catch error
like invalid access and key errors.
This patch brings in changes from miniobrowser repo.
- Bucket policy UI and functionality fixes by @krishnasrinivas
- Bucket policy implementation by @balamurugana
- UI changes and new functionality changing password etc. @rushenn
- UI and new functionality for sharing URLs, deleting files
@rushenn and @krishnasrinivas.
- Other misc fixes by @vadmeste @brendanashworth
This is needed to validate if the `format.json` indeed exists
when a fresh node is brought online.
This wrapped implementation also connects to the remote node
by attempting a re-login. Subsequently after a successful
connect `format.json` is validated as well.
Fixes#3207
logurs is not helping us to set different log formats (json/text) to
different loggers. Now, we create different logurs instances and call
them in errorIf and fatalIf
This change adds more richer error response
for JSON-RPC by interpreting object layer
errors to corresponding meaningful errors
for the web browser.
```go
&json2.Error{
Message: "Bucket Name Invalid, Only lowercase letters, full stops, and numbers are allowed.",
}
```
Additionally this patch also allows PresignedGetObject()
to take expiry parameter to have variable expiry.
XL multipart fails to remove tmp files when an error occurs during upload, this case covers the scenario where an upload is canceled manually by the client in the middle of job.
content-length-range policy in postPolicy API was
not working properly handle it. The reflection
strategy used has changed in recent version of Go.
Any free form interface{} of any integer is treated
as `float64` this caused a bug where content-length-range
parsing failed to provide any value.
Fixes#3295
This is needed as explained by @krisis
Lets say we have following errors.
```
[]error{nil, errFileNotFound, errDiskAccessDenied, errDiskAccesDenied}
```
Since the last two errors are filtered, the maximum is nil,
depending on map order.
Let's say we get nil from reduceErr. Clearly at this point
we don't have quorum nodes agreeing about the data and since
GetObject only requires N/2 (Read quorum) and isDiskQuorum
would have returned true. This is problematic and can lead to
undersiable consequences.
Fixes#3298
Disks when are offline for a long period of time, we should
ignore the disk after trying Login upto 5 times.
This is to reduce the network chattiness, this also reduces
the overall time spent on `net.Dial`.
Fixes#3286
For binary releases and operating systems it would be
All operating systems.
```
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Minio is 25 days 12 hours 30 minutes old ┃
┃ Update: https://dl.minio.io/server/minio/release/linux-amd64/minio ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
```
On docker.
```
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Minio is 25 days 12 hours 32 minutes old ┃
┃ Update: docker pull minio/minio ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
```
In a situation when we have lots of buckets the bootup time
might have slowed down a bit but during this situation the
servers quickly going up and down would be an in-transit state.
Certain calls which do not use quorum like `readXLMetaStat`
might return an error saying `errDiskNotFound` this is returned
in place of expected `errFileNotFound` which leads to an issue
where server doesn't start.
To avoid this situation we need to ignore them as safe values
to be ignored, for the most part these are network related errors.
Fixes#3275
Also fix test to not use a bucket name with a leading slash - this
causes the bucket name to become empty and go to an unintended API
call (listbuckets).
This patch fixes a possible bug, reproduced rarely only seen
once.
```
panic: runtime error: index out of range
goroutine 136 [running]:
panic(0xac1a40, 0xc4200120b0)
/usr/local/go/src/runtime/panic.go:500 +0x1a1
github.com/minio/minio/vendor/github.com/minio/dsync.lock.func1(0xc4203d2240, 0x4, 0xc420474080, 0x4, 0x4, 0xc4202abb60, 0x0, 0xa86d01, 0xefcfc0, 0xc420417a80)
/go/src/github.com/minio/minio/vendor/github.com/minio/dsync/drwmutex.go:170 +0x69b
created by github.com/minio/minio/vendor/github.com/minio/dsync.lock
/go/src/github.com/minio/minio/vendor/github.com/minio/dsync/drwmutex.go:191 +0xf4
```
Ref #3229
After review with @abperiasamy we decided to remove all the unnecessary options
- MINIO_BROWSER (Implemented as a security feature but now deemed obsolete
since even if blocking access to MINIO_BROWSER, s3 API port is open)
- MINIO_CACHE_EXPIRY (Defaults to 72h)
- MINIO_MAXCONN (No one used this option and we don't test this)
- MINIO_ENABLE_FSMETA (Enable FSMETA all the time)
Remove --ignore-disks option - this option was implemented when XL layer
would initialize the backend disks and heal them automatically to disallow
XL accidentally using the root partition itself this option was introduced.
This behavior has been changed XL no longer automatically initializes
`format.json` a HEAL is controlled activity, so ignore-disks is not
useful anymore. This change also addresses the problems of our documentation
going forward and keeps things simple. This patch brings in reduction of
options and defaulting them to a valid known inputs. This patch also
serves as a guideline of limiting many ways to do the same thing.
rpcClient should attempt a reconnect if the call fails
with 'rpc.ErrShutdown' this is needed since at times when
the servers are taken down and brought back up.
The hijacked connection from net.Dial is usually closed.
So upon first attempt rpcClient might falsely indicate that
disk to be down, to avoid this state make another dial attempt
to really fail.
Fixes#3206Fixes#3205
- abstract out instrumentation information.
- use separate lockInstance type that encapsulates the nsMutex, volume,
path and opsID as the frontend or top-level lock object.
This is done by not making the methods of the BucketMetaState interface
as methods (via type nesting) on the type implementing
RPCs (s3PeerAPIHandlers).
- Adds an interface to update in-memory bucket metadata state called
BucketMetaState - this interface has functions to:
- update bucket notification configuration,
- bucket listener configuration,
- bucket policy configuration, and
- send bucket event
- This interface is implemented by `localBMS` a type for manipulating
local node in-memory bucket metadata, and by `remoteBMS` a type for
manipulating remote node in-memory bucket metadata.
- The remote node interface, makes an RPC call, but the local node
interface does not - it updates in-memory bucket state directly.
- Rename mkPeersFromEndpoints to makeS3Peers and refactored it.
- Use arrayslice instead of map in s3Peers struct
- `s3Peers.SendUpdate` now receives an arrayslice of peer indexes to
send the request to, with a special nil value slice indicating that
all peers should be sent the update.
- `s3Peers.SendUpdate` now returns an arrayslice of errors, representing
errors from peers when sending an update. The array positions
correspond to peer array s3Peers.peers
Improve globalS3Peers:
- Make isDistXL a global `globalIsDistXL` and remove from s3Peers
- Make globalS3Peers an array of (address, bucket-meta-state) pairs.
- Fix code and tests.
Default golang net.Listen only listens on the first IP when
host resolves to multiple IPs.
This change addresses a problem for example your ``/etc/hosts``
has entries as following
```
127.0.1.1 minio1
192.168.1.10 minio1
```
Trying to start minio as
```
minio server --address "minio1:9001" ~/Photos
```
Causes the minio server to be bound only to "127.0.1.1" which
is an incorrect behavior since we are generally interested in
`192.168.1.10` as well.
This patch addresses this issue if the hostname is resolvable
and gives back list of addresses associated with that hostname
we just bind on all of them as it is the expected behavior.
- The benchmark initialization function was not taking into account the
instance type (FS/XL), was using XL ObjectLayer even for FS
benchmarks.
- This was leading to incorrect benchmark results for FS related
benchmarks.
- The fix takes into account the instance type (FS/XL) and correctly
returns FS backend for FS benchmarks.
Do not attempt to fetch volume/drive information for
each i/o situation. In our case we do this in all calls
`posix.go` this in-turn created a terrible situation for
windows. This issue does not affect the i/o path on Unix
platforms since statvfs calls are in the range of micro
seconds on these platforms.
This verification is only needed during startup and we
let things fail at a later stage on windows.
- Reads and writes of uploads.json in XL now uses quorum for
newMultipart, completeMultipart and abortMultipart operations.
- Each disk's `uploads.json` file is read and updated independently for
adding or removing an upload id from the file. Quorum is used to
decide if the high-level operation actually succeeded.
- Refactor FS code to simplify the flow, and fix a bug while reading
uploads.json.
For command line arguments we are currently following
- <node-1>:/path ... <node-n>:/path
This patch changes this to
- http://<node-1>/path ... http://<node-n>/path
In FS or single-node XL mode, there is no need to save listener
configuration to persistent storage. As there is only one server, if it
is restarted, any connected listenBucketAPI clients were disconnected
and will have to reconnect - so there is nothing to actually store.
This incidentally solves #3052 by avoiding the problem.
- When modifying notification configuration
- When modifying listener configuration
- When modifying policy configuration
With this change we also stop early checking if the bucket exists, since
that uses a Read-lock and causes a deadlock due to the outer Write-lock.
opsID, a variable on the stack, changes over the course of
Completemultipartupload function in xl-v1-multipart.go. This was
being used in a function closure which was passed to defer
statement. The variables used in the closure depend on their values at
the time of evaluation which is indeterminate behaviour. It is
incorrect to depend on values of variables on stack at the end of
function, when deferred functions are executed.
Added clear subcommand for control lock with following options:
```
3. Clear lock named 'bucket/object' (exact match).
$ minio control lock clear http://localhost:9000/bucket/object
4. Clear all locks with names that start with 'bucket/prefix' (wildcard match).
$ minio control lock --recursive clear http://localhost:9000/bucket/prefix
5. Clear all locks older than 10minutes.
$ minio control lock --older-than=10m clear http://localhost:9000/
6. Clear all locks with names that start with 'bucket/a' and that are older than 1hour.
$ minio control lock --recursive --older-than=1h clear http://localhost:9000/bucket/a
```
This makes sure that when SSL is enabled (for FS/single node mode),
the server address is picked up from the --address option (that needs
to include the hostname for SSL verification, and has to be input
appropriately by user), instead of just using ":<port>".
In a distributed setup that the server should not perform any operation
on the storage layer after it is exported via RPC. e.g, cleaning up of
temporary directories under .minio.sys/tmp may interfere with ongoing
PUT objects being served by the distributed setup.
In a multipart upload scenario disks going down and coming backup
can lead to certain parts missing on the disk/server which was
going down. This is a valid case since these blocks can be
missing and should be healed through heal operation. But we are
not supposed to fail prematurely since we have enough data on
the other disks as well within read-quorum.
This fix relaxes previous assumption, fixes a major corruption
issue reproduced by @vadmeste.
Fixes#2976
Don't close socket while re-initializing notify-listeners, as the rpc
client object is shared between notify-listeners and peer clients.
Also, improves SendRPC() readability by using GetPeerClient().
Fixes a serialisation bug - encoding/gob does not directly support
serializing `map[string]interface{}`, so we serialise to JSON and send a
byte array in the RPC call, and deserialize and update on the receiver.
* Implements a Peer RPC router that sends info to all Minio servers in the cluster.
* Bucket notifications are propagated to all nodes via this RPC router.
* Bucket listener configuration is persisted to separate object layer
file (`listener.json`) and peer RPCs are used to communicate changes
throughout the cluster.
* When events are generated, RPC calls to send them to other servers
where bucket listeners may be connected is implemented.
* Some bucket notification tests are now disabled as they cannot work in
the new design.
* Minor fix in `funcFromPC` to use `path.Join`
* Add test coverage for removeEntry and removeEntryIfExists
* Initial test framework for Lock/Unlock functionality
* Add clarification comments
* Add test coverage code for RLock() and RUnlock()
* Add test coverage for Expired() function
* Have all lock-rpc-server test functions start with the same prefix
* Properly initialize JWT security token
* Refactor streaming signatureV4 w/ state machine
- Used state machine to make transitions between reading chunk header,
chunk data and trailer explicit.
* debug: add print/panic statements to gather more info on CI failure
* Persist lastChunk status between Read() on ChunkReader
... remove panic() which was added as interim aid for debugging.
* Add unit-tests to cover v4 streaming signature
- Cleaning up of ListMultipartUpload API test for improving readability,
code maintainance and extensibility.
- Moving ListMultipartUploads to Go 1.7 sub tests.
- Using the new Anonymous request helper function for
ListMultipartUploads.
- Add helper function for API handler anonymous request tests.
- Add PutObject Part Anonymous request case using the new helper
function to validate its functionality.
- Servers do not exit for invalid credentials instead they print and wait.
- Servers do not exit for version mismatch instead they print and wait.
- Servers do not exit for time differences between nodes they print and wait.
- Clean up PutObjectPart and ListObjectPart API handler tests.
- Add more comments, make the tests more readable.
- Add verification for HTTP response status code.
- Initialize the test using object Layer.
- Move to Go 1.7 sub tests.
XSD - xml schema definition for SOAP operations
on S3 provides positional restrictions on XML
output.
Fix the response by re-arranging the positions in
accordance with S3 behavior.
Fixes#2849
These messages based on our prep stage during XL
and prints more informative message regarding
drive information.
This change also does a much needed refactoring.
* The user is required to specify a table name and database connection
information in the configuration file.
* INSERTs and DELETEs are done via prepared statements for speed.
* Assumes a table structure, and requires PostgreSQL 9.5 or above due to
the use of UPSERT.
* Creates the table if it does not exist with the given table name using
a query like:
CREATE TABLE myminio (
key varchar PRIMARY KEY,
value JSONB
);
* Vendors some required libraries.
* Add missing uploadID test
... make variables in test code unexported.
* Add ServerNotInitialized test for ListObjectPartsHandler
* Add tests for ListObjectParts with signatureV2 and Anonymous requests
* Add failure test cases for ListObjectParts
* Return negative values of Total and Free in StorageInfo() when we fail to get disk info
* Return consistent messages in web handlers when the server is not initialized
* api/complete-multipart: tests and simplification.
- Removing the logic of sending white space characters.
- Fix for incorrect HTTP response status for certain cases.
- Tests for New Multipart Upload and Complete Multipart Upload.
* tests: test for Delelete Object API handler
* Test code for controller-handler operations:
* Heal operations
* List operation
* Switch to "testing" lib, moving away from gocheck
* Minor refactors
* Remove extra call to initGracefulShutdown
* Remove dead code in mainControl:
Dead code found by the TestControlMain() test function that always
passes.
* Add tests for control-*-main.go
ElasticSearch and Redis are both treated like a database.
Each indexs are based on the object names uniquely indentifying
the event. Upon each delete event of the named object deletes
the index on elasticsearch and redis respectively.
- Using gjson for constructing xlMetaV1{} in realXLMeta.
- Test for parsing constructing xlMetaV1{} using gjson.
- Changes made since benchmarks showed 30-40% improvement in speed.
- Follow up comments in issue https://github.com/minio/minio/issues/2208
for more details.
- gjson parsing of parts from xl.json for listParts.
- gjson parsing of statInfo from xl.json for getObjectInfo.
- Vendorizing gjson dependency.
* Add unit-tests for formatting disks during initialization
- Fixed corresponding code at places where it was deviating from the
tabular spec.
* Added more test cases and simplified algo
... based on feedback from ``go test -coverprofile``.
- Fix distributed branch to be able to run FS version.
- Fix distributed branch to be able to run XL local disks.
- Ignore initialization failures of notification and bucket
policies, the codepath should load whatever is possible.
From the S3 layer after PutObject we were calling GetObjectInfo for bucket notification. This can
be avoided if PutObjectInfo returns ObjectInfo.
fixes#2567
Initialization when disk was down the network disk
reported an incorrect error rather than errDiskNotFound.
This resulted in incorrect error handling during
prepInitStorage() stage.
Fixes#2577
- Instrumentation for locks.
- Detailed test coverage.
- Adding RPC control handler to fetch lock instrumentation.
- RPC control handlers suite tests with a test RPC server.
This PR contains various fixes for the distributed release:
- Use DRWMutex in namespace-lock only for a single Lock()/RLock() call in conformance to server-side rw-locking as implemented in minio/dsync
- Implement missing cases in lock-rpc-server to catch Unlock() for active read locks and RUnlock() for an active write lock
- Refactor RPCClient to release local mutex while making actual RPC.Call()
Current code did not marshal/unmarshal buffers properly from
the server, The reason being buffers have to be allocated and
sent properly back to client to be consumable.
This change initializes rpc servers associated with disks that are
local. It makes object layer initialization on demand, namely on the
first request to the object layer.
Also adds lock RPC service vendorized minio/dsync
Fixes a deadlock reproduced while running s3verify during
RemoveObject(). Previously held lock by GetObject() inside
the go-routine was never relenquished.
- Fixes couple of error strings reported are mismatching.
- Fixes a error HTTP status which was wrong fixed.
- Remove usage of an deprecated PostResponse, au contraire
to their documentation there is no response body in
PostPolicy.
CBL client does not close connection when the backup process is stopped, this causes
read() on the stream on the server side to block and hence the lock held on the part
is not released. When the backup process is restarted, we again try to lock on the
part and this will block. Using a unique tmp name and not locking it fixes the problem.
Adding deadlines is a no go since Golang doesn't back off
the timers if there is an active i/o in progress.
It is meant to be for applications to handle this themselves
and manually progress the deadlines.
Fixes#2561
If the location was invalid, it would write an error response but then
continue to attempt to make the bucket. Whether or not it would succeed,
it would attempt to call response.WriteHeaders twice in a row, which
would cause a message to be logged to the server console (bad).
Here is the relevant Go code:
c80e0d374b/src/net/http/server.go (L878-L881)
Current master has a regression 'mc policy <policy-type> alias/bucket/prefix'
does not work anymore, due to the way new minio-go changes do json marshalling.
This led to a regression on server side when a ``prefix`` is provided
policy is rejected as malformed from th server which is not the case with
AWS S3.
This patch uses the new ``minio-go/pkg/set`` package to address the
unmarshalling problems.
Fixes#2503