- Implement a new xl.json 2.0.0 format to support,
this moves the entire marshaling logic to POSIX
layer, top layer always consumes a common FileInfo
construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
for object placements.
Fixes#2111
At a customer setup with lots of concurrent calls
it can be observed that in newRetryTimer there
were lots of tiny alloations which are not
relinquished upon retries, in this codepath
we were only interested in re-using the timer
and use it wisely for each locker.
```
(pprof) top
Showing nodes accounting for 8.68TB, 97.02% of 8.95TB total
Dropped 1198 nodes (cum <= 0.04TB)
Showing top 10 nodes out of 79
flat flat% sum% cum cum%
5.95TB 66.50% 66.50% 5.95TB 66.50% time.NewTimer
1.16TB 13.02% 79.51% 1.16TB 13.02% github.com/ncw/directio.AlignedBlock
0.67TB 7.53% 87.04% 0.70TB 7.78% github.com/minio/minio/cmd.xlObjects.putObject
0.21TB 2.36% 89.40% 0.21TB 2.36% github.com/minio/minio/cmd.(*posix).Walk
0.19TB 2.08% 91.49% 0.27TB 2.99% os.statNolog
0.14TB 1.59% 93.08% 0.14TB 1.60% os.(*File).readdirnames
0.10TB 1.09% 94.17% 0.11TB 1.25% github.com/minio/minio/cmd.readDirN
0.10TB 1.07% 95.23% 0.10TB 1.07% syscall.ByteSliceFromString
0.09TB 1.03% 96.27% 0.09TB 1.03% strings.(*Builder).grow
0.07TB 0.75% 97.02% 0.07TB 0.75% path.(*lazybuf).append
```
some clients such as veeam expect the x-amz-meta to
be sent in lower cased form, while this does indeed
defeats the HTTP protocol contract it is harder to
change these applications, while these applications
get fixed appropriately in future.
x-amz-meta is usually sent in lowercased form
by AWS S3 and some applications like veeam
incorrectly end up relying on the case sensitivity
of the HTTP headers.
Bonus fixes
- Fix the iso8601 time format to keep it same as
AWS S3 response
- Increase maxObjectList to 50,000 and use
maxDeleteList as 10,000 whenever multi-object
deletes are needed.
There is a disparency of behavior under Linux & Windows about
the returned error when trying to rename a non existant path.
err := os.Rename("/path/does/not/exist", "/tmp/copy")
Linux:
isSysErrNotDir(err) = false
os.IsNotExist(err) = true
Windows:
isSysErrNotDir(err) = true
os.IsNotExist(err) = true
ENOTDIR in Linux is returned when the destination path
of the rename call contains a file in one of the middle
segments of the path (e.g. /tmp/file/dst, where /tmp/file
is an actual file not a directory)
However, as shown above, Windows has more scenarios when
it returns ENOTDIR. For example, when the source path contains
an inexistant directory in its path.
In that case, we want errFileNotFound returned and not
errFileAccessDenied, so this commit will add a further check to close
the disparency between Windows & Linux.
Add dummy calls which respond success when ACL's
are set to be private and fails, if user tries
to change them from their default 'private'
Some applications such as nuxeo may have an
unnecessary requirement for this operation,
we support this anyways such that don't have
to fully implement the functionality just that
we can respond with success for default ACLs
This allows for canonicalization of the strings
throughout our code and provides a common space
for all these constants to reside.
This list is rather non-exhaustive but captures
all the headers used in AWS S3 API operations
The side affect of this change memory
increase, but this is a trade-off between
performance and actual memory usage.
For all practical scenarios this should be
an adequate change.
This PR is the first set of changes to move the config
to the backend, the changes use the existing `config.json`
allows it to be migrated such that we can save it in on
backend disks.
In future releases, we will slowly migrate out of the
current architecture.
Fixes#6182
- remove old bucket policy handling
- add new policy handling
- add new policy handling unit tests
This patch brings support to bucket policy to have more control not
limiting to anonymous. Bucket owner controls to allow/deny any rest
API.
For example server side encryption can be controlled by allowing
PUT/GET objects with encryptions including bucket owner.
This PR implements an object layer which
combines input erasure sets of XL layers
into a unified namespace.
This object layer extends the existing
erasure coded implementation, it is assumed
in this design that providing > 16 disks is
a static configuration as well i.e if you started
the setup with 32 disks with 4 sets 8 disks per
pack then you would need to provide 4 sets always.
Some design details and restrictions:
- Objects are distributed using consistent ordering
to a unique erasure coded layer.
- Each pack has its own dsync so locks are synchronized
properly at pack (erasure layer).
- Each pack still has a maximum of 16 disks
requirement, you can start with multiple
such sets statically.
- Static sets set of disks and cannot be
changed, there is no elastic expansion allowed.
- Static sets set of disks and cannot be
changed, there is no elastic removal allowed.
- ListObjects() across sets can be noticeably
slower since List happens on all servers,
and is merged at this sets layer.
Fixes#5465Fixes#5464Fixes#5461Fixes#5460Fixes#5459Fixes#5458Fixes#5460Fixes#5488Fixes#5489Fixes#5497Fixes#5496
Previously ListenBucketNotificationHandler could deadlock with
PutObjectHandler's eventNotify call when a client closes its
connection. This change removes the cyclic dependency between the
channel and map of ARN to channels by using a separate done channel to
signal that the client has quit.
This change brings public data-types such that
we can ask projects to implement gateway projects
externally than maintaining in our repo.
All publicly exported structs are maintained in object-api-datatypes.go
completePart --> CompletePart
uploadMetadata --> MultipartInfo
All other exported errors are at object-api-errors.go
Every so often we get requirements for creating
directories/prefixes and we end up rejecting
such requirements. This PR implements this and
allows empty directories without any new file
addition to backend.
Existing lower APIs themselves are leveraged to provide
this behavior. Only FS backend supports this for
the time being as desired.
This PR addresses a long standing dependency on
`gopkg.in/check.v1` project used for our tests.
All tests are re-written to use the go default
testing framework instead.
There was no reason for us to use an external
package where Go tools are sufficient for this.
This is done to avoid repeated declaration of not-implemented
functions for each gateway. It also avoids a possible bug in go
https://github.com/golang/go/issues/18468 which is triggered on
our multiple PRs already.
Current code allowed it wrongly to generate secret key upto 100
we should only use 100 as a value to validate but for generating
it should be 40.
Fixes#4470
This PR also does backend format change to 1.0.1
from 1.0.0. Backward compatible changes are still
kept to read the 'md5Sum' key. But all new objects
will be stored with the same details under 'etag'.
Fixes#4312
- Due to usage of amazon SDK, spark expects md5sum of empty string to be
returned when it does PUT on a directory.
- The fix returns md5sum of a empty string for the above mentioned case.
- This fixes the issue of Apache Spark not being able to write into Minio.