Creating notification events for replica creation
is not particularly useful to send as the notification
event generated at source already includes replication
completion events.
For applications using replica cluster as failover, avoiding
duplicate notifications for replica event will allow seamless
failover.
during rolling upgrade, provide a more descriptive error
message and discourage rolling upgrade in such situations,
allowing users to take action.
additionally also rename `slashpath -> pathutil` to avoid
a slighly mis-pronounced usage of `path` package.
- using miniogo.ObjectInfo.UserMetadata is not correct
- using UserTags from Map->String() can change order
- ContentType comparison needs to be removed.
- Compare both lowercase and uppercase key names.
- do not silently error out constructing PutObjectOptions
if tag parsing fails
- avoid notification for empty object info, failed operations
should rely on valid objInfo for notification in all
situations
- optimize copyObject implementation, also introduce a new
replication event
- clone ObjectInfo() before scheduling for replication
- add additional headers for comparison
- remove strings.EqualFold comparison avoid unexpected bugs
- fix pool based proxying with multiple pools
- compare only specific metadata
Co-authored-by: Poorna Krishnamoorthy <poornas@users.noreply.github.com>
globalSubscribers.NumSubscribers() is heavily used in S3 requests and it
uses mutex, use atomic.Load instead since it is faster
Co-authored-by: Anis Elleuch <anis@min.io>
additionally also configure http2 healthcheck
values to quickly detect unstable connections
and let them timeout.
also use single transport for proxying requests
Design: https://gist.github.com/klauspost/025c09b48ed4a1293c917cecfabdf21c
Gist of improvements:
* Cross-server caching and listing will use the same data across servers and requests.
* Lists can be arbitrarily resumed at a constant speed.
* Metadata for all files scanned is stored for streaming retrieval.
* The existing bloom filters controlled by the crawler is used for validating caches.
* Concurrent requests for the same data (or parts of it) will not spawn additional walkers.
* Listing a subdirectory of an existing recursive cache will use the cache.
* All listing operations are fully streamable so the number of objects in a bucket no
longer dictates the amount of memory.
* Listings can be handled by any server within the cluster.
* Caches are cleaned up when out of date or superseded by a more recent one.
ListObjectsV1 requests are actually redirected to a specific node,
depending on the bucket name. The purpose of this behavior was
to optimize listing.
However, the current code sends a Bad Gateway error if the
target node is offline, which is a bad behavior because it means
that the list request will fail, although this is unnecessary since
we can still use the current node to list as well (the default behavior
without using proxying optimization)
Currently, you can see mint fails when there is one offline node, after
this PR, mint will always succeed.
When manual healing is triggered, one node in a cluster will
become the authority to heal. mc regularly sends new requests
to fetch the status of the ongoing healing process, but a load
balancer could land the healing request to a node that is not
doing the healing request.
This PR will redirect a request to the node based on the node
index found described as part of the client token. A similar
technique is also used to proxy ListObjectsV2 requests
by encoding this information in continuation-token
- Add changes to ensure remote disks are not
incorrectly taken online if their order has
changed or are incorrect disks.
- Bring changes to peer to detect disconnection
with separate Health handler, to avoid a
rather expensive call GetLocakDiskIDs()
- Follow up on the same changes for Lockers
as well
- Implement a new xl.json 2.0.0 format to support,
this moves the entire marshaling logic to POSIX
layer, top layer always consumes a common FileInfo
construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
for object placements.
Fixes#2111
some clients such as veeam expect the x-amz-meta to
be sent in lower cased form, while this does indeed
defeats the HTTP protocol contract it is harder to
change these applications, while these applications
get fixed appropriately in future.
x-amz-meta is usually sent in lowercased form
by AWS S3 and some applications like veeam
incorrectly end up relying on the case sensitivity
of the HTTP headers.
Bonus fixes
- Fix the iso8601 time format to keep it same as
AWS S3 response
- Increase maxObjectList to 50,000 and use
maxDeleteList as 10,000 whenever multi-object
deletes are needed.
ResponseWriter & RecordAPIStats has similar role, merge them.
This commit will also fix wrong auditing for STS and Web and others
since they are using ResponseWriter instead of the RecordAPIStats.
Add two new configuration entries, api.requests-max and
api.requests-deadline which have the same role of
MINIO_API_REQUESTS_MAX and MINIO_API_REQUESTS_DEADLINE.
- total number of S3 API calls per server
- maximum wait duration for any S3 API call
This implementation is primarily meant for situations
where HDDs are not capable enough to handle the incoming
workload and there is no way to throttle the client.
This feature allows MinIO server to throttle itself
such that we do not overwhelm the HDDs.
JWT parsing is simplified by using a custom claim
data structure such as MapClaims{}, also writes
a custom Unmarshaller for faster unmarshalling.
- Avoid as much reflections as possible
- Provide the right types for functions as much
as possible
- Avoid strings.Join, strings.Split to reduce
allocations, rely on indexes directly.
On every restart of the server, usage was being
calculated which is not useful instead wait for
sufficient time to start the crawling routine.
This PR also avoids lots of double allocations
through strings, optimizes usage of string builders
and also avoids crawling through symbolic links.
Fixes#8844
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS
This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
This PR adds code to appropriately handle versioning issues
that come up quite constantly across our API changes. Currently
we were also routing our requests wrong which sort of made it
harder to write a consistent error handling code to appropriately
reject or honor requests.
This PR potentially fixes issues
- old mc is used against new minio release which is incompatible
returns an appropriate for client action.
- any older servers talking to each other, report appropriate error
- incompatible peer servers should report error and reject the calls
with appropriate error
- adding oauth support to MinIO browser (#8400) by @kanagaraj
- supports multi-line get/set/del for all config fields
- add support for comments, allow toggle
- add extensive validation of config before saving
- support MinIO browser to support proper claims, using STS tokens
- env support for all config parameters, legacy envs are also
supported with all documentation now pointing to latest ENVs
- preserve accessKey/secretKey from FS mode setups
- add history support implements three APIs
- ClearHistory
- RestoreHistory
- ListHistory
- add help command support for each config parameters
- all the bug fixes after migration to KV, and other bug
fixes encountered during testing.
The measures are consolidated to the following metrics
- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `disks_offline`: Total number of offline disks in current MinIO instance.
- `disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests.
And this PR also modifies the current StorageInfo queries
- Decouples StorageInfo from ServerInfo .
- StorageInfo is enhanced to give endpoint information.
NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR
Fixes#7873
This PR fixes relying on r.Context().Done()
by setting
```
Connection: "close"
```
HTTP Header, this has detrimental issues for
client side connection pooling. Since this
header explicitly tells clients to turn-off
connection pooling. This causing pro-active
connections to be closed leaving many conn's
in TIME_WAIT state. This can be observed with
`mc admin trace -a` when running distributed
setup.
This PR also fixes tracing filtering issue
when bucket names have `minio` as prefixes,
trace was erroneously ignoring them.
This allows for canonicalization of the strings
throughout our code and provides a common space
for all these constants to reside.
This list is rather non-exhaustive but captures
all the headers used in AWS S3 API operations
Different gateway implementations due to different backend
API errors, might return different unsupported errors at
our handler layer. Current code posed a problem for us because
this information was lost and we would convert it to InternalError
in this situation all S3 clients end up retrying the request.
To avoid this unexpected situation implement a way to support
this cleanly such that the underlying information is not lost
which is returned by gateway.
Especially in gateway IAM admin APIs are not enabled
if etcd is not enabled, we should enable admin API though
but only enable IAM and Config APIs with etcd configured.
This PR adds support
- Request query params
- Request headers
- Response headers
AuditLogEntry is exported and versioned as well
starting with this PR.
POST mime/multipart upload style can have filename value optional
which leads to implementation issues in Go releases in their
standard mime/multipart library.
When `filename` doesn't exist Go doesn't update `form.File` which
we rely on to extract the incoming file data, strangely when `filename`
is not specified this data is buffered in memory and is now part of
`form.Value` instead of `form.File` which creates an inconsistent
behavior.
This PR tries to fix this in our code for the time being, but ideal PR
would be to fix the upstream mime/multipart library to handle the
above situation consistently.
This PR adds disk based edge caching support for minio server.
Cache settings can be configured in config.json to take list of disk drives,
cache expiry in days and file patterns to exclude from cache or via environment
variables MINIO_CACHE_DRIVES, MINIO_CACHE_EXCLUDE and MINIO_CACHE_EXPIRY
Design assumes that Atime support is enabled and the list of cache drives is
fixed.
- Objects are cached on both GET and PUT/POST operations.
- Expiry is used as hint to evict older entries from cache, or if 80% of cache
capacity is filled.
- When object storage backend is down, GET, LIST and HEAD operations fetch
object seamlessly from cache.
Current Limitations
- Bucket policies are not cached, so anonymous operations are not supported in
offline mode.
- Objects are distributed using deterministic hashing among list of cache
drives specified.If one or more drives go offline, or cache drive
configuration is altered - performance could degrade to linear lookup.
Fixes#4026
Save http trace to a file instead of displaying it onto the console.
the environment variable MINIO_HTTP_TRACE will be a filepath instead
of a boolean.
This to handle the scenario where both json and http tracing are
turned on. In that case, both http trace and json output are displayed
on the screen making the json not parsable. Loging this trace onto
a file helps us avoid that scenario.
Fixes#5263
This adds configurable data and parity options on a per object
basis. To use variable parity
- Users can set environment variables to cofigure variable
parity
- Then add header x-amz-storage-class to putobject requests
with relevant storage class values
Fixes#4997
This change introduces following simplified steps to follow
during config migration.
```
// Steps to move from version N to version N+1
// 1. Add new struct serverConfigVN+1 in config-versions.go
// 2. Set configCurrentVersion to "N+1"
// 3. Set serverConfigCurrent to serverConfigVN+1
// 4. Add new migration function (ex. func migrateVNToVN+1()) in config-migrate.go
// 5. Call migrateVNToVN+1() from migrateConfig() in config-migrate.go
// 6. Make changes in config-current_test.go for any test change
```
When MINIO_TRACE_DIR is provided, create a new log file and store all
HTTP requests + responses data, body are excluded to reduce memory
consumption. MINIO_HTTP_TRACE=1 enables logging. Use non mem
consuming http req/resp recorders, the maximum is about 32k per request.
This logs to STDOUT, body logging is disabled for PutObject PutObjectPart
GetObject.
S3 only allows http headers with a size of 8 KB and user-defined metadata
with a size of 2 KB. This change adds a new API error and returns this
error to clients which sends to large http requests.
Fixes#4634
Fixed header-to-metadat extraction. The extractMetadataFromHeader function should return an error if the http.Header contains a non-canonicalized key. The reason is that the keys can be manually set (through a map access) which can lead to ugly bugs.
Also fixed header-to-metadata extraction. Return a InternalError if a non-canonicalized key is found in a http.Header. Also log the error.
This PR also does backend format change to 1.0.1
from 1.0.0. Backward compatible changes are still
kept to read the 'md5Sum' key. But all new objects
will be stored with the same details under 'etag'.
Fixes#4312
Separate out validating v/s parsing logic in
isValidLocationConstraint() into parseLocationConstraint()
and isValidLocation()
Additionally also set `X-Amz-Bucket-Region` as part of the
common headers for the clients to fallback on in-case of any
region related errors.
We can't use Content-Encoding to verify if `aws-chunked` is set
or not. Just use 'streaming' signature header instead.
While this is considered mandatory, on the contrary aws-sdk-java
doesn't set this value
http://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html
```
Set the value to aws-chunked.
```
We will relax it and behave appropriately. Also this PR supports
saving custom encoding after trimming off the `aws-chunked`
parameter.
Fixes#3983
This change is cleanup of the postPolicyHandler code
primarily to address the flow and also converting
certain critical parts into self contained functions.
Avoid passing size = -1 to PutObject API by requiring content-length
header in POST request (as AWS S3 does) and in Upload web handler.
Post handler is modified to completely store multipart file to know
its size before sending it to PutObject().
This is a consolidation effort, avoiding usage
of naked strings in codebase. Whenever possible
use constants which can be repurposed elsewhere.
This also fixes `goconst ./...` reported issues.
This is written so that to simplify our handler code
and provide a way to only update metadata instead of
the data when source and destination in CopyObject
request are same.
Fixes#3316
These messages based on our prep stage during XL
and prints more informative message regarding
drive information.
This change also does a much needed refactoring.