Using this script, post decrypt we should be able to bring up the
MinIO instance with same configuration.
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
A new middleware function is added for admin handlers, including options
for modifying certain behaviors. This admin middleware:
- sets the handler context via reflection in the request and sends AuditLog
- checks for object API availability (skipping it if a flag is passed)
- enables gzip compression (skipping it if a flag is passed)
- enables header tracing (adding body tracing if a flag is passed)
While the new function is a middleware, due to the flags used for
conditional behavior modification, which is used in each route registration
call.
To try to ensure that no regressions are introduced, the following
changes were done mechanically mostly with `sed` and regexp:
- Remove defer logger.AuditLog in admin handlers
- Replace newContext() calls with r.Context()
- Update admin routes registration calls
Bonus: remove unused NetSpeedtestHandler
Since the new adminMiddleware function checks for object layer presence
by default, we need to pass the `noObjLayerFlag` explicitly to admin
handlers that should work even when it is not available. The following
admin handlers do not require it:
- ServerInfoHandler
- StartProfilingHandler
- DownloadProfilingHandler
- ProfileHandler
- SiteReplicationDevNull
- SiteReplicationNetPerf
- TraceHandler
For these handlers adminMiddleware does not check for the object layer
presence (disabled by passing the `noObjLayerFlag`), and for all other
handlers, the pre-check ensures that the handler is not called when the
object layer is not available - the client would get a
ErrServerNotInitialized and can retry later.
This `noObjLayerFlag` is added based on existing behavior for these
handlers only.
Now it would list details of all KMS instances with additional
attributes `endpoint` and `version`. In the case of k8s-based
deployment the list would consist of a single entry.
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
The `clusterInfo` struct in admin-handlers is same as
madmin.ClusterRegistrationInfo, except for small differences in field
names.
Removing this and using madmin.ClusterRegistrationInfo in its place will
help in following ways:
- The JSON payload generated by mc in case of cluster registration will
be consistent (same keys) with cluster.info generated by minio as part
of the profile and inspect zip
- health-analyzer can parse the cluster.info using the same struct and
won't have to define it's own
when object speedtest is running keep writing
previous speedtest result back to client until
we have a new result - this avoids sending back
blank entries in between the speedtest when it
is running in 'autotune' mode.
smaller setups may have less drives per server choosing
the concurrency based on number of local drives, and let
the MinIO server change the overall concurrency as
necessary.
this has been observed in multiple environments
where the setups are small `speedtest` naturally
fails with default '10s' and the concurrency
of '32' is big for such clusters.
choose a smaller value i.e equal to number of
drives in such clusters and let 'autotune'
increase the concurrency instead.
This PR changes the handling of bucket deletes for site
replicated setups to hold on to deleted bucket state until
it syncs to all the clusters participating in site replication.
Currently, if one server in a distributed setup fails to upgrade
due to any reasons, it is not possible to upgrade again unless
nodes are restarted.
To fix this, split the upgrade process into two steps :
- download the new binary on all servers
- If successful, overwrite the old binary with the new one
Add cluster info to inspect and profiling archive.
In addition to the existing data generation for both inspect and profiling,
cluster.info file is added. This latter contains some info of the cluster.
The generation of cluster.info is is done as the last step and it can fail
if it exceed 10 seconds.
This commit adds a `context.Context` to the
the KMS `{Stat, CreateKey, GenerateKey}` API
calls.
The context will be used to terminate external calls
as soon as the client requests gets canceled.
A follow-up PR will add a `context.Context` to
the remaining `DecryptKey` API call.
Signed-off-by: Andreas Auernhammer <hi@aead.dev>
The current code uses approximation using a ratio. The approximation
can skew if we have multiple pools with different disk capacities.
Replace the algorithm with a simpler one which counts data
disks and ignore parity disks.
- currently subnet health check was freezing and calling
locks at multiple locations, avoid them.
- throw errors if first attempt itself fails with no results
Currently minio_s3_requests_errors_total covers 4xx and
5xx S3 responses which can be confusing when s3 applications
sent a lot of HEAD requests with obvious 404 responses or
when the replication is enabled.
Add
- minio_s3_requests_4xx_errors_total
- minio_s3_requests_5xx_errors_total
to help users monitor 4xx and 5xx HTTP status codes separately.
The S3 service can be frozen indefinitely if a client or mc asks for object
perf API but quits early or has some networking issues. The reason is
that partialWrite() can block indefinitely.
This commit makes partialWrite() listens to context cancellation as well. It
also renames deadlinedCtx to healthCtx since it covers handler context
cancellation and not only not only the speedtest deadline.
Main motivation is move towards a common backend format
for all different types of modes in MinIO, allowing for
a simpler code and predictable behavior across all features.
This PR also brings features such as versioning, replication,
transitioning to single drive setups.
it seems in some places we have been wrongly using the
timer.Reset() function, nicely exposed by an example
shared by @donatello https://go.dev/play/p/qoF71_D1oXD
this PR fixes all the usage comprehensively
Execute the object, drive and net speedtests as part of the healthinfo
(if requested by the client), and include their result in the response.
The options for the speedtests have been picked from the default values
used by `mc support perf` command.
The deployment id was being written to the health report towards the end
of the handler. Because of this, if there was a timeout in any of the
data fetching, the deployment id was not getting written at all. Upload
of such reports fails on SUBNET as deployment id is the unique
identifier for a cluster in subnet.
Fixed by writing the deployment id at the beginning of the processing.
avoids creating new transport for each `isServerResolvable`
request, instead re-use the available global transport and do
not try to forcibly close connections to avoid TIME_WAIT
build upon large clusters.
Never use httpClient.CloseIdleConnections() since that can have
a drastic effect on existing connections on the transport pool.
Remove it everywhere.
This is a side-affect of the optimization done in PR #13544 which
causes a certain type of delete operations on given object versions
can cause lastVersion indication to be skipped, which leads to
an `xl.meta` where Versions[] slice is empty while the entire
file is intact by itself.
This PR tries to ensure that such files are visible and deletable
by regular means of listing as null 'delete-marker' and also
avoid the situation where this potential issue might arise.
small setups do not return appropriate errors when speedtest
cannot run on small tiny setups, allow the tests to fail
appropriately more pro-actively.
many users bring toy setups, this PR simply returns an error
in such situations.
Some users running MinIO claim that their system became slow. One
way to investigate is to look at this Prometheus history of the number of
the requests reaching the server. The existing current S3 requests metric
is not enough because it can increase of the system really becomes slow,
due to disk issues for example.