Commit Graph

5777 Commits

Author SHA1 Message Date
Krishnan Parthasarathi a93214ea63
ilm: ObjectSizeLessThan and ObjectSizeGreaterThan (#18500) 2023-11-22 13:42:39 -08:00
Klaus Post e6b0fc465b
tweak healing to include version-id in healing result (#18225) 2023-11-22 12:30:31 -08:00
Anis Eleuch 70fbcfee4a
Implement batch snowball (#18485) 2023-11-22 10:51:46 -08:00
Sveinn d67e4d5b17
fix: check for bucket existence before FTP upload (#18496) 2023-11-21 21:36:32 -08:00
Harshavardhana fe3e49c4eb
use Access(F_OK) do not need to check for permissions (#18492) 2023-11-21 15:08:41 -08:00
Shubhendu 58306a9d34
Replicate Expiry ILM configs while site replication (#18130)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-21 09:48:06 -08:00
Harshavardhana a4cfb5e1ed
return errors if dataDir is missing during HeadObject() (#18477)
Bonus: allow replication to attempt Deletes/Puts when
the remote returns quorum errors of some kind, this is
to ensure that MinIO can rewrite the namespace with the
latest version that exists on the source.
2023-11-20 21:33:47 -08:00
Klaus Post 51aa59a737
perf: websocket grid connectivity for all internode communication (#18461)
This PR adds a WebSocket grid feature that allows servers to communicate via 
a single two-way connection.

There are two request types:

* Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small
  roundtrips with small payloads.

* Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`,
  which allows for different combinations of full two-way streams with an initial payload.

Only a single stream is created between two machines - and there is, as such, no
server/client relation since both sides can initiate and handle requests. Which server
initiates the request is decided deterministically on the server names.

Requests are made through a mux client and server, which handles message
passing, congestion, cancelation, timeouts, etc.

If a connection is lost, all requests are canceled, and the calling server will try
to reconnect. Registered handlers can operate directly on byte 
slices or use a higher-level generics abstraction.

There is no versioning of handlers/clients, and incompatible changes should
be handled by adding new handlers.

The request path can be changed to a new one for any protocol changes.

First, all servers create a "Manager." The manager must know its address 
as well as all remote addresses. This will manage all connections.
To get a connection to any remote, ask the manager to provide it given
the remote address using.

```
func (m *Manager) Connection(host string) *Connection
```

All serverside handlers must also be registered on the manager. This will
make sure that all incoming requests are served. The number of in-flight 
requests and responses must also be given for streaming requests.

The "Connection" returned manages the mux-clients. Requests issued
to the connection will be sent to the remote.

* `func (c *Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)`
   performs a single request and returns the result. Any deadline provided on the request is
   forwarded to the server, and canceling the context will make the function return at once.

* `func (c *Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st *Stream, err error)`
   will initiate a remote call and send the initial payload.

```Go
// A Stream is a two-way stream.
// All responses *must* be read by the caller.
// If the call is canceled through the context,
//The appropriate error will be returned.
type Stream struct {
	// Responses from the remote server.
	// Channel will be closed after an error or when the remote closes.
	// All responses *must* be read by the caller until either an error is returned or the channel is closed.
	// Canceling the context will cause the context cancellation error to be returned.
	Responses <-chan Response

	// Requests sent to the server.
	// If the handler is defined with 0 incoming capacity this will be nil.
	// Channel *must* be closed to signal the end of the stream.
	// If the request context is canceled, the stream will no longer process requests.
	Requests chan<- []byte
}

type Response struct {
	Msg []byte
	Err error
}
```

There are generic versions of the server/client handlers that allow the use of type
safe implementations for data types that support msgpack marshal/unmarshal.
2023-11-20 17:09:35 -08:00
Anis Eleuch 02331a612c
batch-repl: Replicate missing metadata and standard headers (#18484)
- Replicate Expires when the source is local or remote
- Replicate metadata when the source is remote
2023-11-18 19:12:44 -08:00
Anis Eleuch 8317557f70
decom: Fix listing quorum to be equal to deletion quorum (#18476)
With an odd number of drives per erasure set setup, the write/quorum is
the half + 1; however the decommissioning listing will still list those
objects and does not consider those as stale.

Fix it by using (N+1)/2 formula.

Co-authored-by: Anis Elleuch <anis@min.io>
2023-11-17 21:09:09 -08:00
Anis Eleuch 1bb7a2a295
Immediate transition ILM to avoid quick deferring to the scanner (#18475)
Immediate transition use case and is mostly used to fill warm
backend with a lot of data when a new deployment is created

Currently, if the transition queue is complete, the transition will be
deferred to the scanner; change this behavior by blocking the PUT request
until the transition queue has a new place for a transition task.
2023-11-17 16:16:46 -08:00
Harshavardhana 0a286153bb
remove checking for BucketInfo() peer call for every PUT() (#18464)
we already validate if the bucket doesn't exist in RenameData()
which can handle this cleanly, instead of making a network call
and returning errors.
2023-11-17 05:29:50 -08:00
Anis Eleuch 22d59e757d
Remove stale data in HEAD/GET object (#18460)
Currently if the object does not exist in quorum disks of an erasure
set, the dangling code is never called because the returned error will
be errFileNotFound or errFileVersionNotFound;

With this commit, when errFileNotFound or errFileVersionNotFound is
returning when trying to calculate the quorum of a given object, the
code checks if a disk returned nil, which means a stale object exists in
that disk, that will trigger deleteIfDangling() function
2023-11-16 08:39:53 -08:00
Andreas Auernhammer 0daa2dbf59
health: split liveness and readiness handler (#18457)
This commit splits the liveness and readiness
handler into two separate handlers. In K8S, a
liveness probe is used to determine whether the
pod is in "live" state and functioning at all.
In contrast, the readiness probe is used to
determine whether the pod is ready to serve
requests.

A failing liveness probe causes pod restarts while
a failing readiness probe causes k8s to stop routing
traffic to the pod. Hence, a liveness probe should
be as robust as possible while a readiness probe
should be used to load balancing.

Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2023-11-16 01:51:27 -08:00
Praveen raj Mani 38f35463b7
Load bucket configs during the metadata refresh (#18449)
This patch takes care of loading the bucket configs of failed buckets
during the periodic refresh. This makes sure the event notifiers and
remote bucket targets are properly initialized.
2023-11-15 12:43:25 -08:00
Harshavardhana 5573986e8e
fix: relax free inode check for single drive deployments (#18437)
users might use MinIO on NFS, GPFS that provide dynamic
inodes and may not even have a concept of free inodes.

to allow users to use MinIO on top of GPFS relax the
free inode check.
2023-11-14 09:31:16 -08:00
Sveinn f3367a1b20
Adding error handling for network errors in the SFTP layer (#18442) 2023-11-14 09:31:00 -08:00
Sveinn 8fbec30998
Adding a missing return to fix SFTP Rmdir message (#18438) 2023-11-14 09:26:46 -08:00
Harshavardhana a7466eeb0e
fix: ignore dperf on unformatted/unavailable/unmounted drives (#18435) 2023-11-13 22:32:08 -08:00
Harshavardhana 8b1e819bf3
fix: make sure to purge all the completed in resume() (#18429)
currently previously completed jobs would re-run
even if they are completed, causing incorrect behavior.
2023-11-13 08:15:00 -08:00
Anis Eleuch fe63664164
prom: Add drive failure tolerance per erasure set (#18424) 2023-11-13 00:59:48 -08:00
Sveinn 9afdb05bf4
fix: file consistency issue on SFTP upload (#18422)
* creating a byte buffer for SFTP file segments
* Adding an error condition for when there are 
  remaining segments in the queue
* Simplification of the queue using a map
2023-11-11 00:14:41 -08:00
Krishnan Parthasarathi 9569a85cee
Avoid allocs for MRF on-disk header (#18425) 2023-11-10 19:54:46 -08:00
Harshavardhana 54721b7c7b
fix: batch replication from source allow out of band deletes (#18423)
it is possible that ILM or Deletes got triggered on batch
of objects that we are attempting to batch replicate, ignore
this scenario as valid behavior.
2023-11-10 16:12:35 -08:00
Harshavardhana 91d8bddbd1
use sendfile/splice implementation to perform DMA (#18411)
sendfile implementation to perform DMA on all platforms

Go stdlib already supports sendfile/splice implementations
for

- Linux
- Windows
- *BSD
- Solaris

Along with this change however O_DIRECT for reads() must be
removed as well since we need to use sendfile() implementation

The main reason to add O_DIRECT for reads was to reduce the
chances of page-cache causing OOMs for MinIO, however it would
seem that avoiding buffer copies from user-space to kernel space
this issue is not a problem anymore.

There is no Go based memory allocation required, and neither
the page-cache is referenced back to MinIO. This page-
cache reference is fully owned by kernel at this point, this
essentially should solve the problem of page-cache build up.

With this now we also support SG - when NIC supports Scatter/Gather
https://en.wikipedia.org/wiki/Gather/scatter_(vector_addressing)
2023-11-10 10:10:14 -08:00
Harshavardhana 80adc87a14
converge WARM tier object name to hash of deployment+bucket (#18410)
this is to ensure that we can converge and save IOPs
when hot-tier accesses MinIO.
2023-11-10 02:15:13 -08:00
Taran Pelkey 117ad1b65b
Loosen requirements to detach policies for LDAP (#18419) 2023-11-09 14:44:43 -08:00
Klaus Post 2229509362
fix: leaking offline disks in MarkOffline() thread (#18414)
`monitorAndConnectEndpoints` will continue to attempt to reconnect offline disks.

Since disks were never closed, a `MarkOffline` would continue to try to check these disks forever.

Close previous disks.
2023-11-09 09:33:32 -08:00
Krishnan Parthasarathi 0a25083fdb
Tiered objects require ns locks unlike inlined (#18409) 2023-11-08 20:00:02 -08:00
Sveinn 15137d0327
refactor SFTP to use the new minio/pkg implementation (#18406) 2023-11-08 09:47:05 -08:00
Poorna 8c9974bc0f
site replication: avoid propagating bucket b/w settings (#18399)
replication mode and bucket bandwidth are one-way and should not be
propagated to peer cluster.

Regression from #18062
2023-11-08 00:40:25 -08:00
jiuker 079b6c2b50
fix: add err when all bucket resync failed (#18401) 2023-11-08 00:40:08 -08:00
Harshavardhana 754f7a8a39
replace io.Discard usage to fix some NUMA copy() latencies (#18394)
replace io.Discard usage to fix NUMA copy() latencies

On NUMA systems copying from 8K buffer allocated via
io.Discard leads to large latency build-up for every

```
copy(new8kbuf, largebuf)
```

can in-cur upto 1ms worth of latencies on NUMA systems
due to memory sharding across NUMA nodes.
2023-11-06 14:26:08 -08:00
Harshavardhana 64bafe1dfe
skip speedtest bucket from site-replication (#18393) 2023-11-06 11:52:33 -08:00
jiuker c3e456e7e6
fix: no resyncid when site-replication cancel (#18392) 2023-11-06 01:53:31 -08:00
vicmunoz da95a2d13f
fix: object versions metric help (#18388) 2023-11-03 11:43:52 -07:00
Shireesh Anjal cc5e05fdeb
Do not anonymize hostnames by default (#18387)
Anonymize them only if the parameter `anonymize` is set to `strict
2023-11-03 10:09:33 -07:00
jiuker 8a56af439c
fix: siteReplicationSys.startResync return no buckets return if error (#18374) 2023-11-02 16:00:03 -07:00
Shireesh Anjal f6e581ce54
Capture network device info in health report (#18381) 2023-11-02 09:49:49 -07:00
Klaus Post 7472818d94
Fix hanging scanner saves (#18368)
Fix various regressions from #18029

* If context is canceled the token is never returned. This will lead to scanner being unable to save and deadlocking.
* Fix backup not being able to get any data (hr empty)
* Reduce backup timeout.
2023-11-01 09:09:28 -07:00
Taran Pelkey 33322e6638
Change behavior of service account empty policies (#18346)
* Fix embedded/implied policy behavior

* assume implied policy if pased to empty

* fix for all

* Fix failing tests

---------

Co-authored-by: Prakash Senthil Vel <23444145+prakashsvmx@users.noreply.github.com>
2023-10-31 12:30:36 -07:00
Daniel López Guimaraes a1792ca0d1
fix: relax enforcing filename on PostPolicy (#18336)
The filename is not required to be on the form data.
2023-10-30 21:06:32 -07:00
Harshavardhana ac8c43fe9c
fix: allow missing hot-tier accounting (#18345) 2023-10-30 14:42:11 -07:00
Allan Roger Reid 4d40ee00e9
Add check for reverse proxy setups (#18310)
Add check for reverse proxy setups, to skip check for paths being served by different port on same address.
2023-10-30 10:49:04 -07:00
Adrian Najera 06f59ad631
fix: expiration time for share link when using OpenID (#18297) 2023-10-30 10:21:34 -07:00
Harshavardhana 877e0cac03
fix: tiering statistics handling a bug in clone() implementation (#18342)
Tiering statistics have been broken for some time now, a regression
was introduced in 6f2406b0b6

Bonus fixes an issue where the objects are not assumed to be
of the 'STANDARD' storage-class for the objects that have
not yet tiered, this should be conditional based on the object's
metadata not a default assumption.

This PR also does some cleanup in terms of implementation,

fixes #18070
2023-10-30 09:59:51 -07:00
Klaus Post 508710f4d1
Re-add duplicate upload id sanity check. (#18339)
https://github.com/minio/minio/pull/18307 partially removed the duplicate upload id check.

While I can't really see how ListDir can return duplicate entries, let's re-add it, since it is a cheap sanity check.
2023-10-29 08:33:30 -07:00
Matthew Toohey c2fedb4c3f
fix: log targetID instead of Name when event error occurs (#18335) 2023-10-28 08:32:57 -07:00
Poorna 03dc65e12d
Reload replication targets lazily if missing (#18333)
There can be rare situations where errors seen in bucket metadata
load on startup or subsequent metadata updates can result in missing
replication remotes.

Attempt a refresh of remote targets backed by a good replication config
lazily in 5 minute intervals if there ever occurs a situation where
remote targets go AWOL.
2023-10-27 21:08:53 -07:00
Praveen raj Mani 54aed421b8
fix: update the user cache while adding service accounts with expiry (#18320) 2023-10-26 08:11:29 -07:00
jiuker d5e8dac1cf
fix: canceling the heal caused goroutine to leak. (#18322) 2023-10-26 07:53:06 -07:00
Poorna 96ec8fcba1
Preserve replica timestamps in multipart (#18318)
Also a backward compatibility fix to use x-amz-replica-status
if present as replication status.
2023-10-25 21:24:10 -07:00
Harshavardhana 0663eb69ed
fix: do not preserve mtime during CopyObject() metadata updates (#18316)
mtime must be preserved only if destination mtime is set.

fixes #18314
2023-10-25 14:30:56 -07:00
Harshavardhana c60f54e5be
make ListMultipart/ListParts more reliable skip healing disks (#18312)
this PR also fixes old flaky tests, by properly marking disk offline-based tests.
2023-10-24 23:33:25 -07:00
Harshavardhana 483389f2e2 set diskMaxConcurrent to 32 if nrRequests is lower 2023-10-24 17:21:12 -07:00
Harshavardhana 069d118329
fix: listObjectParts to prefer local and single disks (#18309) 2023-10-24 13:51:57 -07:00
Harshavardhana a7b1834772
fix: flaky and stupid tests in root lockdown (#18308) 2023-10-24 13:22:44 -07:00
Klaus Post 6415dec37a
Improve multipart listing speed (#18307) 2023-10-24 12:06:06 -07:00
Harshavardhana 2dc917e87f
maxConcurrent must be set only once per node (#18303) 2023-10-23 21:42:36 -07:00
Aditya Manthramurthy 0a284a1a10
fix: SR: Add more info when IAM config differs (#18302)
Provide details on what IAM info mismatched when the validation fails
2023-10-23 21:16:40 -07:00
Harshavardhana 5c8339e1e8
fix: veeam SOS API to higher layers (#18287)
- support populating usage info from scanner info
- support populating quota for the bucket via quota
  settings for the bucket
2023-10-23 13:55:45 -07:00
Harshavardhana fd37418da2
fix: allow server not initialized error to be retried (#18300)
Since relaxing quorum the error across pools
for ListBuckets(), GetBucketInfo() we hit a
situation where loading IAM could potentially
return an error for second pool that server
is not initialized.

We need to handle this, let the pool come online
and retry transparently - this PR fixes that.
2023-10-23 12:30:20 -07:00
Harshavardhana bbfea29c2b
use object modTime for the event sequencer ID (#18285)
always set modTime after lock is acquired in
completemultipart stage to make sure that the
modTime is not racy.
2023-10-20 19:28:05 -07:00
Harshavardhana aa703dc903
relax write quorum requirement for ListBuckets()/HeadBucket() (#18288)
Also fix error handling for HeadBucket() to be pool specific
2023-10-20 17:50:21 -07:00
Harshavardhana 780882efcf
do not check for query params to be signed headers (#18283)
x-amz-signed-headers is meant for HTTP headers only
not for query params, using that to verify things
further can lead to failure.

The generated presigned URL with custom metadata
is already kosher (tamper proof).

fixes #18281
2023-10-19 21:32:49 -07:00
Klaus Post ba6218b354
fix: resource metrics "concurrent map iteration and map write" (#18273)
`resourceMetricsMap` has no protection against concurrent reads and writes.

Add a mutex and don't use maps from the last iteration.

Bug introduced in #18057

Fixes #18271
2023-10-18 13:28:50 -07:00
Harshavardhana 8e32de3ba9
cache DiskInfo() metrics call separately (#18270) 2023-10-18 11:17:32 -07:00
Klaus Post e37508fb8f
fix: linter errors in Windows specific code (#18276) 2023-10-18 11:08:15 -07:00
Klaus Post b46a717425
Remove unused config migration (#18277)
None of the migration is called. Remove dead code.
2023-10-18 11:05:24 -07:00
Klaus Post 7926df0b80
Fix globalDeploymentID race (#18275)
globalDeploymentID was being read while it was being set.

Fixes race:

```
WARNING: DATA RACE
Write at 0x0000079605a0 by main goroutine:
  github.com/minio/minio/cmd.connectLoadInitFormats()
      github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0
  github.com/minio/minio/cmd.waitForFormatErasure()
      github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d
...

Previous read at 0x0000079605a0 by goroutine 105:
  github.com/minio/minio/cmd.newContext()
      github.com/minio/minio/cmd/utils.go:817 +0x31e
  github.com/minio/minio/cmd.adminMiddleware.func1()
      github.com/minio/minio/cmd/admin-router.go:110 +0x96
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
  github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1()
      github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
...
```
2023-10-18 08:06:57 -07:00
Harshavardhana f91b257f50
choose different max_concurrent requests per drive based on HDD/NVMe (#18254)
currently the default for all drives is 512, which is a lot
for HDDs the recent testing has revealed moving this to 32
for HDDs seems like a fair value.
2023-10-16 17:18:13 -07:00
Harshavardhana edfb310a59
fix: always load ENVs from files first as soon as server starts (#18247)
This is a regression from #18231, however reading from ENV files
must happen well before any parsing logic is invoked.
2023-10-15 21:13:43 -07:00
Poorna 78f1f69d57
fix site replication resync status (#18245)
To persist status changes on disk upon completion.

Adds new tests to handle this functionality.
2023-10-13 22:17:22 -07:00
Harshavardhana e1e33077e8
fix: tests and resync replication status (#18244) 2023-10-13 17:03:34 -07:00
Aditya Manthramurthy b3e7de010d
Remove usage of errors.Join for go1.19 compat (#18243) 2023-10-13 15:14:16 -07:00
Shireesh Anjal bf1c6edb76
Revert "Capture network device info in health report" (#18241)
Introducing a new version of healthinfo struct for adding this info is
not correct. It needs to be implemented differently without adding a new
version.

This reverts commit 8737025d940f80360ed4b3686b332db5156f6659.
2023-10-13 07:46:36 -07:00
jiuker 2ac7fee017
fix: missing fileName will upload failed when PostPolicyBucketHandler (#18240) 2023-10-13 07:31:23 -07:00
Klaus Post 128256e3ab
Add event counters (#18232)
Export metric for global events sent and skipped for the lifetime of the server.
2023-10-12 15:39:22 -07:00
Shireesh Anjal a66a7f3e97
Capture network device info in health report (#18213) 2023-10-12 15:33:31 -07:00
jiuker 20b79f8945
fix: env depend on the flag (#18231) 2023-10-12 15:32:38 -07:00
Klaus Post 9a877734b2
Fix various poolmeta races (#18230)
There is a fundamental race condition in `newErasureServerPools`, where setObjectLayer is 
called before the poolMeta has been loaded/populated.

We add a placeholder value to this field but disable all saving of the value, so we don't risk 
overwriting the value on disk. Once the value has been loaded or created, it is replaced with 
the proper value, which will also be saved.

Also fixes various accesses of `poolMeta` that were done without locks.

We make the `poolMeta.IsSuspended` return false, even if we shouldn't risk out-of-bounds 
reads anymore.
2023-10-12 15:30:42 -07:00
Harshavardhana 409c391850
implement helpers to get relevant info instead of FileInfo() (#18228) 2023-10-12 15:29:59 -07:00
jiuker 000928d34e
fix: should call func globalOSMetrics.time(s)() when updateOSMetrics (#18209) 2023-10-12 00:08:13 -07:00
Harshavardhana 6829ae5b13
completely remove drive caching layer from gateway days (#18217)
This has already been deprecated for close to a year now.
2023-10-11 21:18:17 -07:00
jiuker f09756443d
fix: a dynamic config will make a panic for addOrUpdateIDP (#18208) 2023-10-11 09:06:40 -07:00
jiuker 5512016885
fix: siteResyncMetrics init will make a deadlock when len(siteReplication) >= 3 (#18206) 2023-10-10 23:27:27 -07:00
Harshavardhana 21ecb941fe
fix: avoid counting out of band deletes during disk heal (#18205) 2023-10-10 14:39:48 -07:00
Harshavardhana 77e94087cf
fix: calling statfs() call moves the disk head (#18203)
if erasure upgrade is needed rely on the in-memory
values, instead of performing a "DiskInfo()" call.

https://brendangregg.com/blog/2016-09-03/sudden-disk-busy.html

for HDDs these are problematic, lets avoid this because
there is no value in "being" absolutely strict here
in terms of parity. We are okay to increase parity
as we see based on the in-memory online/offline ratio.
2023-10-10 13:47:35 -07:00
Klaus Post 9ab1f25a47
fix : PutObjectExtract data races (#18199)
Several callers to putObjectTar may be fighting to set sc. Move the write out of the loop.

Use static resp, and request elements.

Fixes tests with -race:

```
WARNING: DATA RACE
Read at 0x00c01cd680e0 by goroutine 691354:
  github.com/minio/minio/cmd.objectAPIHandlers.PutObjectExtractHandler.func1()
      e:/gopath/src/github.com/minio/minio/cmd/object-handlers.go:2130 +0x149
  github.com/minio/minio/cmd.untar.func1()
      e:/gopath/src/github.com/minio/minio/cmd/untar.go:250 +0x2b6
  github.com/minio/minio/cmd.untar.func8()
      e:/gopath/src/github.com/minio/minio/cmd/untar.go:261 +0xa4

Previous write at 0x00c01cd680e0 by goroutine 691352:
  github.com/minio/minio/cmd.objectAPIHandlers.PutObjectExtractHandler.func1()
      e:/gopath/src/github.com/minio/minio/cmd/object-handlers.go:2131 +0x15d
  github.com/minio/minio/cmd.untar.func1()
      e:/gopath/src/github.com/minio/minio/cmd/untar.go:250 +0x2b6
  github.com/minio/minio/cmd.untar.func8()
      e:/gopath/src/github.com/minio/minio/cmd/untar.go:261 +0xa4
```
2023-10-10 08:36:44 -07:00
jiuker aaab7aefbe
fix: avoid nil panic upon error in GetObjectNInfo via InnerGetObjectNInfoFn (#18198) 2023-10-10 08:35:33 -07:00
Klaus Post 5b8599e52d
Do not log invalid tag errors (#18200)
Eliminate logging on invalid tags:

```
API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325)
Time: 07:40:33 UTC 10/10/2023
DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71
RequestID: 178CAF0DE0A67480
RemoteHost: 127.0.0.1
Host: 127.0.0.1:9001
UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64)
Error: Tags cannot be more than 10 (*tags.errTag)
       5: internal\logger\logger.go:259:logger.LogIf()
       4: cmd\api-errors.go:2350:cmd.toAPIErrorCode()
       3: cmd\api-errors.go:2375:cmd.toAPIError()
       2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler()
       1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP()

API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325)
Time: 07:40:33 UTC 10/10/2023
DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71
RequestID: 178CAF0DE0BEA514
RemoteHost: 127.0.0.1
Host: 127.0.0.1:9001
UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64)
Error: Cannot provide multiple Tags with the same key (*tags.errTag)
       5: internal\logger\logger.go:259:logger.LogIf()
       4: cmd\api-errors.go:2350:cmd.toAPIErrorCode()
       3: cmd\api-errors.go:2375:cmd.toAPIError()
       2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler()
       1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP()

API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325)
Time: 07:40:33 UTC 10/10/2023
DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71
RequestID: 178CAF0DE0E78970
RemoteHost: 127.0.0.1
Host: 127.0.0.1:9001
UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64)
Error: The TagKey you have provided is invalid (*tags.errTag)
       5: internal\logger\logger.go:259:logger.LogIf()
       4: cmd\api-errors.go:2350:cmd.toAPIErrorCode()
       3: cmd\api-errors.go:2375:cmd.toAPIError()
       2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler()
       1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP()

API: PutObjectTagging(bucket=aws-sdk-go-test-aupmzek4341ee2, object=sgehiqp24fwt4hafffmtwzkrqnq325)
Time: 07:40:33 UTC 10/10/2023
DeploymentID: f122cbfa-42b1-428f-9002-39c644cace71
RequestID: 178CAF0DE1002AE8
RemoteHost: 127.0.0.1
Host: 127.0.0.1:9001
UserAgent: aws-sdk-go/1.44.257 (go1.21.0; linux; amd64)
Error: The TagValue you have provided is invalid (*tags.errTag)
       5: internal\logger\logger.go:259:logger.LogIf()
       4: cmd\api-errors.go:2350:cmd.toAPIErrorCode()
       3: cmd\api-errors.go:2375:cmd.toAPIError()
       2: cmd\object-handlers.go:2912:cmd.objectAPIHandlers.PutObjectTaggingHandler()
       1: net\http\server.go:2136:http.HandlerFunc.ServeHTTP()
```
2023-10-10 08:35:03 -07:00
Harshavardhana 74e0c9ab9b
reduce unnecessary logging, simplify certain error handling (#18196)
remove a bunch of unnecessary logs
2023-10-10 00:33:42 -07:00
Harshavardhana dcce83b288
avoid rebalance state for getObjectTags if any (#18197)
fixes #18190
2023-10-09 23:56:26 -07:00
Matthew Toohey f731e7ea36
Fix current_send_in_progress metric always being zero (#18160) 2023-10-09 17:28:17 -07:00
Maxim Tkachenko ec30bb89a4
simplify channel send() in WalkDir() (#18186) 2023-10-09 17:27:55 -07:00
Klaus Post 7cd08594f6
Use better host names for metric errors (#18188)
Typically hosts would end up like this:

```
   "hosts": [
        ":9000",
        ":9000",
        ":9000",
...
```

Also add host name to errors.
2023-10-09 17:27:11 -07:00
Aditya Manthramurthy 2b4531f069
fix: O_DIRECT is on only for multi-disk setups (#18194)
Disable it for single disk/unsupported platforms
2023-10-09 17:08:40 -07:00
Harshavardhana 11544a62aa
fix: upon write failure on disk journal close the file properly (#18183)
close the file properly before dereferencing *os.File,
this can silently leak fd's in rare cases.

This PR fixes this properly.
2023-10-08 12:17:08 -07:00
Taran Pelkey 18550387d5
fix: DeleteServiceAccount API behavior (#18163) 2023-10-08 12:13:18 -07:00
Klaus Post 0de2b9a1b2
Fix panic on double unfreezeServices (#18177)
Calling unfreezeServices twice results in panic:

```
panic: "POST /minio/peer/v32/signalservice?signal=4&sub-sys=": close of nil channel
goroutine 14703 [running]:
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x65
github.com/minio/minio/cmd.setCriticalErrorHandler.func1.1()
	github.com/minio/minio/cmd/generic-handlers.go:549 +0x8e
panic({0x27c3020, 0x4c9b370})
	runtime/panic.go:884 +0x212
github.com/minio/minio/cmd.unfreezeServices()
	github.com/minio/minio/cmd/service.go:112 +0xc7
github.com/minio/minio/cmd.(*peerRESTServer).SignalServiceHandler(0x0?, {0x4cb6af0, 0xc010b96420}, 0xc01affab00)
	github.com/minio/minio/cmd/peer-rest-server.go:837 +0x13a
net/http.HandlerFunc.ServeHTTP(...)
```

If the function was called a second time `val` would not be nil, but the returned channel `ch` would be, causing the panic.

Check the channel isn't nil and also use Swap for an atomic swap instead of 2 separate operations (though we are in a mutex).
2023-10-06 07:51:50 -06:00
Poorna 9dc29d7687
Avoid ILM expiry on deleted versions that are yet to replicate (#18175)
Fixes #18167
2023-10-06 06:55:15 -06:00
Poorna 72871dbb9a
delete replication: avoid overwriting replication decision (#18174)
from ObjectInfo unless version purge status is present. Otherwise
there is potential to make incorrect replication decision if Stat
returned an error
2023-10-05 21:09:45 -06:00
Aditya Manthramurthy 4bda4e4e2b
fix: check for disk-level O_DIRECT support (#18173)
Disk level O_DIRECT support checking at xl storage initialization was
conditional on a config setting being enabled. (This never took effect
because config initialization happens after ObjectLayer is ready.) This
is not necessary as the config setting is dynamic - O_DIRECT should be
enabled via runtime config. So we need to do the disk level support
check regardless of the config setting.
2023-10-05 20:54:49 -06:00
Harshavardhana 1971c54a50
update buffer channels for both trace and listen events (#18171)
- Trace needs higher buffered channels than 4000 to ensure
  when we run `mc admin trace -a` it captures all information
  sufficiently.

- Listen event notification needs the event channel to be
  `apiRequestsMaxPerNode` * number of nodes
2023-10-05 18:16:04 -06:00
Anis Eleuch b336e9a79f
fix: loading usage cache to not fail early when reading the backup fails (#18158)
Currently, the retry is not fully used when there is no backup copy of
the data usage; use 5 retry attempts when we don't have any valid data, 
new or backup, unless we have seen an un-recognized error.
2023-10-02 19:22:35 -07:00
Harshavardhana a2ab21e91c
add max-keys=2 optimization for spark workloads (#18154)
comment in the code provides more detailed explanation
on what this PR entails and its assumptions.

this PR reduces the amount of listing() by an order
of magnitude, however there are other such calls that
still needs further optimization that shall be done
in subsequent PRs.
2023-10-02 07:52:59 -06:00
Sveinn 603437e70f
Fix startup formatting (#18156)
Percentages in root user names are used for formatting.

Before:
```
S3-API: http://192.168.50.21:9000  http://172.31.96.1:9000  http://127.0.0.1:9000
RootUser: "U4B6Zi!b75DXSPm%!!(MISSING)a(MISSING)vZb"
RootPass: "Q4#Q6y8G%!P(MISSING)x#npP4dudUobU#NBcGB7RMKV4ajYb"

Console: http://192.168.50.21:51915 http://172.31.96.1:51915 http://127.0.0.1:51915
RootUser: "U4B6Zi!b75DXSPm%!!(MISSING)a(MISSING)vZb"
RootPass: "Q4#Q6y8G%!P(MISSING)x#npP4dudUobU#NBcGB7RMKV4ajYb"

Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart
FORMAT: %117s MESSAGE: $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%avZb" "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"
   $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%!a(MISSING)vZb" "Q4#Q6y8G%Px#npP4dudUobU#NBcGB7RMKV4ajYb"
```

After:

```
Status:         1 Online, 0 Offline.
S3-API: http://192.168.50.21:9000  http://172.31.96.1:9000  http://127.0.0.1:9000
RootUser: "U4B6Zi!b75DXSPm%avZb"
RootPass: "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"

Console: http://192.168.50.21:52421 http://172.31.96.1:52421 http://127.0.0.1:52421
RootUser: "U4B6Zi!b75DXSPm%avZb"
RootPass: "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"

Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart
   $ mc alias set myminio http://192.168.50.21:9000 "U4B6Zi!b75DXSPm%avZb" "Q4#Q6y8G%%Px#npP4dudUobU#NBcGB7RMKV4ajYb"
```

No need for special Windows case. `mc` works just fine.
2023-10-02 07:39:47 -06:00
Shireesh Anjal 6d20ec3bea
Add support for resource metrics (#18057)
Add a new endpoint for "resource" metrics `/v2/metrics/resource`

This should return system metrics related to drives, network, CPU and
memory. Except for drives, other metrics should have corresponding "avg"
and "max" values also.

Reuse the real-time feature to capture the required data,
introducing CPU and memory metrics in it.

Collect the data every minute and keep updating the average and max values
accordingly, returning the latest values when the API is called.
2023-09-30 13:40:20 -07:00
Anis Eleuch 22d2dbc4e6
decom: Fix infinite retry when the decom is canceled (#18143)
Also, use rand.Float64() since it is thread-safe; otherwise go race
will complain.
2023-09-30 00:02:29 -07:00
Harshavardhana d6446cb096
do not return an error in AbortMultipartUpload() (#18135)
returning an error is a bit undefined in AWS S3
as it may return an error or not depending on the
time from AbortMultipartUpload().
2023-09-29 10:28:19 -07:00
Harshavardhana c34bdc33fb
make sure to set Versioned field to ensure rename2 is not called (#18141)
without this the rename2() can rename the previous dataDir
causing issues for different versions of the object, only
latest version is preserved due to this bug.

Added healing code to ensure recovery of such content.
2023-09-29 09:08:24 -07:00
Anis Eleuch aec023f537
Avoid showing buckets without quorum in each pool (#18125) 2023-09-29 00:58:54 -07:00
Poorna e101eeeda9
fix: tier addition validation (#18136) 2023-09-28 22:33:24 -07:00
Harshavardhana 3c470a6b8b
fix: the inspect script to use scheme per deployment (#18118) 2023-09-27 08:22:50 -07:00
Poorna 6bc7d711b3
delete of a missing versionId return 204 (#18117) 2023-09-26 14:02:56 -07:00
Harshavardhana cdeab19673
fix: always check error upon w.Close() in Write() (#18111)
not checking w.Close() can prematurely make us
think that the w.Write() actually succeeded, apparently
Write() may or may not return an error but sometimes
only during a Close() call to the fd we may see the
error from Write() propagate.

Fdatasync(w) on the FD would return an error requiring
Close() error handling is less of a concern, however it may
happen such that fdatasync() did not return an error, where
as Close() would.
2023-09-26 11:04:00 -07:00
Anis Eleuch 22ee678136
tier: Avoid doing versioned operations since not required anymore (#18108)
Currently, setting a new tiering target returns an error when a bucket
is versioned and the tiering credentials does not have authorization to
specify a version-id when reading or removing a specific version;

Since tiering does not require versioning anymore; avoid doing versioned
operations when performing checklist ops while adding a new tiering
configuration.
2023-09-26 00:14:56 -07:00
Poorna 50a8f13e85
site replication: allow setting bandwidth default for bucket (#18062)
This can still be overridden at the bucket level
2023-09-25 15:50:52 -07:00
jiuker 6dec60b6e6
fix: check post policy like AWS S3 (#18074) 2023-09-25 12:35:25 -07:00
Harshavardhana ac3a19138a
fix: set scanning details locally to avoid cached values (#18092)
atomic variable results such as scanning must not use
cached values, instead rely on real-time information.
2023-09-25 08:26:29 -07:00
Klaus Post 21e8e071d7
Improve ListObject Compatibility (#18099)
Do not error out when a provided marker is before or after the prefix, but instead just ignore it if before and return an empty list when after.

Fixes #18093
2023-09-25 08:13:08 -07:00
Klaus Post 57f84a8b4c
Add abandoned folder scanning to metrics (#18076)
Include object and versions heal scan times when checking non-empty abandoned folders.

Furthermore don't add delay between healing versions, instead do one per object wait.
2023-09-24 22:15:31 -07:00
Aditya Manthramurthy 22041bbcc4
fix: Update policy mapping properly in notification (#18088)
This is fixing a regression from an earlier change where STS account
loading was made lazy.
2023-09-22 20:47:50 -07:00
Harshavardhana 91ebac0a00
fix: move abandoned parts check after healing not in ILM path (#18087) 2023-09-22 12:07:52 -07:00
Harshavardhana 3a90fb108c only look for metadata if batch replication asks for metadata filters (#18082)
This PR changes the StatObject() to be must have for non-minio source
to being a conditional API call.

- Calls StatObject() when needed
- Calls GetObjectTagging() when needed

These calls if we do without these conditionals can cause a lot of
delays, so we avoid them if not needed in more common scenario.
2023-09-22 11:31:57 -07:00
Shubhendu 74cfb207c1
Added check for mandatory MINIO_KMS_KES_KEY_NAME env var (#18077)
If MinIO started with KMS enabled, MINIO_KMS_KES_KEY_NAME should
be set for server to start.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-09-21 10:37:37 -07:00
Harshavardhana 9788d85ea3
remove logging for invalid metadata values (#18068) 2023-09-20 15:49:55 -07:00
Anis Eleuch 69c0e18685
perf net: Add the endpoint name related to the perf net error (#18063)
In a perf test, one node will run speed test with all nodes. If there is
an error with a peer node, the peer node name is not included in the
error hence confusing the user.

This commit will add the peer endpoint string to the netperf error.
2023-09-19 22:41:06 -07:00
Aditya Manthramurthy 3cac927348
Load STS policy mappings periodically (#18061)
To ensure that policy mappings are current for service accounts
belonging to (non-derived) STS accounts (like an LDAP user's service
account) we periodically reload such mappings.

This is primarily to handle a case where a policy mapping update
notification is missed by a minio node. Such a node would continue to
have the stale mapping in memory because STS creds/mappings were never
periodically scanned from storage.
2023-09-19 17:57:42 -07:00
Harshavardhana 9081346c40 fix: more regressions listing policy mappings (#18060)
also relax ListServiceAccounts() returning error if
no service accounts exist.
2023-09-19 15:23:18 -07:00
Harshavardhana fcfadb0e51
fix: regression in loading LDAP users policy mappings (#18055)
LDAP users are stored as STS users, we need to load
their policy mappings appropriately.

Fixes a regression caused by #17994
2023-09-19 10:31:56 -07:00
Harshavardhana 2add57cfed
apply healing per object at 1024 cycles (#18050)
- we already have MRF for most recent failures
- we trigger healing during HEAD/GET operation

These are enough, also change the default max wait
from 5sec to 1sec for default scanner speed.
2023-09-19 09:24:22 -07:00
Poorna b73699fad8
replication: pass user tags while queueing (#18052)
Continues from #18032 - otherwise replication will fail on tag based rules.
2023-09-19 03:18:28 -07:00
Harshavardhana b8ebe54e53 Revert "skip tiered objects to GLACIER in batch replication (#18044)"
This reverts commit fd421ddd6f.

MinIO already provides `filter` based on metadata that would work
in this scenario already.
2023-09-19 00:05:40 -07:00
Harshavardhana c3d70e0795
cache usage, prefix-usage, and buckets for AccountInfo up to 10 secs (#18051)
AccountInfo is quite frequently called by the Console UI 
login attempts, when many users are logging in it is important
that we provide them with better responsiveness.

- ListBuckets information is cached every second
- Bucket usage info is cached for up to 10 seconds
- Prefix usage (optional) info is cached for up to 10 secs

Failure to update after cache expiration, would still
allow login which would end up providing information
previously cached.

This allows for seamless responsiveness for the Console UI
logins, and overall responsiveness on a heavily loaded
system.
2023-09-18 22:13:03 -07:00
Harshavardhana fd421ddd6f
skip tiered objects to GLACIER in batch replication (#18044)
tiered objects to GLACIER are not readable until
they are restored, we skip these as unreadable
2023-09-18 10:25:31 -07:00
jiuker 9947c01c8e
feat: SSE-KMS use uuid instead of read all data to md5. (#17958) 2023-09-18 10:00:54 -07:00
Eng Zer Jun a00db4267c
data-usage-cache: remove redundant nil check (#17970)
From the Go specification:

  "3. If the map is nil, the number of iterations is 0." [1]

Therefore, an additional nil check for before the loop is unnecessary.

[1]: https://go.dev/ref/spec#For_range

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2023-09-16 19:09:29 -07:00
Harshavardhana 36385010f5
use optimized pathJoin instead of path.Join (#18042)
this avoids allocations in scanner routine, they are tiny but 
they allocate a lot over many cycles of the scanner.
2023-09-16 19:08:59 -07:00
Harshavardhana fa6d082bfd
reduce all major allocations in replication path (#18032)
- remove targetClient for passing around via replicationObjectInfo{}
- remove cloing to object info unnecessarily
- remove objectInfo from replicationObjectInfo{} (only require necessary fields)
2023-09-16 02:28:06 -07:00
Poorna b733e6e83c
site replication turn off retry login for admin API calls (#18039)
additionally also mark site offline if n/w is down
2023-09-15 18:01:47 -07:00
Anis Eleuch 37aa5934a1
scanner: Fix loading data usage cache structure (#18037)
Return an empty data usage cache structure when the data usage cache
file does not exist, otherwise, the scanner won't work.
2023-09-15 13:11:08 -07:00
Harshavardhana 1647fc7edc
fix: optimize listMultipartUploads to serve via local disks (#18034)
and remove unused getLoadBalancedDisks()
2023-09-15 08:34:03 -07:00
Harshavardhana 7b92687397
remove generating presignedURLs with range header for lambda (#18033) 2023-09-14 21:58:17 -07:00
Alex dc48cd841a
Added MINIO_PROMETHEUS_AUTH_TOKEN env support (#18028)
Signed-off-by: Benjamin Perez <benjamin@bexsoft.net>
2023-09-14 17:28:21 -07:00
Anis Eleuch b0e1776d6d
Do not use a chain for S3 tiering to return better error messages (#18030)
When using a chain provider all providers do not return a valid
access and secret key, an anonymous request is sent, which makes it hard
for users to figure out what is going on

In the case of S3 tiering, when AWS IAM temporary account generation returns
an error, an anonymous login will be used because of the chain provider.
Avoid this and use the AWS IAM provider directly to get a good error
message.
2023-09-14 15:28:20 -07:00
Aditya Manthramurthy 7a7068ee47
Move IAM periodic ops to a single go routine (#18026)
This helps reduce disk operations as these periodic routines would not
run concurrently any more.

Also add expired STS purging periodic operation: Since we do not scan
the on-disk STS credentials (and instead only load them on-demand) a
separate routine is needed to purge expired credentials from storage.
Currently this runs about a quarter as often as IAM refresh.

Also fix a bug where with etcd, STS accounts could get loaded into the
iamUsersMap instead of the iamSTSAccountsMap.
2023-09-14 15:25:17 -07:00
Aditya Manthramurthy cbc0ef459b
Fix policy package import name (#18031)
We do not need to rename the import of minio/pkg/v2/policy as iampolicy
any more.
2023-09-14 14:50:16 -07:00
Harshavardhana a2aabfabd9
add backups for usage-caches to rely on upon error (#18029)
This allows scanner to avoid lengthy scans, skip
things appropriately and also not lose metrics in
any manner.

reduce longer deadlines for usage-cache loads/saves
to match the disk timeout which is 2minutes now per
IOP.
2023-09-14 11:53:52 -07:00
Harshavardhana 32890342ce
introduce MINIO_BROWSER_REDIRECT env to enable/disable auto-redirect (#18025) 2023-09-13 18:43:57 -07:00
Aditya Manthramurthy ed2c2a285f
Load STS accounts into IAM cache lazily (#17994)
In situations with large number of STS credentials on disk, IAM load
time is high. To mitigate this, STS accounts will now be loaded into
memory only on demand - i.e. when the credential is used.

In each IAM cache (re)load we skip loading STS credentials and STS
policy mappings into memory. Since STS accounts only expire and cannot
be deleted, there is no risk of invalid credentials being reused,
because credential validity is checked when it is used.
2023-09-13 12:43:46 -07:00
Poorna 18e23bafd9
replication resync: report only the on-disk status (#18017)
Avoid reporting in-memory status since results can vary if different
nodes are queried, resync always runs at a single node.
2023-09-13 10:58:38 -07:00
Harshavardhana 8b8be2695f
optimize mkdir calls to avoid base-dir `Mkdir` attempts (#18021)
Currently we have IOPs of these patterns

```
[OS] os.Mkdir play.min.io:9000 /disk1 2.718µs
[OS] os.Mkdir play.min.io:9000 /disk1/data 2.406µs
[OS] os.Mkdir play.min.io:9000 /disk1/data/.minio.sys 4.068µs
[OS] os.Mkdir play.min.io:9000 /disk1/data/.minio.sys/tmp 2.843µs
[OS] os.Mkdir play.min.io:9000 /disk1/data/.minio.sys/tmp/d89c8ceb-f8d1-4cc6-b483-280f87c4719f 20.152µs
```

It can be seen that we can save quite Nx levels such as
if your drive is mounted at `/disk1/minio` you can simply
skip sending an `Mkdir /disk1/` and `Mkdir /disk1/minio`.

Since they are expected to exist already, this PR adds a way
for us to ignore all paths upto the mount or a directory which
ever has been provided to MinIO setup.
2023-09-13 08:14:36 -07:00
Poorna 96fbf18201
replication: queue existing objects to same workers as incoming (#18020)
Previously existing objects were queued to single worker and MRF re-queues
are also handled by same worker - this does not fully use the available
bandwidth in case there is no incoming workload.
2023-09-12 21:59:15 -07:00
Harshavardhana c8a57a8fa2
fix: send content-md5 for AWS S3 proactively (#18018)
fixes #17977
2023-09-12 19:11:13 -07:00
Harshavardhana b1c2dacab3
fix: allow dynamic ports for API only in non-distributed setups (#18019)
fixes #17998
2023-09-12 19:10:49 -07:00
Harshavardhana 08b3a466e8
fix: allow concurrent SFTP connections (#18013)
current implementation did not fully implement
the concurrent SFTP connection implementation,
this PR properly handles this.

fixes #17914
2023-09-12 12:41:52 -07:00
Harshavardhana 1df5e31706
optimize MRF replication queue to avoid memory leaks (#18007) 2023-09-11 20:59:11 -07:00
Harshavardhana 9f7044aed0
fix: ignore transient errors in read path (#18006)
Errors such as

```
returned an error (context deadline exceeded) (*fmt.wrapError)
```

```
(msgp: too few bytes left to read object) (*fmt.wrapError)
```
2023-09-11 15:29:59 -07:00
Anis Eleuch 41de53996b
heal: calculate the number of workers based on NRRequests (#17945) 2023-09-11 14:48:54 -07:00
Harshavardhana 9878031cfd
fix: change DISK_ to DRIVE_ for some drive related envs (#18005) 2023-09-11 12:19:22 -07:00
Poorna 703ed46d79
fix: replication of tags while removing (#17989)
A tag removal was not being replicated prior to this change
2023-09-06 19:05:02 -07:00
Harshavardhana f7ca6c63c2
fix: bucket quota clear and honor existing quota config (#17988) 2023-09-06 19:03:58 -07:00
Harshavardhana ad69b9907f
fix: report bucket metrics for only existing buckets (#17987) 2023-09-06 12:50:46 -07:00
Shubhendu bfddbb8b40
Embed file in ZIP with custom permissions (#17954)
This change enables embedding files in ZIP with custom permissions.
Also uses default creds for starting MinIO based on inspect data.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-09-06 09:24:01 -07:00
Poorna 13a2dc8485
replication resync: avoid blocking on results channel. (#17981)
continues fix in #17775
2023-09-05 20:22:39 -07:00
Harshavardhana 1e51424e8a
use syscall.Rename() directly instead of os.Rename() (#17982) 2023-09-05 20:22:23 -07:00
Harshavardhana 5b114b43f7
refactor bandwidth throttling for replication target (#17980)
This refactor is to allow using the bandwidth throttling
for other purposes.
2023-09-05 20:21:59 -07:00
Poorna 812f5a02d7
metrics: fix panic in replication stats reporting (#17979) 2023-09-05 10:26:18 -07:00
Aditya Manthramurthy 1c99fb106c
Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Krishnan Parthasarathi 71c32e9b48
Return successorModTime in quorum when available (#17925) 2023-09-04 08:24:17 -07:00
Harshavardhana 380a59520b add missing testdata for benchmarking 2023-09-02 14:40:38 -07:00
Harshavardhana 3995355150
avoid repeated large allocations for large parts (#17968)
objects with 10,000 parts and many of them can
cause a large memory spike which can potentially
lead to OOM due to lack of GC.

with previous PR reducing the memory usage significantly
in #17963, this PR reduces this further by 80% under
repeated calls.

Scanner sub-system has no use for the slice of Parts(),
it is better left empty.

```
benchmark                            old ns/op     new ns/op     delta
BenchmarkToFileInfo/ToFileInfo-8     295658        188143        -36.36%

benchmark                            old allocs     new allocs     delta
BenchmarkToFileInfo/ToFileInfo-8     61             60             -1.64%

benchmark                            old bytes     new bytes     delta
BenchmarkToFileInfo/ToFileInfo-8     1097210       227255        -79.29%
```
2023-09-02 07:49:24 -07:00
Harshavardhana 8208bcb896
remove all unnecessary logging, logOnce when absolutely needed (#17965) 2023-09-01 16:19:18 -07:00
Poorna d665e855de
replication: remove check for empty version id (#17964) 2023-09-01 13:46:10 -07:00
Harshavardhana 18b3655c99
with xlv2 format we never had to fill in checksumInfo() (#17963)
- this PR avoids sending a large ChecksumInfo slice
  when its not needed

- also for a file with XLV2 format there is no reason
  to allocate Checksum slice while reading
2023-09-01 13:45:58 -07:00
Anis Eleuch 6a8d8f34a5
kafka: Do not require key when sending a message (#17962)
Keys are helpful to ensure the strict ordering of messages, however currently the
code uses a random request id for every log, hence using the request-id
as a Kafka key is not serve any purpose;

This commit removes the usage of the key, to also fix the audit issue from
internal subsystem that does not have a request ID.
2023-09-01 08:37:22 -07:00
Harshavardhana b1c1f02132
use buffers for pathJoin, to re-use buffers. (#17960)
```
benchmark                        old ns/op     new ns/op     delta
BenchmarkPathJoin/PathJoin-8     79.6          55.3          -30.53%

benchmark                        old allocs     new allocs     delta
BenchmarkPathJoin/PathJoin-8     2              1              -50.00%

benchmark                        old bytes     new bytes     delta
BenchmarkPathJoin/PathJoin-8     48            24            -50.00%
```
2023-08-31 17:58:48 -07:00
yangw b13fcaf666
fix: read atomic variable in clientDevNull round trip time (#17955) 2023-08-31 08:31:01 -07:00
Harshavardhana 9458485e43
avoid double logging from healing (#17950) 2023-08-30 18:46:04 -07:00
Poorna b48bbe08b2
Add additional info for replication metrics API (#17293)
to track the replication transfer rate across different nodes,
number of active workers in use and in-queue stats to get
an idea of the current workload.

This PR also adds replication metrics to the site replication
status API. For site replication, prometheus metrics are
no longer at the bucket level - but at the cluster level.

Add prometheus metric to track credential errors since uptime
2023-08-30 01:00:59 -07:00
Krishnan Parthasarathi 6a67c277eb
Reuse types for key-value, notification and retry (#17936) 2023-08-29 11:27:23 -07:00
Harshavardhana 7cafdc0512
fix: skip access checks further for known buckets (#17934) 2023-08-28 15:16:41 -07:00
Harshavardhana 8a57b6bced
use renameat2 Linux extension syscall (#17757)
this is a faster and safer alternative
on newer kernel versions.
2023-08-27 09:57:11 -07:00
Krishnan Parthasarathi 53abd25116
Don't log when object to be tiered is not found (#17924) 2023-08-25 23:34:16 -07:00
Harshavardhana 1ea7826c0e
do not have to consider replicationTimestamp for healing and quorum (#17922)
replicationTimestamp might differ if there were retries
in replication and the retried attempt overwrote in
quorum but enough shards with newer timestamp causing
the existing timestamps on xl.meta to be invalid, we
do not rely on this value for anything external.

this is purely a hint for debugging purposes, but there
is no real value in it considering the object itself
is in-tact we do not have to spend time healing this
situation.

we may consider healing this situation in future but
that needs to be decoupled to make sure that we do not
over calculate how much we have to heal.
2023-08-25 15:31:15 -07:00
Anis Eleuch 0cde37be50
Reduce the number of calls to import bucket metadata (#17899)
For each bucket, save the bucket metadata 
once, call the site replication hook once
2023-08-25 07:59:16 -07:00
jiuker 6aeca54ece
fix: replace context by timeout-context from parent-context when `selfSpeedTest` (#17906) 2023-08-25 07:58:38 -07:00
Harshavardhana 124e28578c
remove strict persistence requirements for List() .metacache objects (#17917)
.metacache objects are transient in nature, and are better left to
use page-cache effectively to avoid using more IOPs on the disks.

this allows for incoming calls to be not taxed heavily due to
multiple large batch listings.
2023-08-25 07:58:11 -07:00
Harshavardhana 62c9e500de
remove mTime requirement from pre-condition checks (#17916)
given a versionId the mtime is always the same, it
can never be different than its original value.

versionIds also do not conflict, since they are uuid's
and unique practically forever.
2023-08-24 14:33:58 -07:00
jiuker 02cc18ff29 refactor the perf client for TTFB and TotalResponseTime (#17901) 2023-08-24 10:21:08 -07:00
Harshavardhana ba4566e86d
add missing IAM node metrics to cluster and node endpoint (#17908) 2023-08-24 09:26:37 -07:00
Krishnan Parthasarathi 87cb0081ec
Retain current and upto NewerNoncurrentVersions versions (#17909)
applyNewerNoncurrentVersionLimit method should pass along versions
unaffected by NewerNoncurrentVersions rule for further ILM evaluation.
2023-08-24 09:26:29 -07:00
Poorna 4a6af93c83
mark replication target offline if network timeouts seen (#17907)
regular target liveness check every 5 secs will toggle state back
as target returns online.
2023-08-24 09:24:26 -07:00
Harshavardhana af564b8ba0
allow bootstrap to capture time-spent for each initializers (#17900) 2023-08-23 03:07:06 -07:00
Klaus Post 7c8746732b
Return cancelled storage calls as 499 (#17895)
Make upstream cancels more visible - right now they are just reported as "forbidden".
2023-08-22 11:10:41 -07:00
Klaus Post f506117edb
Reduce memory profiling rate (#17894)
Change profiling from every 4KB to every 128K, reducing the lock contention by a factor of 32.
2023-08-22 07:21:49 -07:00
Harshavardhana 1c5af7c31a
serialize queueMRFHeal(), add timeouts and avoid normal build-ups (#17886)
we expect a certain level of IOPs and latency so this is okay.

fixes other miscellaneous bugs

- such as hanging on mrfCh <- when the context is canceled
- queuing MRF heal when the context is canceled
- remove unused saveStateCh channel
2023-08-21 16:44:50 -07:00
Harshavardhana 3a0125fa1f
remove unexpected logging from peer calls (#17888)
also make sure RequestID is set for system logs
2023-08-21 14:25:24 -07:00
Daniel Valdivia 328cb0a076
Pass environment variable to control session length to console (#17885)
Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>
2023-08-21 11:55:43 -07:00