Commit Graph

5795 Commits

Author SHA1 Message Date
Anis Eleuch 22f8e39b58
tier: Allow edit of the new Azure and AWS auth params (#18690)
Allow editing for the service principal credentials from Azure
and the web identity token for AWS;

Also, more validation of input parameters.
2023-12-21 16:58:10 -08:00
Harshavardhana eba23bbac4
rename object_size -> block_size for cache subsystem (#18694) 2023-12-21 16:57:13 -08:00
Harshavardhana 4550535cbb
send proper IPv6 names avoid bracketing notation (#18699)
Following policies if present

```
       "Condition": {
         "IpAddress": {
            "aws:SourceIp": [
              "54.240.143.0/24",
               "2001:DB8:1234:5678::/64"
             ]
          }
        }
```

And client is making a request to MinIO via IPv6 can
potentially crash the server.

Workarounds are turn-off IPv6 and use only IPv4
2023-12-21 16:56:55 -08:00
Anis Eleuch 8432fd5ac2
prom: Add online and healing drives metrics per erasure set (#18700) 2023-12-21 16:56:43 -08:00
Harshavardhana 7c948adf88
allow pre-allocating buffers to reduce frequent GCs during growth (#18686)
This PR also increases per node bpool memory from 1024 entries
to 2048 entries; along with that, it also moves the byte pool
centrally instead of being per pool.
2023-12-21 08:59:38 -08:00
Krishnan Parthasarathi 56b7045c20
Export tier metrics (#18678)
minio_node_tier_ttlb_seconds - Distribution of time to last byte for streaming objects from warm tier
minio_node_tier_requests_success - Number of requests to download object from warm tier that were successful
minio_node_tier_requests_failure - Number of requests to download object from warm tier that failed
2023-12-20 20:13:40 -08:00
Poorna d55b6b9909
Fix quota config replication for SR (#18684)
Fixing regression introduced by PR #17988
2023-12-19 13:22:47 -08:00
Shireesh Anjal 7680e5f81d
Read new key license_v2 from SUBNET response (#18669)
SUBNET now has a v2 of license that is returned in the new key
`license_v2`. mc will start reading and storing the same. (The old key
`license` is deprecated but is still available in SUBNET response to
ensure that the current released version of minio doesn't break)
2023-12-18 08:21:44 -08:00
Taran Pelkey ad8a34858f
Add APIs to create and list access keys for LDAP (#18402) 2023-12-15 13:00:43 -08:00
Krishnan Parthasarathi 162eced7d2
Fix incorrect metric desc for bucketRequestsDuration (#18657) 2023-12-14 19:02:11 -08:00
Krishnan Parthasarathi bec1f7c26a
metrics: Refactor handling of histogram vectors (#18632) 2023-12-14 14:02:52 -08:00
Anis Eleuch 8771617199
tier: Add support of AWS S3 tiering with web identity token file (#18648) 2023-12-14 14:01:49 -08:00
Klaus Post 6c89a81af4
Fix CreateFile shared buffer corruption. (#18652)
`(*xlStorageDiskIDCheck).CreateFile` wraps the incoming reader in `xioutil.NewDeadlineReader`.

The wrapped reader is handed to `(*xlStorage).CreateFile`. This performs a Read call via `writeAllDirect`, 
which reads into an `ODirectPool` buffer.

`(*DeadlineReader).Read` spawns an async read into the buffer. If a timeout is hit while reading, 
the read operation returns to `writeAllDirect`. The operation returns an error and the buffer is reused.

However, if the async `Read` call unblocks, it will write to the now recycled buffer.

Fix: Remove the `DeadlineReader` - it is inherently unsafe. Instead, rely on the network timeouts. 
This is not a disk timeout, anyway.

Regression in https://github.com/minio/minio/pull/17745
2023-12-14 10:51:57 -08:00
Praveen raj Mani 10ca0a6936
Label the notification target metrics by their target IDs (#18633)
This patch adds the targetID to the existing notification target metrics
and deprecates the current target metrics which points to the overall
event notification subsystem
2023-12-14 09:09:26 -08:00
Harshavardhana b3314e97a6
re-use the same local drive used by remote-peer (#18645)
historically, we have always kept storage-rest-server
and a local storage API separate without much trouble,
since they both can independently operate due to no
special state() between them.

however, over some time, we have added state()
such as

- drive monitoring threads now there will be "2" of
  them per drive instead of just 1.

- concurrent tokens available per drive are now twice
  instead of just single shared, allowing unexpectedly
  high amount of I/O to go through.

- applying serialization by using walkMutexes can now
  be adequately honored for both remote callers and local
  callers.
2023-12-13 19:27:55 -08:00
Poorna 3781a0f9ad
replication: Pass metadata timestamps in CopyObject call (#18647)
Regression from #18285. CopyObject options were inheriting source MTime
for metadata timestamps if unspecified, removing this prevented metadata
updates from being applied on target.
2023-12-13 15:28:55 -08:00
Poorna e79b289325
fix datadir missing check on HeadObject (#18646)
versions pending purge in replication were seeing a errFileCorrupt
that prevents permanent deletion after replication.

Regression from PR#18477
2023-12-13 14:54:01 -08:00
Harshavardhana 3f72c7fcc7
healthcheck requests with user-agent mozilla do not need redirects (#18642)
apparently, windows powershell curl has this abhorrent behavior
2023-12-12 16:16:26 -08:00
Harshavardhana d521c84d55
reduce logging during permission denied errors (#18641)
log them if any only once
2023-12-12 16:11:17 -08:00
Anis Eleuch 4a21dce2b5
tier: Add support of SP credentials with Azure (#18630)
Co-authored-by: Anis Elleuch <anis@min.io>
2023-12-11 21:51:53 -08:00
Harshavardhana 65f34cd823
fix: remove ODirectReader entirely since we do not need it anymore (#18619) 2023-12-09 10:17:51 -08:00
Harshavardhana 196e7e072b
allow bitrot files to be healed in MRF (#18618)
bitrot scanMode was ignored in MRF,
allow it to heal relevant content if
needed when seen as an error.
2023-12-08 12:26:01 -08:00
Anis Eleuch 6f97663174
yml-config: Add support of rootUser and rootPassword (#18615)
Users can define the root user and password in the yaml configuration
file; Root credentials defined in the environment variable still take
precedence
2023-12-08 12:04:54 -08:00
Anis Eleuch aed7a1818a
info: Populate pool/set/disk indexes for offline disks (#18613)
This can be calculated from the disk layout and some external
applications would like to know the location of the offline
disks.
2023-12-08 08:13:04 -08:00
Poorna 6b06da76cb
add configuration to limit replication workers (#18601) 2023-12-07 16:22:00 -08:00
jiuker 6ca6788bb7
feat: add events_errors_total metric (#18610) 2023-12-07 16:21:17 -08:00
Anis Eleuch 2e23e61a45
Add support of conf file to pass arguments and options (#18592) 2023-12-07 01:33:56 -08:00
Harshavardhana 53ce92b9ca
fix: use the right channel to feed the data in (#18605)
this PR fixes a regression in batch replication
where we weren't sending any data from the Walk()
results due to incorrect channels being used.
2023-12-06 18:17:03 -08:00
Shireesh Anjal 7350a29fec
Capture percentage of cpu load and memory used (#18596)
By default the cpu load is the cumulative of all cores. Capture the
percentage load (load * 100 / cpu-count)

Also capture the percentage memory used (used * 100 / total)
2023-12-06 13:19:59 -08:00
jiuker 5cc2c62c66
fix: GetFreePort() will get the same port (#18604) 2023-12-06 10:36:42 -08:00
Harshavardhana 4bc5ed6c76
support LDAP service accounts via SFTP, FTP logins (#18599) 2023-12-06 04:31:35 -08:00
Harshavardhana 73dde66dbe
stick to go1.19 go.mod (#18600) 2023-12-06 01:09:22 -08:00
Harshavardhana e30c0e7ca3 Revert "Heal buckets at node level (#18504)"
This reverts commit 708296ae1b.
2023-12-05 22:34:46 -08:00
Shubhendu 708296ae1b
Heal buckets at node level (#18504) 2023-12-05 02:17:35 -08:00
Harshavardhana fbb5e75e01
avoid run-away goroutine build-up in notification send, use channels (#18533)
use memory for async events when necessary and dequeue them as
needed, for all synchronous events customers must enable

```
MINIO_API_SYNC_EVENTS=on
```

Async events can be lost but is upto to the admin to
decide what they want, we will not create run-away number
of goroutines per event instead we will queue them properly.

Currently the max async workers is set to runtime.GOMAXPROCS(0)
which is more than sufficient in general, but it can be made
configurable in future but may not be needed.
2023-12-05 02:16:33 -08:00
Harshavardhana f327b21557
handle crashes with ILM expiry changes (#18590) 2023-12-05 01:14:36 -08:00
Harshavardhana 45b7253f39
parallelize renameData() cleanup upon error (#18591) 2023-12-04 14:54:34 -08:00
Harshavardhana 05bb655efc
avoid caching metrics for timeout errors per drive (#18584)
Bonus: combine the loop for drive/REST registration.
2023-12-04 11:54:13 -08:00
Harshavardhana 8fdfcfb562
upon RenameData() quorum error delete any partial success (#18586)
there is potential for danglingWrites when quorum failed, where
only some drives took a successful write, generally this is left
to the healing routine to pick it up. However it is better that
we delete it right away to avoid potential for quorum issues on
version signature when there are many versions of an object.
2023-12-04 11:33:39 -08:00
Harshavardhana e7c144eeac
avoid double MRF heal when there is versions disparity (#18585) 2023-12-04 11:13:50 -08:00
Harshavardhana e98172d72d
avoid hot-tier SLA to be tied to warm-tier SLA (#18581)
it is okay if the warm-tier cannot keep up, we should continue
to take I/O at hot-tier, only fail hot-tier or block it when
we are disk full.

Bonus: add metrics counter for these missed tasks, we will
know for sure if one of the node is lagging behind or is
losing too many tasks during transitioning.
2023-12-02 13:02:12 -08:00
Krishnan Parthasarathi a50f26b7f5
Implement batch-expiration for objects (#17946)
Based on an initial PR from -
https://github.com/minio/minio/pull/17792

But fully completes it with newer finalized YAML spec.
2023-12-02 02:51:33 -08:00
Klaus Post 69294cf98a
Disable DMA optimization on windows (#18575)
It appears that Windows can lock up when errors occur. Use regular copy here.
2023-12-01 16:13:19 -08:00
Krishnan Parthasarathi c397fb6c7a
Minor fixes to bucket replication (#18578) 2023-12-01 16:13:08 -08:00
Klaus Post 961b0b524e
Do not require restart when a disk is unreachable during node boot (#18576)
A disk that is not able to initialize when an instance is started
will never have a handler registered, which means a user will
need to restart the node after fixing the disk;

This will also prevent showing the wrong 'upgrade is needed.'
error message in that case.

When the disk is still failing, print an error every 30 minutes;
Disk reconnection will be retried every 30 seconds.

Co-authored-by: Anis Elleuch <anis@min.io>
2023-12-01 12:01:14 -08:00
Harshavardhana 109a9e3f35
skip ILM expired objects from healing (#18569) 2023-12-01 07:56:24 -08:00
Klaus Post 5f971fea6e
Fix Mux Connect Error (#18567)
`OpMuxConnectError` was not handled correctly.

Remove local checks for single request handlers so they can 
run before being registered locally.

Bonus: Only log IAM bootstrap on startup.
2023-12-01 00:18:04 -08:00
Klaus Post 94fbcd8ebe
Add TLS cert checksum (#18557)
It allows validation of whether all certs match across clusters.
2023-11-30 12:13:50 -08:00
Harshavardhana 879d5dd236
site replication must heal policy mappings with correct userType (#18563) 2023-11-30 10:34:18 -08:00
Harshavardhana 0ee722f8c3
cleanup handling of STS isAllowed and simplifies the PolicyDBGet() (#18554) 2023-11-29 16:07:35 -08:00
Anis Eleuch b7d11141e1
rename Force to Immediate for clarity (#18540) 2023-11-28 22:35:16 -08:00
Klaus Post bea0b050cd
Improve env var config error reporting (#18549)
Improve env var config error

Env vars that were set on current server but not on remotes were not reported in errors.

Add these.
2023-11-28 10:39:02 -08:00
Shubhendu ce62980d4e
Fixed transition rules getting overwritten while healing (#18542)
While healing the latest changes of expiry rules across sites
if target had pre existing transition rules, they were getting
overwritten as cloned latest expiry rules from remote site were
getting written as is. Fixed the same and added test cases as
well.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-28 10:38:35 -08:00
Klaus Post dc88865908
fix: shadowed error in getObjectFileInfo() (#18548)
This will result in `done <- err == nil` always returning true
for this path, which seems unintentional.
2023-11-28 09:47:41 -08:00
Krishnan Parthasarathi 9fbd931058
Skip versions expired by DeleteAllVersionsAction (#18537)
Object versions expired by DeleteAllVersionsAction must not be included
toward data-usage accounting.
2023-11-28 08:39:21 -08:00
jiuker b0264bdb90
preserve null version delete marker on suspended bucket version (#18547) 2023-11-28 08:31:33 -08:00
bestgopher 95d6f43cc8
fix(cmd/notification.go): no error when retry successful (#18530) 2023-11-27 22:41:03 -08:00
Anis Eleuch 9cb94eb4a9
cleaning up will delete instead of rename to trash with full disk err (#18534)
moveToTrash() function moves a folder to .trash, for example, when 
doing some object deletions: a data dir that has many parts will be 
renamed to the trash folder; However, ENOSPC is a valid error from 
rename(), and it can cripple a user trying to free some space in an 
entire disk situation.

Therefore, this commit will try to do a recursive delete in that case.
2023-11-27 17:36:02 -08:00
Harshavardhana bd0819330d
avoid Walk() API listing objects without quorum (#18535)
This allows batch replication to basically do not
attempt to copy objects that do not have read quorum.

This PR also allows walk() to provide custom
values for quorum under batch replication, and
key rotation.
2023-11-27 17:20:04 -08:00
Harshavardhana 8d9e83fd99
support passing signatureAge conditional (#18529)
this PR allows following policy

```
{
   "Version": "2012-10-17",
   "Statement": [
      {
         "Sid": "Deny a presigned URL request if the signature is more than 10 min old",
         "Effect": "Deny",
         "Action": "s3:*",
         "Resource": "arn:aws:s3:::DOC-EXAMPLE-BUCKET1/*",
         "Condition": {
            "NumericGreaterThan": {
               "s3:signatureAge": 600000
            }
         }
      }
   ]
}
```

This is to basically disable all pre-signed URLs that are older than 10 minutes.
2023-11-27 11:30:19 -08:00
jiuker be02333529
feat: drive sub-sys to max timeout reload (#18501) 2023-11-27 09:15:06 -08:00
Harshavardhana 506f121576
remove frivolous logging in transition object (#18526)
AWS S3 closes keep-alive connections frequently
leading to frivolous logs filling up the MinIO
logs when the transition tier is an AWS S3 bucket.

Ignore such transient errors, let MinIO retry
it when it can.
2023-11-26 22:18:09 -08:00
Klaus Post ca488cce87
Add detailed parameter tracing + custom prefix (#18518)
* Allow per handler custom prefix.
* Add automatic parameter extraction
2023-11-26 01:32:59 -08:00
Shireesh Anjal 11dc723324
Pass SUBNET URL to console (#18503)
When minio runs with MINIO_CI_CD=on, it is expected to communicate
with the locally running SUBNET. This is happening in the case of MinIO
via call home functionality. However, the subnet-related functionality inside the
console continues to talk to the SUBNET production URL. Because of this,
the console cannot be tested with a locally running SUBNET.

Set the env variable CONSOLE_SUBNET_URL correctly in such cases. 
(The console already has code to use the value of this variable
as the subnet URL)
2023-11-24 09:59:35 -08:00
Shubhendu dd6ea18901
fix: No shallow copy needed when looking at r.Form (#18499)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-24 09:46:55 -08:00
Harshavardhana 9032f49f25
DiskInfo() must return errDiskNotFound not internal errors (#18514) 2023-11-24 09:07:14 -08:00
Anis Eleuch fbc6f3f6e8
snowball-repl: Add support of immediate tiering (#18508)
Also, fix a possible crash when some fields are not added to the batch
snowball yaml
2023-11-22 16:33:11 -08:00
Harshavardhana fba883839d
feat: bring new HDD related performance enhancements (#18239)
Optionally allows customers to enable 

- Enable an external cache to catch GET/HEAD responses 
- Enable skipping disks that are slow to respond in GET/HEAD 
  when we have already achieved a quorum
2023-11-22 13:46:17 -08:00
Krishnan Parthasarathi a93214ea63
ilm: ObjectSizeLessThan and ObjectSizeGreaterThan (#18500) 2023-11-22 13:42:39 -08:00
Klaus Post e6b0fc465b
tweak healing to include version-id in healing result (#18225) 2023-11-22 12:30:31 -08:00
Anis Eleuch 70fbcfee4a
Implement batch snowball (#18485) 2023-11-22 10:51:46 -08:00
Sveinn d67e4d5b17
fix: check for bucket existence before FTP upload (#18496) 2023-11-21 21:36:32 -08:00
Harshavardhana fe3e49c4eb
use Access(F_OK) do not need to check for permissions (#18492) 2023-11-21 15:08:41 -08:00
Shubhendu 58306a9d34
Replicate Expiry ILM configs while site replication (#18130)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-11-21 09:48:06 -08:00
Harshavardhana a4cfb5e1ed
return errors if dataDir is missing during HeadObject() (#18477)
Bonus: allow replication to attempt Deletes/Puts when
the remote returns quorum errors of some kind, this is
to ensure that MinIO can rewrite the namespace with the
latest version that exists on the source.
2023-11-20 21:33:47 -08:00
Klaus Post 51aa59a737
perf: websocket grid connectivity for all internode communication (#18461)
This PR adds a WebSocket grid feature that allows servers to communicate via 
a single two-way connection.

There are two request types:

* Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small
  roundtrips with small payloads.

* Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`,
  which allows for different combinations of full two-way streams with an initial payload.

Only a single stream is created between two machines - and there is, as such, no
server/client relation since both sides can initiate and handle requests. Which server
initiates the request is decided deterministically on the server names.

Requests are made through a mux client and server, which handles message
passing, congestion, cancelation, timeouts, etc.

If a connection is lost, all requests are canceled, and the calling server will try
to reconnect. Registered handlers can operate directly on byte 
slices or use a higher-level generics abstraction.

There is no versioning of handlers/clients, and incompatible changes should
be handled by adding new handlers.

The request path can be changed to a new one for any protocol changes.

First, all servers create a "Manager." The manager must know its address 
as well as all remote addresses. This will manage all connections.
To get a connection to any remote, ask the manager to provide it given
the remote address using.

```
func (m *Manager) Connection(host string) *Connection
```

All serverside handlers must also be registered on the manager. This will
make sure that all incoming requests are served. The number of in-flight 
requests and responses must also be given for streaming requests.

The "Connection" returned manages the mux-clients. Requests issued
to the connection will be sent to the remote.

* `func (c *Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)`
   performs a single request and returns the result. Any deadline provided on the request is
   forwarded to the server, and canceling the context will make the function return at once.

* `func (c *Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st *Stream, err error)`
   will initiate a remote call and send the initial payload.

```Go
// A Stream is a two-way stream.
// All responses *must* be read by the caller.
// If the call is canceled through the context,
//The appropriate error will be returned.
type Stream struct {
	// Responses from the remote server.
	// Channel will be closed after an error or when the remote closes.
	// All responses *must* be read by the caller until either an error is returned or the channel is closed.
	// Canceling the context will cause the context cancellation error to be returned.
	Responses <-chan Response

	// Requests sent to the server.
	// If the handler is defined with 0 incoming capacity this will be nil.
	// Channel *must* be closed to signal the end of the stream.
	// If the request context is canceled, the stream will no longer process requests.
	Requests chan<- []byte
}

type Response struct {
	Msg []byte
	Err error
}
```

There are generic versions of the server/client handlers that allow the use of type
safe implementations for data types that support msgpack marshal/unmarshal.
2023-11-20 17:09:35 -08:00
Anis Eleuch 02331a612c
batch-repl: Replicate missing metadata and standard headers (#18484)
- Replicate Expires when the source is local or remote
- Replicate metadata when the source is remote
2023-11-18 19:12:44 -08:00
Anis Eleuch 8317557f70
decom: Fix listing quorum to be equal to deletion quorum (#18476)
With an odd number of drives per erasure set setup, the write/quorum is
the half + 1; however the decommissioning listing will still list those
objects and does not consider those as stale.

Fix it by using (N+1)/2 formula.

Co-authored-by: Anis Elleuch <anis@min.io>
2023-11-17 21:09:09 -08:00
Anis Eleuch 1bb7a2a295
Immediate transition ILM to avoid quick deferring to the scanner (#18475)
Immediate transition use case and is mostly used to fill warm
backend with a lot of data when a new deployment is created

Currently, if the transition queue is complete, the transition will be
deferred to the scanner; change this behavior by blocking the PUT request
until the transition queue has a new place for a transition task.
2023-11-17 16:16:46 -08:00
Harshavardhana 0a286153bb
remove checking for BucketInfo() peer call for every PUT() (#18464)
we already validate if the bucket doesn't exist in RenameData()
which can handle this cleanly, instead of making a network call
and returning errors.
2023-11-17 05:29:50 -08:00
Anis Eleuch 22d59e757d
Remove stale data in HEAD/GET object (#18460)
Currently if the object does not exist in quorum disks of an erasure
set, the dangling code is never called because the returned error will
be errFileNotFound or errFileVersionNotFound;

With this commit, when errFileNotFound or errFileVersionNotFound is
returning when trying to calculate the quorum of a given object, the
code checks if a disk returned nil, which means a stale object exists in
that disk, that will trigger deleteIfDangling() function
2023-11-16 08:39:53 -08:00
Andreas Auernhammer 0daa2dbf59
health: split liveness and readiness handler (#18457)
This commit splits the liveness and readiness
handler into two separate handlers. In K8S, a
liveness probe is used to determine whether the
pod is in "live" state and functioning at all.
In contrast, the readiness probe is used to
determine whether the pod is ready to serve
requests.

A failing liveness probe causes pod restarts while
a failing readiness probe causes k8s to stop routing
traffic to the pod. Hence, a liveness probe should
be as robust as possible while a readiness probe
should be used to load balancing.

Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Signed-off-by: Andreas Auernhammer <github@aead.dev>
2023-11-16 01:51:27 -08:00
Praveen raj Mani 38f35463b7
Load bucket configs during the metadata refresh (#18449)
This patch takes care of loading the bucket configs of failed buckets
during the periodic refresh. This makes sure the event notifiers and
remote bucket targets are properly initialized.
2023-11-15 12:43:25 -08:00
Harshavardhana 5573986e8e
fix: relax free inode check for single drive deployments (#18437)
users might use MinIO on NFS, GPFS that provide dynamic
inodes and may not even have a concept of free inodes.

to allow users to use MinIO on top of GPFS relax the
free inode check.
2023-11-14 09:31:16 -08:00
Sveinn f3367a1b20
Adding error handling for network errors in the SFTP layer (#18442) 2023-11-14 09:31:00 -08:00
Sveinn 8fbec30998
Adding a missing return to fix SFTP Rmdir message (#18438) 2023-11-14 09:26:46 -08:00
Harshavardhana a7466eeb0e
fix: ignore dperf on unformatted/unavailable/unmounted drives (#18435) 2023-11-13 22:32:08 -08:00
Harshavardhana 8b1e819bf3
fix: make sure to purge all the completed in resume() (#18429)
currently previously completed jobs would re-run
even if they are completed, causing incorrect behavior.
2023-11-13 08:15:00 -08:00
Anis Eleuch fe63664164
prom: Add drive failure tolerance per erasure set (#18424) 2023-11-13 00:59:48 -08:00
Sveinn 9afdb05bf4
fix: file consistency issue on SFTP upload (#18422)
* creating a byte buffer for SFTP file segments
* Adding an error condition for when there are 
  remaining segments in the queue
* Simplification of the queue using a map
2023-11-11 00:14:41 -08:00
Krishnan Parthasarathi 9569a85cee
Avoid allocs for MRF on-disk header (#18425) 2023-11-10 19:54:46 -08:00
Harshavardhana 54721b7c7b
fix: batch replication from source allow out of band deletes (#18423)
it is possible that ILM or Deletes got triggered on batch
of objects that we are attempting to batch replicate, ignore
this scenario as valid behavior.
2023-11-10 16:12:35 -08:00
Harshavardhana 91d8bddbd1
use sendfile/splice implementation to perform DMA (#18411)
sendfile implementation to perform DMA on all platforms

Go stdlib already supports sendfile/splice implementations
for

- Linux
- Windows
- *BSD
- Solaris

Along with this change however O_DIRECT for reads() must be
removed as well since we need to use sendfile() implementation

The main reason to add O_DIRECT for reads was to reduce the
chances of page-cache causing OOMs for MinIO, however it would
seem that avoiding buffer copies from user-space to kernel space
this issue is not a problem anymore.

There is no Go based memory allocation required, and neither
the page-cache is referenced back to MinIO. This page-
cache reference is fully owned by kernel at this point, this
essentially should solve the problem of page-cache build up.

With this now we also support SG - when NIC supports Scatter/Gather
https://en.wikipedia.org/wiki/Gather/scatter_(vector_addressing)
2023-11-10 10:10:14 -08:00
Harshavardhana 80adc87a14
converge WARM tier object name to hash of deployment+bucket (#18410)
this is to ensure that we can converge and save IOPs
when hot-tier accesses MinIO.
2023-11-10 02:15:13 -08:00
Taran Pelkey 117ad1b65b
Loosen requirements to detach policies for LDAP (#18419) 2023-11-09 14:44:43 -08:00
Klaus Post 2229509362
fix: leaking offline disks in MarkOffline() thread (#18414)
`monitorAndConnectEndpoints` will continue to attempt to reconnect offline disks.

Since disks were never closed, a `MarkOffline` would continue to try to check these disks forever.

Close previous disks.
2023-11-09 09:33:32 -08:00
Krishnan Parthasarathi 0a25083fdb
Tiered objects require ns locks unlike inlined (#18409) 2023-11-08 20:00:02 -08:00
Sveinn 15137d0327
refactor SFTP to use the new minio/pkg implementation (#18406) 2023-11-08 09:47:05 -08:00
Poorna 8c9974bc0f
site replication: avoid propagating bucket b/w settings (#18399)
replication mode and bucket bandwidth are one-way and should not be
propagated to peer cluster.

Regression from #18062
2023-11-08 00:40:25 -08:00
jiuker 079b6c2b50
fix: add err when all bucket resync failed (#18401) 2023-11-08 00:40:08 -08:00
Harshavardhana 754f7a8a39
replace io.Discard usage to fix some NUMA copy() latencies (#18394)
replace io.Discard usage to fix NUMA copy() latencies

On NUMA systems copying from 8K buffer allocated via
io.Discard leads to large latency build-up for every

```
copy(new8kbuf, largebuf)
```

can in-cur upto 1ms worth of latencies on NUMA systems
due to memory sharding across NUMA nodes.
2023-11-06 14:26:08 -08:00
Harshavardhana 64bafe1dfe
skip speedtest bucket from site-replication (#18393) 2023-11-06 11:52:33 -08:00
jiuker c3e456e7e6
fix: no resyncid when site-replication cancel (#18392) 2023-11-06 01:53:31 -08:00
vicmunoz da95a2d13f
fix: object versions metric help (#18388) 2023-11-03 11:43:52 -07:00
Shireesh Anjal cc5e05fdeb
Do not anonymize hostnames by default (#18387)
Anonymize them only if the parameter `anonymize` is set to `strict
2023-11-03 10:09:33 -07:00
jiuker 8a56af439c
fix: siteReplicationSys.startResync return no buckets return if error (#18374) 2023-11-02 16:00:03 -07:00
Shireesh Anjal f6e581ce54
Capture network device info in health report (#18381) 2023-11-02 09:49:49 -07:00
Klaus Post 7472818d94
Fix hanging scanner saves (#18368)
Fix various regressions from #18029

* If context is canceled the token is never returned. This will lead to scanner being unable to save and deadlocking.
* Fix backup not being able to get any data (hr empty)
* Reduce backup timeout.
2023-11-01 09:09:28 -07:00
Taran Pelkey 33322e6638
Change behavior of service account empty policies (#18346)
* Fix embedded/implied policy behavior

* assume implied policy if pased to empty

* fix for all

* Fix failing tests

---------

Co-authored-by: Prakash Senthil Vel <23444145+prakashsvmx@users.noreply.github.com>
2023-10-31 12:30:36 -07:00
Daniel López Guimaraes a1792ca0d1
fix: relax enforcing filename on PostPolicy (#18336)
The filename is not required to be on the form data.
2023-10-30 21:06:32 -07:00
Harshavardhana ac8c43fe9c
fix: allow missing hot-tier accounting (#18345) 2023-10-30 14:42:11 -07:00
Allan Roger Reid 4d40ee00e9
Add check for reverse proxy setups (#18310)
Add check for reverse proxy setups, to skip check for paths being served by different port on same address.
2023-10-30 10:49:04 -07:00
Adrian Najera 06f59ad631
fix: expiration time for share link when using OpenID (#18297) 2023-10-30 10:21:34 -07:00
Harshavardhana 877e0cac03
fix: tiering statistics handling a bug in clone() implementation (#18342)
Tiering statistics have been broken for some time now, a regression
was introduced in 6f2406b0b6

Bonus fixes an issue where the objects are not assumed to be
of the 'STANDARD' storage-class for the objects that have
not yet tiered, this should be conditional based on the object's
metadata not a default assumption.

This PR also does some cleanup in terms of implementation,

fixes #18070
2023-10-30 09:59:51 -07:00
Klaus Post 508710f4d1
Re-add duplicate upload id sanity check. (#18339)
https://github.com/minio/minio/pull/18307 partially removed the duplicate upload id check.

While I can't really see how ListDir can return duplicate entries, let's re-add it, since it is a cheap sanity check.
2023-10-29 08:33:30 -07:00
Matthew Toohey c2fedb4c3f
fix: log targetID instead of Name when event error occurs (#18335) 2023-10-28 08:32:57 -07:00
Poorna 03dc65e12d
Reload replication targets lazily if missing (#18333)
There can be rare situations where errors seen in bucket metadata
load on startup or subsequent metadata updates can result in missing
replication remotes.

Attempt a refresh of remote targets backed by a good replication config
lazily in 5 minute intervals if there ever occurs a situation where
remote targets go AWOL.
2023-10-27 21:08:53 -07:00
Praveen raj Mani 54aed421b8
fix: update the user cache while adding service accounts with expiry (#18320) 2023-10-26 08:11:29 -07:00
jiuker d5e8dac1cf
fix: canceling the heal caused goroutine to leak. (#18322) 2023-10-26 07:53:06 -07:00
Poorna 96ec8fcba1
Preserve replica timestamps in multipart (#18318)
Also a backward compatibility fix to use x-amz-replica-status
if present as replication status.
2023-10-25 21:24:10 -07:00
Harshavardhana 0663eb69ed
fix: do not preserve mtime during CopyObject() metadata updates (#18316)
mtime must be preserved only if destination mtime is set.

fixes #18314
2023-10-25 14:30:56 -07:00
Harshavardhana c60f54e5be
make ListMultipart/ListParts more reliable skip healing disks (#18312)
this PR also fixes old flaky tests, by properly marking disk offline-based tests.
2023-10-24 23:33:25 -07:00
Harshavardhana 483389f2e2 set diskMaxConcurrent to 32 if nrRequests is lower 2023-10-24 17:21:12 -07:00
Harshavardhana 069d118329
fix: listObjectParts to prefer local and single disks (#18309) 2023-10-24 13:51:57 -07:00
Harshavardhana a7b1834772
fix: flaky and stupid tests in root lockdown (#18308) 2023-10-24 13:22:44 -07:00
Klaus Post 6415dec37a
Improve multipart listing speed (#18307) 2023-10-24 12:06:06 -07:00
Harshavardhana 2dc917e87f
maxConcurrent must be set only once per node (#18303) 2023-10-23 21:42:36 -07:00
Aditya Manthramurthy 0a284a1a10
fix: SR: Add more info when IAM config differs (#18302)
Provide details on what IAM info mismatched when the validation fails
2023-10-23 21:16:40 -07:00
Harshavardhana 5c8339e1e8
fix: veeam SOS API to higher layers (#18287)
- support populating usage info from scanner info
- support populating quota for the bucket via quota
  settings for the bucket
2023-10-23 13:55:45 -07:00
Harshavardhana fd37418da2
fix: allow server not initialized error to be retried (#18300)
Since relaxing quorum the error across pools
for ListBuckets(), GetBucketInfo() we hit a
situation where loading IAM could potentially
return an error for second pool that server
is not initialized.

We need to handle this, let the pool come online
and retry transparently - this PR fixes that.
2023-10-23 12:30:20 -07:00
Harshavardhana bbfea29c2b
use object modTime for the event sequencer ID (#18285)
always set modTime after lock is acquired in
completemultipart stage to make sure that the
modTime is not racy.
2023-10-20 19:28:05 -07:00
Harshavardhana aa703dc903
relax write quorum requirement for ListBuckets()/HeadBucket() (#18288)
Also fix error handling for HeadBucket() to be pool specific
2023-10-20 17:50:21 -07:00
Harshavardhana 780882efcf
do not check for query params to be signed headers (#18283)
x-amz-signed-headers is meant for HTTP headers only
not for query params, using that to verify things
further can lead to failure.

The generated presigned URL with custom metadata
is already kosher (tamper proof).

fixes #18281
2023-10-19 21:32:49 -07:00
Klaus Post ba6218b354
fix: resource metrics "concurrent map iteration and map write" (#18273)
`resourceMetricsMap` has no protection against concurrent reads and writes.

Add a mutex and don't use maps from the last iteration.

Bug introduced in #18057

Fixes #18271
2023-10-18 13:28:50 -07:00
Harshavardhana 8e32de3ba9
cache DiskInfo() metrics call separately (#18270) 2023-10-18 11:17:32 -07:00
Klaus Post e37508fb8f
fix: linter errors in Windows specific code (#18276) 2023-10-18 11:08:15 -07:00
Klaus Post b46a717425
Remove unused config migration (#18277)
None of the migration is called. Remove dead code.
2023-10-18 11:05:24 -07:00
Klaus Post 7926df0b80
Fix globalDeploymentID race (#18275)
globalDeploymentID was being read while it was being set.

Fixes race:

```
WARNING: DATA RACE
Write at 0x0000079605a0 by main goroutine:
  github.com/minio/minio/cmd.connectLoadInitFormats()
      github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0
  github.com/minio/minio/cmd.waitForFormatErasure()
      github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d
...

Previous read at 0x0000079605a0 by goroutine 105:
  github.com/minio/minio/cmd.newContext()
      github.com/minio/minio/cmd/utils.go:817 +0x31e
  github.com/minio/minio/cmd.adminMiddleware.func1()
      github.com/minio/minio/cmd/admin-router.go:110 +0x96
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
  github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1()
      github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
...
```
2023-10-18 08:06:57 -07:00
Harshavardhana f91b257f50
choose different max_concurrent requests per drive based on HDD/NVMe (#18254)
currently the default for all drives is 512, which is a lot
for HDDs the recent testing has revealed moving this to 32
for HDDs seems like a fair value.
2023-10-16 17:18:13 -07:00
Harshavardhana edfb310a59
fix: always load ENVs from files first as soon as server starts (#18247)
This is a regression from #18231, however reading from ENV files
must happen well before any parsing logic is invoked.
2023-10-15 21:13:43 -07:00
Poorna 78f1f69d57
fix site replication resync status (#18245)
To persist status changes on disk upon completion.

Adds new tests to handle this functionality.
2023-10-13 22:17:22 -07:00
Harshavardhana e1e33077e8
fix: tests and resync replication status (#18244) 2023-10-13 17:03:34 -07:00
Aditya Manthramurthy b3e7de010d
Remove usage of errors.Join for go1.19 compat (#18243) 2023-10-13 15:14:16 -07:00
Shireesh Anjal bf1c6edb76
Revert "Capture network device info in health report" (#18241)
Introducing a new version of healthinfo struct for adding this info is
not correct. It needs to be implemented differently without adding a new
version.

This reverts commit 8737025d940f80360ed4b3686b332db5156f6659.
2023-10-13 07:46:36 -07:00
jiuker 2ac7fee017
fix: missing fileName will upload failed when PostPolicyBucketHandler (#18240) 2023-10-13 07:31:23 -07:00
Klaus Post 128256e3ab
Add event counters (#18232)
Export metric for global events sent and skipped for the lifetime of the server.
2023-10-12 15:39:22 -07:00
Shireesh Anjal a66a7f3e97
Capture network device info in health report (#18213) 2023-10-12 15:33:31 -07:00
jiuker 20b79f8945
fix: env depend on the flag (#18231) 2023-10-12 15:32:38 -07:00
Klaus Post 9a877734b2
Fix various poolmeta races (#18230)
There is a fundamental race condition in `newErasureServerPools`, where setObjectLayer is 
called before the poolMeta has been loaded/populated.

We add a placeholder value to this field but disable all saving of the value, so we don't risk 
overwriting the value on disk. Once the value has been loaded or created, it is replaced with 
the proper value, which will also be saved.

Also fixes various accesses of `poolMeta` that were done without locks.

We make the `poolMeta.IsSuspended` return false, even if we shouldn't risk out-of-bounds 
reads anymore.
2023-10-12 15:30:42 -07:00
Harshavardhana 409c391850
implement helpers to get relevant info instead of FileInfo() (#18228) 2023-10-12 15:29:59 -07:00