minio

mirror of https://github.com/minio/minio.git synced 2024-12-24 22:25:54 -05:00

Author	SHA1	Message	Date
Anis Eleuch	7705605b5a	scanner: Add a config to disable short sleep between objects scan (#18734 ) Add a hidden configuration under the scanner sub section to configure if the scanner should sleep between two objects scan. The configuration has only effect when there is no drive activity related to s3 requests or healing. By default, the code will keep the current behavior which is doing sleep between objects. To forcefully enable the full scan speed in idle mode, you can do this: `mc admin config set myminio scanner idle_speed=full`	2024-01-04 15:07:17 -08:00
Anis Eleuch	414bcb0c73	prom: Add read quorum per erasure set metric (#18736 )	2024-01-04 15:05:13 -08:00
Harshavardhana	f4710948c4	fix: an odd crash when deleting `null` DEL markers (#18727 ) fixes #18724 A regression was introduced in #18547, that attempted to file adding a missing `null` marker however we should not skip returning based on versionID instead it must be based on if we are being asked to create a DEL marker or not. The PR also has a side-affect for replicating `null` marker permanent delete, as it may end up adding a `null` marker while removing one. This PR should address both scenarios.	2024-01-02 15:08:18 -08:00
Anis Eleuch	3f4488c589	scanner: Allow full throttle if there is no parallel disk ops (#18109 )	2024-01-02 13:51:24 -08:00
Pedro Juarez	8f13c8c3bf	Support to store browser config settings (#18631 ) * csp_policy * hsts_seconds * hsts_include_subdomains * hsts_preload * referrer_policy	2024-01-01 08:36:33 -08:00
Zhou Ting	31d16f6cc2	allow sha256 payload to be configurable for object perf test (#18712 ) Signed-off-by: Zhou Ting <ting.z.zhou@intel.com>	2023-12-29 23:56:50 -08:00
Harshavardhana	a50ea92c64	feat: introduce list_quorum="auto" to prefer quorum drives (#18084 ) NOTE: This feature is not retro-active; it will not cater to previous transactions on existing setups. To enable this feature, please set ` _MINIO_DRIVE_QUORUM=on` environment variable as part of systemd service or k8s configmap. Once this has been enabled, you need to also set `list_quorum`. ``` ~ mc admin config set alias/ api list_quorum=auto` ``` A new debugging tool is available to check for any missing counters.	2023-12-29 15:52:41 -08:00
Harshavardhana	5b2ced0119	re-use globalLocalDrives properly (#18721 )	2023-12-29 09:30:10 -08:00
Anis Eleuch	8a0ba093dd	audit: Fix merrs and derrs object dangling message (#18714 ) merrs and derrs are empty when a dangling object is deleted. Fix the bug and adds invalid-meta data for data blocks	2023-12-27 22:27:04 -08:00
Daniel Valdivia	5fc7da345d	Upgrade Console to v0.44.0 (#18717 ) Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>	2023-12-27 11:19:13 -08:00
Anis Eleuch	8bd4f6568b	server-info: Avoid initializing audit/log http/kafka targets (#18703 ) This can cause unnecessary ServerInfo() call delay.	2023-12-22 10:25:08 -08:00
Harshavardhana	da55499db0	fix: reject clients that do not send proper payload (#18701 )	2023-12-22 01:26:17 -08:00
Anis Eleuch	22f8e39b58	tier: Allow edit of the new Azure and AWS auth params (#18690 ) Allow editing for the service principal credentials from Azure and the web identity token for AWS; Also, more validation of input parameters.	2023-12-21 16:58:10 -08:00
Harshavardhana	eba23bbac4	rename object_size -> block_size for cache subsystem (#18694 )	2023-12-21 16:57:13 -08:00
Harshavardhana	4550535cbb	send proper IPv6 names avoid bracketing notation (#18699 ) Following policies if present ``` "Condition": { "IpAddress": { "aws:SourceIp": [ "54.240.143.0/24", "2001:DB8:1234:5678::/64" ] } } ``` And client is making a request to MinIO via IPv6 can potentially crash the server. Workarounds are turn-off IPv6 and use only IPv4	2023-12-21 16:56:55 -08:00
Anis Eleuch	8432fd5ac2	prom: Add online and healing drives metrics per erasure set (#18700 )	2023-12-21 16:56:43 -08:00
Harshavardhana	7c948adf88	allow pre-allocating buffers to reduce frequent GCs during growth (#18686 ) This PR also increases per node bpool memory from 1024 entries to 2048 entries; along with that, it also moves the byte pool centrally instead of being per pool.	2023-12-21 08:59:38 -08:00
Krishnan Parthasarathi	56b7045c20	Export tier metrics (#18678 ) minio_node_tier_ttlb_seconds - Distribution of time to last byte for streaming objects from warm tier minio_node_tier_requests_success - Number of requests to download object from warm tier that were successful minio_node_tier_requests_failure - Number of requests to download object from warm tier that failed	2023-12-20 20:13:40 -08:00
Poorna	d55b6b9909	Fix quota config replication for SR (#18684 ) Fixing regression introduced by PR #17988	2023-12-19 13:22:47 -08:00
Shireesh Anjal	7680e5f81d	Read new key license_v2 from SUBNET response (#18669 ) SUBNET now has a v2 of license that is returned in the new key `license_v2`. mc will start reading and storing the same. (The old key `license` is deprecated but is still available in SUBNET response to ensure that the current released version of minio doesn't break)	2023-12-18 08:21:44 -08:00
Taran Pelkey	ad8a34858f	Add APIs to create and list access keys for LDAP (#18402 )	2023-12-15 13:00:43 -08:00
Krishnan Parthasarathi	162eced7d2	Fix incorrect metric desc for bucketRequestsDuration (#18657 )	2023-12-14 19:02:11 -08:00
Krishnan Parthasarathi	bec1f7c26a	metrics: Refactor handling of histogram vectors (#18632 )	2023-12-14 14:02:52 -08:00
Anis Eleuch	8771617199	tier: Add support of AWS S3 tiering with web identity token file (#18648 )	2023-12-14 14:01:49 -08:00
Klaus Post	6c89a81af4	Fix CreateFile shared buffer corruption. (#18652 ) `(xlStorageDiskIDCheck).CreateFile` wraps the incoming reader in `xioutil.NewDeadlineReader`. The wrapped reader is handed to `(xlStorage).CreateFile`. This performs a Read call via `writeAllDirect`, which reads into an `ODirectPool` buffer. `(*DeadlineReader).Read` spawns an async read into the buffer. If a timeout is hit while reading, the read operation returns to `writeAllDirect`. The operation returns an error and the buffer is reused. However, if the async `Read` call unblocks, it will write to the now recycled buffer. Fix: Remove the `DeadlineReader` - it is inherently unsafe. Instead, rely on the network timeouts. This is not a disk timeout, anyway. Regression in https://github.com/minio/minio/pull/17745	2023-12-14 10:51:57 -08:00
Praveen raj Mani	10ca0a6936	Label the notification target metrics by their target IDs (#18633 ) This patch adds the targetID to the existing notification target metrics and deprecates the current target metrics which points to the overall event notification subsystem	2023-12-14 09:09:26 -08:00
Harshavardhana	b3314e97a6	re-use the same local drive used by remote-peer (#18645 ) historically, we have always kept storage-rest-server and a local storage API separate without much trouble, since they both can independently operate due to no special state() between them. however, over some time, we have added state() such as - drive monitoring threads now there will be "2" of them per drive instead of just 1. - concurrent tokens available per drive are now twice instead of just single shared, allowing unexpectedly high amount of I/O to go through. - applying serialization by using walkMutexes can now be adequately honored for both remote callers and local callers.	2023-12-13 19:27:55 -08:00
Poorna	3781a0f9ad	replication: Pass metadata timestamps in CopyObject call (#18647 ) Regression from #18285. CopyObject options were inheriting source MTime for metadata timestamps if unspecified, removing this prevented metadata updates from being applied on target.	2023-12-13 15:28:55 -08:00
Poorna	e79b289325	fix datadir missing check on HeadObject (#18646 ) versions pending purge in replication were seeing a errFileCorrupt that prevents permanent deletion after replication. Regression from PR#18477	2023-12-13 14:54:01 -08:00
Harshavardhana	3f72c7fcc7	healthcheck requests with user-agent mozilla do not need redirects (#18642 ) apparently, windows powershell curl has this abhorrent behavior	2023-12-12 16:16:26 -08:00
Harshavardhana	d521c84d55	reduce logging during permission denied errors (#18641 ) log them if any only once	2023-12-12 16:11:17 -08:00
Anis Eleuch	4a21dce2b5	tier: Add support of SP credentials with Azure (#18630 ) Co-authored-by: Anis Elleuch <anis@min.io>	2023-12-11 21:51:53 -08:00
Harshavardhana	65f34cd823	fix: remove ODirectReader entirely since we do not need it anymore (#18619 )	2023-12-09 10:17:51 -08:00
Harshavardhana	196e7e072b	allow bitrot files to be healed in MRF (#18618 ) bitrot scanMode was ignored in MRF, allow it to heal relevant content if needed when seen as an error.	2023-12-08 12:26:01 -08:00
Anis Eleuch	6f97663174	yml-config: Add support of rootUser and rootPassword (#18615 ) Users can define the root user and password in the yaml configuration file; Root credentials defined in the environment variable still take precedence	2023-12-08 12:04:54 -08:00
Anis Eleuch	aed7a1818a	info: Populate pool/set/disk indexes for offline disks (#18613 ) This can be calculated from the disk layout and some external applications would like to know the location of the offline disks.	2023-12-08 08:13:04 -08:00
Poorna	6b06da76cb	add configuration to limit replication workers (#18601 )	2023-12-07 16:22:00 -08:00
jiuker	6ca6788bb7	feat: add events_errors_total metric (#18610 )	2023-12-07 16:21:17 -08:00
Anis Eleuch	2e23e61a45	Add support of conf file to pass arguments and options (#18592 )	2023-12-07 01:33:56 -08:00
Harshavardhana	53ce92b9ca	fix: use the right channel to feed the data in (#18605 ) this PR fixes a regression in batch replication where we weren't sending any data from the Walk() results due to incorrect channels being used.	2023-12-06 18:17:03 -08:00
Shireesh Anjal	7350a29fec	Capture percentage of cpu load and memory used (#18596 ) By default the cpu load is the cumulative of all cores. Capture the percentage load (load * 100 / cpu-count) Also capture the percentage memory used (used * 100 / total)	2023-12-06 13:19:59 -08:00
jiuker	5cc2c62c66	fix: GetFreePort() will get the same port (#18604 )	2023-12-06 10:36:42 -08:00
Harshavardhana	4bc5ed6c76	support LDAP service accounts via SFTP, FTP logins (#18599 )	2023-12-06 04:31:35 -08:00
Harshavardhana	73dde66dbe	stick to go1.19 go.mod (#18600 )	2023-12-06 01:09:22 -08:00
Harshavardhana	e30c0e7ca3	Revert "Heal buckets at node level (#18504 )" This reverts commit `708296ae1b`.	2023-12-05 22:34:46 -08:00
Shubhendu	708296ae1b	Heal buckets at node level (#18504 )	2023-12-05 02:17:35 -08:00
Harshavardhana	fbb5e75e01	avoid run-away goroutine build-up in notification send, use channels (#18533 ) use memory for async events when necessary and dequeue them as needed, for all synchronous events customers must enable ``` MINIO_API_SYNC_EVENTS=on ``` Async events can be lost but is upto to the admin to decide what they want, we will not create run-away number of goroutines per event instead we will queue them properly. Currently the max async workers is set to runtime.GOMAXPROCS(0) which is more than sufficient in general, but it can be made configurable in future but may not be needed.	2023-12-05 02:16:33 -08:00
Harshavardhana	f327b21557	handle crashes with ILM expiry changes (#18590 )	2023-12-05 01:14:36 -08:00
Harshavardhana	45b7253f39	parallelize renameData() cleanup upon error (#18591 )	2023-12-04 14:54:34 -08:00
Harshavardhana	05bb655efc	avoid caching metrics for timeout errors per drive (#18584 ) Bonus: combine the loop for drive/REST registration.	2023-12-04 11:54:13 -08:00
Harshavardhana	8fdfcfb562	upon RenameData() quorum error delete any partial success (#18586 ) there is potential for danglingWrites when quorum failed, where only some drives took a successful write, generally this is left to the healing routine to pick it up. However it is better that we delete it right away to avoid potential for quorum issues on version signature when there are many versions of an object.	2023-12-04 11:33:39 -08:00
Harshavardhana	e7c144eeac	avoid double MRF heal when there is versions disparity (#18585 )	2023-12-04 11:13:50 -08:00
Harshavardhana	e98172d72d	avoid hot-tier SLA to be tied to warm-tier SLA (#18581 ) it is okay if the warm-tier cannot keep up, we should continue to take I/O at hot-tier, only fail hot-tier or block it when we are disk full. Bonus: add metrics counter for these missed tasks, we will know for sure if one of the node is lagging behind or is losing too many tasks during transitioning.	2023-12-02 13:02:12 -08:00
Krishnan Parthasarathi	a50f26b7f5	Implement batch-expiration for objects (#17946 ) Based on an initial PR from - https://github.com/minio/minio/pull/17792 But fully completes it with newer finalized YAML spec.	2023-12-02 02:51:33 -08:00
Klaus Post	69294cf98a	Disable DMA optimization on windows (#18575 ) It appears that Windows can lock up when errors occur. Use regular copy here.	2023-12-01 16:13:19 -08:00
Krishnan Parthasarathi	c397fb6c7a	Minor fixes to bucket replication (#18578 )	2023-12-01 16:13:08 -08:00
Klaus Post	961b0b524e	Do not require restart when a disk is unreachable during node boot (#18576 ) A disk that is not able to initialize when an instance is started will never have a handler registered, which means a user will need to restart the node after fixing the disk; This will also prevent showing the wrong 'upgrade is needed.' error message in that case. When the disk is still failing, print an error every 30 minutes; Disk reconnection will be retried every 30 seconds. Co-authored-by: Anis Elleuch <anis@min.io>	2023-12-01 12:01:14 -08:00
Harshavardhana	109a9e3f35	skip ILM expired objects from healing (#18569 )	2023-12-01 07:56:24 -08:00
Klaus Post	5f971fea6e	Fix Mux Connect Error (#18567 ) `OpMuxConnectError` was not handled correctly. Remove local checks for single request handlers so they can run before being registered locally. Bonus: Only log IAM bootstrap on startup.	2023-12-01 00:18:04 -08:00
Klaus Post	94fbcd8ebe	Add TLS cert checksum (#18557 ) It allows validation of whether all certs match across clusters.	2023-11-30 12:13:50 -08:00
Harshavardhana	879d5dd236	site replication must heal policy mappings with correct userType (#18563 )	2023-11-30 10:34:18 -08:00
Harshavardhana	0ee722f8c3	cleanup handling of STS isAllowed and simplifies the PolicyDBGet() (#18554 )	2023-11-29 16:07:35 -08:00
Anis Eleuch	b7d11141e1	rename Force to Immediate for clarity (#18540 )	2023-11-28 22:35:16 -08:00
Klaus Post	bea0b050cd	Improve env var config error reporting (#18549 ) Improve env var config error Env vars that were set on current server but not on remotes were not reported in errors. Add these.	2023-11-28 10:39:02 -08:00
Shubhendu	ce62980d4e	Fixed transition rules getting overwritten while healing (#18542 ) While healing the latest changes of expiry rules across sites if target had pre existing transition rules, they were getting overwritten as cloned latest expiry rules from remote site were getting written as is. Fixed the same and added test cases as well. Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2023-11-28 10:38:35 -08:00
Klaus Post	dc88865908	fix: shadowed error in getObjectFileInfo() (#18548 ) This will result in `done <- err == nil` always returning true for this path, which seems unintentional.	2023-11-28 09:47:41 -08:00
Krishnan Parthasarathi	9fbd931058	Skip versions expired by DeleteAllVersionsAction (#18537 ) Object versions expired by DeleteAllVersionsAction must not be included toward data-usage accounting.	2023-11-28 08:39:21 -08:00
jiuker	b0264bdb90	preserve null version delete marker on suspended bucket version (#18547 )	2023-11-28 08:31:33 -08:00
bestgopher	95d6f43cc8	fix(cmd/notification.go): no error when retry successful (#18530 )	2023-11-27 22:41:03 -08:00
Anis Eleuch	9cb94eb4a9	cleaning up will delete instead of rename to trash with full disk err (#18534 ) moveToTrash() function moves a folder to .trash, for example, when doing some object deletions: a data dir that has many parts will be renamed to the trash folder; However, ENOSPC is a valid error from rename(), and it can cripple a user trying to free some space in an entire disk situation. Therefore, this commit will try to do a recursive delete in that case.	2023-11-27 17:36:02 -08:00
Harshavardhana	bd0819330d	avoid Walk() API listing objects without quorum (#18535 ) This allows batch replication to basically do not attempt to copy objects that do not have read quorum. This PR also allows walk() to provide custom values for quorum under batch replication, and key rotation.	2023-11-27 17:20:04 -08:00
Harshavardhana	8d9e83fd99	support passing signatureAge conditional (#18529 ) this PR allows following policy ``` { "Version": "2012-10-17", "Statement": [ { "Sid": "Deny a presigned URL request if the signature is more than 10 min old", "Effect": "Deny", "Action": "s3:", "Resource": "arn:aws:s3:::DOC-EXAMPLE-BUCKET1/", "Condition": { "NumericGreaterThan": { "s3:signatureAge": 600000 } } } ] } ``` This is to basically disable all pre-signed URLs that are older than 10 minutes.	2023-11-27 11:30:19 -08:00
jiuker	be02333529	feat: drive sub-sys to max timeout reload (#18501 )	2023-11-27 09:15:06 -08:00
Harshavardhana	506f121576	remove frivolous logging in transition object (#18526 ) AWS S3 closes keep-alive connections frequently leading to frivolous logs filling up the MinIO logs when the transition tier is an AWS S3 bucket. Ignore such transient errors, let MinIO retry it when it can.	2023-11-26 22:18:09 -08:00
Klaus Post	ca488cce87	Add detailed parameter tracing + custom prefix (#18518 ) * Allow per handler custom prefix. * Add automatic parameter extraction	2023-11-26 01:32:59 -08:00
Shireesh Anjal	11dc723324	Pass SUBNET URL to console (#18503 ) When minio runs with MINIO_CI_CD=on, it is expected to communicate with the locally running SUBNET. This is happening in the case of MinIO via call home functionality. However, the subnet-related functionality inside the console continues to talk to the SUBNET production URL. Because of this, the console cannot be tested with a locally running SUBNET. Set the env variable CONSOLE_SUBNET_URL correctly in such cases. (The console already has code to use the value of this variable as the subnet URL)	2023-11-24 09:59:35 -08:00
Shubhendu	dd6ea18901	fix: No shallow copy needed when looking at r.Form (#18499 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2023-11-24 09:46:55 -08:00
Harshavardhana	9032f49f25	DiskInfo() must return errDiskNotFound not internal errors (#18514 )	2023-11-24 09:07:14 -08:00
Anis Eleuch	fbc6f3f6e8	snowball-repl: Add support of immediate tiering (#18508 ) Also, fix a possible crash when some fields are not added to the batch snowball yaml	2023-11-22 16:33:11 -08:00
Harshavardhana	fba883839d	feat: bring new HDD related performance enhancements (#18239 ) Optionally allows customers to enable - Enable an external cache to catch GET/HEAD responses - Enable skipping disks that are slow to respond in GET/HEAD when we have already achieved a quorum	2023-11-22 13:46:17 -08:00
Krishnan Parthasarathi	a93214ea63	ilm: ObjectSizeLessThan and ObjectSizeGreaterThan (#18500 )	2023-11-22 13:42:39 -08:00
Klaus Post	e6b0fc465b	tweak healing to include version-id in healing result (#18225 )	2023-11-22 12:30:31 -08:00
Anis Eleuch	70fbcfee4a	Implement batch snowball (#18485 )	2023-11-22 10:51:46 -08:00
Sveinn	d67e4d5b17	fix: check for bucket existence before FTP upload (#18496 )	2023-11-21 21:36:32 -08:00
Harshavardhana	fe3e49c4eb	use Access(F_OK) do not need to check for permissions (#18492 )	2023-11-21 15:08:41 -08:00
Shubhendu	58306a9d34	Replicate Expiry ILM configs while site replication (#18130 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2023-11-21 09:48:06 -08:00
Harshavardhana	a4cfb5e1ed	return errors if dataDir is missing during HeadObject() (#18477 ) Bonus: allow replication to attempt Deletes/Puts when the remote returns quorum errors of some kind, this is to ensure that MinIO can rewrite the namespace with the latest version that exists on the source.	2023-11-20 21:33:47 -08:00
Klaus Post	51aa59a737	perf: websocket grid connectivity for all internode communication (#18461 ) This PR adds a WebSocket grid feature that allows servers to communicate via a single two-way connection. There are two request types: * Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small roundtrips with small payloads. * Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`, which allows for different combinations of full two-way streams with an initial payload. Only a single stream is created between two machines - and there is, as such, no server/client relation since both sides can initiate and handle requests. Which server initiates the request is decided deterministically on the server names. Requests are made through a mux client and server, which handles message passing, congestion, cancelation, timeouts, etc. If a connection is lost, all requests are canceled, and the calling server will try to reconnect. Registered handlers can operate directly on byte slices or use a higher-level generics abstraction. There is no versioning of handlers/clients, and incompatible changes should be handled by adding new handlers. The request path can be changed to a new one for any protocol changes. First, all servers create a "Manager." The manager must know its address as well as all remote addresses. This will manage all connections. To get a connection to any remote, ask the manager to provide it given the remote address using. ``` func (m Manager) Connection(host string) Connection ``` All serverside handlers must also be registered on the manager. This will make sure that all incoming requests are served. The number of in-flight requests and responses must also be given for streaming requests. The "Connection" returned manages the mux-clients. Requests issued to the connection will be sent to the remote. * `func (c Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)` performs a single request and returns the result. Any deadline provided on the request is forwarded to the server, and canceling the context will make the function return at once. `func (c Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st Stream, err error)` will initiate a remote call and send the initial payload. ```Go // A Stream is a two-way stream. // All responses must be read by the caller. // If the call is canceled through the context, //The appropriate error will be returned. type Stream struct { // Responses from the remote server. // Channel will be closed after an error or when the remote closes. // All responses must be read by the caller until either an error is returned or the channel is closed. // Canceling the context will cause the context cancellation error to be returned. Responses <-chan Response // Requests sent to the server. // If the handler is defined with 0 incoming capacity this will be nil. // Channel must be closed to signal the end of the stream. // If the request context is canceled, the stream will no longer process requests. Requests chan<- []byte } type Response struct { Msg []byte Err error } ``` There are generic versions of the server/client handlers that allow the use of type safe implementations for data types that support msgpack marshal/unmarshal.	2023-11-20 17:09:35 -08:00
Anis Eleuch	02331a612c	batch-repl: Replicate missing metadata and standard headers (#18484 ) - Replicate Expires when the source is local or remote - Replicate metadata when the source is remote	2023-11-18 19:12:44 -08:00
Anis Eleuch	8317557f70	decom: Fix listing quorum to be equal to deletion quorum (#18476 ) With an odd number of drives per erasure set setup, the write/quorum is the half + 1; however the decommissioning listing will still list those objects and does not consider those as stale. Fix it by using (N+1)/2 formula. Co-authored-by: Anis Elleuch <anis@min.io>	2023-11-17 21:09:09 -08:00
Anis Eleuch	1bb7a2a295	Immediate transition ILM to avoid quick deferring to the scanner (#18475 ) Immediate transition use case and is mostly used to fill warm backend with a lot of data when a new deployment is created Currently, if the transition queue is complete, the transition will be deferred to the scanner; change this behavior by blocking the PUT request until the transition queue has a new place for a transition task.	2023-11-17 16:16:46 -08:00
Harshavardhana	0a286153bb	remove checking for BucketInfo() peer call for every PUT() (#18464 ) we already validate if the bucket doesn't exist in RenameData() which can handle this cleanly, instead of making a network call and returning errors.	2023-11-17 05:29:50 -08:00
Anis Eleuch	22d59e757d	Remove stale data in HEAD/GET object (#18460 ) Currently if the object does not exist in quorum disks of an erasure set, the dangling code is never called because the returned error will be errFileNotFound or errFileVersionNotFound; With this commit, when errFileNotFound or errFileVersionNotFound is returning when trying to calculate the quorum of a given object, the code checks if a disk returned nil, which means a stale object exists in that disk, that will trigger deleteIfDangling() function	2023-11-16 08:39:53 -08:00
Andreas Auernhammer	0daa2dbf59	health: split liveness and readiness handler (#18457 ) This commit splits the liveness and readiness handler into two separate handlers. In K8S, a liveness probe is used to determine whether the pod is in "live" state and functioning at all. In contrast, the readiness probe is used to determine whether the pod is ready to serve requests. A failing liveness probe causes pod restarts while a failing readiness probe causes k8s to stop routing traffic to the pod. Hence, a liveness probe should be as robust as possible while a readiness probe should be used to load balancing. Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ Signed-off-by: Andreas Auernhammer <github@aead.dev>	2023-11-16 01:51:27 -08:00
Praveen raj Mani	38f35463b7	Load bucket configs during the metadata refresh (#18449 ) This patch takes care of loading the bucket configs of failed buckets during the periodic refresh. This makes sure the event notifiers and remote bucket targets are properly initialized.	2023-11-15 12:43:25 -08:00
Harshavardhana	5573986e8e	fix: relax free inode check for single drive deployments (#18437 ) users might use MinIO on NFS, GPFS that provide dynamic inodes and may not even have a concept of free inodes. to allow users to use MinIO on top of GPFS relax the free inode check.	2023-11-14 09:31:16 -08:00
Sveinn	f3367a1b20	Adding error handling for network errors in the SFTP layer (#18442 )	2023-11-14 09:31:00 -08:00
Sveinn	8fbec30998	Adding a missing return to fix SFTP Rmdir message (#18438 )	2023-11-14 09:26:46 -08:00
Harshavardhana	a7466eeb0e	fix: ignore dperf on unformatted/unavailable/unmounted drives (#18435 )	2023-11-13 22:32:08 -08:00
Harshavardhana	8b1e819bf3	fix: make sure to purge all the completed in resume() (#18429 ) currently previously completed jobs would re-run even if they are completed, causing incorrect behavior.	2023-11-13 08:15:00 -08:00
Anis Eleuch	fe63664164	prom: Add drive failure tolerance per erasure set (#18424 )	2023-11-13 00:59:48 -08:00
Sveinn	9afdb05bf4	fix: file consistency issue on SFTP upload (#18422 ) * creating a byte buffer for SFTP file segments * Adding an error condition for when there are remaining segments in the queue * Simplification of the queue using a map	2023-11-11 00:14:41 -08:00
Krishnan Parthasarathi	9569a85cee	Avoid allocs for MRF on-disk header (#18425 )	2023-11-10 19:54:46 -08:00
Harshavardhana	54721b7c7b	fix: batch replication from source allow out of band deletes (#18423 ) it is possible that ILM or Deletes got triggered on batch of objects that we are attempting to batch replicate, ignore this scenario as valid behavior.	2023-11-10 16:12:35 -08:00
Harshavardhana	91d8bddbd1	use sendfile/splice implementation to perform DMA (#18411 ) sendfile implementation to perform DMA on all platforms Go stdlib already supports sendfile/splice implementations for - Linux - Windows - *BSD - Solaris Along with this change however O_DIRECT for reads() must be removed as well since we need to use sendfile() implementation The main reason to add O_DIRECT for reads was to reduce the chances of page-cache causing OOMs for MinIO, however it would seem that avoiding buffer copies from user-space to kernel space this issue is not a problem anymore. There is no Go based memory allocation required, and neither the page-cache is referenced back to MinIO. This page- cache reference is fully owned by kernel at this point, this essentially should solve the problem of page-cache build up. With this now we also support SG - when NIC supports Scatter/Gather https://en.wikipedia.org/wiki/Gather/scatter_(vector_addressing)	2023-11-10 10:10:14 -08:00
Harshavardhana	80adc87a14	converge WARM tier object name to hash of deployment+bucket (#18410 ) this is to ensure that we can converge and save IOPs when hot-tier accesses MinIO.	2023-11-10 02:15:13 -08:00
Taran Pelkey	117ad1b65b	Loosen requirements to detach policies for LDAP (#18419 )	2023-11-09 14:44:43 -08:00
Klaus Post	2229509362	fix: leaking offline disks in MarkOffline() thread (#18414 ) `monitorAndConnectEndpoints` will continue to attempt to reconnect offline disks. Since disks were never closed, a `MarkOffline` would continue to try to check these disks forever. Close previous disks.	2023-11-09 09:33:32 -08:00
Krishnan Parthasarathi	0a25083fdb	Tiered objects require ns locks unlike inlined (#18409 )	2023-11-08 20:00:02 -08:00
Sveinn	15137d0327	refactor SFTP to use the new minio/pkg implementation (#18406 )	2023-11-08 09:47:05 -08:00
Poorna	8c9974bc0f	site replication: avoid propagating bucket b/w settings (#18399 ) replication mode and bucket bandwidth are one-way and should not be propagated to peer cluster. Regression from #18062	2023-11-08 00:40:25 -08:00
jiuker	079b6c2b50	fix: add err when all bucket resync failed (#18401 )	2023-11-08 00:40:08 -08:00
Harshavardhana	754f7a8a39	replace io.Discard usage to fix some NUMA copy() latencies (#18394 ) replace io.Discard usage to fix NUMA copy() latencies On NUMA systems copying from 8K buffer allocated via io.Discard leads to large latency build-up for every ``` copy(new8kbuf, largebuf) ``` can in-cur upto 1ms worth of latencies on NUMA systems due to memory sharding across NUMA nodes.	2023-11-06 14:26:08 -08:00
Harshavardhana	64bafe1dfe	skip speedtest bucket from site-replication (#18393 )	2023-11-06 11:52:33 -08:00
jiuker	c3e456e7e6	fix: no resyncid when site-replication cancel (#18392 )	2023-11-06 01:53:31 -08:00
vicmunoz	da95a2d13f	fix: object versions metric help (#18388 )	2023-11-03 11:43:52 -07:00
Shireesh Anjal	cc5e05fdeb	Do not anonymize hostnames by default (#18387 ) Anonymize them only if the parameter `anonymize` is set to `strict	2023-11-03 10:09:33 -07:00
jiuker	8a56af439c	fix: siteReplicationSys.startResync return no buckets return if error (#18374 )	2023-11-02 16:00:03 -07:00
Shireesh Anjal	f6e581ce54	Capture network device info in health report (#18381 )	2023-11-02 09:49:49 -07:00
Klaus Post	7472818d94	Fix hanging scanner saves (#18368 ) Fix various regressions from #18029 * If context is canceled the token is never returned. This will lead to scanner being unable to save and deadlocking. * Fix backup not being able to get any data (hr empty) * Reduce backup timeout.	2023-11-01 09:09:28 -07:00
Taran Pelkey	33322e6638	Change behavior of service account empty policies (#18346 ) * Fix embedded/implied policy behavior * assume implied policy if pased to empty * fix for all * Fix failing tests --------- Co-authored-by: Prakash Senthil Vel <23444145+prakashsvmx@users.noreply.github.com>	2023-10-31 12:30:36 -07:00
Daniel López Guimaraes	a1792ca0d1	fix: relax enforcing filename on PostPolicy (#18336 ) The filename is not required to be on the form data.	2023-10-30 21:06:32 -07:00
Harshavardhana	ac8c43fe9c	fix: allow missing hot-tier accounting (#18345 )	2023-10-30 14:42:11 -07:00
Allan Roger Reid	4d40ee00e9	Add check for reverse proxy setups (#18310 ) Add check for reverse proxy setups, to skip check for paths being served by different port on same address.	2023-10-30 10:49:04 -07:00
Adrian Najera	06f59ad631	fix: expiration time for share link when using OpenID (#18297 )	2023-10-30 10:21:34 -07:00
Harshavardhana	877e0cac03	fix: tiering statistics handling a bug in clone() implementation (#18342 ) Tiering statistics have been broken for some time now, a regression was introduced in `6f2406b0b6` Bonus fixes an issue where the objects are not assumed to be of the 'STANDARD' storage-class for the objects that have not yet tiered, this should be conditional based on the object's metadata not a default assumption. This PR also does some cleanup in terms of implementation, fixes #18070	2023-10-30 09:59:51 -07:00
Klaus Post	508710f4d1	Re-add duplicate upload id sanity check. (#18339 ) https://github.com/minio/minio/pull/18307 partially removed the duplicate upload id check. While I can't really see how ListDir can return duplicate entries, let's re-add it, since it is a cheap sanity check.	2023-10-29 08:33:30 -07:00
Matthew Toohey	c2fedb4c3f	fix: log targetID instead of Name when event error occurs (#18335 )	2023-10-28 08:32:57 -07:00
Poorna	03dc65e12d	Reload replication targets lazily if missing (#18333 ) There can be rare situations where errors seen in bucket metadata load on startup or subsequent metadata updates can result in missing replication remotes. Attempt a refresh of remote targets backed by a good replication config lazily in 5 minute intervals if there ever occurs a situation where remote targets go AWOL.	2023-10-27 21:08:53 -07:00
Praveen raj Mani	54aed421b8	fix: update the user cache while adding service accounts with expiry (#18320 )	2023-10-26 08:11:29 -07:00
jiuker	d5e8dac1cf	fix: canceling the heal caused goroutine to leak. (#18322 )	2023-10-26 07:53:06 -07:00
Poorna	96ec8fcba1	Preserve replica timestamps in multipart (#18318 ) Also a backward compatibility fix to use x-amz-replica-status if present as replication status.	2023-10-25 21:24:10 -07:00
Harshavardhana	0663eb69ed	fix: do not preserve mtime during CopyObject() metadata updates (#18316 ) mtime must be preserved only if destination mtime is set. fixes #18314	2023-10-25 14:30:56 -07:00
Harshavardhana	c60f54e5be	make ListMultipart/ListParts more reliable skip healing disks (#18312 ) this PR also fixes old flaky tests, by properly marking disk offline-based tests.	2023-10-24 23:33:25 -07:00
Harshavardhana	483389f2e2	set diskMaxConcurrent to 32 if nrRequests is lower	2023-10-24 17:21:12 -07:00
Harshavardhana	069d118329	fix: listObjectParts to prefer local and single disks (#18309 )	2023-10-24 13:51:57 -07:00
Harshavardhana	a7b1834772	fix: flaky and stupid tests in root lockdown (#18308 )	2023-10-24 13:22:44 -07:00
Klaus Post	6415dec37a	Improve multipart listing speed (#18307 )	2023-10-24 12:06:06 -07:00
Harshavardhana	2dc917e87f	maxConcurrent must be set only once per node (#18303 )	2023-10-23 21:42:36 -07:00
Aditya Manthramurthy	0a284a1a10	fix: SR: Add more info when IAM config differs (#18302 ) Provide details on what IAM info mismatched when the validation fails	2023-10-23 21:16:40 -07:00
Harshavardhana	5c8339e1e8	fix: veeam SOS API to higher layers (#18287 ) - support populating usage info from scanner info - support populating quota for the bucket via quota settings for the bucket	2023-10-23 13:55:45 -07:00
Harshavardhana	fd37418da2	fix: allow server not initialized error to be retried (#18300 ) Since relaxing quorum the error across pools for ListBuckets(), GetBucketInfo() we hit a situation where loading IAM could potentially return an error for second pool that server is not initialized. We need to handle this, let the pool come online and retry transparently - this PR fixes that.	2023-10-23 12:30:20 -07:00
Harshavardhana	bbfea29c2b	use object modTime for the event sequencer ID (#18285 ) always set modTime after lock is acquired in completemultipart stage to make sure that the modTime is not racy.	2023-10-20 19:28:05 -07:00
Harshavardhana	aa703dc903	relax write quorum requirement for ListBuckets()/HeadBucket() (#18288 ) Also fix error handling for HeadBucket() to be pool specific	2023-10-20 17:50:21 -07:00
Harshavardhana	780882efcf	do not check for query params to be signed headers (#18283 ) x-amz-signed-headers is meant for HTTP headers only not for query params, using that to verify things further can lead to failure. The generated presigned URL with custom metadata is already kosher (tamper proof). fixes #18281	2023-10-19 21:32:49 -07:00
Klaus Post	ba6218b354	fix: resource metrics "concurrent map iteration and map write" (#18273 ) `resourceMetricsMap` has no protection against concurrent reads and writes. Add a mutex and don't use maps from the last iteration. Bug introduced in #18057 Fixes #18271	2023-10-18 13:28:50 -07:00
Harshavardhana	8e32de3ba9	cache DiskInfo() metrics call separately (#18270 )	2023-10-18 11:17:32 -07:00
Klaus Post	e37508fb8f	fix: linter errors in Windows specific code (#18276 )	2023-10-18 11:08:15 -07:00
Klaus Post	b46a717425	Remove unused config migration (#18277 ) None of the migration is called. Remove dead code.	2023-10-18 11:05:24 -07:00
Klaus Post	7926df0b80	Fix globalDeploymentID race (#18275 ) globalDeploymentID was being read while it was being set. Fixes race: ``` WARNING: DATA RACE Write at 0x0000079605a0 by main goroutine: github.com/minio/minio/cmd.connectLoadInitFormats() github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0 github.com/minio/minio/cmd.waitForFormatErasure() github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d ... Previous read at 0x0000079605a0 by goroutine 105: github.com/minio/minio/cmd.newContext() github.com/minio/minio/cmd/utils.go:817 +0x31e github.com/minio/minio/cmd.adminMiddleware.func1() github.com/minio/minio/cmd/admin-router.go:110 +0x96 net/http.HandlerFunc.ServeHTTP() net/http/server.go:2136 +0x47 github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1() github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a net/http.HandlerFunc.ServeHTTP() net/http/server.go:2136 +0x47 ... ```	2023-10-18 08:06:57 -07:00

1 2 3 4 5 ...

5807 Commits