minio

mirror of https://github.com/minio/minio.git synced 2025-05-03 07:50:32 -04:00

Author	SHA1	Message	Date
Harshavardhana	960d604013	disconnected returns, an unexpected error to List() returning 500s (#18959 ) provide the error string appropriately so that the matching of error types works. Also add a string based fallback for the said error.	2024-02-03 01:04:33 -08:00
Harshavardhana	ff80cfd83d	move Make,Delete,Head,Heal bucket calls to websockets (#18951 )	2024-02-02 14:54:54 -08:00
Harshavardhana	99fde2ba85	deprecate disk tokens, instead rely on deadlines and active monitoring (#18947 ) disk tokens usage is not necessary anymore with the implementation of deadlines for storage calls and active monitoring of the drive for I/O timeouts. Functionality kicking off a bad drive is still supported, it's just that we do not have to serialize I/O in the manner tokens would do.	2024-02-02 10:10:54 -08:00
Frank Wessels	31743789dc	Fix some leftover issues from PR 18936 (#18946 )	2024-02-01 19:42:56 -08:00
Anis Eleuch	6fd63e920a	log: Use error log type instead of Application/MinIO type (#18930 ) * log: Use error log type instead of Application/MinIO type Also bump github.com/shirou/gopsutil version to address cross compilation issues. * Apply suggestions from code review Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com> --------- Co-authored-by: Anis Eleuch <anis@min.io> Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2024-02-01 16:13:57 -08:00
Aditya Manthramurthy	59cc3e93d6	fix: `null` inline policy handling for access keys (#18945 ) Interpret `null` inline policy for access keys as inheriting parent policy. Since MinIO Console currently sends this value, we need to honor it for now. A larger fix in Console and in the server are required. Fixes #18939.	2024-02-01 14:45:03 -08:00
Anis Eleuch	61a4bb38cd	batch: Fix a typo while validating smallerThan field (#18942 )	2024-02-01 13:53:26 -08:00
Klaus Post	b192bc348c	Improve object reuse for grid messages (#18940 ) Allow internal types to support a `Recycler` interface, which will allow for sharing of common types across handlers. This means that all `grid.MSS` (and similar) objects are shared across in a common pool instead of a per-handler pool. Add internal request reuse of internal types. Add for safe (pointerless) types explicitly. Only log params for internal types. Doing Sprint(obj) is just a bit too messy.	2024-02-01 12:41:20 -08:00
Harshavardhana	6440d0fbf3	move a collection of peer APIs to websockets (#18936 )	2024-02-01 10:47:20 -08:00
Anis Eleuch	24ecc44bac	Keep ServiceV1 admin stop/restart API and mark as deprecated (#18932 )	2024-01-31 12:20:33 -08:00
Aditya Manthramurthy	0ae4915a93	fix: permission checks for editing access keys (#18928 ) With this change, only a user with `UpdateServiceAccountAdminAction` permission is able to edit access keys. We would like to let a user edit their own access keys, however the feature needs to be re-designed for better security and integration with external systems like AD/LDAP and OpenID. This change prevents privilege escalation via service accounts.	2024-01-31 10:56:45 -08:00
Harshavardhana	caac9d216e	remove all the frivolous logs, that may or may not be actionable (#18922 ) for actionable, inspections we have `mc support inspect` we do not need double logging, healing will report relevant errors if any, in terms of quorum lost etc.	2024-01-30 18:11:45 -08:00
Harshavardhana	057192913c	add total usable capacity, free and used to DataUsageInfo() (#18921 )	2024-01-30 17:49:37 -08:00
Harshavardhana	f25cbdf43c	use all the available nr_requests for NVMe (#18920 )	2024-01-30 14:10:06 -08:00
Klaus Post	6da4a9c7bb	Improve tracing & notification scalability (#18903 ) * Perform JSON encoding on remote machines and only forward byte slices. * Migrate tracing & notification to WebSockets.	2024-01-30 12:49:02 -08:00
Harshavardhana	80ca120088	remove checkBucketExist check entirely to avoid fan-out calls (#18917 ) Each Put, List, Multipart operations heavily rely on making GetBucketInfo() call to verify if bucket exists or not on a regular basis. This has a large performance cost when there are tons of servers involved. We did optimize this part by vectorizing the bucket calls, however its not enough, beyond 100 nodes and this becomes fairly visible in terms of performance.	2024-01-30 12:43:25 -08:00
Anis Eleuch	a669946357	Add cgroup v2 support for memory limit (#18905 )	2024-01-30 11:13:27 -08:00
Poorna	7ffc162ea8	exclude veeam virtual objects from replication (#18918 ) Fixes: #18916	2024-01-30 10:43:58 -08:00
Poorna	bcfd7fbbcf	reuse transports for callhome and remote tgt validation (#18912 )	2024-01-29 23:05:39 -08:00
Harshavardhana	486e2e48ea	enable xattr capture by default (#18911 ) - healing must not set the write xattr because that is the job of active healing to update. what we need to preserve is permanent deletes. - remove older env for drive monitoring and enable it accordingly, as a global value.	2024-01-29 23:03:58 -08:00
Harshavardhana	2ddf2ca934	allow configuring maximum idle connections per host (#18908 )	2024-01-29 16:50:37 -08:00
Poorna	29b1a29044	fix metrics panic in node metrics endpoint (#18894 )	2024-01-29 12:32:44 -08:00
jiuker	b4ab8e095a	fix: preserve bucket metric of data usage for replication info (#18895 )	2024-01-29 08:54:20 -08:00
Harshavardhana	cff8235068	remove getReplicationNodeMetrics() from peer metrics groups	2024-01-28 18:45:20 -08:00
Harshavardhana	944f3c1477	remove local disk metrics from cluster metrics (#18886 ) local disk metrics were polluting cluster metrics Please remove them instead of adding relevant ones. - batch job metrics were incorrectly kept at bucket metrics endpoint, move it to cluster metrics. - add tier metrics to cluster peer metrics from the node. - fix missing set level cluster health metrics	2024-01-28 12:53:59 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00
Harshavardhana	6347fb6636	add missing proper error return in WalkDir() (#18884 ) without this the caller might end up returning incorrect errors and not ignoring the drive properly.	2024-01-27 16:13:41 -08:00
Harshavardhana	32e668eb94	update() stale rebalance stats() object during pool expansion (#18882 ) it is entirely possible that a rebalance process which was running when it was asked to "stop" it failed to write its last statistics to the disk. After this a pool expansion can cause disruption and all S3 API calls would fail at IsPoolRebalancing() function. This PRs makes sure that we update rebalance.bin under such conditions to avoid any runtime crashes.	2024-01-27 10:14:03 -08:00
Harshavardhana	c88308cf0e	avoid 'panic' on mc admin update for single drive setup (#18876 )	2024-01-26 12:07:03 -08:00
Harshavardhana	88837fb753	add new update v2 that updates per node, allows idempotent behavior (#18859 ) add new update v2 that updates per node, allows idempotent behavior new API ensures that - binary is correct and can be downloaded checksummed verified - committed to actual path - restart returns back the relevant waiting drives	2024-01-26 08:40:13 -08:00
Harshavardhana	d0283ff354	remove unnecessary logs in HealBucket() (#18875 )	2024-01-26 08:39:57 -08:00
Harshavardhana	f449a7ae2c	allow bucket import to be idempotent (#18873 ) do not need to be defensive in our approach, we should simply override anything everything in import process, do not care about what currently exists on the disk - backup is the source of truth.	2024-01-25 17:20:54 -08:00
Klaus Post	a113b2c394	Fix inspect format.json exclusion (#18871 ) Right now the format.json is excluded if anything within `.minio.sys` is requested. I assume the check was meant to exclude only if it was actually requesting it.	2024-01-25 15:59:00 -08:00
Harshavardhana	74851834c0	further bootstrap/startup optimization for reading 'format.json' (#18868 ) - Move RenameFile to websockets - Move ReadAll that is primarily is used for reading 'format.json' to to websockets - Optimize DiskInfo calls, and provide a way to make a NoOp DiskInfo call.	2024-01-25 12:45:46 -08:00
Harshavardhana	e377bb949a	migrate bootstrap logic directly to websockets (#18855 ) improve performance for startup sequences by 2x for 300+ nodes.	2024-01-24 13:36:44 -08:00
Poorna	b6e9d235fe	fix replication error logs to include target endpoint (#18863 )	2024-01-24 13:05:43 -08:00
Klaus Post	4a6c97463f	Fix all racy use of NewDeadlineWorker (#18861 ) AlmosAll uses of NewDeadlineWorker, which relied on secondary values, were used in a racy fashion, which could lead to inconsistent errors/data being returned. It also propagates the deadline downstream. Rewrite all these to use a generic WithDeadline caller that can return an error alongside a value. Remove the stateful aspect of DeadlineWorker - it was racy if used - but it wasn't AFAICT. Fixes races like: ``` WARNING: DATA RACE Read at 0x00c130b29d10 by goroutine 470237: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).ReadVersion() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:702 +0x611 github.com/minio/minio/cmd.readFileInfo() github.com/minio/minio/cmd/erasure-metadata-utils.go:160 +0x122 github.com/minio/minio/cmd.erasureObjects.getObjectFileInfo.func1.1() github.com/minio/minio/cmd/erasure-object.go:809 +0x27a github.com/minio/minio/cmd.erasureObjects.getObjectFileInfo.func1.2() github.com/minio/minio/cmd/erasure-object.go:828 +0x61 Previous write at 0x00c130b29d10 by goroutine 470298: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).ReadVersion.func1() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:698 +0x244 github.com/minio/minio/internal/ioutil.(DeadlineWorker).Run.func1() github.com/minio/minio/internal/ioutil/ioutil.go:141 +0x33 WARNING: DATA RACE Write at 0x00c0ba6e6c00 by goroutine 94507: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).StatVol.func1() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:419 +0x104 github.com/minio/minio/internal/ioutil.(DeadlineWorker).Run.func1() github.com/minio/minio/internal/ioutil/ioutil.go:141 +0x33 Previous read at 0x00c0ba6e6c00 by goroutine 94463: github.com/minio/minio/cmd.(xlStorageDiskIDCheck).StatVol() github.com/minio/minio/cmd/xl-storage-disk-id-check.go:422 +0x47e github.com/minio/minio/cmd.getBucketInfoLocal.func1() github.com/minio/minio/cmd/peer-s3-server.go:275 +0x122 github.com/minio/pkg/v2/sync/errgroup.(*Group).Go.func1() ``` Probably back from #17701	2024-01-24 10:08:31 -08:00
Frank Wessels	6c912ac960	Fix startup message when using single path (#18856 )	2024-01-24 10:02:56 -08:00
Harshavardhana	708cebe7f0	add necessary protection err, fileInfo slice reads and writes (#18854 ) protection was in place. However, it covered only some areas, so we re-arranged the code to ensure we could hold locks properly. Along with this, remove the DataShardFix code altogether, in deployments with many drive replacements, this can affect and lead to quorum loss.	2024-01-24 01:08:23 -08:00
Harshavardhana	f78d677ab6	pre-allocate EC memory by default at startup (#18846 )	2024-01-23 20:41:11 -08:00
Poorna	e39e2306d6	site replication: remove extraneous log for missing group (#18785 )	2024-01-23 18:28:11 -08:00
Harshavardhana	52229a21cb	avoid reload of 'format.json' over the network under normal conditions (#18842 )	2024-01-23 14:11:46 -08:00
Harshavardhana	961f7dea82	compress binary while sending it to all the nodes (#18837 ) Also limit the amount of concurrency when sending binary updates to peers, avoid high network over TX that can cause disconnection events for the node sending updates.	2024-01-22 12:16:36 -08:00
Shubhendu	65c4d550cb	Distribution bucket metrics with site replication (#18841 ) If site replication is enabled, we should still show the size and version distribution histogram metrics at bucket level. Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2024-01-22 08:45:36 -08:00
Harshavardhana	f9b4a8d6e8	improve server update behavior by re-using memory properly (#18831 )	2024-01-19 18:27:58 -08:00
Harshavardhana	e11d851aee	add new drive I/O waiting/tokens metric (#18836 ) Bonus: add virtual memory used as well part of the system resource metrics.	2024-01-19 14:51:36 -08:00
Harshavardhana	ac81f0248c	introduce new ServiceV2 API to handle guided restarts (#18826 ) New API now verifies any hung disks before restart/stop, provides a 'per node' break down of the restart/stop results. Provides also how many blocked syscalls are present on the drives and what users must do about them. Adds options to do pre-flight checks to provide information to the user regarding any hung disks. Provides 'force' option to forcibly attempt a restart() even with waiting syscalls on the drives.	2024-01-19 14:22:36 -08:00
Aditya Manthramurthy	cc960adbee	fix: remove policy mapping file when empty (#18828 ) On a policy detach operation, if there are no policies remaining attached to the user/group, remove the policy mapping file, instead of leaving a file containing an empty list of policies.	2024-01-19 10:31:40 -08:00
Shubhendu	19387cafab	Use +Inf label additionally for Histogram metrics (#18807 )	2024-01-18 14:51:28 -08:00
Harshavardhana	7c0673279b	capture I/O in waiting and total tokens in diskMetrics (#18819 ) This is needed for the subsequent changes in ServerUpdate(), ServerRestart() etc.	2024-01-18 11:17:43 -08:00

1 2 3 4 5 ...

5782 Commits