minio

mirror of https://github.com/minio/minio.git synced 2025-11-26 20:38:20 -05:00

Author	SHA1	Message	Date
Klaus Post	0a63dc199c	Add trace sizes to more trace types (#19864 ) Add trace sizes to * ILM traces * Replication traces * Healing traces * Decommission traces * Rebalance traces * (s)ftp traces * http traces.	2024-06-03 08:45:54 -07:00
Klaus Post	e72429c79c	Add sizes to traces (#19851 ) added to storage and grid traces. Can provide more context for traces that aren't HTTP. Others may apply.	2024-05-31 22:17:37 -07:00
Klaus Post	c5b3f5553f	Add per connection RPC metrics (#19852 ) Provides individual and aggregate stats for each RPC connection. Example: ``` "rpc": { "collectedAt": "2024-05-31T14:33:29.1373103+02:00", "connected": 30, "disconnected": 0, "outgoingStreams": 69, "incomingStreams": 0, "outgoingBytes": 174822796, "incomingBytes": 175821566, "outgoingMessages": 768595, "incomingMessages": 768589, "outQueue": 0, "lastPongTime": "2024-05-31T12:33:28Z", "byDestination": { "http://127.0.0.1:9001": { "collectedAt": "2024-05-31T14:33:29.1373103+02:00", "connected": 5, "disconnected": 0, "outgoingStreams": 2, "incomingStreams": 0, "outgoingBytes": 38432543, "incomingBytes": 66604052, "outgoingMessages": 229496, "incomingMessages": 229575, "outQueue": 0, "lastPongTime": "2024-05-31T12:33:27Z" }, "http://127.0.0.1:9002": { "collectedAt": "2024-05-31T14:33:29.1373103+02:00", "connected": 5, "disconnected": 0, "outgoingStreams": 6, "incomingStreams": 0, "outgoingBytes": 38215680, "incomingBytes": 66121283, "outgoingMessages": 228525, "incomingMessages": 228510, "outQueue": 0, "lastPongTime": "2024-05-31T12:33:27Z" }, ... ```	2024-05-31 22:16:24 -07:00
Klaus Post	d3ae0aaad3	Add max buffering to SFTP (#19848 ) Prevent OOM by adversarial use of SFTP upload by setting a 100MB max upload buffer.	2024-05-31 14:28:07 -07:00
Anis Eleuch	1277ad69a6	heal: Remove .healing.bin when all ES drives are healing (#19846 ) In the very rare case when all drives in a erasure set need to be healed, remove .healing.bin from all drives, otherwise it will be stuck in a loop Also, fix a unit test that fails sometimes due to wrong test.	2024-05-31 07:48:50 -07:00
Harshavardhana	8f93e81afb	change service account embedded policy size limit (#19840 ) Bonus: trim-off all the unnecessary spaces to allow for real 2048 characters in policies for STS handlers and re-use the code in all STS handlers.	2024-05-30 11:10:41 -07:00
Harshavardhana	4af31e654b	avoid pre-populating buffers for deployments < 32GiB memory (#19839 )	2024-05-30 04:58:12 -07:00
Harshavardhana	aad50579ba	fix: wire up ILM sub-system properly for help (#19836 )	2024-05-30 01:14:58 -07:00
Harshavardhana	38d059b0ae	fix: single node multi-drive must register local drives properly (#19832 ) since #19688 there was a regression introduced during drive lookups for single node multi-drive setups, drive replacement would not work correctly without this PR.	2024-05-29 13:12:44 -07:00
Klaus Post	bd4eeb4522	Fix flipped EcM, EcN in metadata header (#19831 ) Since this is a tuple encoded field we can just flip the struct members.	2024-05-29 12:14:09 -07:00
jiuker	03e3493288	fix: correct parse the tagging error for PostPolicyBucketHandler (#19825 )	2024-05-29 11:50:46 -07:00
Harshavardhana	64baedf5a4	fix: hide prefixes for Hadoop properly (#19821 )	2024-05-28 15:53:15 -07:00
Anis Eleuch	f79a4ef4d0	policy: More defensive code validating svc:DurationSeconds (#19820 ) This does not fix any current issue, but merging https://github.com/minio/madmin-go/pull/282 can lose the validation of the service account expiration time. Add more defensive code for now. In the future, we should avoid doing validation in another library.	2024-05-28 10:19:04 -07:00
Taran Pelkey	2d53854b19	Restrict access keys for users and groups to not allow '=' or ',' (#19749 ) * initial commit * Add UTF check --------- Co-authored-by: Harshavardhana <harsha@minio.io>	2024-05-28 10:14:16 -07:00
jiuker	c904ef966e	feat: support tags for PostPolicy upload (#19816 )	2024-05-27 21:44:00 -07:00
Harshavardhana	e0fe7cc391	fix: information disclosure bug in preconditions GET (#19810 ) precondition check was being honored before, validating if anonymous access is allowed on the metadata of an object, leading to metadata disclosure of the following headers. ``` Last-Modified Etag x-amz-version-id Expires: Cache-Control: ``` although the information presented is minimal in nature, and of opaque nature. It still simply discloses that an object by a specific name exists or not without even having enough permissions.	2024-05-27 12:17:46 -07:00
Harshavardhana	9d20dec56a	Revert "remove dataErrs from er.deleteIfDangling code" This reverts commit `7d75b1e758`. This fails multipart tests we need this code to handle existing challenges, so wait for the comprehensive fix.	2024-05-26 11:13:29 -07:00
Harshavardhana	597a785253	fix: authenticate LDAP via actual DN instead of normalized DN (#19805 ) fix: authenticate LDAP via actual DN instead of normalized DN Normalized DN is only for internal representation, not for external communication, any communication to LDAP must be based on actual user DN. LDAP servers do not understand normalized DN. fixes #19757	2024-05-25 06:43:06 -07:00
Harshavardhana	7d75b1e758	remove dataErrs from er.deleteIfDangling code avoid this until a comprehensive change is merged such as https://github.com/minio/minio/pull/19797	2024-05-24 18:20:04 -07:00
Aditya Manthramurthy	5f78691fcf	ldap: Add user DN attributes list config param (#19758 ) This change uses the updated ldap library in minio/pkg (bumped up to v3). A new config parameter is added for LDAP configuration to specify extra user attributes to load from the LDAP server and to store them as additional claims for the user. A test is added in sts_handlers.go that shows how to access the LDAP attributes as a claim. This is in preparation for adding SSH pubkey authentication to MinIO's SFTP integration.	2024-05-24 16:05:23 -07:00
Shireesh Anjal	a591e06ae5	Add cluster scanner metrics in metrics-v3 (#19517 ) endpoint: /minio/metrics/v3/cluster/scanner metrics: - bucket_scans_finished (counter) - bucket_scans_started (counter) - directories_scanned (counter) - last_activity_nano_seconds (gauge) - objects_scanned (counter) - versions_scanned (counter)	2024-05-24 12:29:25 -07:00
Harshavardhana	443c93c634	compute time spent in ILM properly (#19806 )	2024-05-24 12:28:51 -07:00
Shireesh Anjal	5659cddc84	Add cluster config metrics in metrics-v3 (#19507 ) endpoint: /minio/metrics/v3/cluster/config metrics: - write_quorum - rrs_parity - standard_parity	2024-05-24 05:50:46 -07:00
Shireesh Anjal	673a521711	Change endpoint of v3 notification metrics (#19804 ) from /cluster/notification to /notification	2024-05-24 04:10:24 -07:00
Shireesh Anjal	7981509cc8	Add cluster and bucket replication metrics in metrics-v3 (#19546 ) endpoint: /minio/metrics/v3/cluster/replication metrics: - average_active_workers - average_queued_bytes - average_queued_count - average_transfer_rate - current_active_workers - current_transfer_rate - last_minute_queued_bytes - last_minute_queued_count - max_active_workers - max_queued_bytes - max_queued_count - max_transfer_rate - recent_backlog_count endpoint: /minio/metrics/v3/api/bucket/replication metrics: - last_hour_failed_bytes - last_hour_failed_count - last_minute_failed_bytes - last_minute_failed_count - latency_ms - proxied_delete_tagging_requests_total - proxied_get_requests_failures - proxied_get_requests_total - proxied_get_tagging_requests_failures - proxied_get_tagging_requests_total - proxied_head_requests_failures - proxied_head_requests_total - proxied_put_tagging_requests_failures - proxied_put_tagging_requests_total - sent_bytes - sent_count - total_failed_bytes - total_failed_count - proxied_delete_tagging_requests_failures	2024-05-23 00:41:18 -07:00
Harshavardhana	d38e020b29	remove errant logs for disconnected remote (#19793 ) Signed-off-by: Harshavardhana <harsha@minio.io>	2024-05-22 18:12:23 -07:00
Poorna	7d29030292	fix list results returned for spark max-keys=2 listing (#19791 ) This PR continues fix #19725 for some unhandled cases	2024-05-22 16:16:34 -07:00
Shubhendu	7c7650b7c3	Add sufficient deadlines and countermeasures to handle hung node scenario (#19688 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2024-05-22 16:07:14 -07:00
Harshavardhana	ca80eced24	usage of deadline conn at Accept() breaks websocket (#19789 ) fortunately not wired up to use, however if anyone enables deadlines for conn then sporadically MinIO startups fail.	2024-05-22 10:49:27 -07:00
jiuker	9906b3ade9	fix: reject ilm rule when bucket LockEnabled (#19785 )	2024-05-21 23:50:03 -07:00
Anis Eleuch	bf1769d3e0	xl: Avoid marking a drive offline after one part read failure (#19779 ) This commit will fix one rare case of a multipart object that can be read in theory but GetObject API returned an error. It turned out that a six years old code was marking a drive offline when the bitrot streaming fails to read a part in a disk with any error. This can affect reading a subsequent part, though having enough shards, but unable to construct because one drive was marked offline earlier. This commit will remove the drive marking offline code. It will also close the bitrotstreaming reader before marking it as nil.	2024-05-21 07:36:21 -07:00
Harshavardhana	63e1ad9f29	fix: the user-agent for Veeam	2024-05-20 11:54:52 -07:00
Harshavardhana	1fd90c93ff	re-use StorageAPI while loading drive formats (#19770 ) Bonus: safe settings for deployment ID to avoid races	2024-05-19 01:06:49 -07:00
Krishnan Parthasarathi	1228d6bf1a	Return NumVersions in quorum when available (#19766 ) Similar to https://github.com/minio/minio/pull/17925	2024-05-17 13:57:37 -07:00
Shireesh Anjal	fc4561c64c	Start callhome immediately after enabling (#19764 ) Currently, on enabling callhome (or restarting the server), the callhome job gets scheduled. This means that one has to wait for 24hrs (the default frequency duration) to see it in action and to figure out if it is working as expected. It will be a better user experience to perform the first callhome execution immediately after enabling it (or on server start if already enabled). Also, generate audit event on callhome execution, setting the error field in case the execution has failed.	2024-05-17 09:53:34 -07:00
Klaus Post	3b7747b42b	Tweak multipart uploads (#19756 ) * Store ModTime in the upload ID; return it when listing instead of the current time. * Use this ModTime to expire and skip reading the file info. * Consistent upload sorting in listing (since it now has the ModTime). * Exclude healing disks to avoid returning an empty list.	2024-05-17 09:40:09 -07:00
Harshavardhana	e432e79324	avoid calling 'admin info' for disk, cpu, net metrics collection (#19762 ) resource metrics collection was incorrectly making fan-out liveness peer calls where it's not needed.	2024-05-17 08:15:13 -07:00
Harshavardhana	08d74819b6	handle racy updates to globalSite config (#19750 ) ``` ================== WARNING: DATA RACE Read at 0x0000082be990 by goroutine 205: github.com/minio/minio/cmd.setCommonHeaders() Previous write at 0x0000082be990 by main goroutine: github.com/minio/minio/cmd.lookupConfigs() ```	2024-05-16 16:13:47 -07:00
Poorna	aa3fde1784	Add ListObjectsV2 unit test (#19753 ) for PR: #19725	2024-05-15 20:40:51 -07:00
Harshavardhana	0b3eb7f218	add more deadlines and pass around context under most situations (#19752 )	2024-05-15 15:19:00 -07:00
Klaus Post	b792b36495	Add Veeam storage class override (#19748 ) Recent Veeam is very picky about storage class names. Add `_MINIO_VEEAM_FORCE_SC` env var. It will override the storage class returned by the storage backend if it is non-standard and we detect a Veeam client by checking the User Agent. Applies to HeadObject/GetObject/ListObject*	2024-05-15 11:04:16 -07:00
Harshavardhana	d3db7d31a3	fix: add deadlines for all synchronous REST callers (#19741 ) add deadlines that can be dynamically changed via the drive max timeout values. Bonus: optimize "file not found" case and hung drives/network - circuit break the check and return right away instead of waiting.	2024-05-15 09:52:29 -07:00
Shireesh Anjal	c05ca63158	Fix crash on /minio/metrics/v3?list (#19745 ) An unchecked map access was causing panic.	2024-05-15 09:06:35 -07:00
Shireesh Anjal	0e59e50b39	Capture ttfb api metrics only for GetObject (#19733 ) as that is the only API where the TTFB metric is beneficial, and capturing this for all APIs exponentially increases the response size in large clusters.	2024-05-14 23:25:13 -07:00
Klaus Post	d4b391de1b	Add PutObject Ring Buffer (#19605 ) Replace the `io.Pipe` from streamingBitrotWriter -> CreateFile with a fixed size ring buffer. This will add an output buffer for encoded shards to be written to disk - potentially via RPC. This will remove blocking when `(*streamingBitrotWriter).Write` is called, and it writes hashes and data. With current settings, the write looks like this: ``` Outbound ┌───────────────────┐ ┌────────────────┐ ┌───────────────┐ ┌────────────────┐ │ │ Parr. │ │ (http body) │ │ │ │ │ Bitrot Hash │ Write │ Pipe │ Read │ HTTP buffer │ Write (syscall) │ TCP Buffer │ │ Erasure Shard │ ──────────► │ (unbuffered) │ ────────────► │ (64K Max) │ ───────────────────► │ (4MB) │ │ │ │ │ │ (io.Copy) │ │ │ └───────────────────┘ └────────────────┘ └───────────────┘ └────────────────┘ ``` We write a Hash (32 bytes). Since the pipe is unbuffered, it will block until the 32 bytes have been delivered to the TCP buffer, and the next Read hits the Pipe. Then we write the shard data. This will typically be bigger than 64KB, so it will block until two blocks have been read from the pipe. When we insert a ring buffer: ``` Outbound ┌───────────────────┐ ┌────────────────┐ ┌───────────────┐ ┌────────────────┐ │ │ │ │ (http body) │ │ │ │ │ Bitrot Hash │ Write │ Ring Buffer │ Read │ HTTP buffer │ Write (syscall) │ TCP Buffer │ │ Erasure Shard │ ──────────► │ (2MB) │ ────────────► │ (64K Max) │ ───────────────────► │ (4MB) │ │ │ │ │ │ (io.Copy) │ │ │ └───────────────────┘ └────────────────┘ └───────────────┘ └────────────────┘ ``` The hash+shard will fit within the ring buffer, so writes will not block - but will complete after a memcopy. Reads can fill the 64KB buffer if there is data for it. If the network is congested, the ring buffer will become filled, and all syscalls will be on full buffers. Only when the ring buffer is filled will erasure coding start blocking. Since there is always "space" to write output data, we remove the parallel writing since we are always writing to memory now, and the goroutine synchronization overhead probably not worth taking. If the output were blocked in the existing, we would still wait for it to unblock in parallel write, so it would make no difference there - except now the ring buffer smoothes out the load. There are some micro-optimizations we could look at later. The biggest is that, in most cases, we could encode directly to the ring buffer - if we are not at a boundary. Also, "force filling" the Read requests (i.e., blocking until a full read can be completed) could be investigated and maybe allow concurrent memory on read and write.	2024-05-14 17:11:04 -07:00
Olli Janatuinen	534e7161df	SFTP: Correctly inform client about unsupported commands (#19735 )	2024-05-14 03:29:30 -07:00
Harshavardhana	9b219cd646	fix: return quorum based error, temporary failures must be ignored (#19732 )	2024-05-14 03:29:17 -07:00
Shireesh Anjal	3bab4822f3	Add logger webhook metrics in metrics-v3 (#19515 ) endpoint: /minio/metrics/v3/cluster/webhook metrics: - failed_messages (counter) - online (gauge) - queue_length (gauge) - total_messages (counter)	2024-05-14 00:27:33 -07:00
coderwander	3c5f2d8916	fix some typo in struct name comments (#19513 ) Signed-off-by: coderwander <770732124@qq.com>	2024-05-14 00:26:50 -07:00
Shireesh Anjal	5808190398	Add more metrics to v3/cluster/erasure-set (#19714 ) Metrics being added: - read_tolerance: No of drive failures that can be tolerated without disrupting read operations - write_tolerance: No of drive failures that can be tolerated without disrupting write operations - read_health: Health of the erasure set in a pool for read operations (1=healthy, 0=unhealthy) - write_health: Health of the erasure set in a pool for write operations (1=healthy, 0=unhealthy)	2024-05-14 00:25:56 -07:00

... 2 3 4 5 6 ...

6275 Commits