minio

mirror of https://github.com/minio/minio.git synced 2025-11-25 12:06:10 -05:00

Author	SHA1	Message	Date
Harshavardhana	ba54b39c02	fix: crash when audit webhook queue_dir is not writable (#19854 ) This is regression introduced in #19275 refactor	2024-06-01 20:03:39 -07:00
Anis Eleuch	2a75225569	kafka: _MINIO_KAFKA_DEBUG to enable sarama debug messages (#19849 )	2024-06-01 08:02:59 -07:00
Aditya Manthramurthy	5f78691fcf	ldap: Add user DN attributes list config param (#19758 ) This change uses the updated ldap library in minio/pkg (bumped up to v3). A new config parameter is added for LDAP configuration to specify extra user attributes to load from the LDAP server and to store them as additional claims for the user. A test is added in sts_handlers.go that shows how to access the LDAP attributes as a claim. This is in preparation for adding SSH pubkey authentication to MinIO's SFTP integration.	2024-05-24 16:05:23 -07:00
Harshavardhana	8c1bba681b	add logrotate support for MinIO logs (#19641 )	2024-05-01 10:57:52 -07:00
Anis Eleuch	95bf4a57b6	logging: Add subsystem to log API (#19002 ) Create new code paths for multiple subsystems in the code. This will make maintaing this easier later. Also introduce bugLogIf() for errors that should not happen in the first place.	2024-04-04 05:04:40 -07:00
Sveinn	1fc4203c19	Webhook targets refactor and bug fixes (#19275 ) - old version was unable to retain messages during config reload - old version could not go from memory to disk during reload - new version can batch disk queue entries to single for to reduce I/O load - error logging has been improved, previous version would miss certain errors. - logic for spawning/despawning additional workers has been adjusted to trigger when half capacity is reached, instead of when the log queue becomes full. - old version would json marshall x2 and unmarshal 1x for every log item. Now we only do marshal x1 and then we GetRaw from the store and send it without having to re-marshal.	2024-03-25 09:44:20 -07:00
Harshavardhana	233cc3905a	add batchSize support for webhook endpoints (#19214 ) configure batch size to send audit/logger events in batches instead of sending one event per connection. this is mainly to optimize the number of requests we make to webhook endpoint.	2024-03-07 12:17:46 -08:00
Harshavardhana	e91a4a414c	merge startHTTPLogger() many callers into a simpler pattern (#19211 ) simplify audit webhook worker model fixes couple of bugs like - ping(ctx) was creating a logger without updating number of workers leading to incorrect nWorkers scaling, causing an additional worker that is not tracked properly. - h.logCh <- entry could potentially hang for when the queue is full on heavily loaded systems.	2024-03-06 08:09:46 -08:00
Harshavardhana	74ccee6619	avoid too much auditing during decom/rebalance make it more robust (#19174 ) there can be a sudden spike in tiny allocations, due to too much auditing being done, also don't hang on the ``` h.logCh <- entry ``` after initializing workers if you do not have a way to dequeue for some reason.	2024-03-06 03:43:16 -08:00
Harshavardhana	53aa8f5650	use typos instead of codespell (#19088 )	2024-02-21 22:26:06 -08:00
Harshavardhana	cd419a35fe	simplify broker healthcheck by following kafka guidelines (#19082 ) fixes #19081	2024-02-20 00:16:35 -08:00
Anis Eleuch	68dde2359f	log: Add logger.Event to send to console and other logger targets (#19060 ) Add a new function logger.Event() to send the log to Console and http/kafka log webhooks. This will include some internal events such as disk healing and rebalance/decommissioning	2024-02-15 15:13:30 -08:00
Anis Eleuch	6fd63e920a	log: Use error log type instead of Application/MinIO type (#18930 ) * log: Use error log type instead of Application/MinIO type Also bump github.com/shirou/gopsutil version to address cross compilation issues. * Apply suggestions from code review Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com> --------- Co-authored-by: Anis Eleuch <anis@min.io> Co-authored-by: Harshavardhana <harsha@minio.io> Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>	2024-02-01 16:13:57 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00
Praveen raj Mani	c905d3fe21	fix: Re-use TCP connections for Kafka dials (#18860 ) Fixes #18857	2024-01-24 13:10:52 -08:00
Harshavardhana	dd2542e96c	add codespell action (#18818 ) Original work here, #18474, refixed and updated.	2024-01-17 23:03:17 -08:00
Klaus Post	51aa59a737	perf: websocket grid connectivity for all internode communication (#18461 ) This PR adds a WebSocket grid feature that allows servers to communicate via a single two-way connection. There are two request types: * Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small roundtrips with small payloads. * Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`, which allows for different combinations of full two-way streams with an initial payload. Only a single stream is created between two machines - and there is, as such, no server/client relation since both sides can initiate and handle requests. Which server initiates the request is decided deterministically on the server names. Requests are made through a mux client and server, which handles message passing, congestion, cancelation, timeouts, etc. If a connection is lost, all requests are canceled, and the calling server will try to reconnect. Registered handlers can operate directly on byte slices or use a higher-level generics abstraction. There is no versioning of handlers/clients, and incompatible changes should be handled by adding new handlers. The request path can be changed to a new one for any protocol changes. First, all servers create a "Manager." The manager must know its address as well as all remote addresses. This will manage all connections. To get a connection to any remote, ask the manager to provide it given the remote address using. ``` func (m Manager) Connection(host string) Connection ``` All serverside handlers must also be registered on the manager. This will make sure that all incoming requests are served. The number of in-flight requests and responses must also be given for streaming requests. The "Connection" returned manages the mux-clients. Requests issued to the connection will be sent to the remote. * `func (c Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)` performs a single request and returns the result. Any deadline provided on the request is forwarded to the server, and canceling the context will make the function return at once. `func (c Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st Stream, err error)` will initiate a remote call and send the initial payload. ```Go // A Stream is a two-way stream. // All responses must be read by the caller. // If the call is canceled through the context, //The appropriate error will be returned. type Stream struct { // Responses from the remote server. // Channel will be closed after an error or when the remote closes. // All responses must be read by the caller until either an error is returned or the channel is closed. // Canceling the context will cause the context cancellation error to be returned. Responses <-chan Response // Requests sent to the server. // If the handler is defined with 0 incoming capacity this will be nil. // Channel must be closed to signal the end of the stream. // If the request context is canceled, the stream will no longer process requests. Requests chan<- []byte } type Response struct { Msg []byte Err error } ``` There are generic versions of the server/client handlers that allow the use of type safe implementations for data types that support msgpack marshal/unmarshal.	2023-11-20 17:09:35 -08:00
Anis Eleuch	12f570a307	audit: Try to send audit even if the status is offline (#18458 ) Currently, once the audit becomes offline, there is no code that tries to reconnect to the audit, at the same time Send() quickly returns with an error without really trying to send a message the audit endpoint; so the audit endpoint will never be online again. Fixing this behavior; the current downside is that we miss printing some logs when the audit becomes offline; however this information is available in prometheus Later, we can refactor internal/logger so the http endpoint can send errors to console target.	2023-11-17 10:40:28 -08:00
Anis Eleuch	6ef8e87492	Support case insensitive kafka SASL mechanism config values (#18398 )	2023-11-08 20:04:01 -08:00
Shubhendu	5b9656374c	Error if target went offline (#18221 ) If target went offline while MinIO was down, error once while trying to send message. If target goes offline during MinIO server running, it already comes through ping() call and errors out if target offline. Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2023-10-12 06:13:57 -07:00
Praveen raj Mani	c27d0583d4	Send kafka notification messages in batches when queue_dir is enabled (#18164 ) Fixes #18124	2023-10-07 08:07:38 -07:00
Shubhendu	10d5dd3a67	fix: a regression with audit log sending (#18112 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2023-09-26 12:23:02 -07:00
Anis Eleuch	4eeb48f8e0	Return cached online/offline status for audit/http loggers (#18083 ) To avoid having delays in prometheus scrape and in 'mc admin info' command.	2023-09-21 16:58:24 -07:00
Harshavardhana	1472875670	fix: failed messages counting in audit_http metrics (#18075 ) all retries must not be counted as failed messages, a failed message is a single counter not for all retries, this PR fixes this. Also we do not need to retry 10-times, instead we should retry at max 3 times with some jitter to deliver the messages.	2023-09-21 11:24:56 -07:00
Aditya Manthramurthy	1c99fb106c	Update to minio/pkg/v2 (#17967 )	2023-09-04 12:57:37 -07:00
Anis Eleuch	6a8d8f34a5	kafka: Do not require key when sending a message (#17962 ) Keys are helpful to ensure the strict ordering of messages, however currently the code uses a random request id for every log, hence using the request-id as a Kafka key is not serve any purpose; This commit removes the usage of the key, to also fix the audit issue from internal subsystem that does not have a request ID.	2023-09-01 08:37:22 -07:00
Harshavardhana	adb8be069e	tune-kafka targets to ensure timeout triggers on hung brokers (#17898 ) hung brokers can cause slowness to the entire system when many callers are hung, leading to large goroutine build-up.	2023-08-22 20:26:35 -07:00
Harshavardhana	11dfc817f3	do not log client canceled events (#17838 )	2023-08-17 14:53:43 -07:00
Praveen raj Mani	0285df5a02	fix: prioritize audit_webhook and logger_webhook ENVs over the config KVS (#17783 )	2023-08-03 02:47:07 -07:00
Anis Eleuch	9c0e8cd15b	logger: Avoid slow calls in http logger Send() function (#17747 ) Send() is synchronous and can affect the latency of S3 requests when the logger buffer is full. Avoid checking if the HTTP target is online or not and increase the workers anyway since the buffer is already full. Also, avoid logs flooding when the audit target is down.	2023-07-29 12:49:18 -07:00
Harshavardhana	dbd4c2425e	fix: kafka broker pings must not be greater than 1sec (#17376 )	2023-06-07 11:47:00 -07:00
Krishnan Parthasarathi	55a3310446	logger-http: Don't retry after a succesful send (#17266 )	2023-05-22 14:53:18 -07:00
jiuker	41fa8fa2d2	fix: increment counter when entry be skipped (#17237 )	2023-05-19 08:36:52 -07:00
Praveen raj Mani	85912985b6	Check for only network errors in audit webhook for reachability (#17228 )	2023-05-17 11:10:33 -07:00
Praveen raj Mani	57acacd5a7	Support persistent queue store for loggers (#17121 )	2023-05-08 21:20:31 -07:00
jiuker	b28d391a22	fix: add correct worker count before startHTTPLogger() (#17091 )	2023-04-27 10:51:16 -07:00
Harshavardhana	8a9b9832fd	add Dial timeout for Kafka broker pings (#17044 )	2023-04-17 15:45:01 -07:00
Klaus Post	11d04279c8	Add lazy init of audit logger (#16842 )	2023-03-21 10:50:40 -07:00
ferhat elmas	714283fae2	cleanup ignored static analysis (#16767 )	2023-03-06 08:56:10 -08:00
Daniel Valdivia	fb17f97cf3	move audit and logger message structure to minio/pkg (#16655 ) Signed-off-by: Daniel Valdivia <18384552+dvaldivia@users.noreply.github.com>	2023-02-21 21:21:17 -08:00
Shubhendu	6b65ba1551	Added attribute proxy for `mc admin config set ALIAS logger_webhook` (#16657 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>	2023-02-21 21:19:46 -08:00
Anis Elleuch	939c0100a6	log: Do not interpret verbs in object names in console output (#16233 )	2022-12-13 08:27:40 -08:00
Shireesh Anjal	98a67a3776	Improvements in logger and audit webhooks (#16102 )	2022-11-28 08:03:26 -08:00
Klaus Post	5b242f1d11	Add Audit target metrics (#16044 )	2022-11-10 10:20:21 -08:00
Harshavardhana	5e763b71dc	use logger.LogOnce to reduce printing disconnection logs (#15408 ) fixes #15334 - re-use net/url parsed value for http.Request{} - remove gosimple, structcheck and unusued due to https://github.com/golangci/golangci-lint/issues/2649 - unwrapErrs upto leafErr to ensure that we store exactly the correct errors	2022-07-27 09:44:59 -07:00
Harshavardhana	32b2f6117e	fix: do not pass around sync.Map (#15250 ) it is not safe to pass around sync.Map through pointers, as it may be concurrently updated by different callers. this PR simplifies by avoiding sync.Map altogether, we do not need sync.Map to keep object->erasureMap association. This PR fixes a crash when concurrently using this value when audit logs are configured. ``` fatal error: concurrent map iteration and map write goroutine 247651580 [running]: runtime.throw({0x277a6c1?, 0xc002381400?}) runtime/panic.go:992 +0x71 fp=0xc004d29b20 sp=0xc004d29af0 pc=0x438671 runtime.mapiternext(0xc0d6e87f18?) runtime/map.go:871 +0x4eb fp=0xc004d29b90 sp=0xc004d29b20 pc=0x41002b ```	2022-07-07 17:04:25 -07:00
Klaus Post	ac055b09e9	Add detailed scanner metrics (#15161 )	2022-07-05 14:45:49 -07:00
Harshavardhana	9d07cde385	use crypto/sha256 only for FIPS 140-2 compliance (#14983 ) It would seem like the PR #11623 had chewed more than it wanted to, non-fips build shouldn't really be forced to use slower crypto/sha256 even for presumed "non-performance" codepaths. In MinIO there are really no "non-performance" codepaths. This assumption seems to have had an adverse effect in certain areas of CPU usage. This PR ensures that we stick to sha256-simd on all non-FIPS builds, our most common build to ensure we get the best out of the CPU at any given point in time.	2022-05-27 06:00:19 -07:00
Anis Elleuch	e952e2a691	audit/kafka: Fix quitting early after first logging (#14932 ) A recent commit created some regressions: - Kafka/Audit goroutines quit when the first log is sent - Missing doneCh initialization in Kafka audit	2022-05-17 07:43:25 -07:00
Harshavardhana	040ac5cad8	fix: when logger queue is full exit quickly upon doneCh (#14928 ) Additionally only reload requested sub-system not everything	2022-05-16 16:10:51 -07:00

1 2

63 Commits