minio

mirror of https://github.com/minio/minio.git synced 2024-12-26 15:15:55 -05:00

Author	SHA1	Message	Date
Klaus Post	f1302c40fe	Fix uninitialized replication stats (#20260 ) Services are unfrozen before `initBackgroundReplication` is finished. This means that the globalReplicationStats write is racy. Switch to an atomic pointer. Provide the `ReplicationPool` with the stats, so it doesn't have to be grabbed from the atomic pointer on every use. All other loads and checks are nil, and calls return empty values when stats still haven't been initialized.	2024-08-15 05:04:40 -07:00
Harshavardhana	db78431b1d	avoid crash when initializing bucket quota cache (#20258 )	2024-08-14 17:34:56 -07:00
Harshavardhana	a17f14f73a	separate lock from common grid to avoid epoll contention (#20180 ) epoll contention on TCP causes latency build-up when we have high volume ingress. This PR is an attempt to relieve this pressure. upstream issue https://github.com/golang/go/issues/65064 It seems to be a deeper problem; haven't yet tried the fix provide in this issue, but however this change without changing the compiler helps. Of course, this is a workaround for now, hoping for a more comprehensive fix from Go runtime.	2024-07-29 11:10:04 -07:00
Harshavardhana	7fcb428622	do not print unexpected logs (#20083 )	2024-07-12 13:51:54 -07:00
Shireesh Anjal	22c53b1c70	Remove license update job (#20037 )	2024-07-03 11:49:48 -07:00
Harshavardhana	32d04091a2	resume any batch jobs in a goroutine (#20035 ) Bonus: move batch job initialization to the last item after all other initialization, allowing for faster startup time for different subsystems.	2024-07-03 00:16:05 -07:00
Harshavardhana	5e7b243bde	extend cluster health to return errors for IAM, and Bucket metadata (#19995 ) Bonus: make API freeze to be opt-in instead of default	2024-06-26 00:44:34 -07:00
Anis Eleuch	4d7d008741	bootstrap: Speed up bucket metadata loading (#19969 ) Currently, bucket metadata is being loaded serially inside ListBuckets Objet API. Fix that by loading the bucket metadata as the number of erasure sets * 10, which is a good approximation.	2024-06-21 15:22:24 -07:00
Harshavardhana	2825294b7b	allow server startup to come online with READ success (#19957 )	2024-06-19 22:21:31 -07:00
Sveinn	bce93b5cfa	Removing timeout on shutdown (#19956 )	2024-06-19 11:42:47 -07:00
Harshavardhana	7a4b250c8b	avoid waiting for quorum health while debugging (#19955 )	2024-06-19 10:12:20 -07:00
Harshavardhana	69e41f87ef	compute localIPs only once per server startup() (#19951 ) repeatedly calling this function is not necessary, on systems with lots of interfaces, including virtual ones can make this reasonably delayed.	2024-06-19 07:34:00 -07:00
Harshavardhana	ee48f9f206	perform healthchecks before initializing everything fully (#19953 ) adds more informative logs that provide details on which erasure set is losing quorum etc.	2024-06-19 07:33:40 -07:00
Harshavardhana	d5e48cfd65	fix: remove DriveOPTimeout for REST callers as they don't work properly (#19873 ) Go's net/http is notoriously difficult to have a streaming deadlines per READ/WRITE on the net.Conn if we add them they interfere with the Go's internal requirements for a HTTP connection. Remove this support for now fixes #19853	2024-06-04 08:12:57 -07:00
Harshavardhana	4af31e654b	avoid pre-populating buffers for deployments < 32GiB memory (#19839 )	2024-05-30 04:58:12 -07:00
Aditya Manthramurthy	5f78691fcf	ldap: Add user DN attributes list config param (#19758 ) This change uses the updated ldap library in minio/pkg (bumped up to v3). A new config parameter is added for LDAP configuration to specify extra user attributes to load from the LDAP server and to store them as additional claims for the user. A test is added in sts_handlers.go that shows how to access the LDAP attributes as a claim. This is in preparation for adding SSH pubkey authentication to MinIO's SFTP integration.	2024-05-24 16:05:23 -07:00
Shubhendu	7c7650b7c3	Add sufficient deadlines and countermeasures to handle hung node scenario (#19688 ) Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io> Signed-off-by: Harshavardhana <harsha@minio.io>	2024-05-22 16:07:14 -07:00
Harshavardhana	ca80eced24	usage of deadline conn at Accept() breaks websocket (#19789 ) fortunately not wired up to use, however if anyone enables deadlines for conn then sporadically MinIO startups fail.	2024-05-22 10:49:27 -07:00
Harshavardhana	08d74819b6	handle racy updates to globalSite config (#19750 ) ``` ================== WARNING: DATA RACE Read at 0x0000082be990 by goroutine 205: github.com/minio/minio/cmd.setCommonHeaders() Previous write at 0x0000082be990 by main goroutine: github.com/minio/minio/cmd.lookupConfigs() ```	2024-05-16 16:13:47 -07:00
Harshavardhana	72ff69d9bb	add log-prefix name for specifying custom log-name (#19712 )	2024-05-09 14:29:37 -07:00
Harshavardhana	1526e7ece3	extend server config.yaml to support per pool set drive count (#19663 ) This is to support deployments migrating from a multi-pooled wider stripe to lower stripe. MINIO_STORAGE_CLASS_STANDARD is still expected to be same for all pools. So you can satisfy adding custom drive count based pools by adjusting the storage class value. ``` version: v2 address: ':9000' rootUser: 'minioadmin' rootPassword: 'minioadmin' console-address: ':9001' pools: # Specify the nodes and drives with pools - args: - 'node{11...14}.example.net/data{1...4}' - args: - 'node{15...18}.example.net/data{1...4}' - args: - 'node{19...22}.example.net/data{1...4}' - args: - 'node{23...34}.example.net/data{1...10}' set-drive-count: 6 ```	2024-05-03 08:54:03 -07:00
Harshavardhana	402a3ac719	support compression after rotation of logs (#19647 )	2024-05-01 15:38:07 -07:00
Harshavardhana	8c1bba681b	add logrotate support for MinIO logs (#19641 )	2024-05-01 10:57:52 -07:00
Harshavardhana	f3a52cc195	simplify listener implementation setup customizations in right place (#19589 )	2024-04-23 21:08:47 -07:00
Harshavardhana	6bfff7532e	re-use transport and set stronger backwards compatible Ciphers (#19565 ) This PR fixes a few things - FIPS support for missing for remote transports, causing MinIO could end up using non-FIPS Ciphers in FIPS mode - Avoids too many transports, they all do the same thing to make connection pooling work properly re-use them. - globalTCPOptions must be set before setting transport to make sure the client conn deadlines are honored properly. - GCS warm tier must re-use our transport - Re-enable trailing headers support.	2024-04-21 04:43:18 -07:00
jiuker	272367ccd2	feat: add memlimit flags for setMaxResources (#19400 )	2024-04-04 05:06:57 -07:00
Anis Eleuch	95bf4a57b6	logging: Add subsystem to log API (#19002 ) Create new code paths for multiple subsystems in the code. This will make maintaing this easier later. Also introduce bugLogIf() for errors that should not happen in the first place.	2024-04-04 05:04:40 -07:00
Harshavardhana	dc45a5010d	bring back minor DNS cache for k8s setups (#19341 ) k8s as it stands is flaky in DNS lookups, bring this change back such that we can cache DNS atleast for 30secs TTL.	2024-03-26 08:00:38 -07:00
Harshavardhana	f168ef9989	implement a flag to specify custom crossdomain.xml (#19262 ) fixes #16909	2024-03-17 23:42:40 -07:00
Poorna	8e2238ea09	some more cleanup for startup message (#19229 )	2024-03-08 22:42:32 -08:00
Poorna	31e8f7c525	Small reformatting of startup message (#19228 ) Also changing User-Agent format	2024-03-08 19:07:08 -08:00
Krishnan Parthasarathi	a7577da768	Improve expiration of tiered objects (#18926 ) - Use a shared worker pool for all ILM expiry tasks - Free version cleanup executes in a separate goroutine - Add a free version only if removing the remote object fails - Add ILM expiry metrics to the node namespace - Move tier journal tasks to expiryState - Remove unused on-disk journal for tiered objects pending deletion - Distribute expiry tasks across workers such that the expiry of versions of the same object serialized - Ability to resize worker pool without server restart - Make scaling down of expiryState workers' concurrency safe; Thanks @klauspost - Add error logs when expiryState and transition state are not initialized (yet) * metrics: Add missed tier journal entry tasks * Initialize the ILM worker pool after the object layer	2024-03-01 21:11:03 -08:00
Harshavardhana	2c2f5d871c	debug: introduce support for configuring client connect WRITE deadline (#19170 ) just like client-conn-read-deadline, added a new flag that does client-conn-write-deadline as well. Both are not configured by default, since we do not yet know what is the right value. Allow this to be configurable if needed.	2024-03-01 08:00:42 -08:00
Harshavardhana	51874a5776	fix: allow DNS disconnection events to happen in k8s (#19145 ) in k8s things really do come online very asynchronously, we need to use implementation that allows this randomness. To facilitate this move WriteAll() as part of the websocket layer instead. Bonus: avoid instances of dnscache usage on k8s	2024-02-28 09:54:52 -08:00
Harshavardhana	9a012a53ef	initialize the disk healer early on (#19143 ) This PR fixes a bug that perhaps has been long introduced, with no visible workarounds. In any deployment, if an entire erasure set is deleted, there is no way the cluster recovers.	2024-02-27 23:02:14 -08:00
Harshavardhana	92788e4cf4	fix: re-arrange console-sys to log properly in k8s/docker (#19129 ) fixes #19125	2024-02-26 01:33:48 -08:00
Anis Eleuch	68dde2359f	log: Add logger.Event to send to console and other logger targets (#19060 ) Add a new function logger.Event() to send the log to Console and http/kafka log webhooks. This will include some internal events such as disk healing and rebalance/decommissioning	2024-02-15 15:13:30 -08:00
Harshavardhana	997ba3a574	introduce reader deadlines for net.Conn (#19023 ) Bonus: set "retry-after" header for AWS SDKs if possible to honor them.	2024-02-09 13:25:16 -08:00
Poorna	bcfd7fbbcf	reuse transports for callhome and remote tgt validation (#18912 )	2024-01-29 23:05:39 -08:00
Harshavardhana	2ddf2ca934	allow configuring maximum idle connections per host (#18908 )	2024-01-29 16:50:37 -08:00
Harshavardhana	1d3bd02089	avoid close 'nil' panics if any (#18890 ) brings a generic implementation that prints a stack trace for 'nil' channel closes(), if not safely closes it.	2024-01-28 10:04:17 -08:00
Harshavardhana	e377bb949a	migrate bootstrap logic directly to websockets (#18855 ) improve performance for startup sequences by 2x for 300+ nodes.	2024-01-24 13:36:44 -08:00
Harshavardhana	f78d677ab6	pre-allocate EC memory by default at startup (#18846 )	2024-01-23 20:41:11 -08:00
Harshavardhana	dd2542e96c	add codespell action (#18818 ) Original work here, #18474, refixed and updated.	2024-01-17 23:03:17 -08:00
Harshavardhana	7c948adf88	allow pre-allocating buffers to reduce frequent GCs during growth (#18686 ) This PR also increases per node bpool memory from 1024 entries to 2048 entries; along with that, it also moves the byte pool centrally instead of being per pool.	2023-12-21 08:59:38 -08:00
Anis Eleuch	6f97663174	yml-config: Add support of rootUser and rootPassword (#18615 ) Users can define the root user and password in the yaml configuration file; Root credentials defined in the environment variable still take precedence	2023-12-08 12:04:54 -08:00
Anis Eleuch	2e23e61a45	Add support of conf file to pass arguments and options (#18592 )	2023-12-07 01:33:56 -08:00
Harshavardhana	fbb5e75e01	avoid run-away goroutine build-up in notification send, use channels (#18533 ) use memory for async events when necessary and dequeue them as needed, for all synchronous events customers must enable ``` MINIO_API_SYNC_EVENTS=on ``` Async events can be lost but is upto to the admin to decide what they want, we will not create run-away number of goroutines per event instead we will queue them properly. Currently the max async workers is set to runtime.GOMAXPROCS(0) which is more than sufficient in general, but it can be made configurable in future but may not be needed.	2023-12-05 02:16:33 -08:00
Harshavardhana	fba883839d	feat: bring new HDD related performance enhancements (#18239 ) Optionally allows customers to enable - Enable an external cache to catch GET/HEAD responses - Enable skipping disks that are slow to respond in GET/HEAD when we have already achieved a quorum	2023-11-22 13:46:17 -08:00
Klaus Post	51aa59a737	perf: websocket grid connectivity for all internode communication (#18461 ) This PR adds a WebSocket grid feature that allows servers to communicate via a single two-way connection. There are two request types: * Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small roundtrips with small payloads. * Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`, which allows for different combinations of full two-way streams with an initial payload. Only a single stream is created between two machines - and there is, as such, no server/client relation since both sides can initiate and handle requests. Which server initiates the request is decided deterministically on the server names. Requests are made through a mux client and server, which handles message passing, congestion, cancelation, timeouts, etc. If a connection is lost, all requests are canceled, and the calling server will try to reconnect. Registered handlers can operate directly on byte slices or use a higher-level generics abstraction. There is no versioning of handlers/clients, and incompatible changes should be handled by adding new handlers. The request path can be changed to a new one for any protocol changes. First, all servers create a "Manager." The manager must know its address as well as all remote addresses. This will manage all connections. To get a connection to any remote, ask the manager to provide it given the remote address using. ``` func (m Manager) Connection(host string) Connection ``` All serverside handlers must also be registered on the manager. This will make sure that all incoming requests are served. The number of in-flight requests and responses must also be given for streaming requests. The "Connection" returned manages the mux-clients. Requests issued to the connection will be sent to the remote. * `func (c Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)` performs a single request and returns the result. Any deadline provided on the request is forwarded to the server, and canceling the context will make the function return at once. `func (c Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st Stream, err error)` will initiate a remote call and send the initial payload. ```Go // A Stream is a two-way stream. // All responses must be read by the caller. // If the call is canceled through the context, //The appropriate error will be returned. type Stream struct { // Responses from the remote server. // Channel will be closed after an error or when the remote closes. // All responses must be read by the caller until either an error is returned or the channel is closed. // Canceling the context will cause the context cancellation error to be returned. Responses <-chan Response // Requests sent to the server. // If the handler is defined with 0 incoming capacity this will be nil. // Channel must be closed to signal the end of the stream. // If the request context is canceled, the stream will no longer process requests. Requests chan<- []byte } type Response struct { Msg []byte Err error } ``` There are generic versions of the server/client handlers that allow the use of type safe implementations for data types that support msgpack marshal/unmarshal.	2023-11-20 17:09:35 -08:00

1 2 3 4 5 ...

518 Commits