Commit Graph

471 Commits

Author SHA1 Message Date
Harshavardhana
fbb5e75e01
avoid run-away goroutine build-up in notification send, use channels (#18533)
use memory for async events when necessary and dequeue them as
needed, for all synchronous events customers must enable

```
MINIO_API_SYNC_EVENTS=on
```

Async events can be lost but is upto to the admin to
decide what they want, we will not create run-away number
of goroutines per event instead we will queue them properly.

Currently the max async workers is set to runtime.GOMAXPROCS(0)
which is more than sufficient in general, but it can be made
configurable in future but may not be needed.
2023-12-05 02:16:33 -08:00
Harshavardhana
fba883839d
feat: bring new HDD related performance enhancements (#18239)
Optionally allows customers to enable 

- Enable an external cache to catch GET/HEAD responses 
- Enable skipping disks that are slow to respond in GET/HEAD 
  when we have already achieved a quorum
2023-11-22 13:46:17 -08:00
Klaus Post
51aa59a737
perf: websocket grid connectivity for all internode communication (#18461)
This PR adds a WebSocket grid feature that allows servers to communicate via 
a single two-way connection.

There are two request types:

* Single requests, which are `[]byte => ([]byte, error)`. This is for efficient small
  roundtrips with small payloads.

* Streaming requests which are `[]byte, chan []byte => chan []byte (and error)`,
  which allows for different combinations of full two-way streams with an initial payload.

Only a single stream is created between two machines - and there is, as such, no
server/client relation since both sides can initiate and handle requests. Which server
initiates the request is decided deterministically on the server names.

Requests are made through a mux client and server, which handles message
passing, congestion, cancelation, timeouts, etc.

If a connection is lost, all requests are canceled, and the calling server will try
to reconnect. Registered handlers can operate directly on byte 
slices or use a higher-level generics abstraction.

There is no versioning of handlers/clients, and incompatible changes should
be handled by adding new handlers.

The request path can be changed to a new one for any protocol changes.

First, all servers create a "Manager." The manager must know its address 
as well as all remote addresses. This will manage all connections.
To get a connection to any remote, ask the manager to provide it given
the remote address using.

```
func (m *Manager) Connection(host string) *Connection
```

All serverside handlers must also be registered on the manager. This will
make sure that all incoming requests are served. The number of in-flight 
requests and responses must also be given for streaming requests.

The "Connection" returned manages the mux-clients. Requests issued
to the connection will be sent to the remote.

* `func (c *Connection) Request(ctx context.Context, h HandlerID, req []byte) ([]byte, error)`
   performs a single request and returns the result. Any deadline provided on the request is
   forwarded to the server, and canceling the context will make the function return at once.

* `func (c *Connection) NewStream(ctx context.Context, h HandlerID, payload []byte) (st *Stream, err error)`
   will initiate a remote call and send the initial payload.

```Go
// A Stream is a two-way stream.
// All responses *must* be read by the caller.
// If the call is canceled through the context,
//The appropriate error will be returned.
type Stream struct {
	// Responses from the remote server.
	// Channel will be closed after an error or when the remote closes.
	// All responses *must* be read by the caller until either an error is returned or the channel is closed.
	// Canceling the context will cause the context cancellation error to be returned.
	Responses <-chan Response

	// Requests sent to the server.
	// If the handler is defined with 0 incoming capacity this will be nil.
	// Channel *must* be closed to signal the end of the stream.
	// If the request context is canceled, the stream will no longer process requests.
	Requests chan<- []byte
}

type Response struct {
	Msg []byte
	Err error
}
```

There are generic versions of the server/client handlers that allow the use of type
safe implementations for data types that support msgpack marshal/unmarshal.
2023-11-20 17:09:35 -08:00
Harshavardhana
a7b1834772
fix: flaky and stupid tests in root lockdown (#18308) 2023-10-24 13:22:44 -07:00
Harshavardhana
fd37418da2
fix: allow server not initialized error to be retried (#18300)
Since relaxing quorum the error across pools
for ListBuckets(), GetBucketInfo() we hit a
situation where loading IAM could potentially
return an error for second pool that server
is not initialized.

We need to handle this, let the pool come online
and retry transparently - this PR fixes that.
2023-10-23 12:30:20 -07:00
Klaus Post
7926df0b80
Fix globalDeploymentID race (#18275)
globalDeploymentID was being read while it was being set.

Fixes race:

```
WARNING: DATA RACE
Write at 0x0000079605a0 by main goroutine:
  github.com/minio/minio/cmd.connectLoadInitFormats()
      github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0
  github.com/minio/minio/cmd.waitForFormatErasure()
      github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d
...

Previous read at 0x0000079605a0 by goroutine 105:
  github.com/minio/minio/cmd.newContext()
      github.com/minio/minio/cmd/utils.go:817 +0x31e
  github.com/minio/minio/cmd.adminMiddleware.func1()
      github.com/minio/minio/cmd/admin-router.go:110 +0x96
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
  github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1()
      github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
...
```
2023-10-18 08:06:57 -07:00
Harshavardhana
edfb310a59
fix: always load ENVs from files first as soon as server starts (#18247)
This is a regression from #18231, however reading from ENV files
must happen well before any parsing logic is invoked.
2023-10-15 21:13:43 -07:00
jiuker
20b79f8945
fix: env depend on the flag (#18231) 2023-10-12 15:32:38 -07:00
Harshavardhana
6829ae5b13
completely remove drive caching layer from gateway days (#18217)
This has already been deprecated for close to a year now.
2023-10-11 21:18:17 -07:00
Shireesh Anjal
6d20ec3bea
Add support for resource metrics (#18057)
Add a new endpoint for "resource" metrics `/v2/metrics/resource`

This should return system metrics related to drives, network, CPU and
memory. Except for drives, other metrics should have corresponding "avg"
and "max" values also.

Reuse the real-time feature to capture the required data,
introducing CPU and memory metrics in it.

Collect the data every minute and keep updating the average and max values
accordingly, returning the latest values when the API is called.
2023-09-30 13:40:20 -07:00
Harshavardhana
b1c2dacab3
fix: allow dynamic ports for API only in non-distributed setups (#18019)
fixes #17998
2023-09-12 19:10:49 -07:00
Aditya Manthramurthy
1c99fb106c
Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Harshavardhana
af564b8ba0
allow bootstrap to capture time-spent for each initializers (#17900) 2023-08-23 03:07:06 -07:00
Harshavardhana
239ccc9c40
fix: crash in globalTierJournal when TierConfig is not initialized (#17791) 2023-08-03 14:16:15 -07:00
Harshavardhana
c32c71c836
allow DNS cache TTL to be configurable (#17709)
this is added for now as a hidden variable
2023-07-24 15:13:35 -07:00
Harshavardhana
4f257bf1e6
pick internode interface properly via globalLocalNodeName (#17680)
current code will not pick the right interface name
if --address or --interface is not provided.
2023-07-18 19:18:11 -07:00
Harshavardhana
005a4a275a
add more bootstrap messages to provide latency (#17650)
- simplify refreshing bucket metadata, wait() to
  depend on how fast the bucket metadata can load.

- simplify resync to start resync in single pass.
2023-07-14 04:00:29 -07:00
jiuker
183428db03
fear: Implement 'mc support top net' (#17598) 2023-07-13 11:41:19 -07:00
Harshavardhana
7f782983ca
fix: for FTP server driver allow implicit trust of TLS (#17541)
fixes #17535
2023-06-30 08:04:13 -07:00
Harshavardhana
d3e5e607a7
allow site-replication checks to work on non-distributed setups (#17524)
fixes #17523
2023-06-27 09:23:50 -07:00
Anis Eleuch
d8dad5c9ea
s3: Make/Delete buckets to use error quorum per pool (#17467) 2023-06-23 11:48:23 -07:00
Harshavardhana
65c31fab12
fix: do not crash rebalance code instead set the object layer (#17465)
fixes #17421
2023-06-20 09:28:23 -07:00
Aditya Manthramurthy
5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Anis Eleuch
bb24346e04
listen: Only error out if not able to bind any interface (#17353) 2023-06-12 09:09:28 -07:00
Klaus Post
6e38d0f3ab
Add more bootstrap info in debug mode (#17362) 2023-06-08 08:39:47 -07:00
Harshavardhana
d1448adbda
use slices package and remove some helpers (#17342) 2023-06-06 10:12:52 -07:00
Praveen raj Mani
ecfb18b26a
Freeze the s3 APIs until the notification sub-system initializes completely (#17182) 2023-05-19 08:44:48 -07:00
Harshavardhana
b62791617c
fix: notify systemd as soon as we wait on the OS signal (#17199) 2023-05-12 16:42:17 -07:00
Praveen raj Mani
57acacd5a7
Support persistent queue store for loggers (#17121) 2023-05-08 21:20:31 -07:00
Poorna
c5c1426262
Validate if replication config being added is self referential (#17142) 2023-05-06 13:35:43 -07:00
Harshavardhana
5569acd95c
disallow EC:0 if not set during server startup (#17141) 2023-05-04 14:44:30 -07:00
Harshavardhana
9571b0825e
add configurable VRF interface and user-timeout (#17108) 2023-05-03 14:12:25 -07:00
WGH
ab34f0065c
Support systemd notify protocol (#17062) 2023-05-01 23:15:08 -07:00
Harshavardhana
dbd53af369
fix: initialize reverse proxy forwarder with right public certs (#17080) 2023-04-25 15:50:32 -07:00
Harshavardhana
477230c82e
avoid attempting to migrate old configs (#17004) 2023-04-21 13:56:08 -07:00
Harshavardhana
dd9ed85e22
implement support for FTP/SFTP server (#16952) 2023-04-15 07:34:02 -07:00
Anis Eleuch
91b6fe1af3
trace: Bootstrap to show the correct source line number (#16989) 2023-04-06 17:51:53 -07:00
Krishnan Parthasarathi
31fba6f434
Save bootstrap trace events in a circular buffer (#16823) 2023-03-17 16:01:03 -07:00
Harshavardhana
0c1f8b4e0f
add user-agent for all minio.Client usage (#16619) 2023-02-14 13:19:30 -08:00
Harshavardhana
71f02adfca Revert "Print golang http errors in MinIO log format (#16465)"
This reverts commit 1fd7946dce.
2023-02-09 09:27:27 +05:30
Krishnan Parthasarathi
990fc415f7
Ensure safety of transitionState at startup (#16563) 2023-02-07 23:11:42 -08:00
Harshavardhana
747d475e76
initialize subsystems that are not dependent on buckets first (#16559) 2023-02-07 12:46:47 -08:00
Anis Elleuch
095b518802
Show a better error msg when internal data encryption key is incorrect (#16549) 2023-02-07 05:22:54 -08:00
Anis Elleuch
1fd7946dce
Print golang http errors in MinIO log format (#16465) 2023-01-26 22:46:16 +05:30
Harshavardhana
54b561898f
fix: anonymize the x-amz-id-2 value from hostname (#16478) 2023-01-25 10:25:36 -08:00
Shireesh Anjal
5a9f7516d6
Add monthly license update job (#16391) 2023-01-17 05:08:15 +05:30
Anis Elleuch
2146ed4033
xl: Quit early when EC config is incorrect (#16390)
Co-authored-by: Anis Elleuch <anis@min.io>
2023-01-09 23:07:45 -08:00
Harshavardhana
e0086c1be7
reduce startup delays on kubernetes (#16356) 2023-01-05 02:32:43 -08:00
Harshavardhana
1cd8e1d8b6
remove the startup jitter before locks() (#16340) 2023-01-02 01:40:09 -08:00
Anis Elleuch
acc9c033ed
debug: Add X-Amz-Request-ID to lock/unlock calls (#16309) 2022-12-23 19:49:07 -08:00