minio/internal
Klaus Post 3415c4dd1e
Fix reconnected deadlock with full queue (#19964)
When a reconnection happens, `handleMessages` must be able to complete and exit. 

This can be prevented in a full queue.

Deadlock chain (May 10th release)

```
1 @ 0x44110e 0x453125 0x109f88c 0x109f7d5 0x10a472c 0x10a3f72 0x10a34ed 0x4795e1
#	0x109f88b	github.com/minio/minio/internal/grid.(*Connection).send+0x3eb			github.com/minio/minio/internal/grid/connection.go:548
#	0x109f7d4	github.com/minio/minio/internal/grid.(*Connection).queueMsg+0x334		github.com/minio/minio/internal/grid/connection.go:586
#	0x10a472b	github.com/minio/minio/internal/grid.(*Connection).handleAckMux+0xab		github.com/minio/minio/internal/grid/connection.go:1284
#	0x10a3f71	github.com/minio/minio/internal/grid.(*Connection).handleMsg+0x231		github.com/minio/minio/internal/grid/connection.go:1211
#	0x10a34ec	github.com/minio/minio/internal/grid.(*Connection).handleMessages.func1+0x6cc	github.com/minio/minio/internal/grid/connection.go:1019

---> blocks ---> via (Connection).handleMsgWg

1 @ 0x44110e 0x454165 0x454134 0x475325 0x486b08 0x10a161a 0x10a1465 0x2470e67 0x7395a9 0x20e61af 0x20e5f1f 0x7395a9 0x22f781c 0x7395a9 0x22f89a5 0x7395a9 0x22f6e82 0x7395a9 0x22f49a2 0x7395a9 0x2206e45 0x7395a9 0x22f4d9c 0x7395a9 0x210ba06 0x7395a9 0x23089c2 0x7395a9 0x22f86e9 0x7395a9 0xd42582 0x2106c04
#	0x475324	sync.runtime_Semacquire+0x24								runtime/sema.go:62
#	0x486b07	sync.(*WaitGroup).Wait+0x47								sync/waitgroup.go:116
#	0x10a1619	github.com/minio/minio/internal/grid.(*Connection).reconnected+0xb9			github.com/minio/minio/internal/grid/connection.go:857
#	0x10a1464	github.com/minio/minio/internal/grid.(*Connection).handleIncoming+0x384			github.com/minio/minio/internal/grid/connection.go:825
```

Add a queue cleaner in reconnected that will pop old messages so `handleMessages` can 
send messages without blocking and exit appropriately for the connection to be re-established.

Messages are likely dropped by the remote, but we may have some that can succeed, 
so we only drop when running out of space.
2024-06-20 16:11:40 -07:00
..
amztime add codespell action (#18818) 2024-01-17 23:03:17 -08:00
arn Add more tests for ARN and its format (#19408) 2024-04-04 01:31:34 -07:00
auth Restrict access keys for users and groups to not allow '=' or ',' (#19749) 2024-05-28 10:14:16 -07:00
bpool Reduce parallelReader allocs (#19558) 2024-04-19 09:44:59 -07:00
bucket ldap: Add user DN attributes list config param (#19758) 2024-05-24 16:05:23 -07:00
cachevalue Add cluster config metrics in metrics-v3 (#19507) 2024-05-24 05:50:46 -07:00
color add logrotate support for MinIO logs (#19641) 2024-05-01 10:57:52 -07:00
config change service account embedded policy size limit (#19840) 2024-05-30 11:10:41 -07:00
crypto Fix replication checksum transfer (#19906) 2024-06-10 10:40:33 -07:00
deadlineconn Add sufficient deadlines and countermeasures to handle hung node scenario (#19688) 2024-05-22 16:07:14 -07:00
disk Read drive IO stats from sysfs instead of procfs (#19131) 2024-02-26 11:34:50 -08:00
dsync Do not block on distributed unlocks (#19952) 2024-06-19 07:35:19 -07:00
etag fix: some flyby typos in the code (#19212) 2024-03-10 14:09:36 -07:00
event kafka: _MINIO_KAFKA_DEBUG to enable sarama debug messages (#19849) 2024-06-01 08:02:59 -07:00
fips disable builds for go1.18 (#16332) 2022-12-30 11:37:07 -08:00
grid Fix reconnected deadlock with full queue (#19964) 2024-06-20 16:11:40 -07:00
handlers send proper IPv6 names avoid bracketing notation (#18699) 2023-12-21 16:56:55 -08:00
hash Accept multipart checksums with part count (#19680) 2024-05-08 09:18:34 -07:00
http Removing timeout on shutdown (#19956) 2024-06-19 11:42:47 -07:00
init force all internal MinIO operations to be under UTC (#16009) 2022-11-04 16:44:38 -07:00
ioutil Add sufficient deadlines and countermeasures to handle hung node scenario (#19688) 2024-05-22 16:07:14 -07:00
jwt allow JWT parsing on large session policy based tokens (#17167) 2023-05-09 00:53:08 -07:00
kms kms: use GetClientCertificate callback for KES API keys (#19921) 2024-06-12 07:31:26 -07:00
lock fix: linter errors in Windows specific code (#18276) 2023-10-18 11:08:15 -07:00
logger race: Fix detected test race in the internal audit code (#19865) 2024-06-03 08:44:50 -07:00
lsync cleanup Go linter settings (#16736) 2023-03-04 20:57:35 -08:00
mcontext Add X-Amz-Request-Id to internode calls (#16146) 2022-12-06 09:27:26 -08:00
mountinfo add codespell action (#18818) 2024-01-17 23:03:17 -08:00
net fix: return error when requested interface has no stats available (#17666) 2023-07-17 01:14:01 -07:00
once Support persistent queue store for loggers (#17121) 2023-05-08 21:20:31 -07:00
pubsub Fix tracing send on closed channel (#18982) 2024-02-06 08:57:30 -08:00
rest ldap: Add user DN attributes list config param (#19758) 2024-05-24 16:05:23 -07:00
ringbuffer Add PutObject Ring Buffer (#19605) 2024-05-14 17:11:04 -07:00
s3select ldap: Add user DN attributes list config param (#19758) 2024-05-24 16:05:23 -07:00
store Webhook targets refactor and bug fixes (#19275) 2024-03-25 09:44:20 -07:00