minio

Commit Graph

Author	SHA1	Message	Date
Krishnan Parthasarathi	9569a85cee	Avoid allocs for MRF on-disk header (#18425 )	2023-11-10 19:54:46 -08:00
Poorna	03dc65e12d	Reload replication targets lazily if missing (#18333 ) There can be rare situations where errors seen in bucket metadata load on startup or subsequent metadata updates can result in missing replication remotes. Attempt a refresh of remote targets backed by a good replication config lazily in 5 minute intervals if there ever occurs a situation where remote targets go AWOL.	2023-10-27 21:08:53 -07:00
Harshavardhana	e1e33077e8	fix: tests and resync replication status (#18244 )	2023-10-13 17:03:34 -07:00
Harshavardhana	74e0c9ab9b	reduce unnecessary logging, simplify certain error handling (#18196 ) remove a bunch of unnecessary logs	2023-10-10 00:33:42 -07:00
Poorna	72871dbb9a	delete replication: avoid overwriting replication decision (#18174 ) from ObjectInfo unless version purge status is present. Otherwise there is potential to make incorrect replication decision if Stat returned an error	2023-10-05 21:09:45 -06:00
Poorna	b73699fad8	replication: pass user tags while queueing (#18052 ) Continues from #18032 - otherwise replication will fail on tag based rules.	2023-09-19 03:18:28 -07:00
jiuker	9947c01c8e	feat: SSE-KMS use uuid instead of read all data to md5. (#17958 )	2023-09-18 10:00:54 -07:00
Harshavardhana	fa6d082bfd	reduce all major allocations in replication path (#18032 ) - remove targetClient for passing around via replicationObjectInfo{} - remove cloing to object info unnecessarily - remove objectInfo from replicationObjectInfo{} (only require necessary fields)	2023-09-16 02:28:06 -07:00
Harshavardhana	a2aabfabd9	add backups for usage-caches to rely on upon error (#18029 ) This allows scanner to avoid lengthy scans, skip things appropriately and also not lose metrics in any manner. reduce longer deadlines for usage-cache loads/saves to match the disk timeout which is 2minutes now per IOP.	2023-09-14 11:53:52 -07:00
Poorna	96fbf18201	replication: queue existing objects to same workers as incoming (#18020 ) Previously existing objects were queued to single worker and MRF re-queues are also handled by same worker - this does not fully use the available bandwidth in case there is no incoming workload.	2023-09-12 21:59:15 -07:00
Harshavardhana	1df5e31706	optimize MRF replication queue to avoid memory leaks (#18007 )	2023-09-11 20:59:11 -07:00
Poorna	703ed46d79	fix: replication of tags while removing (#17989 ) A tag removal was not being replicated prior to this change	2023-09-06 19:05:02 -07:00
Poorna	13a2dc8485	replication resync: avoid blocking on results channel. (#17981 ) continues fix in #17775	2023-09-05 20:22:39 -07:00
Harshavardhana	5b114b43f7	refactor bandwidth throttling for replication target (#17980 ) This refactor is to allow using the bandwidth throttling for other purposes.	2023-09-05 20:21:59 -07:00
Poorna	d665e855de	replication: remove check for empty version id (#17964 )	2023-09-01 13:46:10 -07:00
Poorna	b48bbe08b2	Add additional info for replication metrics API (#17293 ) to track the replication transfer rate across different nodes, number of active workers in use and in-queue stats to get an idea of the current workload. This PR also adds replication metrics to the site replication status API. For site replication, prometheus metrics are no longer at the bucket level - but at the cluster level. Add prometheus metric to track credential errors since uptime	2023-08-30 01:00:59 -07:00
Poorna	4a6af93c83	mark replication target offline if network timeouts seen (#17907 ) regular target liveness check every 5 secs will toggle state back as target returns online.	2023-08-24 09:24:26 -07:00
Harshavardhana	1c5af7c31a	serialize queueMRFHeal(), add timeouts and avoid normal build-ups (#17886 ) we expect a certain level of IOPs and latency so this is okay. fixes other miscellaneous bugs - such as hanging on mrfCh <- when the context is canceled - queuing MRF heal when the context is canceled - remove unused saveStateCh channel	2023-08-21 16:44:50 -07:00
Poorna	dfaf735073	replication: fix queuing of large uploads (#17831 ) Fixes regression from #17687	2023-08-10 15:48:42 -07:00
Harshavardhana	b732a673dc	reduce logging in bucket replication in retry scenarios (#17820 )	2023-08-08 13:27:40 -07:00
Poorna	26c23b30f4	replication: set context timeout for NewMultipartUpload calls (#17807 )	2023-08-05 12:27:07 -07:00
Poorna	311380f8cb	replication resync: fix queueing (#17775 ) Assign resync of all versions of object to the same worker to avoid locking contention. Fixes parallel resync implementation in #16707	2023-08-01 11:51:15 -07:00
Poorna	1a42693d68	replication: limit larger uploads to a subset of workers (#17687 ) Limit large uploads (> 128MiB) to a max of 10 workers, intent is to avoid larger uploads from using all replication bandwidth, giving room for smaller uploads to sync faster.	2023-07-25 20:02:02 -07:00
Harshavardhana	005a4a275a	add more bootstrap messages to provide latency (#17650 ) - simplify refreshing bucket metadata, wait() to depend on how fast the bucket metadata can load. - simplify resync to start resync in single pass.	2023-07-14 04:00:29 -07:00
Poorna	5e2f8d7a42	replication: Simplify mrf requeueing and add backlog handler (#17171 ) Simplify MRF queueing and add backlog handler - Limit re-tries to 3 to avoid repeated re-queueing. Fall offs to be re-tried when the scanner revisits this object or upon access. - Change MRF to have each node process only its MRF entries. - Collect MRF backlog by the node to allow for current backlog visibility	2023-07-12 23:51:33 -07:00
Kaan Kabalak	f64d62b01d	Fix style of logOnceIf calls w/unique identifiers (#17631 )	2023-07-11 13:17:45 -07:00
Poorna	e8c98c3246	Avoid extra GetObjectInfo call in DeleteObject API (#17599 ) Optimize DeleteObject API to avoid extra GetObjectInfo call on the replicating side. For receiving side, it is just a regular DeleteObject call. Bonus: Fix a corner case where version purged is absent on target (either due to replication not yet complete or target version already deleted in a one-way replication or when replication was disabled). In such cases, mark version purge complete.	2023-07-10 07:57:56 -07:00
Klaus Post	ff5988f4e0	Reduce allocations (#17584 ) * Reduce allocations * Add stringsHasPrefixFold which can compare string prefixes, while ignoring case and not allocating. * Reuse all msgp.Readers * Reuse metadata buffers when not reading data. * Make type safe. Make buffer 4K instead of 8. * Unslice	2023-07-06 16:02:08 -07:00
Kaan Kabalak	21fbe88e1f	Print certain log messages once per error (#17484 )	2023-06-24 20:29:13 -07:00
jiuker	b6b68be052	fix: replication check for duplicate endpoints detection with wrong route (#17474 )	2023-06-20 09:27:54 -07:00
Aditya Manthramurthy	5a1612fe32	Bump up madmin-go and pkg deps (#17469 )	2023-06-19 17:53:08 -07:00
Harshavardhana	1443b5927a	allow quorum fileInfo to pick same parityBlocks (#17454 ) Bonus: allow replication to proceed for 503 errors such as with error code SlowDownRead	2023-06-18 18:20:15 -07:00
Poorna	c4d0c49a5f	ensure metadata updates go to same pool where version exists (#17451 ) This PR also returns the replication status in proxy calls and defers replication attempt if HEAD on object version returned a error different from NoSuchKey	2023-06-17 07:30:53 -07:00
Poorna Krishnamoorthy	f986b0c493	replication: perform bucket resync in parallel (#16707 ) Default number of parallel resync operations for a bucket to 10 to speed up resync.	2023-06-11 16:09:55 -07:00
Harshavardhana	b210ea79bc	do not save MTime in newMultipartUpload() to avoid side-affects (#17340 )	2023-06-02 14:38:09 -07:00
Poorna	e95825a42e	replication: use latest object info for metrics update (#17333 )	2023-06-01 18:52:55 -07:00
Poorna	2131046427	replication: fix audit log reporting (#17222 )	2023-05-16 15:35:08 -07:00
Klaus Post	aaf1abc993	simplify HardLimitReader by using LimitReader for internal usage (#17218 )	2023-05-16 13:14:37 -07:00
jiuker	413549bcf5	fix: loadStatsFromDisk() should return nil for configNotFound (#17217 )	2023-05-16 12:23:38 -07:00
Poorna	e07c2ab868	Use hash.NewLimitReader for internal multipart calls (#17191 )	2023-05-12 11:19:08 -07:00
Poorna	c5c1426262	Validate if replication config being added is self referential (#17142 )	2023-05-06 13:35:43 -07:00
Harshavardhana	6825bd7e75	fix: inlined objects don't need to honor long locks (#17039 )	2023-04-17 12:16:37 -07:00
Harshavardhana	c06e0bfef9	set correct `Host:` value for replication event notification (#16984 )	2023-04-06 10:20:53 -07:00
Anis Eleuch	d90d0c8931	Use one http response recorder per external http call (#16938 )	2023-03-31 09:37:29 -07:00
Allan Roger Reid	483b226cc1	fix: avoid logging when object/version not found in replication (#16919 )	2023-03-29 15:02:45 -07:00
Harshavardhana	8e02660a0d	update all our deps (#16899 )	2023-03-28 03:45:24 -07:00
Poorna	fb6ab1cca2	fix: allow replication of 'null' delete markers (#16773 )	2023-03-08 07:03:29 -08:00
Poorna	ee54643004	Avoid unnecessary replication heal attempts (#16769 )	2023-03-07 07:43:38 -08:00
Poorna	c33a237067	fix: under site replication disallow remote target modification (#16628 )	2023-02-15 20:22:13 -08:00
jiuker	a15b6f21b8	remove incorrect use of WaitGroup (#16596 )	2023-02-12 20:59:45 -08:00

1 2 3 4 5

225 Commits