Commit Graph

80 Commits

Author SHA1 Message Date
Poorna
eb7d3da994 Fix site replication status reporting of quota (#16626) 2023-02-16 08:08:35 +05:30
Harshavardhana
ee6d96eb46 Periodically remove stale buckets from in-memory (#16597) 2023-02-13 08:09:52 -08:00
Poorna
52aeebebea Fix site replication meta info call to be non-blocking (#16526)
Co-authored-by: Harshavardhana <harsha@minio.io>
2023-02-08 21:16:53 -08:00
Harshavardhana
d8daabae9b make site replication healing safer (#16560) 2023-02-07 21:44:42 -08:00
Harshavardhana
d19cbc81b5 fix: do not return IAM/Bucket metadata replication errors to client (#16486) 2023-01-26 11:11:54 -08:00
Harshavardhana
b4ef5ff294 remove unnecessary code checking for supported features (#16423) 2023-01-17 19:37:47 +05:30
Anis Elleuch
1ece3d1dfe Add comment field to service accounts (#16380) 2023-01-10 21:57:52 +04:00
Harshavardhana
b882310e2b avoid locks for internal and invalid buckets in MakeBucket() (#16302) 2022-12-23 07:46:00 -08:00
Poorna
6423e4c767 Remove site replication config if it succeeded locally (#16279) 2022-12-22 01:31:20 -08:00
Poorna
d37e514733 Cleanup remote targets automatically on replication config removal. (#16221) 2022-12-14 03:24:06 -08:00
Aditya Manthramurthy
a30cfdd88f Bump up madmin-go to v2 (#16162) 2022-12-06 13:46:50 -08:00
Klaus Post
a713aee3d5 Run staticcheck on CI (#16170) 2022-12-05 11:18:50 -08:00
Anis Elleuch
04948b4d55 fix: checking for stale STS account under site replication (#16109) 2022-11-22 07:26:33 -08:00
Poorna
d6bc141bd1 feat: Add support for site level resync (#15753) 2022-11-14 07:16:40 -08:00
Anis Elleuch
3b1a9b9fdf Use the same lock for the scanner and site replication healing (#15985) 2022-11-08 08:55:55 -08:00
Poorna
4f5d38a4b1 site replication edit: validate endpoint belongs to deployment (#16000) 2022-11-03 16:23:45 -07:00
Harshavardhana
0d49b365ff converge SNSD deployments into single code (#15988) 2022-11-01 16:41:01 -07:00
Klaus Post
0f0e154315 fix: inconsistent replication delete marker timestamps (#15956) 2022-10-27 09:46:52 -07:00
Poorna
ce8456a1a9 proxy multipart to peers via multipart uploadID (#15926) 2022-10-25 10:52:29 -07:00
Harshavardhana
a8332efa94 fix: Delete() of bucket metadata should not parse the config (#15904) 2022-10-19 17:55:09 -07:00
Aditya Manthramurthy
64cf887b28 use LDAP config from minio/pkg to share with console (#15810) 2022-10-07 22:12:36 -07:00
Harshavardhana
538aeef27a fix: heal service accounts for LDAP users in site replication (#15785) 2022-10-04 10:41:47 -07:00
Poorna
be0d2537b7 site replication: fix typo in meta collection (#15792) 2022-10-04 10:19:17 -07:00
Poorna
aec2aa3497 site replication: clear config if remove --all specified (#15716) 2022-09-20 14:32:23 -07:00
Poorna
929b9e164e site replication: Avoid returning root svcacct info in sr metadata (#15608)
Service accounts of root users should not be replicated.
2022-08-29 11:19:51 -07:00
Poorna
b1b6264bea fix: validate deployment id when adding peer clusters (#15591)
Fixes: #15573
2022-08-25 11:30:52 -07:00
Anis Elleuch
b737c83a66 Ensure that only one node performs site replication healing (#15584)
When a node finds a change in the other replication cluster and applies
to itself will already notify other peers. No need for all nodes in a
given cluster to do site replication healing, only one node is
sufficient.
2022-08-24 13:46:09 -07:00
Anis Elleuch
b8cdf060c8 Properly replicate policy mapping for virtual users (#15558)
Currently, replicating policy mapping for STS users does not work. Fix
it is by passing user type to PolicyDBSet.
2022-08-23 11:11:45 -07:00
Harshavardhana
895357607a avoid using errors.As for 'errors.New' use errors.Is (#15549)
Bonus: ignore coredns CVE, for now, there is no fix yet

https://github.com/coredns/coredns/issues/5574
2022-08-18 11:10:49 -07:00
Poorna
21fe14201f replication: centralize healthcheck for remote targets (#15516)
This PR moves health check from minio-go client to being
managed on the server.

Additionally integrating health check into site replication
2022-08-16 17:46:22 -07:00
Poorna
172e63dbb6 fix: site replication group updates to set status correctly (#15507)
Fixes: #15486
2022-08-09 15:17:43 -07:00
Poorna
2c137c0d04 fix: handle invalid endpoint errors in site replication(#15499)
fixes #15497
2022-08-08 11:12:05 -07:00
ebozduman
b57e7321e7 Replaces 'disk'=>'drive' visible to end user (#15464) 2022-08-04 16:10:08 -07:00
Poorna
426c902b87 site replication: fix healing of bucket deletes. (#15377)
This PR changes the handling of bucket deletes for site 
replicated setups to hold on to deleted bucket state until 
it syncs to all the clusters participating in site replication.
2022-07-25 17:51:32 -07:00
Poorna
7e32a17742 fix: site replication healing of missing buckets (#15298)
fixes a regression from #15186

- Adding tests to cover healing of buckets.
- Also dereference quota in SiteReplicationStatus only when non-nil
2022-07-14 14:27:47 -07:00
Poorna
3d969bd2b4 fix: ignore missing targets/replication config during site removal (#15269) 2022-07-11 14:11:46 -07:00
Poorna
0ea5c9d8e8 site healing: Skip stale iam asset updates from peer. (#15203)
Allow healing to apply IAM change only when peer
gave the most recent update.
2022-07-01 13:19:13 -07:00
Poorna
7cc9286e0f site healing: Skip stale bucket metadata updates from peer (#15186)
Allow healing to apply bucket metadata change only when peer
gave the most recent update.
2022-06-28 18:09:20 -07:00
Harshavardhana
1a40c7c27c use signature-v2 for 'object perf' tests to avoid CPU using sha256 (#15151)
It is observed in a local 8 drive system the CPU seems to be
bottlenecked at

```
(pprof) top
Showing nodes accounting for 1385.31s, 88.47% of 1565.88s total
Dropped 1304 nodes (cum <= 7.83s)
Showing top 10 nodes out of 159
      flat  flat%   sum%        cum   cum%
      724s 46.24% 46.24%       724s 46.24%  crypto/sha256.block
   219.04s 13.99% 60.22%    226.63s 14.47%  syscall.Syscall
   158.04s 10.09% 70.32%    158.04s 10.09%  runtime.memmove
   127.58s  8.15% 78.46%    127.58s  8.15%  crypto/md5.block
    58.67s  3.75% 82.21%     58.67s  3.75%  github.com/minio/highwayhash.updateAVX2
    40.07s  2.56% 84.77%     40.07s  2.56%  runtime.epollwait
    33.76s  2.16% 86.93%     33.76s  2.16%  github.com/klauspost/reedsolomon._galMulAVX512Parallel84
     8.88s  0.57% 87.49%     11.56s  0.74%  runtime.step
     7.84s   0.5% 87.99%      7.84s   0.5%  runtime.memclrNoHeapPointers
     7.43s  0.47% 88.47%     22.18s  1.42%  runtime.pcvalue
```

Bonus changes:

- re-use transport for bucket replication clients, also site replication clients.
- use 32KiB buffer for all read and writes at transport layer seems to help
  TLS read connections.
- Do not have 'MaxConnsPerHost' this is problematic to be used with net/http
  connection pooling 'MaxIdleConnsPerHost' is enough.
2022-06-22 16:28:25 -07:00
Harshavardhana
2bb6a3f4d0 cleanup site replication error handling (#15113)
site replication errors were printed at
various random locations, repeatedly - this
PR attempts to remove double logging and
capture all of them at a common place.

This PR also enhances the code to show
partial success and errors as well.
2022-06-20 10:48:11 -07:00
Poorna
8b9a19eef1 fix: typo in site replication version healing (#15103) 2022-06-17 16:43:24 -07:00
Poorna
29edb4ccfe fix: site replication bucket heal to not panic if replication config is missing (#15025) 2022-06-02 12:34:03 -07:00
Harshavardhana
52221db7ef fix: for unexpected errors in reading versioning config panic (#14994)
We need to make sure if we cannot read bucket metadata
for some reason, and bucket metadata is not missing and
returning corrupted information we should panic such
handlers to disallow I/O to protect the overall state
on the system.

In-case of such corruption we have a mechanism now
to force recreate the metadata on the bucket, using
`x-minio-force-create` header with `PUT /bucket` API
call.

Additionally fix the versioning config updated state
to be set properly for the site replication healing
to trigger correctly.
2022-05-31 02:57:57 -07:00
Anis Elleuch
ccbf65c8e8 site-repl: Fix deadlock after an IAM loading error (#14990)
Fix forgotten IAM cache lock releases when reading some data from
disk/etcd

Co-authored-by: Anis Elleuch <anis@min.io>
2022-05-27 10:26:38 -07:00
Poorna
5c81d0d89a site replication: heal missing/invalid replication config (#14979)
Validate remote target ARNs and heal any stale rules in
the replication config
2022-05-26 17:57:23 -07:00
Anis Elleuch
5041bfcb5c replication healing: Fix typo when healing bucket quota info (#14966)
A typo is found in the replication healing code where an empty quota
configuration is sent to peer sites instead of the correct one.
.io>
2022-05-24 06:26:13 -07:00
Aditya Manthramurthy
9aadd725d2 Avoid calling .Reset() on active timer (#14941)
.Reset() documentation states:

    For a Timer created with NewTimer, Reset should be invoked only on stopped
    or expired timers with drained channels.

This change is just to comply with this requirement as there might be some
runtime dependent situation that might lead to unexpected behavior.
2022-05-18 15:37:58 -07:00
Harshavardhana
6cfb1cb6fd fix: timer usage across codebase (#14935)
it seems in some places we have been wrongly using the
timer.Reset() function, nicely exposed by an example
shared by @donatello https://go.dev/play/p/qoF71_D1oXD

this PR fixes all the usage comprehensively
2022-05-17 22:42:59 -07:00
Harshavardhana
def75ffcfe allow versioning config changes under site replication (#14876)
PR #14828 introduced prefix-level exclusion of versioning
and replication - however our site replication implementation
since it defaults versioning on all buckets did not allow
changing versioning configuration once the bucket was created.

This PR changes this and ensures that such changes are honored
and also propagated/healed across sites appropriately.
2022-05-07 18:39:40 -07:00
Poorna
523670ba0d fix: site removal API error handling (#14870)
when the site is being removed is missing replication config. This can
happen when a new deployment is brought in place of a site that
is lost/destroyed and needs to delink old deployment from site
replication.
2022-05-06 12:40:34 -07:00