Commit Graph

95 Commits

Author SHA1 Message Date
Harshavardhana b7c5e45fff
heal: isObjectDangling should return false when it cannot decide (#14053)
In a multi-pool setup when disks are coming up, or in a single pool
setup let's say with 100's of erasure sets with a slow network.

It's possible when healing is attempted on `.minio.sys/config`
folder, it can lead to healing unexpectedly deleting some policy
files as dangling due to a mistake in understanding when `isObjectDangling`
is considered to be 'true'.

This issue happened in commit 30135eed86
when we assumed the validMeta with empty ErasureInfo is considered
to be fully dangling. This implementation issue gets exposed when
the server is starting up.

This is most easily seen with multiple-pool setups because of the
disconnected fashion pools that come up. The decision to purge the
object as dangling is taken incorrectly prior to the correct state
being achieved on each pool, when the corresponding drive let's say
returns 'errDiskNotFound', a 'delete' is triggered. At this point,
the 'drive' comes online because this is part of the startup sequence
as drives can come online lazily.

This kind of situation exists because we allow (totalDisks/2) number
of drives to be online when the server is being restarted.

Implementation made an incorrect assumption here leading to policies
getting deleted.

Added tests to capture the implementation requirements.
2022-01-07 19:11:54 -08:00
Aditya Manthramurthy 0a224654c2
fix: progagation of service accounts for site replication (#14054)
- Only non-root-owned service accounts are replicated for now.
- Add integration tests for OIDC with site replication
2022-01-07 17:41:43 -08:00
Harshavardhana 0e3037631f
skip inconsistent shards if possible (#13945)
data shards were wrong due to a healing bug
reported in #13803 mainly with unaligned object
sizes.

This PR is an attempt to automatically avoid
these shards, with available information about
the `xl.meta` and actually disk mtime.
2021-12-21 10:08:26 -08:00
Harshavardhana e82a5c5c54
fix: site replication issues and add tests (#13861)
- deleting policies was deleting all LDAP
  user mapping, this was a regression introduced
  in #13567

- deleting of policies is properly sent across
  all sites.

- remove unexpected errors instead embed the real
  errors as part of the 500 error response.
2021-12-08 11:50:15 -08:00
Harshavardhana 92fdcafb66
add verification tests for ETag on replicated content (#13857) 2021-12-07 10:08:26 -08:00
Harshavardhana 4f3290309e Revert "disable CI/CD for draft PRs (#13784)"
This reverts commit 5a22f2cf0b.
2021-11-30 09:22:17 -08:00
Krishnan Parthasarathi 5a22f2cf0b
disable CI/CD for draft PRs (#13784) 2021-11-29 23:35:07 -08:00
Harshavardhana 91e0823ff0
allow service freeze/unfreeze on a setup (#13707)
an active running speedTest will reject all
new S3 requests to the server, until speedTest
is complete.

this is to ensure that speedTest results are
accurate and trusted.

Co-authored-by: Klaus Post <klauspost@gmail.com>
2021-11-23 12:02:16 -08:00
Harshavardhana c791de0e1e
re-implement pickValidInfo dataDir, move to quorum calculation (#13681)
dataDir loosely based on maxima is incorrect and does not
work in all situations such as disks in the following order

- xl.json migration to xl.meta there may be partial xl.json's
  leftover if some disks are not yet connected when the disk
  is yet to come up, since xl.json mtime and xl.meta is
  same the dataDir maxima doesn't work properly leading to
  quorum issues.

- its also possible that XLV1 might be true among the disks
  available, make sure to keep FileInfo based on common quorum
  and skip unexpected disks with the older data format.

Also, this PR tests upgrade from older to a newer release if the 
data is readable and matches the checksum.

NOTE: this is just initial work we can build on top of this to do further tests.
2021-11-21 10:41:30 -08:00
Aditya Manthramurthy 1e2fac054c
Add caching to CI jobs (#13712)
- Seems to be improving times for shorter jobs at least.

- Remove Go 1.16.x tests for IAM and replication
2021-11-19 16:18:23 -08:00
Aditya Manthramurthy 087c1b98dc
Add tests for OpenID STS creds and add to CI (#13638) 2021-11-11 11:23:30 -08:00
Harshavardhana 5acc8c0134
add multi-site replication tests (#13631) 2021-11-10 18:18:09 -08:00
Aditya Manthramurthy 1946922de3
Add CI for etcd IAM backend (#13614)
Runs when ETCD_SERVER env var is set
2021-11-09 09:25:13 -08:00
Harshavardhana 12e6907512
apply spelling checks for US locale (#13599) 2021-11-07 01:22:59 -08:00
Aditya Manthramurthy 01b9ff54d9
Add LDAP STS tests and workflow for CI (#13576)
Runs LDAP tests with openldap container on GH Actions
2021-11-04 08:16:30 -07:00
Aditya Manthramurthy 79a58e275c
fix: race in delete user functionality (#13547)
- The race happens with a goroutine that refreshes IAM cache data from storage.
- It could lead to deleted users re-appearing as valid live credentials.
- This change also causes CI to run tests without a race flag (in addition to
running it with).
2021-11-01 15:03:07 -07:00
Aditya Manthramurthy 900e584514
CI: Cancel in-progress jobs when a PR is updated (#13552)
- This should lead to faster results as jobs will be queued for shorter periods
when PRs are updated.

- Current behavior is that previously running CI jobs for an updated PR run to
completion needlessly, and cause new CI jobs to be queued.

Ref: https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions#concurrency
2021-11-01 13:42:48 -07:00
Anis Elleuch f1cab828ee
fix: New disks healing should pick unformatted disks as well (#13054)
A recent regression caused new disks not being re-formatted. In the old
code, a disk needed be 'online' to be chosen to be formatted but the
disk has to be already formatted for XL storage IsOnline() function to
return true.

It is enough to check if XL storage is nil or not if we want to avoid
formatting root disks.

Co-authored-by: Anis Elleuch <anis@min.io>
2021-08-24 07:40:56 -07:00
Harshavardhana a75412440f remove flaky healing test 2021-08-23 08:01:36 -07:00
Harshavardhana 202d0b64eb
fix: enable go1.17 github ci/cd (#12997) 2021-08-18 18:35:22 -07:00
Harshavardhana 27e07e80bc
fix: accountInfo() error FS mode (#12734) 2021-07-17 01:17:35 -07:00
Harshavardhana e1870c7b7c
build things separately in separate jobs (#12533) 2021-06-18 12:08:33 -07:00
Harshavardhana cdeccb5510
feat: Deprecate embedded browser and import console (#12460)
This feature also changes the default port where
the browser is running, now the port has moved
to 9001 and it can be configured with

```
--console-address ":9001"
```
2021-06-17 20:27:04 -07:00
Harshavardhana 764721e2c6
add root_disk threshold detection (#12259)
as there is no automatic way to detect if there
is a root disk mounted on / or /var for the container
environments due to how the root disk information
is masked inside overlay root inside container.

this PR brings an environment variable to set
root disk size threshold manually to detect the
root disks in such situations.
2021-05-08 15:40:29 -07:00
Harshavardhana f7a87b30bf Revert "deprecate embedded browser (#12163)"
This reverts commit 736d8cbac4.

Bring contrib files for older contributions
2021-04-30 08:50:39 -07:00
Harshavardhana b3c8a1864f
fix: optimize ListBuckets for anonymous users (#12182)
anonymous users are never allowed to listBuckets(),
we do not need to further validate the policy, we can
simply reject if credentials are empty.
2021-04-28 21:37:02 -07:00
Harshavardhana 736d8cbac4
deprecate embedded browser (#12163)
https://github.com/minio/console takes over the functionality for the
future object browser development

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-27 10:52:12 -07:00
Harshavardhana 434e5c0cfe
allow preserving legacyXLv1 with inline data format (#11951)
current master breaks this important requirement
we need to preserve legacyXLv1 format, this is simply
ignored and overwritten causing a myriad of issues
by leaving stale files on the namespace etc.

for now lets still use the two-phase approach of
writing to `tmp` and then renaming the content to
the actual namespace.
2021-04-01 22:12:03 -07:00
Harshavardhana ec547c0fa8
enable race detector CI for macos-latest (#11715) 2021-03-05 14:16:23 -08:00
Harshavardhana d73d756a80
fix: incorrect errors thrown by lint (#11699)
fixes #11698
2021-03-04 14:27:38 -08:00
Harshavardhana 0b9c17443e
update gopsutil to use the v3 API (#11638) 2021-03-01 00:15:46 -08:00
Harshavardhana 0f5ca83418
move CI to go1.15 and go1.16 (#11570) 2021-02-18 09:42:32 -08:00
Harshavardhana 828602d672 fix: nancy github URL has changed 2021-01-10 13:37:05 -08:00
Anis Elleuch f164085227
xl: Always set root disk to true in test environment (#11094)
Tests environments (go test or manual testing) should always consider
the passed disks are root disks and should not rely on disk.IsRootDisk()
function. The reason is that this latter can return a false negative
when called in a busy system. However, returning a false negative will
only occur in a testing environment and not in a production, so we can
accept this trade-off for now.
2020-12-12 16:10:07 -08:00
Harshavardhana 253194e491
do not hold write locks - if objects don't exist (#10644) 2020-10-08 17:47:21 -07:00
Harshavardhana 23e8390997
fix: Allow Walk to honor load balanced drives (#10610) 2020-10-01 20:24:34 -07:00
Harshavardhana e730da1438
fix: referesh JWKS public keys upon failure (#10368)
fixes #10359
2020-08-28 08:15:12 -07:00
Harshavardhana c8b84a0e9e
Add nancy vulnerability scanner (#10289) 2020-08-19 14:25:21 -07:00
Harshavardhana 1e2ebc9945
feat: time to bring back http2.0 support (#10230)
Bonus move our CI/CD to go1.14
2020-08-10 09:02:29 -07:00
Harshavardhana 4ac31ea82b
fix: find current location of object multi-zones (#9840)
PutObject on multiple-zone with versioning would not
overwrite the correct location of the object if the
object has delete marker, leading to duplicate objects
on two zones.

This PR fixes by adding affinity towards delete marker
when GetObjectInfo() returns error, use the zone index
which has the delete marker.
2020-06-17 08:33:14 -07:00
Justin Hutchings b91040f7fb
Add CodeQL security scanning (#9763)
Manually specify go and javascript for analysis
2020-06-03 11:14:01 -07:00
Harshavardhana e45c90060f
remove references for deprecated dockerfiles and deployment styles (#9675) 2020-05-22 08:40:59 -07:00
Harshavardhana daf4418cbb
add 50m for windows tests (#9614) 2020-05-15 21:07:27 -07:00
Harshavardhana 247795dd36
add github workflow for windows (#9611)
bye, bye travis
2020-05-15 15:54:39 -07:00
Harshavardhana d099039f5d
Add new github workflow (#9480) 2020-04-29 09:17:32 -07:00