Commit Graph

290 Commits

Author SHA1 Message Date
Krishnan Parthasarathi ad8e611098
feat: implement prefix-level versioning exclusion (#14828)
Spark/Hadoop workloads which use Hadoop MR 
Committer v1/v2 algorithm upload objects to a 
temporary prefix in a bucket. These objects are 
'renamed' to a different prefix on Job commit. 
Object storage admins are forced to configure 
separate ILM policies to expire these objects 
and their versions to reclaim space.

Our solution:

This can be avoided by simply marking objects 
under these prefixes to be excluded from versioning, 
as shown below. Consequently, these objects are 
excluded from replication, and don't require ILM 
policies to prune unnecessary versions.

-  MinIO Extension to Bucket Version Configuration
```xml
<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> 
        <Status>Enabled</Status>
        <ExcludeFolders>true</ExcludeFolders>
        <ExcludedPrefixes>
          <Prefix>app1-jobs/*/_temporary/</Prefix>
        </ExcludedPrefixes>
        <ExcludedPrefixes>
          <Prefix>app2-jobs/*/__magic/</Prefix>
        </ExcludedPrefixes>

        <!-- .. up to 10 prefixes in all -->     
</VersioningConfiguration>
```
Note: `ExcludeFolders` excludes all folders in a bucket 
from versioning. This is required to prevent the parent 
folders from accumulating delete markers, especially
those which are shared across spark workloads 
spanning projects/teams.

- To enable version exclusion on a list of prefixes

```
mc version enable --excluded-prefixes "app1-jobs/*/_temporary/,app2-jobs/*/_magic," --exclude-prefix-marker myminio/test
```
2022-05-06 19:05:28 -07:00
Harshavardhana 5a9a898ba2
allow forcibly creating metadata on buckets (#14820)
introduce x-minio-force-create environment variable
to force create a bucket and its metadata as required,
it is useful in some situations when bucket metadata
needs recovery.
2022-04-27 04:44:07 -07:00
Poorna 3a64580663
Add support for site replication healing (#14572)
heal bucket metadata and IAM entries for
sites participating in site replication from
the site with the most updated entry.

Co-authored-by: Harshavardhana <harsha@minio.io>
Co-authored-by: Aditya Manthramurthy <aditya@minio.io>
2022-04-24 02:36:31 -07:00
Poorna 46ba15ab03
Return MethodNotAllowed if force del on replicated bucket (#14505) 2022-03-08 14:28:51 -08:00
Poorna ed3418c046
Refactor replication resync to be an active process (#14266)
When resync is triggered, walk the bucket namespace and
resync objects that are unreplicated. This PR also adds
an API to report resync progress.
2022-02-10 10:16:52 -08:00
Harshavardhana 74faed166a
Add quota usage as part of prometheus metrics (#14222)
Bonus: pass caller context when needed to all bucket metadata handling calls.
2022-01-31 17:27:43 -08:00
Harshavardhana d6dd17a483
make sure to pass groups for all credentials while verifying policies (#14193)
fixes #14180
2022-01-26 21:53:36 -08:00
Klaus Post 0e31cff762
fix: DeleteMultipleObjects to finish even if cancelled + concurrent sets (#14038)
* Process sets concurrently.
* Disconnect context from request.
* Insert context cancellation checks.
* errFileNotFound and errFileVersionNotFound are ok, unless creating delete markers.
2022-01-06 10:47:49 -08:00
Harshavardhana a60ac7ca17
fix: audit log to support object names in multipleObjectNames() handler (#14017) 2022-01-03 01:28:52 -08:00
Harshavardhana f527c708f2
run gofumpt cleanup across code-base (#14015) 2022-01-02 09:15:06 -08:00
Poorna K b42cfcea60
Disallow versioning/replication change in cluster replication setup (#13910) 2021-12-15 10:37:08 -08:00
Harshavardhana 88ad742da0
fix: error handling cases in site-replication (#13901)
- Allow proper SRError to be propagated to
  handlers and converted appropriately.

- Make sure to enable object locking on buckets
  when requested in MakeBucketHook.

- When DNSConfig is enabled attempt to delete it
  first before deleting buckets locally.
2021-12-14 14:09:57 -08:00
Harshavardhana b120bcb60a
validate if cached value is empty before use (#13830)
fixes a crash reproduced while running hadoop tests

```
goroutine 201564 [running]:
github.com/minio/minio/cmd.metaCacheEntries.resolve({0xc0206ab7a0, 0x4, 0xc0015b1908}, 0xc0212a7040)
	github.com/minio/minio/cmd/metacache-entries.go:352 +0x58a
```

Bonus: HeadBucket() should always provide content-type
2021-12-06 02:59:51 -08:00
Aditya Manthramurthy 4ce6d35e30
Add new `site` config sub-system intended to replace `region` (#13672)
- New sub-system has "region" and "name" fields.

- `region` subsystem is marked as deprecated, however still works, unless the
new region parameter under `site` is set - in this case, the region subsystem is
ignored. `region` subsystem is hidden from top-level help (i.e. from `mc admin
config set myminio`), but appears when specifically requested (i.e. with `mc
admin config set myminio region`).

- MINIO_REGION, MINIO_REGION_NAME are supported as legacy environment variables for server region.

- Adds MINIO_SITE_REGION as the current environment variable to configure the
server region and MINIO_SITE_NAME for the site name.
2021-11-25 13:06:25 -08:00
Anis Elleuch 55d4cdd464
multi-delete: Avoid empty Delete tag in the response (#13725)
When removing an object fails, such as when it is WORM protected, a
wrong <Delete> will still be in the response. This commit fixes it.
2021-11-24 10:01:07 -08:00
Harshavardhana 914bfb2d9c
fix: allow compaction on replicated buckets (#13711)
currently getReplicationConfig() failure incorrectly
returns error on unexpected buckets upon upgrade, we
should always calculate usage as much as possible.
2021-11-19 14:46:14 -08:00
Harshavardhana 661b263e77
add gocritic/ruleguard checks back again, cleanup code. (#13665)
- remove some duplicated code
- reported a bug, separately fixed in #13664
- using strings.ReplaceAll() when needed
- using filepath.ToSlash() use when needed
- remove all non-Go style comments from the codebase

Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>
2021-11-16 09:28:29 -08:00
Harshavardhana 4545ecad58
ignore swapped drives instead of throwing errors (#13655)
- add checks such that swapped disks are detected
  and ignored - never used for normal operations.

- implement `unrecognizedDisk` to be ignored with
  all operations returning `errDiskNotFound`.

- also add checks such that we do not load unexpected
  disks while connecting automatically.

- additionally humanize the values when printing the errors.

Bonus: fixes handling of non-quorum situations in
getLatestFileInfo(), that does not work when 2 drives
are down, currently this function would return errors
incorrectly.
2021-11-15 09:46:55 -08:00
Harshavardhana 44e4bdc6f4
restrict multi object delete > 1000 objects (#13454)
AWS S3 returns error if > 1000 objects are sent
per MultiObject delete request, we should comply
no reason to not comply.
2021-10-18 08:38:33 -07:00
Aditya Manthramurthy 3a7c79e2c7
Add new site replication feature (#13311)
This change allows a set of MinIO sites (clusters) to be configured 
for mutual replication of all buckets (including bucket policies, tags, 
object-lock configuration and bucket encryption), IAM policies, 
LDAP service accounts and LDAP STS accounts.
2021-10-06 16:36:31 -07:00
Klaus Post 421160631a
MakeBucket: Delete leftover buckets on error (#13368)
In (erasureServerPools).MakeBucketWithLocation deletes the created 
buckets if any set returns an error.

Add `NoRecreate` option, which will not recreate the bucket 
in `DeleteBucket`, if the operation fails.

Additionally use context.Background() for operations we always want to be performed.
2021-10-06 10:24:40 -07:00
Poorna Krishnamoorthy 806b10b934
fix: improve error messages returned during replication setup (#13261) 2021-09-21 13:03:20 -07:00
Harshavardhana 50a68a1791
allow S3 gateway to support object locked buckets (#13257)
- Supports object locked buckets that require
  PutObject() to set content-md5 always.
- Use SSE-S3 when S3 gateway is being used instead
  of SSE-KMS for auto-encryption.
2021-09-21 09:02:15 -07:00
Harshavardhana 4d84f0f6f0
fix: support existing folders in single drive mode (#13254)
This PR however also proceeds to simplify the loading
of various subsystems such as

- globalNotificationSys
- globalTargetSys

converge them directly into single bucket metadata sys
loader, once that is loaded automatically every other
target should be loaded and configured properly.

fixes #13252
2021-09-20 17:41:01 -07:00
Poorna Krishnamoorthy c4373ef290
Add support for multi site replication (#12880) 2021-09-18 13:31:35 -07:00
Harshavardhana 6d42569ade
remove ListBucketsMetadata instead add them to AccountInfo() (#13241) 2021-09-17 15:02:21 -07:00
Harshavardhana 45bcf73185
feat: Add ListBucketsWithMetadata extension API (#13219) 2021-09-16 09:52:41 -07:00
Poorna Krishnamoorthy 78dc08bdc2
remove s3:ReplicateDelete permission check from DeleteObject APIs (#13220) 2021-09-15 23:02:16 -07:00
Harshavardhana ef4d023c85
fix: various performance improvements to tiering (#12965)
- deletes should always Sweep() for tiering at the
  end and does not need an extra getObjectInfo() call
- puts, copy and multipart writes should conditionally
  do getObjectInfo() when tiering targets are configured
- introduce 'TransitionedObject' struct for ease of usage
  and understanding.
- multiple-pools optimization deletes don't need to hold
  read locks verifying objects across namespace and pools.
2021-08-17 07:50:00 -07:00
Harshavardhana f9ae71fd17
fix: deleteMultiObjects performance regression (#12951)
fixes performance regression found in deleteObjects(),
putObject(), copyObject and completeMultipart calls.
2021-08-12 18:57:37 -07:00
Harshavardhana 8f2a3efa85
disallow sub-credentials based on root credentials to gain priviledges (#12947)
This happens because of a change added where any sub-credential
with parentUser == rootCredential i.e (MINIO_ROOT_USER) will
always be an owner, you cannot generate credentials with lower
session policy to restrict their access.

This doesn't affect user service accounts created with regular
users, LDAP or OpenID
2021-08-12 18:07:08 -07:00
Harshavardhana a2cd3c9a1d
use ParseForm() to allow query param lookups once (#12900)
```
cpu: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
BenchmarkURLQueryForm
BenchmarkURLQueryForm-4         247099363                4.809 ns/op           0 B/op          0 allocs/op
BenchmarkURLQuery
BenchmarkURLQuery-4              2517624               462.1 ns/op           432 B/op          4 allocs/op
PASS
ok      github.com/minio/minio/cmd      3.848s
```
2021-08-07 22:43:01 -07:00
Harshavardhana cdeccb5510
feat: Deprecate embedded browser and import console (#12460)
This feature also changes the default port where
the browser is running, now the port has moved
to 9001 and it can be configured with

```
--console-address ":9001"
```
2021-06-17 20:27:04 -07:00
Poorna Krishnamoorthy dbea8d2ee0
Add support for existing object replication. (#12109)
Also adding an API to allow resyncing replication when
existing object replication is enabled and the remote target
is entirely lost. With the `mc replicate reset` command, the
objects that are eligible for replication as per the replication
config will be resynced to target if existing object replication
is enabled on the rule.
2021-06-01 19:59:11 -07:00
Harshavardhana 1f262daf6f
rename all remaining packages to internal/ (#12418)
This is to ensure that there are no projects
that try to import `minio/minio/pkg` into
their own repo. Any such common packages should
go to `https://github.com/minio/pkg`
2021-06-01 14:59:40 -07:00
Harshavardhana fdc2020b10
move to iam, bucket policy from minio/pkg (#12400) 2021-05-29 21:16:42 -07:00
Harshavardhana 5b18c57a54
fix: for deleteBucket delete on dnsStore first (#12298)
attempt a delete on remote DNS store first before
attempting locally, because removing at DNS store
is cheaper than deleting locally, in case of
errors locally we can cheaply recreate the
bucket on dnsStore instead of.
2021-05-14 12:40:54 -07:00
Andreas Auernhammer a1f70b106f
sse: add support for SSE-KMS bucket configurations (#12295)
This commit adds support for SSE-KMS bucket configurations.
Before, the MinIO server did not support SSE-KMS, and therefore,
it was not possible to specify an SSE-KMS bucket config.

Now, this is possible. For example:
```
mc encrypt set sse-kms some-key <alias>/my-bucket
```

Further, this commit fixes an issue caused by not supporting
SSE-KMS bucket configuration and switching to SSE-KMS as default
SSE method.

Before, the server just checked whether an SSE bucket config was
present (not which type of SSE config) and applied the default
SSE method (which was switched from SSE-S3 to SSE-KMS).

This caused objects to get encrypted with SSE-KMS even though a
SSE-S3 bucket config was present.

This issue is fixed as a side-effect of this commit.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-14 00:59:05 -07:00
Andreas Auernhammer d8eb7d3e15
kms: replace KES client implementation with minio/kes (#12207)
This commit replaces the custom KES client implementation
with the KES SDK from https://github.com/minio/kes

The SDK supports multi-server client load-balancing and
requests retry out of the box. Therefore, this change reduces
the overall complexity within the MinIO server and there
is no need to maintain two separate client implementations.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-10 18:15:11 -07:00
Andreas Auernhammer af0c65be93
add SSE-KMS support and use SSE-KMS for auto encryption (#12237)
This commit adds basic SSE-KMS support.
Now, a client can specify the SSE-KMS headers
(algorithm, optional key-id, optional context)
such that the object gets encrypted using the
SSE-KMS method. Further, auto-encryption now
defaults to SSE-KMS.

This commit does not try to do any refactoring
and instead tries to implement SSE-KMS as a minimal
change to the code base. However, refactoring the entire
crypto-related code is planned - but needs a separate
effort.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
2021-05-06 15:24:01 -07:00
Harshavardhana 0eeb0a4e04 Revert "add SSE-KMS support and use SSE-KMS for auto encryption (#11767)"
This reverts commit 26f1fcab7d.
2021-05-05 15:20:46 -07:00
Andreas Auernhammer 26f1fcab7d
add SSE-KMS support and use SSE-KMS for auto encryption (#11767)
This commit adds basic SSE-KMS support.
Now, a client can specify the SSE-KMS headers
(algorithm, optional key-id, optional context)
such that the object gets encrypted using the
SSE-KMS method. Further, auto-encryption now
defaults to SSE-KMS.

This commit does not try to do any refactoring
and instead tries to implement SSE-KMS as a minimal
change to the code base. However, refactoring the entire
crypto-related code is planned - but needs a separate
effort.

Signed-off-by: Andreas Auernhammer <aead@mail.de>
Co-authored-by: Klaus Post <klauspost@gmail.com>
2021-05-05 11:24:14 -07:00
Harshavardhana f7a87b30bf Revert "deprecate embedded browser (#12163)"
This reverts commit 736d8cbac4.

Bring contrib files for older contributions
2021-04-30 08:50:39 -07:00
Harshavardhana b3c8a1864f
fix: optimize ListBuckets for anonymous users (#12182)
anonymous users are never allowed to listBuckets(),
we do not need to further validate the policy, we can
simply reject if credentials are empty.
2021-04-28 21:37:02 -07:00
Harshavardhana 736d8cbac4
deprecate embedded browser (#12163)
https://github.com/minio/console takes over the functionality for the
future object browser development

Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-27 10:52:12 -07:00
Krishnan Parthasarathi c829e3a13b Support for remote tier management (#12090)
With this change, MinIO's ILM supports transitioning objects to a remote tier.
This change includes support for Azure Blob Storage, AWS S3 compatible object
storage incl. MinIO and Google Cloud Storage as remote tier storage backends.

Some new additions include:

 - Admin APIs remote tier configuration management

 - Simple journal to track remote objects to be 'collected'
   This is used by object API handlers which 'mutate' object versions by
   overwriting/replacing content (Put/CopyObject) or removing the version
   itself (e.g DeleteObjectVersion).

 - Rework of previous ILM transition to fit the new model
   In the new model, a storage class (a.k.a remote tier) is defined by the
   'remote' object storage type (one of s3, azure, GCS), bucket name and a
   prefix.

* Fixed bugs, review comments, and more unit-tests

- Leverage inline small object feature
- Migrate legacy objects to the latest object format before transitioning
- Fix restore to particular version if specified
- Extend SharedDataDirCount to handle transitioned and restored objects
- Restore-object should accept version-id for version-suspended bucket (#12091)
- Check if remote tier creds have sufficient permissions
- Bonus minor fixes to existing error messages

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
Co-authored-by: Krishna Srinivas <krishna@minio.io>
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 11:58:53 -07:00
Harshavardhana 069432566f update license change for MinIO
Signed-off-by: Harshavardhana <harsha@minio.io>
2021-04-23 11:58:53 -07:00
Harshavardhana a334554f99
fix: add helper for expected path.Clean behavior (#12068)
current usage of path.Clean returns "." for empty strings
instead we need `""` string as-is, make relevant changes
as needed.
2021-04-15 16:32:13 -07:00
Poorna Krishnamoorthy 40409437cd
Add initial usage in GetBucketReplicationMetrics API (#11985) 2021-04-06 11:32:52 -07:00
Harshavardhana 09ee303244
add cluster support for realtime bucket stats (#11963)
implementation in #11949 only catered from single
node, but we need cluster metrics by capturing
from all peers. introduce bucket stats API that
will be used for capturing in-line bucket usage
as well eventually
2021-04-04 15:34:33 -07:00