Commit Graph

109 Commits

Author SHA1 Message Date
Klaus Post 6abe4128d7
Fix ILM expire workers exiting (#20578)
Fix expire workers exiting

Under 2 conditions ILM expire workers would exit, eventually causing all workers to terminate.
2024-10-23 08:35:37 -07:00
Klaus Post ed5ed7e490
Trace ILM errors (#20576)
Some paths would attempt transitions but in case of failures 
no traces would be emitted.

Add traces (with errors) when transition operations fail.
2024-10-22 14:10:34 -07:00
shandongzhejiang a8ff12bc72
chore: fix some comments (#20294)
Signed-off-by: shandongzhejiang <shandongzhejiang@icloud.com>
2024-08-21 13:14:24 -07:00
Harshavardhana e7a56f35b9
flatten out audit tags, do not send as free-form (#20256)
move away from map[string]interface{} to map[string]string
to simplify the audit, and also provide concise information.

avoids large allocations under load(), reduces the amount
of audit information generated, as the current implementation
was a bit free-form. instead all datastructures must be
flattened.
2024-08-13 15:22:04 -07:00
Klaus Post a2cab02554
Fix SSE-C checksums (#19896)
Compression will be disabled by default if SSE-C is specified. So we can still honor SSE-C.
2024-06-10 08:31:51 -07:00
Krishnan Parthasarathi 069c4015cd
Don't tier directory objects (#19891)
Directory objects are used by applications that simulate the folder
structure of an on-disk filesystem. These are zero-byte objects with names
ending with '/'. They are only used to check whether a 'folder' exists in
the namespace.
2024-06-07 08:43:17 -07:00
Klaus Post 0a63dc199c
Add trace sizes to more trace types (#19864)
Add trace sizes to

* ILM traces
* Replication traces
* Healing traces
* Decommission traces
* Rebalance traces
* (s)ftp traces
* http traces.
2024-06-03 08:45:54 -07:00
Harshavardhana aad50579ba
fix: wire up ILM sub-system properly for help (#19836) 2024-05-30 01:14:58 -07:00
Aditya Manthramurthy 5f78691fcf
ldap: Add user DN attributes list config param (#19758)
This change uses the updated ldap library in minio/pkg (bumped
up to v3). A new config parameter is added for LDAP configuration to
specify extra user attributes to load from the LDAP server and to store
them as additional claims for the user.

A test is added in sts_handlers.go that shows how to access the LDAP
attributes as a claim.

This is in preparation for adding SSH pubkey authentication to MinIO's SFTP
integration.
2024-05-24 16:05:23 -07:00
Harshavardhana 443c93c634
compute time spent in ILM properly (#19806) 2024-05-24 12:28:51 -07:00
Harshavardhana f65dd3e5a2
reload from drive tier-config when in-memory cache is not found (#19527)
avoid probing tier target while reloading() tier config
2024-04-16 22:09:58 -07:00
Anis Eleuch 95bf4a57b6
logging: Add subsystem to log API (#19002)
Create new code paths for multiple subsystems in the code. This will
make maintaing this easier later.

Also introduce bugLogIf() for errors that should not happen in the first
place.
2024-04-04 05:04:40 -07:00
Krishnan Parthasarathi 383489d5d9
Handle zero versions qualified for expiration (#19301)
When objects have more versions than their ILM policy expects to retain
via NewerNoncurrentVersions, but they don't qualify for expiry due to
NoncurrentDays are configured in that rule. 

In this case, applyNewerNoncurrentVersionsLimit method was enqueuing empty 
tasks, which lead to a panic (panic: runtime error: index out of range [0] with
length 0) in newerNoncurrentTask.OpHash method, which assumes the task
to contain at least one version to expire.
2024-03-19 20:10:58 -07:00
Krishnan Parthasarathi b69bcdcdc4
Fix ilm config at startup (#19189)
Remove api.expiration_workers config setting which was inadvertently left behind. Per review comment 

https://github.com/minio/minio/pull/18926, expiration_workers can be configured via ilm.expiration_workers.
2024-03-04 18:50:24 -08:00
Krishnan Parthasarathi a7577da768
Improve expiration of tiered objects (#18926)
- Use a shared worker pool for all ILM expiry tasks
- Free version cleanup executes in a separate goroutine
- Add a free version only if removing the remote object fails
- Add ILM expiry metrics to the node namespace
- Move tier journal tasks to expiryState
- Remove unused on-disk journal for tiered objects pending deletion
- Distribute expiry tasks across workers such that the expiry of versions of
  the same object serialized
- Ability to resize worker pool without server restart
- Make scaling down of expiryState workers' concurrency safe; Thanks
  @klauspost
- Add error logs when expiryState and transition state are not
  initialized (yet)
* metrics: Add missed tier journal entry tasks
* Initialize the ILM worker pool after the object layer
2024-03-01 21:11:03 -08:00
Harshavardhana 9a012a53ef
initialize the disk healer early on (#19143)
This PR fixes a bug that perhaps has been long introduced,
with no visible workarounds. In any deployment, if an entire
erasure set is deleted, there is no way the cluster recovers.
2024-02-27 23:02:14 -08:00
Krishnan Parthasarathi ee158e1610
ilm: Update action count only on success (#19093)
It also fixes a long-standing bug in expiring transitioned objects.
The expiration action was deleting the current version in the case'
of tiered objects instead of adding a delete marker.
2024-02-22 15:00:32 -08:00
Harshavardhana 1d3bd02089
avoid close 'nil' panics if any (#18890)
brings a generic implementation that
prints a stack trace for 'nil' channel
closes(), if not safely closes it.
2024-01-28 10:04:17 -08:00
Krishnan Parthasarathi 56b7045c20
Export tier metrics (#18678)
minio_node_tier_ttlb_seconds - Distribution of time to last byte for streaming objects from warm tier
minio_node_tier_requests_success - Number of requests to download object from warm tier that were successful
minio_node_tier_requests_failure - Number of requests to download object from warm tier that failed
2023-12-20 20:13:40 -08:00
Harshavardhana e98172d72d
avoid hot-tier SLA to be tied to warm-tier SLA (#18581)
it is okay if the warm-tier cannot keep up, we should continue
to take I/O at hot-tier, only fail hot-tier or block it when
we are disk full.

Bonus: add metrics counter for these missed tasks, we will
know for sure if one of the node is lagging behind or is
losing too many tasks during transitioning.
2023-12-02 13:02:12 -08:00
Harshavardhana 506f121576
remove frivolous logging in transition object (#18526)
AWS S3 closes keep-alive connections frequently
leading to frivolous logs filling up the MinIO
logs when the transition tier is an AWS S3 bucket.

Ignore such transient errors, let MinIO retry
it when it can.
2023-11-26 22:18:09 -08:00
Krishnan Parthasarathi a93214ea63
ilm: ObjectSizeLessThan and ObjectSizeGreaterThan (#18500) 2023-11-22 13:42:39 -08:00
Anis Eleuch 1bb7a2a295
Immediate transition ILM to avoid quick deferring to the scanner (#18475)
Immediate transition use case and is mostly used to fill warm
backend with a lot of data when a new deployment is created

Currently, if the transition queue is complete, the transition will be
deferred to the scanner; change this behavior by blocking the PUT request
until the transition queue has a new place for a transition task.
2023-11-17 16:16:46 -08:00
Harshavardhana 80adc87a14
converge WARM tier object name to hash of deployment+bucket (#18410)
this is to ensure that we can converge and save IOPs
when hot-tier accesses MinIO.
2023-11-10 02:15:13 -08:00
Klaus Post 7926df0b80
Fix globalDeploymentID race (#18275)
globalDeploymentID was being read while it was being set.

Fixes race:

```
WARNING: DATA RACE
Write at 0x0000079605a0 by main goroutine:
  github.com/minio/minio/cmd.connectLoadInitFormats()
      github.com/minio/minio/cmd/prepare-storage.go:269 +0x14f0
  github.com/minio/minio/cmd.waitForFormatErasure()
      github.com/minio/minio/cmd/prepare-storage.go:294 +0x21d
...

Previous read at 0x0000079605a0 by goroutine 105:
  github.com/minio/minio/cmd.newContext()
      github.com/minio/minio/cmd/utils.go:817 +0x31e
  github.com/minio/minio/cmd.adminMiddleware.func1()
      github.com/minio/minio/cmd/admin-router.go:110 +0x96
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
  github.com/minio/minio/cmd.setBucketForwardingMiddleware.func1()
      github.com/minio/minio/cmd/generic-handlers.go:460 +0xb1a
  net/http.HandlerFunc.ServeHTTP()
      net/http/server.go:2136 +0x47
...
```
2023-10-18 08:06:57 -07:00
Aditya Manthramurthy 1c99fb106c
Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Krishnan Parthasarathi 53abd25116
Don't log when object to be tiered is not found (#17924) 2023-08-25 23:34:16 -07:00
Anis Eleuch 1664fd8bb1
Avoid logging errors twice during transitioned objects expiration (#17782) 2023-08-02 09:06:03 -07:00
Klaus Post ff5988f4e0
Reduce allocations (#17584)
* Reduce allocations

* Add stringsHasPrefixFold which can compare string prefixes, while ignoring case and not allocating.
* Reuse all msgp.Readers
* Reuse metadata buffers when not reading data.

* Make type safe. Make buffer 4K instead of 8.

* Unslice
2023-07-06 16:02:08 -07:00
Aditya Manthramurthy 5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Krishnan Parthasarathi 62df731006
Add updatedAt for GetBucketLifecycleConfig (#17271) 2023-05-24 22:52:39 -07:00
Krishnan Parthasarathi 3e128c116e
Add lifecycle event source to audit log tags (#17248) 2023-05-22 15:28:56 -07:00
jiuker fd2959fa3a
fix: workers.New err must be returned (#17208) 2023-05-16 08:08:00 -07:00
Krishnan Parthasarathi 0ec722bc54
Add tags to NewerNoncurrentVersions audit event (#17110) 2023-05-02 12:56:33 -07:00
Krishnan Parthasarathi e7cac8acef
Add tags to auditLogLifecycle (#17081) 2023-04-26 17:49:00 -07:00
Praveen raj Mani 72802a5972
Use 'minio/pkg/sync/errgroup' and 'minio/pkg/workers' (#17069) 2023-04-25 22:57:40 -07:00
Poorna cd6dec49c0
Add trace support for ilm activity (#16993) 2023-04-11 19:22:32 -07:00
Harshavardhana c06e0bfef9
set correct `Host:` value for replication event notification (#16984) 2023-04-06 10:20:53 -07:00
Harshavardhana 7a6c4e438e
allow more workers for ILM expiration (#16924) 2023-03-30 10:47:15 -07:00
Harshavardhana b66d7dc708
add missing x-amz-id-2 to event notification date (#16646) 2023-02-20 15:41:47 +05:30
Krishnan Parthasarathi d136ac0596
Don't close transition task channel on server exit (#16627) 2023-02-15 22:09:25 -08:00
Krishnan Parthasarathi 9de26531e4
tiering: UpdateWorkers may be called before Init (#16573) 2023-02-08 19:13:34 -08:00
Krishnan Parthasarathi 990fc415f7
Ensure safety of transitionState at startup (#16563) 2023-02-07 23:11:42 -08:00
Krishnan Parthasarathi cea2ca8c8e
Add restore-status header for multipart objects (#16508) 2023-01-31 07:53:45 +05:30
Klaus Post a713aee3d5
Run staticcheck on CI (#16170) 2022-12-05 11:18:50 -08:00
Krishnan Parthasarathi 6eef9b4a23
lifecycle: simplify Eval and HasActiveRules (#16036) 2022-11-10 07:17:45 -08:00
Harshavardhana 23b329b9df
remove gateway completely (#15929) 2022-10-24 17:44:15 -07:00
Anis Elleuch ac85c2af76
lifecycle: refactor rules filtering and tagging support (#15914) 2022-10-21 10:46:53 -07:00
Harshavardhana 228c6686f8
allow non-standards fallback for all http.TimeFormats (#15662)
fixes #15645
2022-09-07 07:24:54 -07:00
Krishnan Parthasarathi 3a1d3a7952
audit-log: Add time to get/restore object from remote-tier (#15602) 2022-08-29 21:33:59 -07:00