Commit Graph

5497 Commits

Author SHA1 Message Date
Praveen raj Mani 0285df5a02
fix: prioritize audit_webhook and logger_webhook ENVs over the config KVS (#17783) 2023-08-03 02:47:07 -07:00
Harshavardhana 45fb375c41
allow healing to prefer local disks over remote (#17788) 2023-08-03 02:18:18 -07:00
Harshavardhana 4a4950fe41
fix: honor requested allow origin settings properly (#17789)
fixes #17778
2023-08-02 20:41:21 -07:00
Anis Eleuch 1664fd8bb1
Avoid logging errors twice during transitioned objects expiration (#17782) 2023-08-02 09:06:03 -07:00
Harshavardhana 21cdd2bf5d
avoid overwriting metrics on success, save it in defer (#17780) 2023-08-01 22:19:56 -07:00
Harshavardhana 0153f96a20
add deadlines for readMetadata() in listing (#17776)
Bonus: also skip spending time looking for xl.json

- Listing()
- Delete()
2023-08-01 21:52:31 -07:00
Harshavardhana a7a7533190
add new errors for Disks with timeouts (#17770) 2023-08-01 12:47:50 -07:00
Poorna 311380f8cb
replication resync: fix queueing (#17775)
Assign resync of all versions of object to the same worker to avoid locking
contention. Fixes parallel resync implementation in #16707
2023-08-01 11:51:15 -07:00
Harshavardhana b0f0e53bba
fix: make sure to correctly initialize health checks (#17765)
health checks were missing for drives replaced since

- HealFormat() would replace the drives without a health check
- disconnected drives when they reconnect via connectEndpoint()
  the loop also loses health checks for local disks and merges
  these into a single code.
- other than this separate cleanUp, health check variables to avoid
  overloading them with similar requirements.
- also ensure that we compete via context selector for disk monitoring
  such that the canceled disks don't linger around longer waiting for
  the ticker to trigger.
- allow disabling active monitoring.
2023-08-01 10:54:26 -07:00
Klaus Post 004f1e2f66
Fix trailing header signature mismatch (#17774)
Seems like clients may omit a newline at the end of the trailer chunk. Each header should end with a newline. Add that if missing.

Fixes #17662
2023-08-01 08:45:57 -07:00
Harshavardhana 2fa561f22e
do not crash on invalid metric values (#17764)
```
minio[1032735]: panic: label value "\xc0.\xc0." is not valid UTF-8
minio[1032735]: goroutine 1781101 [running]:
minio[1032735]: github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
```

log such errors for investigation
2023-08-01 00:55:39 -07:00
Harshavardhana 81be718674
fix: optimize DiskInfo() call avoid metrics when not needed (#17763) 2023-07-31 15:20:48 -07:00
Sho Ce 49a1e2f98e
update-notifier.go: misleading version age message (#17750) 2023-07-31 08:36:19 -07:00
Klaus Post 684c46369c
Send events for extracted objects (#17760)
Fixes #17759
2023-07-31 08:33:51 -07:00
Harshavardhana 73edd5b8fd
introduce 'mc admin config set alias/ api odirect=on' (#17753)
change disable_odirect=off -> odirect=on to make it
easier to understand, instead of making it double
negative.
2023-07-31 00:12:53 -07:00
Harshavardhana 5e5bdf5432
capture total errors data availability and any timeout errors (#17748) 2023-07-29 23:26:26 -07:00
Harshavardhana f13cfcb83e
allow disabling O_DIRECT for write ops (#17751)
on really slow systems, O_DIRECT simply kills the drives
allow for a way to disable them.
2023-07-29 15:17:56 -07:00
Harshavardhana 731e03fe5a
add ReadFileStream deadline for disk call (#17745)
timeout the reader side if hung via disk max timeout
2023-07-28 15:37:53 -07:00
Anis Eleuch 7057d00a28
s3: Return invalid bucket name the first thing in all S3 calls (#17742) 2023-07-28 10:49:20 -07:00
Harshavardhana 114fab4c70
export cluster health as prometheus metrics (#17741) 2023-07-28 01:16:53 -07:00
ruspaul013 a92cb66468
Get the signed headers in the order they were signed (#17690)
use pSignValues to get signed headers in order
2023-07-27 11:45:30 -07:00
ruspaul013 535f97ba61
check if metadata headers/url values are equal with signed headers (#17737) 2023-07-27 11:44:56 -07:00
drivebyer 14ebd82dbd
fix: missing disk metrics when query metric api from peer (#17738) 2023-07-27 11:44:13 -07:00
Harshavardhana 47dcfcbdd4
introduce deadlines on READ operations (#17724) 2023-07-27 07:33:05 -07:00
Krishnan Parthasarathi bf3901342c
Include SuccessorModTime for FileInfo quorum (#17732) 2023-07-26 17:04:16 -07:00
Harshavardhana b28bcad11b
avoid Access() calls on known bucket paths (#17719) 2023-07-26 11:31:40 -07:00
Harshavardhana a7c71e4c6b
protect disk monitoring to avoid busy loop configuration (#17723) 2023-07-25 20:02:22 -07:00
Poorna 1a42693d68
replication: limit larger uploads to a subset of workers (#17687)
Limit large uploads (> 128MiB)  to a max of 10 workers, intent is to avoid
larger uploads from using all replication bandwidth, giving room for smaller
uploads to sync faster.
2023-07-25 20:02:02 -07:00
Harshavardhana e7b60c4d65
Add slow drive timeouts to match with active disk monitoring (#17701)
allow active disk-monitoring to be configurable, and use
these add deadlines in various call layers for various
syscalls.
2023-07-25 16:58:31 -07:00
Poorna f95129894d
Use decrypted object size while computing object size summary (#17717)
Corrects an issue with encrypted versioned objects being reported under
`unversioned` bin in the object version histogram
2023-07-24 17:13:25 -07:00
Harshavardhana c32c71c836
allow DNS cache TTL to be configurable (#17709)
this is added for now as a hidden variable
2023-07-24 15:13:35 -07:00
Harshavardhana 14e1ace552
remove serializing WalkDir() across all buckets/prefixes on SSDs (#17707)
slower drives get knocked off because they are too slow via 
active monitoring, we do not need to block calls arbitrarily.

Serializing adds latencies for already slow calls, remove
it for SSDs/NVMEs

Also, add a selection with context when writing to `out <-`
channel, to avoid any potential blocks.
2023-07-24 09:30:19 -07:00
drivebyer a7fb3a3853
fix: Create metrics slice when necessary in getCacheMetrics() (#17711) 2023-07-24 08:40:21 -07:00
Klaus Post 2da4bd5f1a
Revert "don't error when asked for 0-based range on empty objects (#17708) (#17713)
Revert "don't error when asked for 0-based range on empty objects (#17708)"

This reverts commit 7e76d66184.

There is no valid way to specify offsets in a 0-byte file. Blame it on the [RFC](https://datatracker.ietf.org/doc/html/rfc7233#section-4.4)

> The 416 (Range Not Satisfiable) status code indicates that none of the ranges in the 
> request's Range header field (Section 3.1) overlap the current extent of the selected resource...

A request for "bytes=0-" is a request for the first byte of a resource. If the resource is 0-length, 
the range [0,0] does not overlap the resource content and the server responds with an error.
2023-07-24 07:56:28 -07:00
flisk 7e76d66184
don't error when asked for 0-based range on empty objects (#17708)
In a reverse proxying setup, a proxy in front of MinIO may attempt to
request objects in slices for enhanced cache efficiency. Since such a
a proxy cannot have prior knowledge of how large a requested resource is,
it usually sends a header of the form:

        Range: 0-$slice_size

... and, depending on the size of the resource, expects either:

- an empty response, if $resource_size == 0
- a full response, if $resource_size <= $slice_size
- a partial response, if $resource_size > $slice_size

Prior to this change, MinIO would respond 416 Range Not Satisfiable if a
client tried to request a range on an empty resource. This behavior is
technically consistent with RFC9110[1] – However, it renders sliced
reverse proxying, such as implemented in Nginx, broken in the case of
empty files. Nginx itself seems to break this convention to enable
"useful" responses in these cases, and MinIO should probably do that
too.

[1]: https://www.rfc-editor.org/rfc/rfc9110#byte.ranges
2023-07-23 00:10:03 -07:00
Harshavardhana 7764f4a8e3
return tags as part of Head/Get calls (#17635)
AWS S3 only returns the number of tag
counts, along with that we must return
the tags as well to avoid another metadata
call to the server.
2023-07-22 07:19:43 -07:00
Kaan Kabalak 6624f970c0
Fix spelling of 'already' across repository (#17703) 2023-07-21 08:45:08 -07:00
Harshavardhana 331bdc2245
fix: remove CompleteMultipartUpload() 200 OK response for blocking calls (#17699)
sending whitespace character with CompleteMultipartUpload()
with 200 OK was an AWS S3 compatible implementation detail,
and it was expected that the client SDK must look for both
successful XML as well as error XML for 200 OK.

But this is not useful anymore on MinIO, since we do not
have any large delayed coalescing of parts anymore.
2023-07-20 22:14:38 -07:00
Harshavardhana e12ab486a2
avoid using os.Getenv for internal code, use env.Get() instead (#17688) 2023-07-20 07:52:49 -07:00
Krishnan Parthasarathi 9eeee92d36
Add deletemarker_total metric (#17689) 2023-07-20 07:52:32 -07:00
Anis Eleuch 756d6aa729
fix: report correct pool/set/disk indexes for offline disks (#17695) 2023-07-20 07:48:21 -07:00
Harshavardhana bddd53d6d2
fix: retry listing in decommissioning if it fails perpetually (#17682) 2023-07-19 13:09:37 -07:00
jiuker a99cd825ab
fix: byHost realTime metrics API (#17681) 2023-07-18 23:50:30 -07:00
Harshavardhana 6426b74770
move bucket centric metrics to /minio/v2/metrics/bucket handlers (#17663)
users/customers do not have a reasonable number of buckets anymore,
this is why we must avoid overpopulating cluster endpoints, instead
move the bucket monitoring to a separate endpoint.

some of it's a breaking change here for a couple of metrics, but
it is imperative that we do it to improve the responsiveness of
our Prometheus cluster endpoint.

Bonus: Added new cluster metrics for usage, objects and histograms
2023-07-18 22:25:12 -07:00
Harshavardhana 4f257bf1e6
pick internode interface properly via globalLocalNodeName (#17680)
current code will not pick the right interface name
if --address or --interface is not provided.
2023-07-18 19:18:11 -07:00
Krishnan Parthasarathi 0120ff93bc
admin-info: add DeleteMarkers count (#17659) 2023-07-18 10:49:40 -07:00
Anis Eleuch 49638fa533
s3: Delete Bucket should not recreate bucket if it does not exist (#17676)
Also return Bucket Not Found error in the same use case.
2023-07-18 09:32:19 -07:00
Shubhendu 7a3a7b19e5
Added a start script to inspect command output (#17591)
Using this script, post decrypt we should be able to bring up the
MinIO instance with same configuration.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-07-17 14:15:28 -07:00
Harshavardhana 24e86d0c59
avoid passing around poolIdx, setIdx instead pass the relevant disks (#17660) 2023-07-17 09:52:05 -07:00
jiuker d118031ed6
fix: when Origin: null is set return back '*' for allow origins (#17651) 2023-07-15 12:15:06 -07:00
Anis Eleuch 341a89c00d
return a descriptive error when loading any IAM item fails (#17654)
Sometimes IAM fails to load certain items, which could be a user, 
a service account or a policy but with not enough information for 
us to debug.

This commit will create a more descriptive error to make it easier to
debug in such situations.
2023-07-14 20:17:14 -07:00
Anis Eleuch df29d25e6b
return different status code for internode communication (#17655)
mc admin trace -a will be able to quickly show
401 Unauthorized header to pinpoint trivial issues
between nodes, such as wrong root 
credentials and skewed time.
2023-07-14 18:34:55 -07:00
Harshavardhana 3e196fa7b3
fix: ILM newer noncurrent version limit must return correct versions (#17652)
objects/versions that are not expired via NewerNoncurrentVersions
must be properly returned to be applied under further ILM actions.

this would cause legitimately expired objects to be missed
from expiration.
2023-07-14 16:42:35 -07:00
drivebyer 04c792476f
fix: provide a possible slice cap for heal failed metrics items (#17647)
Signed-off-by: Wu <yang.wu@daocloud.io>
2023-07-14 11:02:45 -07:00
Harshavardhana 005a4a275a
add more bootstrap messages to provide latency (#17650)
- simplify refreshing bucket metadata, wait() to
  depend on how fast the bucket metadata can load.

- simplify resync to start resync in single pass.
2023-07-14 04:00:29 -07:00
Harshavardhana bdddf597f6
shuffle buckets randomly before being scanned (#17644)
this randomness is needed to avoid scanning
the same buckets across different erasure sets,
in the same order.

allow random buckets to be scanned instead
allowing a wider spread of ILM, replication
checks.

Additionally do not loop over twice to fill
the channel, fill the channel regardless of
having bucket new or old.
2023-07-14 02:25:40 -07:00
Aditya Manthramurthy bb6921bf9c
Send AuditLog via new middleware fn for admin APIs (#17632)
A new middleware function is added for admin handlers, including options
for modifying certain behaviors. This admin middleware:

- sets the handler context via reflection in the request and sends AuditLog
- checks for object API availability (skipping it if a flag is passed)
- enables gzip compression (skipping it if a flag is passed)
- enables header tracing (adding body tracing if a flag is passed)

While the new function is a middleware, due to the flags used for
conditional behavior modification, which is used in each route registration
call.

To try to ensure that no regressions are introduced, the following
changes were done mechanically mostly with `sed` and regexp:

- Remove defer logger.AuditLog in admin handlers
- Replace newContext() calls with r.Context()
- Update admin routes registration calls

Bonus: remove unused NetSpeedtestHandler

Since the new adminMiddleware function checks for object layer presence
by default, we need to pass the `noObjLayerFlag` explicitly to admin
handlers that should work even when it is not available. The following
admin handlers do not require it:

- ServerInfoHandler
- StartProfilingHandler
- DownloadProfilingHandler
- ProfileHandler
- SiteReplicationDevNull
- SiteReplicationNetPerf
- TraceHandler

For these handlers adminMiddleware does not check for the object layer
presence (disabled by passing the `noObjLayerFlag`), and for all other
handlers, the pre-check ensures that the handler is not called when the
object layer is not available - the client would get a
ErrServerNotInitialized and can retry later.

This `noObjLayerFlag` is added based on existing behavior for these
handlers only.
2023-07-13 14:52:21 -07:00
Klaus Post 4f89e5bba9
Add active disk health checks (#17539)
Add check every 2 minutes to see if a write+read operation can complete.

If disk is unresponsive for 2 minutes or returns errFaultyDisk, take it offline.
2023-07-13 11:41:55 -07:00
jiuker 183428db03
fear: Implement 'mc support top net' (#17598) 2023-07-13 11:41:19 -07:00
Shireesh Anjal fc6d873758
Use os.ReadFile instead of ioutil.ReadFile (#17649)
ioutil.ReadFile is deprecated and also doesn't work with certain kinds
of symlinks.
2023-07-13 09:07:10 -07:00
Poorna 5e2f8d7a42
replication: Simplify mrf requeueing and add backlog handler (#17171)
Simplify MRF queueing and add backlog handler

- Limit re-tries to 3 to avoid repeated re-queueing. Fall offs
to be re-tried when the scanner revisits this object or upon access.

- Change MRF to have each node process only its MRF entries.

- Collect MRF backlog by the node to allow for current backlog visibility
2023-07-12 23:51:33 -07:00
Shubhendu 9b9871cfbb
Added `endpoint` and `versions` attributes to KMS details (#17350)
Now it would list details of all KMS instances with additional
attributes `endpoint` and `version`. In the case of k8s-based
deployment the list would consist of a single entry.

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-07-12 23:50:38 -07:00
guangwu f80b6926d3
chore: fix minor issues reported via staticcheck (#17639) 2023-07-12 20:33:11 -07:00
Shubhendu 6dc55fe5ed
Corrected the API name for audit logging purpose (#17642)
This would better to record the correct API name so that
any verification around audit logs to figure out if required
APIs are called required no of times, would be correct.
Here in this case of policy attached, API `AttachDetachPolicyBuiltin`
would be called with `requestPath` as `/minio/admin/v3/idp/builtin/policy/attach`
and in case of detach policy the value would be `/minio/admin/v3/idp/builtin/policy/detach`

Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
2023-07-12 15:38:49 -07:00
Harshavardhana 2d1cda2061
fix: do not os.Exit(1) while writing goroutines during shutdown (#17640)
Also shutdown poll add jitter, to verify if the shutdown
sequence can finish before 500ms, this reduces the overall
time taken during "restart" of the service.

Provides speedup for `mc admin service restart` during
active I/O, also ensures that systemd doesn't treat the
returned 'error' as a failure, certain configurations in
systemd can cause it to 'auto-restart' the process by-itself
which can interfere with `mc admin service restart`.

It can be observed how now restarting the service is
much snappier.
2023-07-12 07:18:30 -07:00
Harshavardhana a566bcf613
treat 0-byte objects to honor same quorum as delete marker (#17633)
on unversioned buckets its possible that 0-byte objects
might lose quorum on flaky systems, allow them to be same
as DELETE markers. Since practically speak they have no
content.
2023-07-11 21:53:49 -07:00
Klaus Post 9885a0a6af
Fix hasSpaceFor in SNSD setup (#17630)
If drive is offline or filled we divide by 0.

Fixes #17629

Bonus: Reject when any valid disk exceeds minimum inode threshold.
2023-07-11 14:29:34 -07:00
Kaan Kabalak f64d62b01d
Fix style of logOnceIf calls w/unique identifiers (#17631) 2023-07-11 13:17:45 -07:00
Harshavardhana 82075e8e3a
use strconv variants to improve on performance per 'op' (#17626)
```
BenchmarkItoa
BenchmarkItoa-8         	673628088	         1.946 ns/op	       0 B/op	       0 allocs/op
BenchmarkFormatInt
BenchmarkFormatInt-8    	592919769	         2.012 ns/op	       0 B/op	       0 allocs/op
BenchmarkSprint
BenchmarkSprint-8       	26149144	        49.06 ns/op	       2 B/op	       1 allocs/op
BenchmarkSprintBool
BenchmarkSprintBool-8   	26440180	        45.92 ns/op	       4 B/op	       1 allocs/op
BenchmarkFormatBool
BenchmarkFormatBool-8   	1000000000	         0.2558 ns/op	       0 B/op	       0 allocs/op
```
2023-07-11 07:46:58 -07:00
Harshavardhana 5b7c83341b
move per bucket metrics to peer location (#17627) 2023-07-11 07:46:24 -07:00
Poorna fb49aead9b
replication: add validation API (#17520)
To check if replication is set up properly on a bucket.
2023-07-10 20:09:20 -07:00
Aditya Manthramurthy 85f5700e4e
fix: missing audit logger call for some admin APIs (#17623) 2023-07-10 16:59:44 -07:00
Aditya Manthramurthy 43b3c093ef
Fix: set request id in trace context properly (#17622) 2023-07-10 15:40:44 -07:00
Kaan Kabalak bd6842d917
Further print log messages once per error (#17618) 2023-07-10 07:59:57 -07:00
Poorna e8c98c3246
Avoid extra GetObjectInfo call in DeleteObject API (#17599)
Optimize DeleteObject API to avoid extra 
GetObjectInfo call on the replicating side.

For receiving side, it is just a regular
DeleteObject call.

Bonus: Fix a corner case where version purged is 
absent on target (either due to replication not yet
complete or target version already deleted in a
one-way replication or when replication was disabled). 

In such cases, mark version purge complete.
2023-07-10 07:57:56 -07:00
Harshavardhana dfd7cca0d2
fix: allow cancel of decom only when its in progress (#17607) 2023-07-10 07:55:38 -07:00
Harshavardhana f6186965c3
honor DeleteAllVersions in list(), head() calls (#17604) 2023-07-08 15:42:10 -07:00
Harshavardhana 28a01f0320
update missing license header in files (#17603) 2023-07-08 10:42:05 -07:00
Anis Eleuch 6d0bc5ab1e
prometheus: Fix internode stats (#17594)
Internode calculation was done inside S3 handlers, fix it by moving it
to internode handlers.

Remove admin stats since it is not used.
2023-07-08 07:35:11 -07:00
Aditya Manthramurthy 7af78af1f0
fix: set request ID in tracing context key (#17602)
Since `addCustomerHeaders` middleware was after the `httpTracer`
middleware, the request ID was not set in the http tracing context. By
reordering these middleware functions, the request ID header becomes
available. We also avoid setting the tracing context key again in
`newContext`.

Bonus: All middleware functions are renamed with a "Middleware" suffix
to avoid confusion with http Handler functions.
2023-07-08 07:31:42 -07:00
Harshavardhana abb1f22057 Revert "change ttfb_distribution metrics to histogramMetric (#17115)"
This reverts commit 9112ca4e29.
2023-07-07 13:57:37 -07:00
Harshavardhana f41edb23e2
add variadic delays in peer notification retries (#17592)
just adds more `jitter` in our retries to avoid
burst flooding for peer calls.
2023-07-07 07:47:38 -07:00
Klaus Post e20aab25ec
Check for progress before we reach the limit (#17552) 2023-07-07 00:13:57 -07:00
Klaus Post ff5988f4e0
Reduce allocations (#17584)
* Reduce allocations

* Add stringsHasPrefixFold which can compare string prefixes, while ignoring case and not allocating.
* Reuse all msgp.Readers
* Reuse metadata buffers when not reading data.

* Make type safe. Make buffer 4K instead of 8.

* Unslice
2023-07-06 16:02:08 -07:00
jiuker c47ff44f5e
fix: disable site network test if site replication is disabled (#17579) 2023-07-06 09:19:14 -07:00
Harshavardhana 8af0773baf
remove deprecated Content-Security-Policy (#17580)
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/block-all-mixed-content
2023-07-06 09:18:38 -07:00
jiuker 2dbb1cff4a
feat: support perf site replication (#17477) 2023-07-05 22:28:26 -07:00
Klaus Post 6efcf9c982
Do lockless last minute latency metrics (#17576)
Collect metrics in one second and accumulate lockless before sending upstream.
2023-07-05 10:40:45 -07:00
Harshavardhana 0bc34952eb
fix: under FanOut API avoid repeated md5sum calculation (#17572)
md5sum calculation has a high CPU overhead, avoid calculating
it repeatedly for similar fanOut calls.

To fix following CPU profiler result
```
(pprof) top10
Showing nodes accounting for 678.68s, 84.67% of 801.54s total
Dropped 1072 nodes (cum <= 4.01s)
Showing top 10 nodes out of 156
      flat  flat%   sum%        cum   cum%
   332.54s 41.49% 41.49%    332.54s 41.49%  runtime/internal/syscall.Syscall6
   228.39s 28.49% 69.98%    228.39s 28.49%  crypto/md5.block
    48.07s  6.00% 75.98%     48.07s  6.00%  runtime.memmove
    28.91s  3.61% 79.59%     28.91s  3.61%  github.com/minio/highwayhash.updateAVX2
     8.25s  1.03% 80.61%      8.25s  1.03%  runtime.futex
     8.25s  1.03% 81.64%     10.81s  1.35%  runtime.step
     6.99s  0.87% 82.52%     22.35s  2.79%  runtime.pcvalue
     6.67s  0.83% 83.35%     38.90s  4.85%  runtime.mallocgc
     5.77s  0.72% 84.07%     32.61s  4.07%  runtime.gentraceback
     4.84s   0.6% 84.67%     10.49s  1.31%  runtime.lock2
```
2023-07-05 03:16:05 -07:00
Harshavardhana e37c4efc6e
fix: upon DNS refresh() failure use previous values (#17561)
DNS refresh() in-case of MinIO can safely re-use
the previous values on bare-metal setups, since
bare-metal arrangements do not change DNS in any 
manner commonly.

This PR simplifies that, we only ever need DNS caching
on bare-metal setups.

- On containerized setups do not enable DNS
  caching at all, as it may have adverse effects on
  the overall effectiveness of k8s DNS systems.

  k8s DNS systems are dynamic and expect applications
  to avoid managing DNS caching themselves, instead
  provide a cleaner container native caching
  implementations that must be used.

- update IsDocker() detection, including podman runtime

- move to minio/dnscache fork for a simpler package
2023-07-03 12:30:51 -07:00
Anis Eleuch 15fd5ce2fa
fix: A typo in per pool make/delete bucket errs calculation (#17553) 2023-07-03 09:47:40 -07:00
Harshavardhana 7f782983ca
fix: for FTP server driver allow implicit trust of TLS (#17541)
fixes #17535
2023-06-30 08:04:13 -07:00
Aditya Manthramurthy 9d628346eb
fix: service account list for root user (#17547)
Fixes https://github.com/minio/minio/issues/17545
2023-06-30 08:02:12 -07:00
Aditya Manthramurthy bde533a9c7
fix: OpenID config initialization (#17544)
This is due to a regression in the handling of the enable key in OpenID
configuration.
2023-06-29 23:38:26 -07:00
Harshavardhana aae6846413
feat: allow expiration of all versions via ILM Expiration action (#17521)
Following extension allows users to specify immediate purge of
all versions as soon as the latest version of this object has
expired.

```
<LifecycleConfiguration>
    <Rule>
        <ID>ClassADocRule</ID>
        <Filter>
           <Prefix>classA/</Prefix>
        </Filter>
        <Status>Enabled</Status>
        <Expiration>
             <Days>3650</Days>
	     <ExpiredObjectAllVersions>true</ExpiredObjectAllVersions>
        </Expiration>
    </Rule>
    ...
```
2023-06-28 22:12:28 -07:00
Harshavardhana 5317a0b755
fix: support LDAP settings properly in ftp/sftp (#17536)
Bonus this PR enhances and supports creating
buckets via ftp `mkdir`

fixes #17526
2023-06-28 13:15:21 -07:00
Harshavardhana 73de721a63
fix: handle copyObjectPart encryption properly (#17530)
- look for requested encryption while compressing
  not just via HTTP Headers, but also via multipart
  metadata

- look for SSE-S3 etag decryption not just via HTTP
  Headers, but also via multipart metadata

fixes #17519
2023-06-28 09:43:50 -07:00
Harshavardhana d2f5c3621f
fix: add additional decommission traces for ILM expired content (#17522)
current decommission traces were missing for

- Skipped ILM expired versions
- Skipped single DELETE marked version
- A success or failure in decommissioning DELETE marker
- allow additional info to be shared in DecomStatus() API
2023-06-27 11:59:40 -07:00
Harshavardhana 1818764840
fix: bug in passing Versioned field set for getHealReplicationInfo() (#17498)
Bonus: rejects prefix deletes on object-locked buckets earlier
2023-06-27 09:45:50 -07:00
Harshavardhana d3e5e607a7
allow site-replication checks to work on non-distributed setups (#17524)
fixes #17523
2023-06-27 09:23:50 -07:00
Shireesh Anjal c1943ea3af
Capture realtime metrics in health report (#17516) 2023-06-27 01:39:18 -07:00
guangwu 87b6fb37d6
chore: pkg imported more than once (#17444) 2023-06-26 09:21:29 -07:00
Kaan Kabalak 21fbe88e1f
Print certain log messages once per error (#17484) 2023-06-24 20:29:13 -07:00
Harshavardhana 1f8b9b4bd5
fix: do not listAndHeal() inline with PutObject() (#17499)
there is a possibility that slow drives can actually add latency
to the overall call, leading to a large spike in latency.

this can happen if there are other parallel listObjects()
calls to the same drive, in-turn causing each other to sort
of serialize.

this potentially improves performance and makes PutObject()
also non-blocking.
2023-06-24 19:31:04 -07:00
Klaus Post 216069d0da
Remove 'null' version ID from directory object response (#17495)
Fixes #17494

Regression from #17132
2023-06-23 13:26:00 -07:00
Harshavardhana eefa047974
fix: keep decommission in a go-routine (#17496)
This was removed by mistake in #17491
2023-06-23 12:29:32 -07:00
Anis Eleuch d8dad5c9ea
s3: Make/Delete buckets to use error quorum per pool (#17467) 2023-06-23 11:48:23 -07:00
Klaus Post bf8a68879c
fix: Time ILM Actions for scanner info (#17493)
ILM Actions were not timed fix it.
2023-06-23 07:48:36 -07:00
Aditya Manthramurthy f3248a4b37
Redact all secrets from config viewing APIs (#17380)
This change adds a `Secret` property to `HelpKV` to identify secrets
like passwords and auth tokens that should not be revealed by the server
in its configuration fetching APIs. Configuration reporting APIs now do
not return secrets.
2023-06-23 07:45:27 -07:00
Harshavardhana d315d012a4
decom: during multiple pool decom preserve current pool status (#17491)
removal of completed pools must retain pool status of other
pools in draining, to resume any remaining draining operations.
2023-06-23 07:44:18 -07:00
Harshavardhana bd9bf3693f
lambda: negative duration for presigned URL default to 1H (#17489)
fixes a bug where users created with Expiration as
timeSentinel is not rejected while generating the
presigned URL for lambda processing.
2023-06-23 00:17:24 -07:00
Aditya Manthramurthy 82ce78a17c
Fix locking in policy attach API (#17426)
For policy attach/detach API to work correctly the server should hold a
lock before reading existing policy mapping and until after writing the
updated policy mapping. This is fixed in this change.

A site replication bug, where LDAP policy attach/detach were not
correctly propagated is also fixed in this change.

Bonus: Additionally, the server responds with the actual (or net)
changes performed in the attach/detach API call. For e.g. if a user
already has policy A applied, and a call to attach policies A and B is
performed, the server will respond that B was attached successfully.
2023-06-21 22:44:50 -07:00
Harshavardhana 9af6c6ceef
under rebalance look for expired versions v/s remaining versions (#17482)
A continuation of PR #17479 for rebalance behavior must
also match the decommission behavior.

Fixes bug where rebalance would ignore rebalancing object
versions after one of the version returned "ObjectNotFound"
2023-06-21 13:23:20 -07:00
Praveen raj Mani b94ab07c2f
Honor global root CAs for kafka audit tls (#17481)
honor global root CAs for kafka audit tls
2023-06-21 10:50:40 -07:00
Harshavardhana 7605d07bb2
add support for bucket level request count per API (#17468)
New metrics added to calculate API request count
per bucket, per API.  Captures errors, including
4xx, 5xx HTTP status codes separately.
2023-06-21 09:41:59 -07:00
Harshavardhana ccc5801112
always look for expired versions v/s remaining versions (#17479)
while decommissioning it can so happen that the non-current
versions are all expired but there is a DEL marker as the
latest version.

For such objects, we should not decommission them instead
calculate the remaining versions and if the remaining versions
is one and that version is a DEL marker consider such
an object not to be scheduled for decommissioning.
2023-06-21 08:49:28 -07:00
Praveen raj Mani 7c72b25ef0
Add an option to make bucket notifications synchronous (#17406)
With the current asynchronous behaviour in sending notification events
to the targets, we can't provide guaranteed delivery as the systems
might go for restarts.

For such event-driven use-cases, we can provide an option to enable
synchronous events where the APIs wait until the event is successfully
sent or persisted.

This commit adds 'MINIO_API_SYNC_EVENTS' env which when set to 'on'
will enable sending/persisting events to targets synchronously.
2023-06-20 17:38:59 -07:00
Harshavardhana 02c2ec3027
skip onlineDisks with parity mismatch (#17478) 2023-06-20 13:18:24 -07:00
Harshavardhana 65c31fab12
fix: do not crash rebalance code instead set the object layer (#17465)
fixes #17421
2023-06-20 09:28:23 -07:00
jiuker b6b68be052
fix: replication check for duplicate endpoints detection with wrong route (#17474) 2023-06-20 09:27:54 -07:00
Harshavardhana 15911c85f6
safely ignore out of band deletions while decommissioning (#17473) 2023-06-20 08:31:42 -07:00
Aditya Manthramurthy 5a1612fe32
Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Harshavardhana 1443b5927a
allow quorum fileInfo to pick same parityBlocks (#17454)
Bonus: allow replication to proceed for 503 errors such as
with error code SlowDownRead
2023-06-18 18:20:15 -07:00
Anis Eleuch 35ef35b5c1
fix a integer divide by zero crash during rebalance (#17455)
A state is updated with a delete marker, which does not have parity or
data blocks defined, which can cause the integer divide by zero panics.

This commit fixes to avoid panics.
2023-06-18 11:14:53 -07:00
Harshavardhana 6806537eb3
event args list for fanOut notification must be sized same (#17450)
without this fan-out API can crash if client cancels
the on-going request.
2023-06-18 07:09:20 -07:00
Harshavardhana 64de61d15d
fallback on etags if they match when mtime is not same (#17424)
on "unversioned" buckets there are situations
when successive concurrent I/O can lead to
an inconsistent state() with mtime while the
etag might be the same for the object on disk.

in such a scenario it is possible for us to
allow reading of the object since etag matches
and if etag matches we are guaranteed that we
have enough copies the object will be readable
and same.

This PR allows fallback in such scenarios.
2023-06-17 19:18:20 -07:00
Poorna c4d0c49a5f
ensure metadata updates go to same pool where version exists (#17451)
This PR also returns the replication status in 
proxy calls and defers replication attempt if 
HEAD on object version returned a error different
from NoSuchKey
2023-06-17 07:30:53 -07:00
Harshavardhana 47a48b6832
do not save any metadata from the headers in tar extract (#17436)
only preserve the same storage-class as incoming
request other than that rest of them must be
deduced.
2023-06-15 17:44:07 -07:00
Anis Eleuch a2aed12dcd
decom: Fix a typo in routing decommissioning requests (#17435)
A specific node should do the decommissioning task, however routing the
start decommissioning to that node was not working properly.

Co-authored-by: Anis Elleuch <anis@min.io>
2023-06-15 14:54:29 -07:00
Anis Eleuch d8e6e76e89
site-repl: Better error msg when setting sync in a local cluster (#17407) 2023-06-15 12:44:22 -07:00
Harshavardhana ad4e511026
do not save plain-text ETag when encryption is requested (#17427)
fixes an issue under bucket replication could cause
ETags for replicated SSE-S3 single part PUT objects,
to fail as we would attempt a decryption while listing,
or stat() operation.
2023-06-15 12:43:26 -07:00
Klaus Post 4a562d6732
fix: fanout error response - error must be string for marshaling (#17433)
Uses https://github.com/minio/minio-go/pull/1839
2023-06-15 09:21:53 -07:00
Poorna a9082e4f79
site replication: cancel ongoing op properly (#17428) 2023-06-15 08:05:08 -07:00
jiuker 0474791cf8
fix: set time format right (#17402) 2023-06-14 07:49:13 -07:00
Harshavardhana f32efd5429
more compliance related fixes (#17408)
- lifecycle must return InvalidArgument for rule errors
- do not return `null` versionId in HTTP header
- reject mixed SSE uploads with correct error message
2023-06-13 13:52:33 -07:00
jiuker 22c247a988
fix: preserve multiple values for query params (#17392) 2023-06-13 11:38:46 -07:00
Shubhendu 35d71682f6
fix: do not allow removal of inbuilt policies unless they are already persisted (#17264)
Dont allow removal of inbuilt policies such as `readwrite, readonly, writeonly and diagnostics`
2023-06-13 11:06:17 -07:00
drivebyer 3d6b88a60e
fix: syscall to record time on non-linux (#17383) 2023-06-13 11:04:50 -07:00
Harshavardhana 26a0803388
various compliance related fixes (#17401)
- getObjectTagging to be allowed for anonymous policies
- return correct errors for invalid retention period
- return sorted list of tags for an object
- putObjectTagging must return 200 OK not 204 OK
- return 409 ErrObjectLockConfigurationNotAllowed for existing buckets
2023-06-12 13:22:07 -07:00
Anis Eleuch ae95384dd8
Revert "heal: Update object parity with the latest configured SC (#17187)" (#17404) 2023-06-12 11:54:51 -07:00
Anis Eleuch 0f0dcf0c5e
tar: Avoid storing snowball extraction header in extract objects (#17389) 2023-06-12 09:42:06 -07:00
Klaus Post 6f2406b0b6
fix: protect ReplicationStats against concurrent map iteration and write crash (#17403) 2023-06-12 09:17:11 -07:00
Anis Eleuch bb24346e04
listen: Only error out if not able to bind any interface (#17353) 2023-06-12 09:09:28 -07:00
Harshavardhana be45ffd8a4
return 204 status code for DeleteBucketTagging (#17400) 2023-06-11 20:49:02 -07:00
Poorna Krishnamoorthy f986b0c493 replication: perform bucket resync in parallel (#16707)
Default number of parallel resync operations for a bucket to 10
to speed up resync.
2023-06-11 16:09:55 -07:00
Harshavardhana c9e87f0548
service accounts are allowed to have no expiration (#17397) 2023-06-11 10:34:59 -07:00
Harshavardhana 43468f4d47
return InvalidRequest when no parts are provided (#17395) 2023-06-10 21:59:51 -07:00
Harshavardhana b829e80ecb
do not disable root for invalid API config values (#17386) 2023-06-08 15:50:06 -07:00
Klaus Post 6e38d0f3ab
Add more bootstrap info in debug mode (#17362) 2023-06-08 08:39:47 -07:00
Anis Eleuch 38342b1df5
decom: Parallelize decommissining (#17364) 2023-06-07 14:27:51 -07:00