minio

mirror of https://github.com/minio/minio.git synced 2025-11-28 13:09:09 -05:00

Author	SHA1	Message	Date
Jorge Israel Peña	0a2e6d58a5	hdfs gateway handle listing single files (#10362 )	2020-08-26 16:03:53 -07:00
飞雪无情	ea1803417f	Use constants for gateway names to avoid bugs caused by spelling. (#10355 )	2020-08-26 08:52:46 -07:00
Harshavardhana	d19b434ffc	fix: bring back delayed leaf detection in listing (#10346 )	2020-08-25 12:26:48 -07:00
Jorge Israel Peña	4752323e1c	Use hdfs.Readdir() to optimize HDFS directory listings (#10121 ) Currently, listing directories on HDFS incurs a per-entry remote Stat() call penalty, the cost of which can really blow up on directories with many entries (+1,000) especially when considered in addition to peripheral calls (such as validation) and the fact that minio is an intermediary to the client (whereas other clients listed below can query HDFS directly). Because listing directories this way is expensive, the Golang HDFS library provides the [`Client.Open()`] function which creates a [`FileReader`] that is able to batch multiple calls together through the [`Readdir()`] function. This is substantially more efficient for very large directories. In one case we were witnessing about +20 seconds to list a directory with 1,500 entries, admittedly large, but the Java hdfs ls utility as well as the HDFS library sample ls utility were much faster. Hadoop HDFS DFS (4.02s): λ ~/code/minio → use-readdir » time hdfs dfs -ls /directory/with/1500/entries/ … hdfs dfs -ls 5.81s user 0.49s system 156% cpu 4.020 total Golang HDFS library (0.47s): λ ~/code/hdfs → master » time ./hdfs ls -lh /directory/with/1500/entries/ … ./hdfs ls -lh 0.13s user 0.14s system 56% cpu 0.478 total mc and minio without optimization (16.96s): λ ~/code/minio → master » time mc ls myhdfs/directory/with/1500/entries/ … ./mc ls 0.22s user 0.29s system 3% cpu 16.968 total mc and minio with optimization (0.40s): λ ~/code/minio → use-readdir » time mc ls myhdfs/directory/with/1500/entries/ … ./mc ls 0.13s user 0.28s system 102% cpu 0.403 total [`Client.Open()`]: https://godoc.org/github.com/colinmarc/hdfs#Client.Open [`FileReader`]: https://godoc.org/github.com/colinmarc/hdfs#FileReader [`Readdir()`]: https://godoc.org/github.com/colinmarc/hdfs#FileReader.Readdir	2020-07-24 11:31:51 -07:00
Harshavardhana	ec06089eda	fix: re-implement cluster healthcheck (#10101 )	2020-07-20 18:31:22 -07:00
Harshavardhana	14ff7f5fcf	add hdfs sub-path support (#10046 ) for users who don't have access to HDFS rootPath '/' can optionally specify `minio gateway hdfs hdfs://namenode:8200/path` for which they have access to, allowing all writes to be performed at `/path`. NOTE: once configured in this manner you need to make sure command line is correctly specified, otherwise your data might not be visible closes #10011	2020-07-14 15:49:10 -07:00
Anis Elleuch	778e9c864f	Move dependency from minio-go v6 to v7 (#10042 )	2020-07-14 09:38:05 -07:00
Harshavardhana	e7d7d5232c	fix: admin info output and improve overall performance (#10015 ) - admin info node offline check is now quicker - admin info now doesn't duplicate the code across doing the same checks for disks - rely on StorageInfo to return appropriate errors instead of calling locally. - diskID checks now return proper errors when disk not found v/s format.json missing. - add more disk states for more clarity on the underlying disk errors.	2020-07-13 09:51:07 -07:00
Harshavardhana	4915433bd2	Support bucket versioning (#9377 ) - Implement a new xl.json 2.0.0 format to support, this moves the entire marshaling logic to POSIX layer, top layer always consumes a common FileInfo construct which simplifies the metadata reads. - Implement list object versions - Migrate to siphash from crchash for new deployments for object placements. Fixes #2111	2020-06-12 20:04:01 -07:00
Harshavardhana	38ee40d59c	move to upstream code colinmarc/hdfs (#9738 ) - supports SASL based authentication now - upgrades to new changes in gokrb library - implement force delete feature Fixes #8206	2020-05-29 18:38:50 -07:00
Harshavardhana	b2db8123ec	Preserve errors returned by diskInfo to detect disk errors (#9727 ) This PR basically reverts #9720 and re-implements it differently	2020-05-28 13:03:04 -07:00
Harshavardhana	b330c2c57e	Introduce simpler GetMultipartInfo call for performance (#9722 ) Advantages avoids 100's of stats which are needed for each upload operation in FS/NAS gateway mode when uploading a large multipart object, dramatically increases performance for multipart uploads by avoiding recursive calls. For other gateway's simplifies the approach since azure, gcs, hdfs gateway's don't capture any specific metadata during upload which needs handler validation for encryption/compression. Erasure coding was already optimized, additionally just avoids small allocations of large data structure. Fixes #7206	2020-05-28 12:36:20 -07:00
Harshavardhana	a1de9cec58	cleanup object-lock/bucket tagging for gateways (#9548 ) This PR is to ensure that we call the relevant object layer APIs for necessary S3 API level functionalities allowing gateway implementations to return proper errors as NotImplemented{} This allows for all our tests in mint to behave appropriately and can be handled appropriately as well.	2020-05-08 13:44:44 -07:00
Harshavardhana	282c9f790a	fix: validate partNumber in queryParam as part of preConditions (#9386 )	2020-04-20 22:01:59 -07:00
Harshavardhana	f44cfb2863	use GlobalContext whenever possible (#9280 ) This change is throughout the codebase to ensure that all codepaths honor GlobalContext	2020-04-09 09:30:02 -07:00
Bala FA	2c3e34f001	add force delete option of non-empty bucket (#9166 ) passing HTTP header `x-minio-force-delete: true` would allow standard S3 API DeleteBucket to delete a non-empty bucket forcefully.	2020-03-27 21:52:59 -07:00
Krishna Srinivas	2e9fed1a14	non-empty dirs should not be listed as objects (#9129 )	2020-03-13 17:43:00 -07:00
poornas	224b4f13b8	Add cache eviction low and high watermarks (#8958 ) To allow better control the cache eviction process. Introduce MINIO_CACHE_WATERMARK_LOW and MINIO_CACHE_WATERMARK_HIGH env. variables to specify when to stop/start cache eviction process. Deprecate MINIO_CACHE_EXPIRY environment variable. Cache gc sweeps at 30 minute intervals whenever high watermark is reached to clear least recently accessed entries in the cache until sufficient space is cleared to reach the low watermark. Garbage collection uses an adaptive file scoring approach based on last access time, with greater weights assigned to larger objects and those with more hits to find the candidates for eviction. Thanks to @klauspost for this file scoring algorithm Co-authored-by: Klaus Post <klauspost@minio.io>	2020-02-23 19:03:39 +05:30
Anis Elleuch	d4dcf1d722	metrics: Use StorageInfo() instead to have consistent info (#9006 ) Metrics used to have its own code to calculate offline disks. StorageInfo() was avoided because it is an expensive operation by sending calls to all nodes. To make metrics & server info share the same code, a new argument `local` is added to StorageInfo() so it will only query local disks when needed. Metrics now calls StorageInfo() as server info handler does but with the local flag set to false. Co-authored-by: Praveen raj Mani <praveen@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2020-02-20 09:21:33 +05:30
Harshavardhana	09ee145e9c	gw/hdfs: indicate hdfs gateway is production ready (#8848 )	2020-01-18 07:25:03 -08:00
Harshavardhana	fca4ee84c9	gw/hdfs: listing should list directories properly (#8827 ) Fixes #8822	2020-01-16 17:11:25 -08:00
Praveen raj Mani	5d09233115	Fix Readiness check (#8681 ) - Remove goroutine-check in Readiness check - Bring in quorum check for readiness Fixes #8385 Co-authored-by: Harshavardhana <harsha@minio.io>	2019-12-28 22:24:43 +05:30
Harshavardhana	c8d82588c2	Fix crash in console logger and also handle bucket DNS updates (#8654 ) Also fix listenBucketNotification bugs seen by minio-js listen bucket notification API.	2019-12-16 20:30:57 -08:00
Harshavardhana	5ac4b517c9	Order all keys in config (#8541 ) New changes - return default values when sub-sys is not configured. - state is hidden parameter now - remove worm mode to be saved in config	2019-11-20 15:10:24 -08:00
Harshavardhana	07a556a10b	Avoid ListBuckets() call instead rely on simple HTTP GET (#8475 ) This is to avoid making calls to backend and requiring gateways to allow permissions for ListBuckets() operation just for Liveness checks, we can avoid this and make our liveness checks to be more performant.	2019-11-01 16:58:10 -07:00
Harshavardhana	ee4a6a823d	Migrate config to KV data format (#8392 ) - adding oauth support to MinIO browser (#8400) by @kanagaraj - supports multi-line get/set/del for all config fields - add support for comments, allow toggle - add extensive validation of config before saving - support MinIO browser to support proper claims, using STS tokens - env support for all config parameters, legacy envs are also supported with all documentation now pointing to latest ENVs - preserve accessKey/secretKey from FS mode setups - add history support implements three APIs - ClearHistory - RestoreHistory - ListHistory - add help command support for each config parameters - all the bug fixes after migration to KV, and other bug fixes encountered during testing.	2019-10-22 22:59:13 -07:00
Praveen raj Mani	8836d57e3c	The prometheus metrics refractoring (#8003 ) The measures are consolidated to the following metrics - `disk_storage_used` : Disk space used by the disk. - `disk_storage_available`: Available disk space left on the disk. - `disk_storage_total`: Total disk space on the disk. - `disks_offline`: Total number of offline disks in current MinIO instance. - `disks_total`: Total number of disks in current MinIO instance. - `s3_requests_total`: Total number of s3 requests in current MinIO instance. - `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance. - `s3_requests_current`: Total number of active s3 requests in current MinIO instance. - `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance. - `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance. - `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance. - `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance. - `minio_version_info`: Current MinIO version with commit-id. - `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests. And this PR also modifies the current StorageInfo queries - Decouples StorageInfo from ServerInfo . - StorageInfo is enhanced to give endpoint information. NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR Fixes #7873	2019-10-22 21:01:14 -07:00
Harshavardhana	d48fd6fde9	Remove unusued params and functions (#8399 )	2019-10-15 18:35:41 -07:00
Harshavardhana	d2a8be6fc2	gateway/hdfs: Fix isObjectDir to behave correctly (#8368 )	2019-10-09 04:20:43 +05:30
Harshavardhana	589e32a4ed	Refactor config and split them in packages (#8351 ) This change is related to larger config migration PR change, this is a first stage change to move our configs to `cmd/config/` - divided into its subsystems	2019-10-04 23:05:33 +05:30
Harshavardhana	cb01516a26	In HDFS gateway fix non-empty folder behavior (#8254 ) To be compatible with our FS and Erasure coded mode deployments, make sure that we do not send 200 OK for folders which have files inside. Fixes #8143	2019-09-18 01:59:59 +05:30
Harshavardhana	a7be313230	Start using new errors package (#8207 )	2019-09-11 22:51:43 +05:30
Harshavardhana	e6d8e272ce	Use const slashSeparator instead of "/" everywhere (#8028 )	2019-08-06 12:08:58 -07:00
Harshavardhana	6f2b4675fa	Add krb5 support for HDFS gateway (#7933 )	2019-07-24 18:05:48 -07:00
Harshavardhana	2c0b3cadfc	Update go mod with sem versions of our libraries (#7687 )	2019-05-29 16:35:12 -07:00
ebozduman	67d508b214	Adjusts help content dynamically according to OS (#7646 )	2019-05-15 14:02:44 +05:30
Anis Elleuch	9c90a28546	Implement bulk delete (#7607 ) Bulk delete at storage level in Multiple Delete Objects API In order to accelerate bulk delete in Multiple Delete objects API, a new bulk delete is introduced in storage layer, which will accept a list of objects to delete rather than only one. Consequently, a new API is also need to be added to Object API.	2019-05-13 12:25:49 -07:00
Harshavardhana	72929ec05b	Turn off md5sum optionally if content-md5 is not set (#7609 ) This PR also brings --compat option to run MinIO in strict S3 compatibility mode, MinIO by default will now try to run high performance mode.	2019-05-08 18:35:40 -07:00
Harshavardhana	64998fc4ab	Remove delayIsLeaf requirement simplify ListObjects further (#7593 )	2019-05-02 10:36:57 +05:30
Harshavardhana	c5f26d5cdd	Fix hdfsReader fd leak upon GetObject() (#7596 ) Also migrate to minio/hdfs/v3@v3.0.0	2019-05-01 14:43:21 -07:00
Harshavardhana	620e462413	Implement S3-HDFS gateway (#7440 ) - [x] Support bucket and regular object operations - [x] Supports Select API on HDFS - [x] Implement multipart API support - [x] Completion of ListObjects support	2019-04-17 09:52:08 -07:00

41 Commits