Harshavardhana
dd2542e96c
add codespell action ( #18818 )
...
Original work here, #18474 , refixed and updated.
2024-01-17 23:03:17 -08:00
jiuker
a89e0bab7d
fix: s3 sql parse error for colums as with quotes ( #18765 )
2024-01-09 09:19:11 -08:00
Aditya Manthramurthy
496027b589
Fix precendence bug in S3Select SQL IN clauses ( #18708 )
...
Fixes a precendence issue in SQL Select where `a in b and c = 3` was parsed as `a
in (b and c = 3)`.
Fixes #18682
2023-12-22 23:19:11 -08:00
Harshavardhana
754f7a8a39
replace io.Discard usage to fix some NUMA copy() latencies ( #18394 )
...
replace io.Discard usage to fix NUMA copy() latencies
On NUMA systems copying from 8K buffer allocated via
io.Discard leads to large latency build-up for every
```
copy(new8kbuf, largebuf)
```
can in-cur upto 1ms worth of latencies on NUMA systems
due to memory sharding across NUMA nodes.
2023-11-06 14:26:08 -08:00
Aditya Manthramurthy
1c99fb106c
Update to minio/pkg/v2 ( #17967 )
2023-09-04 12:57:37 -07:00
Harshavardhana
e12ab486a2
avoid using os.Getenv for internal code, use env.Get() instead ( #17688 )
2023-07-20 07:52:49 -07:00
Klaus Post
6fe028b7c5
Revert s3 select simdjson reuse ( #17310 )
2023-05-30 10:02:22 -07:00
Klaus Post
b06d7bf834
fix: leaking connections in JSON SQL with limited return ( #17239 )
2023-05-18 11:26:46 -07:00
ferhat elmas
714283fae2
cleanup ignored static analysis ( #16767 )
2023-03-06 08:56:10 -08:00
ferhat elmas
3423028713
cleanup Go linter settings ( #16736 )
2023-03-04 20:57:35 -08:00
jiuker
e470268c7c
fix: a possible closer leak in SelectObjectHandler ( #16598 )
2023-02-17 01:44:40 -08:00
Klaus Post
ff12080ff5
Remove deprecated io/ioutil ( #15707 )
2022-09-19 11:05:16 -07:00
Klaus Post
c22f3ca7a8
fix: S3 Select CSV -> JSON with variable field count ( #15677 )
...
When there are fewer fields than expected, output fewer fields.
2022-09-12 17:00:59 -07:00
Abirdcfly
d4e0f13bb3
chore: remove duplicate word in comments ( #15607 )
...
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-08-30 08:26:43 -07:00
Harshavardhana
433b6fa8fe
upgrade golang-lint to the latest ( #15600 )
2022-08-26 12:52:29 -07:00
Harshavardhana
d087e28dce
start using t.SetEnv instead of os.Setenv ( #14787 )
2022-04-23 15:33:45 -07:00
Aditya Manthramurthy
e8e48e4c4a
S3 select switch to new parquet library and reduce locking ( #14731 )
...
- This change switches to a new parquet library
- SelectObjectContent now takes a single lock at the beginning and holds it
during the operation. Previously the operation took a lock every time the
parquet library performed a Seek on the underlying object stream.
- Add basic support for LogicalType annotations for timestamps.
2022-04-14 06:54:47 -07:00
Aditya Manthramurthy
79ba458051
fix: free up reader resources in S3Select properly ( #14600 )
2022-03-23 20:58:53 -07:00
Klaus Post
c07af89e48
select: Add ScanRange to CSV&JSON ( #14546 )
...
Implements https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html#AmazonS3-SelectObjectContent-request-ScanRange
Fixes #14539
2022-03-14 09:48:36 -07:00
Klaus Post
88fd1cba71
select: add MISSING operator support ( #14406 )
...
Probably not full support, but for regular checks it should work.
Fixes #14358
2022-02-25 12:31:19 -08:00
Klaus Post
2cea944cdb
select: Allow lower case 'is' ( #14405 )
...
Ref: #14358
2022-02-24 09:10:48 -08:00
Harshavardhana
f527c708f2
run gofumpt cleanup across code-base ( #14015 )
2022-01-02 09:15:06 -08:00
Klaus Post
91f72f25ab
select: Return early from bool AND, OR ( #13914 )
...
Return as soon as an AND fails and whenever an OR succeeds. Faster and more flexible.
For example makes `select * from S3object where _2 != '' AND _2 > 1` able to operate on empty fields.
Followup to #13900
2021-12-15 16:47:21 -08:00
Klaus Post
a8d4042853
select: Add IS (NOT) operators ( #13906 )
...
Add `IS` and `IS NOT` as comparison operators.
This may be a bit wider than the S3 spec, but we can rather
easily remove the forwarding.
2021-12-14 09:54:50 -08:00
Klaus Post
d6fe0f61a9
do not panic when input cannot be parsed ( #13791 )
...
Fix cases where `s3Select.Open` fails and doesn't set the recordReader.
Fixes #13786
2021-11-30 08:42:42 -08:00
Harshavardhana
661b263e77
add gocritic/ruleguard checks back again, cleanup code. ( #13665 )
...
- remove some duplicated code
- reported a bug, separately fixed in #13664
- using strings.ReplaceAll() when needed
- using filepath.ToSlash() use when needed
- remove all non-Go style comments from the codebase
Co-authored-by: Aditya Manthramurthy <donatello@users.noreply.github.com>
2021-11-16 09:28:29 -08:00
Harshavardhana
ea820b30bf
fix: use equalFold() instead of lower and compare ( #13624 )
2021-11-10 08:12:50 -08:00
Harshavardhana
34680c5ccf
fix: SQL select to honor limits properly for array queries ( #13568 )
...
added tests to cover the scenarios as well.
2021-11-02 19:14:46 -07:00
Klaus Post
c2eb60df4a
bz2: limit max concurrent CPU ( #13458 )
...
Ensure that bz2 decompression will never take more than 50% CPU.
2021-10-18 08:44:36 -07:00
Klaus Post
5e53f767c4
Use concurrent bz2 decompression ( #13360 )
...
Testing with `mc sql --compression BZIP2 --csv-input "rd=\n,fh=USE,fd=;" --query="select COUNT(*) from S3Object" local2/testbucket/nyc-taxi-data-10M.csv.bz2`
Before 96.98s, after 10.79s. Uses about 70% CPU while running.
2021-10-14 11:11:07 -07:00
Klaus Post
5a64003f6f
select: Return null for non-exiting column indexes ( #13196 )
...
Fixes #13186
2021-09-13 09:13:25 -07:00
Klaus Post
b2c92cdaaa
select: Add more compression formats ( #13142 )
...
Support Zstandard, LZ4, S2, and snappy as additional
compression formats for S3 Select.
2021-09-06 09:09:53 -07:00
Harshavardhana
202d0b64eb
fix: enable go1.17 github ci/cd ( #12997 )
2021-08-18 18:35:22 -07:00
Harshavardhana
1f262daf6f
rename all remaining packages to internal/ ( #12418 )
...
This is to ensure that there are no projects
that try to import `minio/minio/pkg` into
their own repo. Any such common packages should
go to `https://github.com/minio/pkg `
2021-06-01 14:59:40 -07:00