1
0
mirror of https://github.com/minio/minio.git synced 2025-03-21 21:14:14 -04:00

15 Commits

Author SHA1 Message Date
Ryan Tam
bd56f80250 Fix ignored alias for aggregate result in S3 Select ()
The SQL parser as it stands right now ignores alias for aggregate
result, e.g. `SELECT COUNT(*) AS thing FROM s3object` doesn't actually
return record like `{"thing": 42}`, it returns a record like `{"_1": 42}`.
Column alias for aggregate result is supported in AWS's S3 Select, so
this commit fixes that by respecting the `expr.As` in the expression.

Also improve test for S3 select

On top of testing a simple `SELECT` query, we want to test a few more
"advanced" queries (e.g. aggregation).

Convert existing tests into table driven tests[1], and add the new test
cases with "advanced" queries into them.

[1] - https://github.com/golang/go/wiki/TableDrivenTests
2019-07-03 16:34:54 -07:00
Joe Stevens
a19cf063b5 Fixes for multiplatform dev and testing from forks ()
Add support for correct dependency URLs on all platforms

only build mountinfo.go on linux

make testfile path relative to support fork work
2019-06-04 00:59:40 -07:00
kannappanr
5ecac91a55
Replace Minio refs in docs with MinIO and links () 2019-04-09 11:39:42 -07:00
Aditya Manthramurthy
91c839ad28 Use a buffer to collect SQL Select result rows ()
Batching records into a single SQL Select message in the response
leads to significant speed up as the message header overhead is made
negligible.

This change leads to a speed up of 3-5x for queries that select many
small records.
2019-01-28 20:00:18 -08:00
Bala FA
e23a42305c Rebase minio/parquet-go and fix null handling. () 2019-01-16 21:52:04 +05:30
Bala FA
b0deea27df Refactor s3select to support parquet. ()
Also handle pretty formatted JSON documents.
2019-01-08 16:53:04 -08:00
Harshavardhana
7e1661f4fa Performance improvements to SELECT API on certain query operations ()
This improves the performance of certain queries dramatically,
such as 'count(*)' etc.

Without this PR
```
~ time mc select --query "select count(*) from S3Object" myminio/sjm-airlines/star2000.csv.gz
2173762

real	0m42.464s
user	0m0.071s
sys	0m0.010s
```

With this PR
```
~ time mc select --query "select count(*) from S3Object" myminio/sjm-airlines/star2000.csv.gz
2173762

real	0m17.603s
user	0m0.093s
sys	0m0.008s
```

Almost a 250% improvement in performance. This PR avoids a lot of type
conversions and instead relies on raw sequences of data and interprets
them lazily.

```
benchcmp old new
benchmark                        old ns/op       new ns/op       delta
BenchmarkSQLAggregate_100K-4     551213          259782          -52.87%
BenchmarkSQLAggregate_1M-4       6981901985      2432413729      -65.16%
BenchmarkSQLAggregate_2M-4       13511978488     4536903552      -66.42%
BenchmarkSQLAggregate_10M-4      68427084908     23266283336     -66.00%

benchmark                        old allocs     new allocs     delta
BenchmarkSQLAggregate_100K-4     2366           485            -79.50%
BenchmarkSQLAggregate_1M-4       47455492       21462860       -54.77%
BenchmarkSQLAggregate_2M-4       95163637       43110771       -54.70%
BenchmarkSQLAggregate_10M-4      476959550      216906510      -54.52%

benchmark                        old bytes       new bytes      delta
BenchmarkSQLAggregate_100K-4     1233079         1086024        -11.93%
BenchmarkSQLAggregate_1M-4       2607984120      557038536      -78.64%
BenchmarkSQLAggregate_2M-4       5254103616      1128149168     -78.53%
BenchmarkSQLAggregate_10M-4      26443524872     5722715992     -78.36%
```
2018-11-14 15:55:10 -08:00
Ashish Kumar Sinha
c0b4bf0a3e SQL select query for CSV/JSON ()
select * , select column names have been implemented for CSV.
select * is implemented for JSON.
2018-10-22 12:12:22 -07:00
Praveen raj Mani
cef044178c Treat columns with spaces inbetween [s3Select] ()
replace the double/single quotes with backticks for the xwb1989/sqlparser
to recognise such queries.

Fixes 
2018-10-17 11:01:26 -07:00
Aditya Manthramurthy
16a100b597 Fix out-of-bound array access crash in select processing ()
Fix test case.
2018-10-09 09:45:56 -07:00
Ashish Kumar Sinha
670f9788e3 Count(*) to give integer value ()
The Max, Min functions were giving float value even when they were integers.  
Resolved max and Min to return integers in that scenario.

Fixes 
2018-10-04 17:33:53 -07:00
Harshavardhana
a0683d3c1f Send progress only when requested by client in SelectObject () 2018-09-17 11:52:46 +05:30
Praveen raj Mani
30d4a2cf53 s3select should honour custom record delimiter ()
Allow custom delimiters like `\r\n`, `a`, `\r` etc in input csv and 
replace with `\n`.

Fixes 
2018-09-10 21:50:28 +05:30
Raphael Randschau
8601f29d95 select: fix int overflow of math.MaxInt64 on ARM () 2018-08-22 16:16:04 +05:30
Arjun Mishra
7c14cdb60e S3 Select API Support for CSV ()
Add support for trivial where clause cases
2018-08-15 03:30:19 -07:00