85e939636f
This PR also adds some comments and simplifies the code. Primary handling is done to ensure that we make sure to honor cached buffer. Added unit tests as well Fixes #7141 |
||
---|---|---|
.. | ||
LICENSE | ||
README.md | ||
decoder.go | ||
errors.go | ||
jstream.png | ||
scanner.go | ||
scratch.go |
README.md
jstream
is a streaming JSON parser and value extraction library for Go.
Unlike most JSON parsers, jstream
is document position- and depth-aware -- this enables the extraction of values at a specified depth, eliminating the overhead of allocating encompassing arrays or objects; e.g:
Using the below example document:
we can choose to extract and act only the objects within the top-level array:
f, _ := os.Open("input.json")
decoder := jstream.NewDecoder(f, 1) // extract JSON values at a depth level of 1
for mv := range decoder.Stream() {
fmt.Printf("%v\n ", mv.Value)
}
output:
map[desc:RGB colors:[red green blue]]
map[desc:CMYK colors:[cyan magenta yellow black]]
likewise, increasing depth level to 3
yields:
red
green
blue
cyan
magenta
yellow
black
optionally, kev:value pairs can be emitted as an individual struct:
decoder := jstream.NewDecoder(f, 2).EmitKV() // enable KV streaming at a depth level of 2
jstream.KV{desc RGB}
jstream.KV{colors [red green blue]}
jstream.KV{desc CMYK}
jstream.KV{colors [cyan magenta yellow black]}
Installing
go get github.com/bcicen/jstream
Commandline
jstream
comes with a cli tool for quick viewing of parsed values from JSON input:
cat input.json | jstream -v -d 1
depth start end type | value
1 004 069 object | {"colors":["red","green","blue"],"desc":"RGB"}
1 073 153 object | {"colors":["cyan","magenta","yellow","black"],"desc":"CMYK"}
Options
Opt | Description |
---|---|
-d <n> | emit values at depth n. if n < 0, all values will be emitted |
-v | output depth and offset details for each value |
-h | display help dialog |
Benchmarks
Obligatory benchmarks performed on files with arrays of objects, where the decoded objects are to be extracted.
Two file sizes are used -- regular (1.6mb, 1000 objects) and large (128mb, 100000 objects)
input size | lib | MB/s | Allocated |
---|---|---|---|
regular | standard | 97 | 3.6MB |
regular | jstream | 175 | 2.1MB |
large | standard | 92 | 305MB |
large | jstream | 404 | 69MB |
In a real world scenario, including initialization and reader overhead from varying blob sizes, performance can be expected as below: