mirror of
https://github.com/scottlamb/moonfire-nvr.git
synced 2025-12-08 00:32:26 -05:00
Benchmark & speed up SampleIndexIterator
I'm seeing what is possible performance-wise in the current C++ before
trying out Go and Rust implementations.
* use the google benchmark framework and some real data.
* use release builds - I hadn't done this in a while, and there were a
few compile errors that manifested only in release mode. Update the
readme to suggest using a release build.
* optimize the varint decoder and SampleIndexIterator to branch less.
* enable link-time optimization for release builds.
* add some support for feedback-directed optimization. Ideally "make"
would automatically produce the "generate" build outputs with a
different object/library/executable suffix, run the generate
benchmark, and then produce the "use" builds. This is not that fancy;
you have to run an arcane command:
alias cmake='cmake -DCMAKE_BUILD_TYPE=Release'
cmake -DPROFILE_GENERATE=true -DPROFILE_USE=false .. && \
make recording-bench && \
src/recording-bench && \
cmake -DPROFILE_GENERATE=false -DPROFILE_USE=true .. && \
make recording-bench && \
perf stat -e cycles,instructions,branches,branch-misses \
src/recording-bench --benchmark_repetitions=5
That said, the results are dramatic - at least 50% improvement. (The
results weren't stable before as small tweaks to the code caused a
huge shift in performance, presumably something something branch
alignment something something.)
This commit is contained in:
@@ -142,9 +142,9 @@ For instructions, you can skip to "[Camera configuration and hard disk mounting]
|
||||
|
||||
Once prerequisites are installed, Moonfire NVR can be built as follows:
|
||||
|
||||
$ mkdir build
|
||||
$ cd build
|
||||
$ cmake ..
|
||||
$ mkdir release
|
||||
$ cd release
|
||||
$ cmake -DCMAKE_BUILD_TYPE=Release ..
|
||||
$ make
|
||||
$ sudo make install
|
||||
|
||||
|
||||
Reference in New Issue
Block a user