I found this while bringing db.rs's test coverage up to the old
moonfire-db-test.cc. I mistakenly thought that in SQLite, an ungrouped
aggregate on a relation with no rows would return a row with a null result of
the aggregate. Instead, it returns no rows. In hindsight, this makes more
sense; it matches what grouped aggregates (have to) do.
I should have submitted/pushed more incrementally but just played with it on
my computer as I was learning the language. The new Rust version more or less
matches the functionality of the current C++ version, although there are many
caveats listed below.
Upgrade notes: when moving from the C++ version, I recommend dropping and
recreating the "recording_cover" index in SQLite3 to pick up the addition of
the "video_sync_samples" column:
$ sudo systemctl stop moonfire-nvr
$ sudo -u moonfire-nvr sqlite3 /var/lib/moonfire-nvr/db/db
sqlite> drop index recording_cover;
sqlite3> create index ...rest of command as in schema.sql...;
sqlite3> ^D
Some known visible differences from the C++ version:
* .mp4 generation queries SQLite3 differently. Before it would just get all
video indexes in a single query. Now it leads with a query that should be
satisfiable by the covering index (assuming the index has been recreated as
noted above), then queries individual recording's indexes as needed to fill
a LRU cache. I believe this is roughly similar speed for the initial hit
(which generates the moov part of the file) and significantly faster when
seeking. I would have done it a while ago with the C++ version but didn't
want to track down a lru cache library. It was easier to find with Rust.
* On startup, the Rust version cleans up old reserved files. This is as in the
design; the C++ version was just missing this code.
* The .html recording list output is a little different. It's in ascending
order, with the most current segment shorten than an hour rather than the
oldest. This is less ergonomic, but it was easy. I could fix it or just wait
to obsolete it with some fancier JavaScript UI.
* commandline argument parsing and logging have changed formats due to
different underlying libraries.
* The JSON output isn't quite right (matching the spec / C++ implementation)
yet.
Additional caveats:
* I haven't done any proof-reading of prep.sh + install instructions.
* There's a lot of code quality work to do: adding (back) comments and test
coverage, developing a good Rust style.
* The ffmpeg foreign function interface is particularly sketchy. I'd
eventually like to switch to something based on autogenerated bindings.
I'd also like to use pure Rust code where practical, but once I do on-NVR
motion detection I'll need to existing C/C++ libraries for speed (H.264
decoding + OpenCL-based analysis).
The old release on googlecode.com now 404s, so out-of-the-box builds were
broken. The releases on github have a slightly different file structure, so it's
more than just a change of URL. I upgraded from 1.7.0 to 1.8.0 in the process.
* typo: the subtitle should use its own mdhd, not alias the video one
* use 64-bit ints for the edit lists; the 32-bit values overflow at 13.25 hours
* use etags that reflect the edit list
I'm seeing what is possible performance-wise in the current C++ before
trying out Go and Rust implementations.
* use the google benchmark framework and some real data.
* use release builds - I hadn't done this in a while, and there were a
few compile errors that manifested only in release mode. Update the
readme to suggest using a release build.
* optimize the varint decoder and SampleIndexIterator to branch less.
* enable link-time optimization for release builds.
* add some support for feedback-directed optimization. Ideally "make"
would automatically produce the "generate" build outputs with a
different object/library/executable suffix, run the generate
benchmark, and then produce the "use" builds. This is not that fancy;
you have to run an arcane command:
alias cmake='cmake -DCMAKE_BUILD_TYPE=Release'
cmake -DPROFILE_GENERATE=true -DPROFILE_USE=false .. && \
make recording-bench && \
src/recording-bench && \
cmake -DPROFILE_GENERATE=false -DPROFILE_USE=true .. && \
make recording-bench && \
perf stat -e cycles,instructions,branches,branch-misses \
src/recording-bench --benchmark_repetitions=5
That said, the results are dramatic - at least 50% improvement. (The
results weren't stable before as small tweaks to the code caused a
huge shift in performance, presumably something something branch
alignment something something.)
Now it's possible to quickly determine what calendar days have data and then
query recordings for just the day(s) of interest with their returned
{start,end}_time_usec.
The helper isn't used yet. The goal is to export this on /camera/<uuid>/ as
described in a TODO in design/api.md.
The next step is to keep MoonfireDatabase::CameraData::days up-to-date:
* Init: call on every recording (replacing the current aggregated query with
a recording-by-recording query)
* InsertRecording, DeleteRecordings: call for added/removed recordings
then return it from GetCamera and pass it along to the client in
WebInterface::HandleJsonCameraDetail.
* If the end of a segment is between samples, the last included sample will
have a shortened duration.
* If the beginning of a segment not on a key frame (aka sync sample), the
prefix will be included but trimmed using an edit list. (It seems like a
ctts box might be able to accomplish the same thing, fwiw.)
* gcc (Raspbian 4.9.2-10) 4.9.2 complains about -1 in const char[]s.
gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010 was fine with this.
Use '\xff' instead.
* libjsoncpp-dev 0.6.0~rc2-3.1 doesn't have Json::writeValue.
Use an older interface instead.
* libre2-dev 20140304+dfsg-2 has a bug in which custom RE2 parsers don't
compile because the relevant constructor is only declared, not defined as
trivial. (This is fixed on my Ubuntu's libre2-dev 20150701+dfsg-2.)
Avoid using this.
I tested these in VLC and QuickTime. Both players appear to ignore the
as the track dimensions, track transformation matrix, box dimensions, and box
justification. I just left them at default values then.
Automated testing is minimal. There's a new test that the resulting .mp4
parses, but I didn't actually ensure correctness of the subtitles in any way.
* Changed README.md commensurately
* Add cameras.sql to .gitignore to not commit personal camera data
* Change CMakeLists.txt to explicitly refer to hand-built libevent dirs
There's a lot of work left to do on this:
* important latency optimization: the recording threads block
while fsync()ing sample files, which can take 250+ ms. This
should be moved to a separate thread to happen asynchronously.
* write cycle optimizations: several SQLite commits per camera per minute.
* test coverage: this drops testing of the file rotation, and
there are several error paths worth testing.
* ffmpeg oddities to investigate:
* the out-of-order first frame's pts
* measurable delay before returning packets
* it sometimes returns an initial packet it calls a "key" frame that actually
has an SEI recovery point NAL but not an IDR-coded slice NAL, even though
in the input these always seem to come together. This makes playback
starting from this recording not work at all on Chrome. The symptom is
that it loads a player-looking thing with the proper dimensions but
playback never actually starts.
I imagine these are all related but haven't taken the time to dig through
ffmpeg code and understand them. The right thing anyway may be to ditch
ffmpeg for RTSP streaming (perhaps in favor of the live555 library), as
it seems to have other omissions like making it hard/impossible to take
advantage of Sender Reports. In the meantime, I attempted to mitigate
problems by decreasing ffmpeg's probesize.
* handling overlapping recordings: right now if there's too much time drift or
a time jump, you can end up with recordings that the UI won't play without
manual database changes. It's not obvious what the right thing to do is.
* easy camera setup: currently you have to manually insert rows in the SQLite
database and restart.
but I think it's best to get something in to iterate from.
This deletes a lot of code, including:
* the ffmpeg video sink code (instead now using a bit of extra code in Stream
on top of the SampleFileWriter, SampleIndexEncoder, and MoonfireDatabase
code that's been around for a while)
* FileManager (in favor of new code using the database)
* the old UI
* RealFile and friends
* the dependency on protocol buffers, which was used for the config file
(though I'll likely have other reasons for using protocol buffers later)
* even some utilities like IsWord that were just for validating the config
I discovered that the mp4 files I was writing were viewable in VLC and in
Chrome-on-desktop (ffmpeg-based) but not in Chrome-on-Android
(libstagefright-based). It turns out that I was writing Annex B sample data
rather than the correct AVCParameterSample format. ffmpeg gives both the
"extradata" and the actual frames in Annex B format when reading from rtsp.
This is still my simple, unoptimized implementation of the Annex B parser. My
Raspberry Pi 2 is still able to record my six streams using about 30% of 1
core, so it will do for the moment at least.
In particular, this returns all the extra configuration data that will be
necessary to actually instantiate streams from the database rather than the
soon-to-be-removed configuration file.
Before, I had a gross hardcoded path in moonfire-db.cc + a hacky
Recording::sample_file_path (which is StrCat(sample_file_dir, "/", uuid),
essentially). Now, things expect to take a File* to the sample file directory
and use openat(2). Several things had to change in the process:
* RealFileSlice now takes a File* dir.
* File has an Open that returns an fd (for RealFileSlice's benefit).
* BuildMp4 now is in WebInterface rather than MoonfireDatabase. The latter
only manages the SQLite database, so it shouldn't know anything about the
sample file directory.
This reverts commit ad4beac464.
That commit wasn't as advertised; I had several other changes mixed in my
working copy. I'd also copied a working copy from one path to another, and
it turns out the cmake build subdir was still referring to the original, so
I hadn't realized this commit didn't even build. :(
I didn't properly update the new duration calculation when switching from
ascending to descending order.
Also, on the Pi, 1-hour recordings are noticeably faster to load.
* Schema revisions. The most dramatic is the addition of a covering index on
(camera_id, start_time_90k) that avoids the need to make sparse accesses
into the recording table (where the desired data is intermixed with both
the large blobs and rows from other cameras). A query over a year's data
previously took many seconds (6+ even in a form without the video_index)
and now is roughly 10X faster. Queries for a couple weeks now should be
unnoticeably fast.
Other changes to shrink the rows, such as duration_90k instead of
end_time_90k (more compact varint encoding) and video_sample_entry_id
(typically 1 byte) instead of video_sample_entry_sha1 (20 bytes).
And more CHECK constraints for good measure.
* Caching of expensive computations and logic to keep them up to date.
The top-level web view previously went through the entire recording table,
which was even slower. Now it is served from a small map in RAM.
* Expanded the scope of operations to cover (hopefully) everything needed for
recording into the SQLite database.
* Added tests of MoonfireDatabase. These are basic tests that don't
exercise a lot of error cases, but at least they exist.
The main MoonfireDatabase functionality still missing is support for quickly
seeing what calendar days have data over the full timespan of a camera. This
is more data to compute and cache.
On my laptop, with a month's data, a test query would take 0.1 to 0.2 seconds
before. Now it takes 0.001 to 0.004 seconds.
I improved this by creating and taking advantage of an index on start time.
It's a little more complicated than that because the desired timespan is
specified in terms of a recording's start and end time, not start time alone.
I defined a maximum duration of a recording (5 minutes) and specified this
with an extra condition in the query so that the end time can be used to
narrow the valid range of start times.
"explain query plan select ..." output confirms it's using the index with
both > and < comparisons:
0|0|0|SEARCH TABLE recording USING INDEX recording_start_time_90k (start_time_90k>? AND start_time_90k<?)
0|1|1|SEARCH TABLE video_sample_entry USING INDEX sqlite_autoindex_video_sample_entry_1 (sha1=?)
I also refactored ListMp4Recordings out of BuildMp4File to make the measurement
easier.
This is almost certain to have performance problems with large databases,
but it's a useful starting point.
No tests yet. It shouldn't be too hard to add some for moonfire-db.h, but
I'm impatient to fake up enough data to check on the performance and see
what needs to change there first.