It should reduce compile time / memory usage to put quite a bit of the code
into a separate crate. I also intend to limit visibility of some things to
only within the db crate, but that's for a future change. This is the smallest
move that will compile.
The filenames now represent composite ids (stream id + recording id) rather
than a separate uuid system with its own reservation for a few benefits:
* This provides more information when there are inconsistencies.
* This avoids the need for managing the reservations during recording. I
expect this to simplify delaying flushing of newly written sample files.
Now the directory has to be scanned at startup for files that never got
written to the database, but that's acceptably fast even with millions of
files.
* Less information to keep in memory and in the recording_playback table.
I'd considered using one directory per stream, which might help if the
filesystem has trouble coping with huge directories. But that would mean each
dir has to be fsync()ed separately (more latency and/or more multithreading).
So I'll stick with this until I see concrete evidence of a problem that would
solve.
Test coverage of the error conditions is poor. I plan to do some restructuring
of the db/dir code, hopefully making steps toward testability along the way.
This is still pretty basic support. There's no config UI support for
renaming/moving the sample file directories after they are created, and no
error checking that the files are still in the expected place. I can imagine
sysadmins getting into trouble trying to change things. I hope to address at
least some of that in a follow-up change to introduce a versioning/locking
scheme that ensures databases and sample file dirs match in some way.
A bonus change that kinda got pulled along for the ride: a dialog pops up in
the config UI while a stream is being tested. The experience was pretty bad
before; there was no indication the button worked at all until it was done,
sometimes many seconds later.
This allows each camera to have a main and a sub stream. Previously there was
a field in the schema for the sub stream's url, but it didn't do anything. Now
you can configure individual retention for main and sub streams. They show up
grouped in the UI.
No support for upgrading from schema version 1 yet.
The recording::Segment was constructing a segment with no frames in it, which
was causing a panic when appending a zero-length stts to the Slices. Fix this
in a couple ways:
* Slices::append should return Err rather than panic. No reason to crash the
whole program when we have trouble serving a single .mp4 request.
* recording::Segment shouldn't produce zero-frame segments
I had an assert that fired in this case, dating back to when I hadn't plumbed
Result returns through much of .mp4 construction. Now I have, so there's no
excuse in having an assert here. Change to an error return, and tweak it to
not fire in the zero-duration case.
Also fix a problem in the test harness; I hadn't finished converting it for
multi-recording tests, and it was returning the wrong recording.
Because of that, I seem to have stumbled across a related problem in which
asking for zero duration of a non-zero duration recording will return a
recording::Segment with no frames, which will cause panics because its
corresponding .mp4 slices are zero-length. I just adjusted the panic message
here; I'll follow up with changes to address that.
This was causing Firefox to fail to play multipart .mp4s which trimmed away a
prefix. In the developer console, it said NS_ERROR_DOM_MEDIA_METADATA_ERR
without giving any RESULT_DETAIL, making it a pain to diagnose. Given that the
stss is supposed to be needed for seeking, I'm surprised it didn't have any
immediately obvious impact on Chrome or VLC. Maybe they just took longer to
seek than otherwise necessary.
The bug was that when keeping track of the "next frame num" while constructing
the .mp4, I appended the number in the underlying recording, not the number
post-trimming. That meant following segments used the wrong numbers. In some
cases, it caused it to exceed the total number of samples in the generated
.mp4, which seems to be what Firefox was complaining about. Running the result
through "ffmpeg -i bad.mp4 -c copy -f mp4 good.mp4" just trimmed away the most
obviously invalid ones, leaving others that didn't point to the frames they
meant to. That was enough to make Firefox start playing the file. /shruggie
The existing tests were all with a single segment, so I added a new one to
catch this. I also added a Debug implementation to recording::Segment and
mp4::Segment.
This was totally broken in commit 1cf27c18. It would serve bytes from the
beginning of the sample file in question, not from the start of the given
range.
it should be exactly 1, but was slightly more because the fraction was
incorrectly 1 rather than 0. I'm not sure if any actual players care about
this, but it was something I noticed when looking into strange edit list
behavior.
This is intended to support HTML5 Media Source Extensions, which I expect to
be the most practical way to make a good web UI with a proper scrub bar and
such.
This feature has had very limited testing on Chrome and Firefox, and that was
not entirely successful. More work is needed before it's usable, but this
seems like a helpful progress checkpoint.
This significantly improves safety of the ffmpeg interface. The complex
ABIs aren't accessed directly from Rust. Instead, I have a small C
wrapper which uses the ffmpeg C API and the C headers at compile-time to
determine the proper ABI in the same way any C program using ffmpeg
would, so that the ABI doesn't have to be duplicated in Rust code.
I've tested with ffmpeg 2.x and ffmpeg 3.x; it seems to work properly
with both where before ffmpeg 3.x caused segfaults.
It still depends on ABI compatibility between the compiled and running
versions. C programs need this, too, and normal shared library
versioning practices provide this guarantee. But log both versions on
startup for diagnosing problems with this.
Fixes#7
The benchmarks don't get compiled with the standard "cargo test";
they require "cargo +nightly bench --features=nightly", so I didn't notice
they were broken in the previous commit. Now fixed.
serve_generated_bytes is >3X faster. One caveat is that the reactor thread may
stall when reading from the memory-mapped slice. Moonfire NVR is basically a
single-user program, so that may not be so bad, but we'll see.
It had an Arc which in hindsight isn't necessary; the actual video index
generation is fast anyway. This saves a couple pointers per cache entry and
the overhead of chasing them. LruCache itself also has some extra pointers on
it but that's something to address another day.
This reduces the working set by another 960 bytes for a typical one-hour recording, improving cache efficiency a bit more.
8 bytes from SampleIndexIterator:
* reduce the three "bytes" fields to two. Doing so as "bytes_key" vs
"bytes_nonkey" slowed it down a bit, perhaps because the "bytes" is
needed right away and requires a branch. But "bytes" vs "bytes_other"
seems fine. Looks like it can do this with cmovs in parallel with other
stuff.
* stuff "is_key" into the "i" field.
8 bytes from recording::Segment itself:
* make "frames" and "key_frame" u16s
* stuff "trailing_zero" into "video_sample_entry_id"
There were Nagle's algorithm delays in both the "fresh_client" and
"reuse_client" versions of the .mp4 serving benchmark. Now performance is much
more consistent.
* don't store sizes of mp4-format sample indexes; recalculate them.
* keep SampleIndexIterator position as a u32 rather than a usize.
This is 960 bytes for a 60-minute mp4; another small cache usage improvement.
For a one-hour recording, this is about 2 KiB, so a decent chunk of a
Raspberry Pi 2's L1 cache. Generating the Slices and searching/scanning
it should be a bit faster and pollute the cache less.
This is a pretty small optimization now when transferring a decent chunk
of the moov or mdat, but it's easy enough to do. It will be much more
noticeable if I end up interleaving the captions between each key frame.
This page was noticeably slower than necessary because the recording_cover
index wasn't actually covering the query. Both the schema for new databases
and the upgrade query were broken (and not even in the same way).
No new schema version to correct this, at least for now. I'll probably have
another reason to change the schema soon anyway and can throw this in.
These are currently the only thing which require a nightly Rust. I haven't run
them since adding the feature gates. The feature gates were slightly broken,
and the actual benchmarks had bitrotted a bit. Fix these things. Also put them
into a separate submodule from the regular tests, so that not as many
feature gates (#[cfg(feature="nightly")]) are required.
This is as described in design/time.md. Other aspects of that design
(including using the monotonic clock and adjusting the durations to compensate
for camera clock frequency error) are not implemented yet. No new tests yet.
Just trying to get some flight miles on these ideas as soon as I can.
The advantages of the new schema are:
* overlapping recordings can be unambiguously described and viewed.
This is a significant problem right now; the clock on my cameras appears to
run faster than the (NTP-synchronized) clock on my NVR. Thus, if an
RTSP session drops and is quickly reconnected, there's likely to be
overlap.
* less I/O is required to view mp4s when there are multiple cameras.
This is a pretty dramatic difference in the number of database read
syscalls with pragma page_size = 1024 (605 -> 39 in one test),
although I'm not sure how much of that maps to actual I/O wait time.
That's probably as dramatic as it is due to overflow page chaining.
But even with larger page sizes, there's an improvement. It helps to
stop interleaving the video_index fields from different cameras.
There are changes to the JSON API to take advantage of this, described
in design/api.md.
There's an upgrade procedure, described in guide/schema.md.
This crate is a slightly-more-polished and MIT-licensed version of
resource.rs. So far it has one advantage: running the tests doesn't
require RUST_TEST_THREADS=1.
The benchmarks now require "cargo bench --features=nightly". The
extra #[cfg(nightly)] switches in the code needed for it are a bit
annoying; I may move the benches to a separate directory to avoid this.
But for now, this works.
This is a significant milestone; now the Rust branch matches the C++ branch's
features.
In the process, I switched from using serde_derive (which requires nightly
Rust) to serde_codegen (which does not). It was easier than I thought it'd
be. I'm getting close to no longer requiring nightly Rust.
It would be nice to build on stable Rust. In particular, I'm hitting
compiler bugs in Rust nightly, such at this one:
https://github.com/rust-lang/rust/issues/38177
I imagine beta/stable compilers would be less problematic.
These two features were easy to get rid of:
* alloc was used to get a Box<[u8]> to uninitialized memory.
Looks like that's possible with Vec.
* box_syntax wasn't actually used at all. (Maybe a leftover from something.)
The remaining features are:
* plugin, for clippy.
https://github.com/rust-lang/rust/issues/29597
I could easily gate it with a "nightly" cargo feature.
* proc_macro, for serde_derive.
https://github.com/rust-lang/rust/issues/35900
serde does support stable rust, although it's annoying.
https://serde.rs/codegen-stable.html
I might just wait a bit; this feature looks like it's getting close to
stabilization.
This test is copied from the C++ implementation. It ensures the timestamps are
calculated accurately from the pts rather than using ffmpeg's estimated
duration. The Rust implementation was doing the easy-but-inaccurate thing, so
fix that to make the test pass.
Additionally, I did this with a code structure that should ensure the Rust
code never drops a Writer without indicating to the syncer that its uuid is
abandoned. Such a bug essentially leaks the partially-written file, although a
restart would cause it to be properly unlinked and marked as such. There are
no tests (yet) that exercise this scenario, though.
* new, more thorough tests based on a "BoxCursor" which navigates the
resulting .mp4. This tests everything the C++ code was testing on
Mp4SamplePieces. And it goes beyond: it tests the actual resulting .mp4
file, not some internal logic.
* fix recording::Segment::foreach to properly handle a truncated ending.
Before this was causing a panic.
* get rid of the separate recording::Segment::init method. This was some of
the first Rust I ever wrote, and I must have thought I couldn't loan it my
locked database. I can, and that's more clean. Now Segments are never
half-initialized. Less to test, less to go wrong.
* fix recording::Segment::new to treat a trailing zero duration on a segment
with a non-zero start in the same way as it does with a zero start. I'm
still not sure what I'm doing makes sense, but at least it's not
surprisingly inconsistent.
* add separate, smaller tests of recording::Segment
* address a couple TODOs in the .mp4 code and add missing comments
* change a couple panics on database corruption into cleaner error returns
* increment the etag version given the .mp4 output has changed
I should have submitted/pushed more incrementally but just played with it on
my computer as I was learning the language. The new Rust version more or less
matches the functionality of the current C++ version, although there are many
caveats listed below.
Upgrade notes: when moving from the C++ version, I recommend dropping and
recreating the "recording_cover" index in SQLite3 to pick up the addition of
the "video_sync_samples" column:
$ sudo systemctl stop moonfire-nvr
$ sudo -u moonfire-nvr sqlite3 /var/lib/moonfire-nvr/db/db
sqlite> drop index recording_cover;
sqlite3> create index ...rest of command as in schema.sql...;
sqlite3> ^D
Some known visible differences from the C++ version:
* .mp4 generation queries SQLite3 differently. Before it would just get all
video indexes in a single query. Now it leads with a query that should be
satisfiable by the covering index (assuming the index has been recreated as
noted above), then queries individual recording's indexes as needed to fill
a LRU cache. I believe this is roughly similar speed for the initial hit
(which generates the moov part of the file) and significantly faster when
seeking. I would have done it a while ago with the C++ version but didn't
want to track down a lru cache library. It was easier to find with Rust.
* On startup, the Rust version cleans up old reserved files. This is as in the
design; the C++ version was just missing this code.
* The .html recording list output is a little different. It's in ascending
order, with the most current segment shorten than an hour rather than the
oldest. This is less ergonomic, but it was easy. I could fix it or just wait
to obsolete it with some fancier JavaScript UI.
* commandline argument parsing and logging have changed formats due to
different underlying libraries.
* The JSON output isn't quite right (matching the spec / C++ implementation)
yet.
Additional caveats:
* I haven't done any proof-reading of prep.sh + install instructions.
* There's a lot of code quality work to do: adding (back) comments and test
coverage, developing a good Rust style.
* The ffmpeg foreign function interface is particularly sketchy. I'd
eventually like to switch to something based on autogenerated bindings.
I'd also like to use pure Rust code where practical, but once I do on-NVR
motion detection I'll need to existing C/C++ libraries for speed (H.264
decoding + OpenCL-based analysis).