Commit Graph

72 Commits

Author SHA1 Message Date
Scott Lamb
7fe9d34655 cargo fix --all
* it added "dyn" to trait objects
* it changed "..." in patterns to "..="

cargo --version says: "cargo 1.37.0-nightly (545f35425 2019-05-23)"
2019-06-14 08:47:11 -07:00
Scott Lamb
428f5a3ba4 update a few deps
cursive & rusqlite are more significant; I'll do those separately
2019-05-31 15:08:49 -07:00
Scott Lamb
579150c9d5 redact URLs within stream.rs; fixes #13 2019-02-13 22:34:19 -08:00
Scott Lamb
c271cfa2b5 make Writer enforce maximum recording duration
My installation recently somehow ended up with a recording with a
duration of 503793844 90,000ths of a second, way over the maximum of 5
minutes. (Looks like the machine was pretty unresponsive at the time
and/or having network problems.)

When this happens, the system really spirals. Every flush afterward (12
per minute with my installation) fails with a CHECK constraint failure
on the recording table. It never gives up on that recording. /var/log
fills pretty quickly as this failure is extremely verbose (a stack
trace, and a line for each byte of video_index). Eventually the sample
file dirs fill up too as it continues writing video samples while GC is
stuck. The video samples are useless anyway; given that they're not
referenced in the database, they'll be deleted on next startup.

This ensures the offending recording is never added to the database, so
we don't get the same persistent problem. Instead, writing to the
recording will fail. The stream will drop and be retried. If the
underlying condition that caused a too-long recording (many
non-key-frames, or the camera returning a crazy duration, or the
monotonic clock jumping forward extremely, or something) has gone away,
the system should recover.
2019-01-29 08:26:36 -08:00
Scott Lamb
95a8c2e78d support .mp4 files > 13.25 hours
Use version 1 of the mvhd, tkhd, and mdhd boxes to support 64-bit
durations. 2^32 units / 90,000 units/sec / 60 sec/min / 60 min/hr ~=
13.25 hrs.

Compatibility: looks like Chrome, Firefox, VLC, and ffmepg all support
version 1 with no problem.
2019-01-07 00:59:32 -08:00
Scott Lamb
de643f9f8d include segments in debug output 2018-12-29 13:15:01 -06:00
Scott Lamb
eb8a51aecb add a url for getting debug info about a .mp4 file
and add a unit test of path decoding along the way
2018-12-29 13:09:16 -06:00
Scott Lamb
b5387af3d4 lose "extern crate" everywhere (Rust 2018 edition) 2018-12-28 21:59:39 -06:00
Scott Lamb
f5703b9968 introduce typed errors and use in mp4 code
Fixes #46. If there are no video_sample_entries, it returns
InvalidArgument, which gets mapped to a HTTP 400. Various other failures
turn into non-500s as well.

There are many places that can & should be using typed errors, but it's
a start.
2018-12-28 17:30:33 -06:00
Scott Lamb
699ec87968 upgrade to 2018 Rust edition
This is mostly just "cargo fix --edition" + Cargo.toml changes.
There's one fix for upgrading to NLL in db/writer.rs:
Writer::previously_opened wouldn't build with NLL because of a
double-borrow the previous borrow checker somehow didn't catch.
Restructure to avoid it.

I'll put elective NLL changes in a following commit.
2018-12-28 14:59:06 -06:00
Scott Lamb
89fa35a2f7 be slightly more graceful on bad /view.mp4 (#46)
Before, this would panic from the reactor thread. After, it returns a
internal server error. Still not ideal, but better.

To return "bad request" as it should, mp4::FileBuilder::build() should
return a new error type that distinguishes "invalid argument" from
"internal" and the like. I'm thinking of using a ErrorKind enum
throughout the program that's similar to grpc::StatusCode.
2018-12-28 09:01:47 -06:00
Scott Lamb
071be03c6f update most deps, notably including reqwest
Fixes #60

The reqwest dependency is significant because the old version required
an old version of openssl, complicating compilation on newer platforms.
reqwest also pulled in old/duplicate versions of hyper, tokio, etc.
Nice to drop a lot of that cruft.

I left rusqlite and uuid alone because they had breaking changes I
didn't want to mess with at the moment.

Bumped the minimum Rust version to 1.30.0, as required by the
new encoding_rs crate (and perhaps other things).
2018-11-20 09:32:55 -08:00
Scott Lamb
955a0a8c15 upgrade to hyper 0.12.x
Just one (intentional) functional change---now the streamers start
shutting down while the webserver shuts down gracefully.
2018-08-29 22:26:19 -07:00
Scott Lamb
8dc5d64333 make with_recording_playback less monomorphized
This is a minor code size reduction - instead of being monomorphized
into four variants (according to "cargo llvm-lines"), it's now
monomorphized into two. The stripped release binary on macOS is about
8kB smaller (0.15%). Not a huge improvement but better than nothing.

Benchmarks seem unchanged (though they have a lot of variance).
2018-08-24 15:34:42 -07:00
Scott Lamb
b0071515e0 update deps
I want to use hyper::server::Request::bytes_mut(), so an update is
needed. Update everything at once. Most notably, the http-serve update
starts using the http crate types for some things. (More to come.)
2018-04-06 15:54:52 -07:00
Scott Lamb
97d831e054 move strutil to base crate
I plan to use strutil::hex in db/auth.rs.
2018-03-30 08:54:20 -07:00
Scott Lamb
91636d3193 refine flush_if_sec behavior
The new behavior eliminates a couple unpleasant edge cases in which it
would never flush:

* if all recording stops, whatever was unflushed would stay that way
* if every recording attempt produces a 0-duration recording (such as if the
  camera sends only one frame and thus no PTS delta can be calculated),
  the list of recordings to flush would continue to grow
2018-03-23 15:16:43 -07:00
Scott Lamb
addeb9d2f6 add a TimerGuard around db locks & ops
I moved the clocks member from LockedDatabase to Database to make this happen,
so the new DatabaseGuard (replacing a direct MutexGuard<LockedDatabase>) can
access it before acquiring the lock.

I also made the type of clock a type parameter of Database (and so several
other things throughout the system). This allowed me to drop the Arc<>, but
more importantly it means that the Clocks trait doesn't need to stay
object-safe. I plan to take advantage of that shortly.
2018-03-23 13:31:23 -07:00
Scott Lamb
d6fa470713 tests and fixes for Writer and Syncer
* separate these out into a new file, writer.rs, as dir.rs was getting
  unwieldy.
* extract traits for the parts of SampleFileDir and std::fs::File they needed;
  set up mock implementations.
* move clock.rs to a new base crate to be accessible from the db crate.
* add tests that exercise all the retry paths.
* bugfix: account for the new recording's bytes when calculating how much to
  delete.
* bugfix: when retrying an unlink failure in collect_garbage, we shouldn't
  warn about all the recordings no longer existing. Do this by retrying each
  step rather than the whole procedure again.
* avoid double-panic scenarios, which I hit while tweaking the mocks. These
  are quite annoying to debug as Rust doesn't print information about either
  panic. I ended up using lldb to get a backtrace. Better to be cautious about
  what we're doing when already panicking.
* give more context on raw::insert_recording errors, which I hit as well while
  tweaking the new tests.
2018-03-07 04:42:46 -08:00
Scott Lamb
b78ffc3808 view in-progress recordings!
The time from recorded to viewable was previously 60-120 sec for the first
recording of a RTSP session, 0-60 sec otherwise. Now it's one frame.
2018-03-02 15:40:32 -08:00
Scott Lamb
45f7b30619 allow listing and viewing uncommitted recordings
There may be considerable lag between being fully written and being committed
when using the flush_if_sec feature. Additionally, this is a step toward
listing and viewing recordings before they're fully written. That's a
considerable delay: 60 to 120 seconds for the first recording of a run,
0 to 60 seconds for subsequent recordings.

These recordings aren't yet included in the information returned by
/api/?days=true. They probably should be, but small steps.
2018-03-02 11:38:11 -08:00
Scott Lamb
b17761e871 move list_recordings_by_* logic into raw.rs
I want to start having the db.rs version augment this with the uncommitted
recordings, and it's nice to have the separation of the raw db vs augmented
versions. Also, this fits with the general theme of shrinking db.rs a bit.

I had to put the raw video_sample_entry_id into the rows rather than
the video_sample_entry Arc. In hindsight, this is better anyway: the common
callers don't need to do the btree lookup and arc clone on every row. I think
I'd originally done it that way only because I was quite new to rust and
didn't understand that db could be used from within the row callback given
that both borrows are immutable.
2018-03-01 20:59:05 -08:00
Scott Lamb
fb4d88d3e2 make db::dir::Writer equally stubborn
Every recording it starts must be sent to the syncer with at least one sample
written. It will try forever (unless the channel is down, then panic). This
avoids the situation in which it prevents something in the uncommitted
VecDeque from ever being synced and thus any further recordings from being
flushed.
2018-02-28 12:32:52 -08:00
Scott Lamb
843e1b49c8 take FnMut closures by reference
I mistakenly thought these had to be monomorphized. (The FnOnce still
does, until rust-lang/rfcs#1909 is implemented.) Turns out this way works
fine. It should result in less compile time / code size, though I didn't check
this.
2018-02-23 09:19:42 -08:00
Scott Lamb
b037c9bdd7 knob to reduce db commits (SSD write cycles)
This improves the practicality of having many streams (including the doubling
of streams by having main + sub streams for each camera). With these tuned
properly, extra streams don't cause any extra write cycles in normal or error
cases. Consider the worst case in which each RTSP session immediately sends a
single frame and then fails. Moonfire retries every second, so this would
formerly cause one commit per second per stream. (flush_if_sec=0 preserves
this behavior.) Now the commits can be arbitrarily infrequent by setting
higher values of flush_if_sec.

WARNING: this isn't production-ready! I hacked up dir.rs to make tests pass
and "moonfire-nvr run" work in the best-case scenario, but it doesn't handle
errors gracefully. I've been debating what to do when writing a recording
fails. I considered "abandoning" the recording then either reusing or skipping
its id. (in the latter case, marking the file as garbage if it can't be
unlinked immediately). I think now there's no point in abandoning a recording.
If I can't write to that file, there's no reason to believe another will work
better. It's better to retry that recording forever, and perhaps put the whole
directory into an error state that stops recording until those writes go
through. I'm planning to redesign dir.rs to make this happen.
2018-02-22 16:35:34 -08:00
Scott Lamb
31adbc1e9f initial split of database to a separate crate
It should reduce compile time / memory usage to put quite a bit of the code
into a separate crate. I also intend to limit visibility of some things to
only within the db crate, but that's for a future change. This is the smallest
move that will compile.
2018-02-20 23:15:39 -08:00
Scott Lamb
d84e754b2a replace homegrown Error with failure crate
This reduces boilerplate, making it a bit easier for me to split the db stuff
out into its own crate.
2018-02-20 22:46:14 -08:00
Scott Lamb
253f3de399 reorganize the sample file directory
The filenames now represent composite ids (stream id + recording id) rather
than a separate uuid system with its own reservation for a few benefits:

  * This provides more information when there are inconsistencies.

  * This avoids the need for managing the reservations during recording. I
    expect this to simplify delaying flushing of newly written sample files.
    Now the directory has to be scanned at startup for files that never got
    written to the database, but that's acceptably fast even with millions of
    files.

  * Less information to keep in memory and in the recording_playback table.

I'd considered using one directory per stream, which might help if the
filesystem has trouble coping with huge directories. But that would mean each
dir has to be fsync()ed separately (more latency and/or more multithreading).
So I'll stick with this until I see concrete evidence of a problem that would
solve.

Test coverage of the error conditions is poor. I plan to do some restructuring
of the db/dir code, hopefully making steps toward testability along the way.
2018-02-20 10:11:10 -08:00
Scott Lamb
89b6bccaa3 support multiple sample file directories
This is still pretty basic support. There's no config UI support for
renaming/moving the sample file directories after they are created, and no
error checking that the files are still in the expected place. I can imagine
sysadmins getting into trouble trying to change things. I hope to address at
least some of that in a follow-up change to introduce a versioning/locking
scheme that ensures databases and sample file dirs match in some way.

A bonus change that kinda got pulled along for the ride: a dialog pops up in
the config UI while a stream is being tested. The experience was pretty bad
before; there was no indication the button worked at all until it was done,
sometimes many seconds later.
2018-02-11 23:04:02 -08:00
Scott Lamb
dc402bdc01 schema version 2: support sub streams
This allows each camera to have a main and a sub stream. Previously there was
a field in the schema for the sub stream's url, but it didn't do anything. Now
you can configure individual retention for main and sub streams. They show up
grouped in the UI.

No support for upgrading from schema version 1 yet.
2018-02-03 22:15:54 -08:00
Scott Lamb
6902be1981 upgrade deps 2018-01-30 22:05:39 -08:00
Scott Lamb
8caa2e5d0e crate rename: http-(entity|file) -> http-serve 2018-01-23 11:08:21 -08:00
Scott Lamb
5c8970fe8a update dependencies 2017-11-16 23:01:09 -08:00
Scott Lamb
9041eeb907 fix panic when requesting zero segment duration
The recording::Segment was constructing a segment with no frames in it, which
was causing a panic when appending a zero-length stts to the Slices. Fix this
in a couple ways:

* Slices::append should return Err rather than panic. No reason to crash the
  whole program when we have trouble serving a single .mp4 request.
* recording::Segment shouldn't produce zero-frame segments
2017-10-17 08:55:21 -07:00
Scott Lamb
1d08698d0c debug, fix panic with zero-duration recording
I had an assert that fired in this case, dating back to when I hadn't plumbed
Result returns through much of .mp4 construction. Now I have, so there's no
excuse in having an assert here. Change to an error return, and tweak it to
not fire in the zero-duration case.

Also fix a problem in the test harness; I hadn't finished converting it for
multi-recording tests, and it was returning the wrong recording.

Because of that, I seem to have stumbled across a related problem in which
asking for zero duration of a non-zero duration recording will return a
recording::Segment with no frames, which will cause panics because its
corresponding .mp4 slices are zero-length. I just adjusted the panic message
here; I'll follow up with changes to address that.
2017-10-17 06:14:47 -07:00
Scott Lamb
711f7b3409 fix with-editlist hash
I missed this because I was running with ffmpeg 3 and had grown to expect this
test to fail. Quick fix on that coming shortly.
2017-10-09 21:00:45 -07:00
Scott Lamb
af282c309e fix corrupt stss on segments after trimmed segment
This was causing Firefox to fail to play multipart .mp4s which trimmed away a
prefix. In the developer console, it said NS_ERROR_DOM_MEDIA_METADATA_ERR
without giving any RESULT_DETAIL, making it a pain to diagnose. Given that the
stss is supposed to be needed for seeking, I'm surprised it didn't have any
immediately obvious impact on Chrome or VLC.  Maybe they just took longer to
seek than otherwise necessary.

The bug was that when keeping track of the "next frame num" while constructing
the .mp4, I appended the number in the underlying recording, not the number
post-trimming. That meant following segments used the wrong numbers. In some
cases, it caused it to exceed the total number of samples in the generated
.mp4, which seems to be what Firefox was complaining about. Running the result
through "ffmpeg -i bad.mp4 -c copy -f mp4 good.mp4" just trimmed away the most
obviously invalid ones, leaving others that didn't point to the frames they
meant to. That was enough to make Firefox start playing the file. /shruggie

The existing tests were all with a single segment, so I added a new one to
catch this. I also added a Debug implementation to recording::Segment and
mp4::Segment.
2017-10-09 06:32:43 -07:00
Scott Lamb
919e9a6deb remove extraneous debug logging 2017-10-04 22:55:29 -07:00
Scott Lamb
cb18ba44d8 fix /view.mp4 with rel_start
This was totally broken in commit 1cf27c18. It would serve bytes from the
beginning of the sample file in question, not from the start of the given
range.
2017-10-04 22:51:16 -07:00
Scott Lamb
5ea2c2fed1 fix media rate in edit list
it should be exactly 1, but was slightly more because the fraction was
incorrectly 1 rather than 0. I'm not sure if any actual players care about
this, but it was something I noticed when looking into strange edit list
behavior.
2017-10-04 00:03:33 -07:00
Scott Lamb
7673a00bd9 serve 'video/mp4; codecs="avc1.xxxxxx"' mime type
This can be used when constructing a HTML5 SourceBuffer.
2017-10-03 23:25:58 -07:00
Scott Lamb
04e9f3f160 support segmented mp4s
This is intended to support HTML5 Media Source Extensions, which I expect to
be the most practical way to make a good web UI with a proper scrub bar and
such.

This feature has had very limited testing on Chrome and Firefox, and that was
not entirely successful. More work is needed before it's usable, but this
seems like a helpful progress checkpoint.
2017-10-01 15:29:22 -07:00
Scott Lamb
11420df065 update deps (particularly hyper) + fix warnings 2017-09-21 21:51:58 -07:00
Scott Lamb
857a66f29c use my own ffmpeg crate
This significantly improves safety of the ffmpeg interface. The complex
ABIs aren't accessed directly from Rust. Instead, I have a small C
wrapper which uses the ffmpeg C API and the C headers at compile-time to
determine the proper ABI in the same way any C program using ffmpeg
would, so that the ABI doesn't have to be duplicated in Rust code.
I've tested with ffmpeg 2.x and ffmpeg 3.x; it seems to work properly
with both where before ffmpeg 3.x caused segfaults.

It still depends on ABI compatibility between the compiled and running
versions. C programs need this, too, and normal shared library
versioning practices provide this guarantee. But log both versions on
startup for diagnosing problems with this.

Fixes #7
2017-09-20 21:06:06 -07:00
Scott Lamb
ac43e7fe17 fix bench --features=nightly, broken by upgrade
The benchmarks don't get compiled with the standard "cargo test";
they require "cargo +nightly bench --features=nightly", so I didn't notice
they were broken in the previous commit. Now fixed.
2017-06-11 19:40:36 -07:00
Scott Lamb
bebd6ee79a update dependencies
* The mylog update fixes a couple bad bugs.
* Otherwise, just keep up with the Rust ecosystem.
2017-06-11 12:57:55 -07:00
Scott Lamb
30cda85a2e shrink mp4::Segment by another 24 bytes on 32-bit
This is 1,440 bytes for a 60-segment .mp4, so another modest cache
improvement.
2017-03-27 20:55:58 -07:00
Scott Lamb
c3cffb510b make an assert more informative
I got this error but didn't understand how it happened.
2017-03-05 00:58:06 -08:00
Scott Lamb
1cf27c189f upgrade to async hyper
serve_generated_bytes is >3X faster. One caveat is that the reactor thread may
stall when reading from the memory-mapped slice. Moonfire NVR is basically a
single-user program, so that may not be so bad, but we'll see.
2017-03-02 19:29:28 -08:00
Scott Lamb
618709734a trim the recording playback cache a bit
It had an Arc which in hindsight isn't necessary; the actual video index
generation is fast anyway. This saves a couple pointers per cache entry and
the overhead of chasing them. LruCache itself also has some extra pointers on
it but that's something to address another day.
2017-02-28 23:28:25 -08:00