moonfire-nvr

mirror of https://github.com/scottlamb/moonfire-nvr.git synced 2025-11-28 21:18:11 -05:00

Author	SHA1	Message	Date
Scott Lamb	c271cfa2b5	make Writer enforce maximum recording duration My installation recently somehow ended up with a recording with a duration of 503793844 90,000ths of a second, way over the maximum of 5 minutes. (Looks like the machine was pretty unresponsive at the time and/or having network problems.) When this happens, the system really spirals. Every flush afterward (12 per minute with my installation) fails with a CHECK constraint failure on the recording table. It never gives up on that recording. /var/log fills pretty quickly as this failure is extremely verbose (a stack trace, and a line for each byte of video_index). Eventually the sample file dirs fill up too as it continues writing video samples while GC is stuck. The video samples are useless anyway; given that they're not referenced in the database, they'll be deleted on next startup. This ensures the offending recording is never added to the database, so we don't get the same persistent problem. Instead, writing to the recording will fail. The stream will drop and be retried. If the underlying condition that caused a too-long recording (many non-key-frames, or the camera returning a crazy duration, or the monotonic clock jumping forward extremely, or something) has gone away, the system should recover.	2019-01-29 08:26:36 -08:00
Scott Lamb	3ba3bf2b18	backend support for live stream (#59 ) This is so far completely untested, for use by a new UI prototype. It creates a new URL endpoint which sends one video/mp4 media segment per key frame, with the dependent frames included. This means there will be about one key frame interval of latency (typically about a second). This seems hard to avoid, as mentioned in issue #59.	2019-01-21 15:58:52 -08:00
Scott Lamb	b9e6a6461f	fix streamer tests broken by `4cc796f` That commit rewrote the db/writer.rs tests. But the flush op they used before was also used by src/streamer.rs tests. Reintroduce it.	2019-01-06 07:07:04 -08:00
Scott Lamb	4cc796f697	properly test fix for #64 I went with the third idea in `1ce52e3`: have the tests run each iteration of the syncer explicitly. These are messy tests that know tons of internal details, but I think they're less confusing and racy than if I had the syncer running in a separate thread.	2019-01-04 16:11:58 -08:00
Scott Lamb	1ce52e334c	fix #64 (extraneous flushes) Now each syncer has a binary heap of the times it plans to do a flush. When one of those times arrives, it rechecks if there's something to do. Seems more straightforward than rechecking each stream's first uncommitted recording, especially with the logic to retry failed flushes every minute. Also improved the info! log for each flush to see the actual recordings being flushed for better debuggability. No new tests right now. :-( They're tricky to write. One problem is that it's hard to get the timing right: a different flush has to happen after Syncer::save's database operations and before Syncer::run calls SimulatedClocks::recv_timeout with an empty channel[], advancing the time. I've thought of a few ways of doing this: adding a new SyncerCommand to run something, but it's messy (have to add it from the mock of one of the actions done by the save), and Box<dyn FnOnce() + 'static> not working (see rust-lang/rust#28796) makes it especially annoying. * replacing SimulatedClocks with something more like MockClocks. Lots of boilerplate. Maybe I need to find a good general-purpose Rust mock library. (mockers sounds good but I want something that works on stable Rust.) * bypassing the Syncer::run loop, instead manually running iterations from the test. Maybe the last way is the best for now. I'm likely to try it soon. [*] actually, it's calling Receiver::recv_timeout directly; Clocks::recv_timeout is dead code now? oops.	2019-01-04 13:47:44 -08:00
Scott Lamb	55fa458288	fix confusing variable name + comment	2019-01-04 06:16:25 -08:00
Scott Lamb	eb8a51aecb	add a url for getting debug info about a .mp4 file and add a unit test of path decoding along the way	2018-12-29 13:09:16 -06:00
Scott Lamb	b5387af3d4	lose "extern crate" everywhere (Rust 2018 edition)	2018-12-28 21:59:39 -06:00
Scott Lamb	0b0f4ec9ed	NLL-inspired simplifications to db.rs * remove intermediate bool from adjust_day. * rewrite LockedDatabase::list_aggregate_recordings. I started by collapsing the flush into the first part of the if, in a similar way to adjust_day. But then I refactored more and ended up with a structure that probably would have been allowed with the old lexical borrow checker. I think it's more readable, and it does 1 btree operation per row where before it did 2 or 3.	2018-12-28 15:10:12 -06:00
Scott Lamb	699ec87968	upgrade to 2018 Rust edition This is mostly just "cargo fix --edition" + Cargo.toml changes. There's one fix for upgrading to NLL in db/writer.rs: Writer::previously_opened wouldn't build with NLL because of a double-borrow the previous borrow checker somehow didn't catch. Restructure to avoid it. I'll put elective NLL changes in a following commit.	2018-12-28 14:59:06 -06:00
Scott Lamb	ff58f24785	update deps	2018-12-28 10:13:03 -06:00
Scott Lamb	3c1163dfe2	use SameSite=Strict (for #26 ) I initially chose SameSite=Lax because I thought if a user followed a link to the landing page, the landing page's ajax requests wouldn't send the cookie. But I just did an experiment, and that's not true. Only the initial page load (of a .html file) lacks the cookie. All of its resources and ajax requests send the cookie. I'm not sure about document.cookie accesses, but my cookie is HttpOnly anyway, so it's irrelevant. So no reason to be lax.	2018-12-01 22:04:54 -08:00
Scott Lamb	4f87c16c31	Merge branch 'master' into auth	2018-12-01 15:27:54 -08:00
Scott Lamb	35e6891221	update all Rust deps	2018-12-01 15:20:19 -08:00
Scott Lamb	d35a4592e3	Merge branch 'master' into auth	2018-12-01 00:06:43 -08:00
Scott Lamb	131c5e0640	Fix "no garbage row for <id>" flush failure loops Add some comments along the way. Fixes #63.	2018-12-01 00:03:43 -08:00
Scott Lamb	4daf618c29	fix a couple compile errors in `422cd2a` I just ran a "cargo test" on this after a round of tweaks, not "cargo test --all", so I missed compile errors in the db crate, and a Javascript lint config error. travis-ci caught these.	2018-11-27 12:23:44 -08:00
Scott Lamb	422cd2a75e	preliminary web support for auth (#26 ) Some caveats: * it doesn't record the peer IP yet, which makes it harder to verify sessions are valid. This is a little annoying to do in hyper now (see hyperium/hyper#1410). The direct peer might not be what we want right now anyway because there's no TLS support yet (see #27). In the meantime, the sane way to expose Moonfire NVR to the Internet is via a proxy server, and recording the proxy's IP is not useful. Maybe better to interpret a RFC 7239 Forwarded header (and/or the older X-Forwarded-{For,Proto} headers). * it doesn't ever use Secure (https-only) cookies, for a similar reason. It's not safe to use even with a tls proxy until this is fixed. * there's no "moonfire-nvr config" support for inspecting/invalidating sessions yet. * in debug builds, logging in is crazy slow. See libpasta/libpasta#9. Some notes: * I removed the Javascript "no-use-before-defined" lint, as some of the functions form a cycle. * Fixed #20 along the way. I needed to add support for properly returning non-OK HTTP statuses to signal unauthorized and such. * I removed the Access-Control-Allow-Origin header support, which was at odds with the "SameSite=lax" in the cookie header. The "yarn start" method for running a local proxy server accomplishes the same thing as the Access-Control-Allow-Origin support in a more secure manner.	2018-11-27 11:08:33 -08:00
Scott Lamb	f9d4b5bb8a	fix accidental dependency on rust 1.30.0 travis-ci's 1.27.0 build failed with: error[E0658]: access to extern crates through prelude is experimental (see issue #44660) --> db/auth.rs:159:6 \| 159 \| impl rusqlite::types::FromSql for FromSqlIpAddr { \| ^^^^^^^^	2018-11-02 07:02:08 -07:00
Scott Lamb	75f233da79	initial db layer work for authentication (#26 )	2018-11-01 23:25:06 -07:00
Scott Lamb	8c52c36b51	upgrade a few deps	2018-08-24 22:06:14 -07:00
Scott Lamb	8dc5d64333	make with_recording_playback less monomorphized This is a minor code size reduction - instead of being monomorphized into four variants (according to "cargo llvm-lines"), it's now monomorphized into two. The stripped release binary on macOS is about 8kB smaller (0.15%). Not a huge improvement but better than nothing. Benchmarks seem unchanged (though they have a lot of variance).	2018-08-24 15:34:42 -07:00
Scott Lamb	d7a94956eb	deflake writer tests There was a race condition here because it wasn't waiting for the db flush to complete. This made write_path_retries sometimes not reflect the consequence of the flush, causing an assertion failure. I assume it was also responsible for gc_path_retries timeouts under travis-ci.	2018-08-07 21:58:40 -05:00
Scott Lamb	9982c0b080	small adjustments to auth schema Nothing uses the user and user_session tables yet; I'm trying to anticipate what auth will need before freezing schema version 3.	2018-04-27 06:24:02 -07:00
Scott Lamb	0701121586	a couple refinements to the new user_session table	2018-03-25 07:23:40 -07:00
Scott Lamb	e817b22189	remove an obsolete TODO StreamStateChanger::new already correctly ensures that non-empty streams can't switch sample file dirs.	2018-03-24 20:54:56 -07:00
Scott Lamb	65e68d3255	update design docs for new-schema branch changes	2018-03-24 20:51:30 -07:00
Scott Lamb	91636d3193	refine flush_if_sec behavior The new behavior eliminates a couple unpleasant edge cases in which it would never flush: * if all recording stops, whatever was unflushed would stay that way * if every recording attempt produces a 0-duration recording (such as if the camera sends only one frame and thus no PTS delta can be calculated), the list of recordings to flush would continue to grow	2018-03-23 15:16:43 -07:00
Scott Lamb	addeb9d2f6	add a TimerGuard around db locks & ops I moved the clocks member from LockedDatabase to Database to make this happen, so the new DatabaseGuard (replacing a direct MutexGuard<LockedDatabase>) can access it before acquiring the lock. I also made the type of clock a type parameter of Database (and so several other things throughout the system). This allowed me to drop the Arc<>, but more importantly it means that the Clocks trait doesn't need to stay object-safe. I plan to take advantage of that shortly.	2018-03-23 13:31:23 -07:00
Scott Lamb	c0da1ef880	make v1->v3 upgrade work with --features=bundled --features=bundled enables -DSQLITE_DEFAULT_FOREIGN_KEYS=1, and so some operations have to be done in the proper order. * enable foreign key enforcement all the time, so I test this more reliably. * reorder some parts of the v1->v3 order. foreign key enforcement is immediate (rather than deferred) by default. and ensure old_recording_playback isn't left with a dangling reference to old_recording at the v2 stage. Instead, wait until v3 to delete tables it depends on.	2018-03-22 09:05:40 -07:00
Scott Lamb	c46c50af8f	fix another upgrade error in `dfee66c`	2018-03-22 00:08:49 -07:00
Scott Lamb	2ff7ecb6f4	fix upgrade procedure broken in `dfee66c`	2018-03-22 00:00:39 -07:00
Scott Lamb	1c9f2a4d83	initial schema for user authentication (#26 ) This is only the database schema, which I'm adding now in the hopes of freezing schema version 3. There's no way yet to create users, much less actually authenticate.	2018-03-21 23:57:45 -07:00
Scott Lamb	dfee66c84b	support additional recording_integrity timestamps These are not actually populated by the code yet. I'm trying to get the v3 schema frozen as soon as possible; actually using the fields can come later. Add some explanation of their value in time.md, along with some general musing on leap seconds, and a correction on the frequency error of my cameras.	2018-03-21 22:32:41 -07:00
Scott Lamb	4c8daa6d24	save timestamps along with opens	2018-03-10 16:15:36 -08:00
Scott Lamb	f81d699c8c	new recording_integrity table A couple rarely-used fields move to here, and I expect I'll add more. Redo the check command to just put everything in RAM for simplicity.	2018-03-09 13:37:30 -08:00
Scott Lamb	03809eee8e	clean up leftover throwaway logging	2018-03-09 10:46:34 -08:00
Scott Lamb	d6fa470713	tests and fixes for Writer and Syncer * separate these out into a new file, writer.rs, as dir.rs was getting unwieldy. * extract traits for the parts of SampleFileDir and std::fs::File they needed; set up mock implementations. * move clock.rs to a new base crate to be accessible from the db crate. * add tests that exercise all the retry paths. * bugfix: account for the new recording's bytes when calculating how much to delete. * bugfix: when retrying an unlink failure in collect_garbage, we shouldn't warn about all the recordings no longer existing. Do this by retrying each step rather than the whole procedure again. * avoid double-panic scenarios, which I hit while tweaking the mocks. These are quite annoying to debug as Rust doesn't print information about either panic. I ended up using lldb to get a backtrace. Better to be cautious about what we're doing when already panicking. * give more context on raw::insert_recording errors, which I hit as well while tweaking the new tests.	2018-03-07 04:42:46 -08:00
Scott Lamb	b78ffc3808	view in-progress recordings! The time from recorded to viewable was previously 60-120 sec for the first recording of a RTSP session, 0-60 sec otherwise. Now it's one frame.	2018-03-02 15:40:32 -08:00
Scott Lamb	45f7b30619	allow listing and viewing uncommitted recordings There may be considerable lag between being fully written and being committed when using the flush_if_sec feature. Additionally, this is a step toward listing and viewing recordings before they're fully written. That's a considerable delay: 60 to 120 seconds for the first recording of a run, 0 to 60 seconds for subsequent recordings. These recordings aren't yet included in the information returned by /api/?days=true. They probably should be, but small steps.	2018-03-02 11:38:11 -08:00
Scott Lamb	b17761e871	move list_recordings_by_* logic into raw.rs I want to start having the db.rs version augment this with the uncommitted recordings, and it's nice to have the separation of the raw db vs augmented versions. Also, this fits with the general theme of shrinking db.rs a bit. I had to put the raw video_sample_entry_id into the rows rather than the video_sample_entry Arc. In hindsight, this is better anyway: the common callers don't need to do the btree lookup and arc clone on every row. I think I'd originally done it that way only because I was quite new to rust and didn't understand that db could be used from within the row callback given that both borrows are immutable.	2018-03-01 20:59:05 -08:00
Scott Lamb	b2a8b3c216	update "moonfire-nvr check" for new schema	2018-03-01 17:07:42 -08:00
Scott Lamb	b677964d1a	properly account for bytes to add with next flush This was considering them as 0, so it would under-delete until the next flush them delete all at once. That effectively doubled the number of bytes not yet deleted as they're first transferred to garbage, flushed again, then unlinked.	2018-03-01 13:50:59 -08:00
Scott Lamb	0f2e71ec4a	more safety around adding/deleting dirs	2018-03-01 12:24:32 -08:00
Scott Lamb	f01f523c2c	refine 1->3 upgrade process In hindsight, the "post_tx" step in the upgrade process introduced in `e7f5733` doesn't make sense. If the procedure fails at this stage, nothing says it still needs to be completed. If the sample file dirs have to be updated after the database, then there should be another database version to mark that it's fully completed, and indeed that's the purpose version 3 serves. So get rid of the Upgrader trait and just go back to a simple run function per version. In the case of the sample file dir metadata, it actually can happen before the database transaction; the stuff written to the database later just needs to be consistent with what it finds if there's an existing metadata file from a half-completed update. For safety, ensure there are no unexpected directory contents before upgrading 1->2, and ensure the metadata matches before upgrading 2->3.	2018-03-01 09:47:56 -08:00
Scott Lamb	bcf42fe02c	move db upgrade logic into db crate This allows shrinking db's API surface.	2018-02-28 21:21:47 -08:00
Scott Lamb	fbe1231af0	move open_id from recording_playback to recording I want to be able to use it in etags without having to do a full scan of the recording_playback in advance, which would greatly increase time to first byte. I probably will even use it in urls to ensure the segments they point to are stable. I haven't actually done this yet - it will wait until I implement serving unflushed recordings - but I want to get the schema set up properly.	2018-02-28 20:52:43 -08:00
Scott Lamb	fb4d88d3e2	make db::dir::Writer equally stubborn Every recording it starts must be sent to the syncer with at least one sample written. It will try forever (unless the channel is down, then panic). This avoids the situation in which it prevents something in the uncommitted VecDeque from ever being synced and thus any further recordings from being flushed.	2018-02-28 12:32:52 -08:00
Scott Lamb	b1d71c4e8d	improve Syncer's robustness The new approach is to, rather than panicking, retry forever. The assumption is that if a given operation is failing, a following operation is unlikely to succeed, so it's simpler to just keep trying the earlier one than come up with ways to undo it and proceed with later operations. I still need to apply this approach to the Writer class. It currently unwraps (crashes) or just gives up on a recording without ever sending it to the Syncer. Given that recordings are all synced in order, that means further ones can never be synced.	2018-02-28 11:07:55 -08:00
Scott Lamb	b790075ca2	fix flush_if_sec updates not hitting db	2018-02-23 14:49:10 -08:00

1 2

57 Commits