Reading from the mmap()ed region in the tokio threads could cause
them to stall:
* That could affect UI serving when there were concurrent
UI requests (i.e., not just requests that needed the reads in
question anyway).
* If there's a faulty disk, it could cause the UI to totally hang.
Better to not mix disks between threads.
* Soon, I want to handle RTSP from the tokio threads (#37). Similarly,
we don't want RTSP streaming to block on operations from unrelated
disks.
I went with just one thread per disk which I think is sufficient.
But it'd be possible to do a fixed-size pool instead which might improve
latency when some pages are already cached.
I also dropped the memmap dependency. I had to compute the page
alignment anyway to get mremap to work, and Moonfire NVR already is
Unix-specific, so there wasn't much value from the memmap or memmap2
crates.
Fixes#88
* my dad's GW4089IP cameras use 720x480
* some Reolink cameras use 640x352
* I'm playing with rotated cameras (16x9 -> 9x16)
I'd prefer to calculate pasp from a configured camera aspect ratio
than to hardcode the assumption these are 16x9, but that requires
a schema change. This is an improvement for now.
The immediate motivation is to address these CI failures with nightly:
https://github.com/scottlamb/moonfire-nvr/runs/2593322801?check_suite_focus=true
```
Compiling lock_api v0.4.2
error[E0557]: feature has been removed
--> /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/lock_api-0.4.2/src/lib.rs:91:42
|
91 | #![cfg_attr(feature = "nightly", feature(const_fn))]
| ^^^^^^^^ feature has been removed
|
= note: split into finer-grained feature gates
```
Strangely, they don't occur locally with "rustc 1.54.0-nightly
(fe72845f7 2021-05-16)" but do on CI with the exact same version?!?
I don't get it, but lock-api 0.4.4 is advertised as being updated for
latest nightly, so I expect this will address the problem anyway.
I saw this error once:
Apr 27 21:01:33 nuc moonfire-nvr[188570]: s-reolink-sub
moonfire_nvr::streamer] reolink-sub: sleeping for Duration { secs: 1,
nanos: 0 } after error: CHECK constraint failed: video_sample_entry
and would like to understand it better.
* send folks to the issue tracker with the same template as bug
reports. Gathering this extra info up front should help things
move more quickly.
* in the template, ask for camera info.
* API change: in update signals, allow setting a start time relative
to now. This is an accuracy improvement in the case where the client
has been retrying an initial request for a while. Kind of an obscure
corner case but easy enough to address. And use a more convenient
enum representation.
* in update signals, choose `now` before acquiring the database lock.
If lock acquisition takes a long time, this more accurately reflects
the time the caller intended.
* in general, make Time and Duration (de)serializable and use them
in json types. This makes the types more self-describing, with
better debug printing on both the server side and on the client
library (in moonfire-playground). To make this work, base has to
import serde which initially seemed like poor layering to me, but
serde seems to be imported in some pretty foundational Rust crates
for this reason. I'll go with it.
* Use the standard UUID syntax for /etc/fstab
* Added instruction to create sample directory
* Update install.md
* Change sample ownership instead of perms
In particular, this was happening out of the box on Raspberry Pi OS Lite
20210304, as reported by ironoxidizer@gmail.com here:
https://groups.google.com/g/moonfire-nvr-users/c/2j9LvfFl2u8/m/tJcNS2WfCQAJ
* adjust main.rs to make the problem more obvious
* mention it in the troubleshooting guide
* sidestep it in the nvr docker wrapper script
also just use --networking=host rather than --publish (avoiding a proxy
process). I'm using Docker to simplify the build and deployment process,
not as a security boundary, so just do the simpler thing.
To improve reliability of live streams (#59) on Safari.
Safari was dropping the cookie from websocket update requests.
(But it worked sometimes. I don't get why.) I saw folks on the Internet
thinking this related to HttpOnly:
* https://developer.apple.com/forums/thread/104488
* https://stackoverflow.com/q/47742807/23584
but I still see this behavior without HttpOnly. SameSite=Strict vs
SameSite=Lax appears to make a difference. Try that instead.
SameSite=Strict is pointless for us anyway as noted in a new comment.
Turning off HttpOnly would be more unfortunate security-wise.
* don't error out on websocket error message. The close message has
more useful info. (Unfortunately though, for security reasons it
doesn't give too much to the script on initial connection failure.)
* restore normal (non-error/waiting) state when switching cameras.
* prefix all log messages with the camera name
* When the MediaSource is in error state (or busy, I think),
endOfStream throws an exception. This prevented the error from being
properly displayed in the UI. We don't really need to call
endOfStream, I guess.
* including an Event in a format string just said [Object object],
at least on Safari. Use the type instead, which I think is the
only useful info in the event anyway.
As required for live view (#59) to work on Safari.
Safari has some "interesting" expectations:
* There must be a non-empty list of compatible brands. The major brand
is not automatically included. (Looks like ISO/IEC 14496-12 doesn't
spell out which is correct.)
* The tfdt box must be before the trun boxes. Moonfire NVR was not
compliant with ISO/IEC 14496-12:2015 section 8.8.12.1 before.
Chrome and Firefox didn't care, but Safari does.
* The mdat must be written with the small format. Safari is not
implementing the spec properly.
I figured these out by painstakingly comparing Moonfire NVR's output
with gpac's, making it match almost byte-for-byte until it worked, then
backing out changes one at a time to check which were relevant. Ugh!
Most importantly, in build-ui.bash, fix an extra "&&". This meant
that if the build command fails, it would proceed and cause confusion.
This happened for me: I ran it without the emulation installed. Both
the build-server and deploy stages had problems, but because of the "&&"
the deploy target didn't actually return failure. After I fixed the
emulation problem, there was a bad cached layer.
Also save the build output from the dev stage.
It's a start. It can display several streams at once, which is nice.
There are lots of opportunities for improvement:
* it doesn't keep the videos approximately in sync.
* it accumulates extra buffering, drifting behind live. This is
particularly noticeable when it's paused and played again; it can
be several seconds before it jumps to after the break.
* it always uses the sub stream rather main. I'd prefer it support
"auto" (use main if the viewport is larger than the sub stream and
there's sufficient bandwidth), "main", or "sub".
* it has a kludgy heuristic where it throws away everything buffered 5
seconds before the current timestamp. It should throw away
everything before the current GOP instead, but I need to alter the
API so it can easily know when that is.
* it can't tell you when a camera connection is down. This needs an
API change also.
* it'd be nice to quickly double-click on a stream to view only it,
then double-click again to go back to the multi-pane view.
* it doesn't allow you to zoom in on part of the video. This would be
nice particularly when viewing 4k video streams on small screens.
* it has only four preconfigured layouts that subdivide a 16x9
viewport. You have to choose every camera every time. It'd be nice
to both allow more flexibility and have more memory.
React prototype: #111
live stream: #59
Chrome appears to time out at 60 seconds of inactivity otherwise.
I think it's better to keep the stream open, even if the camera is
broken.
The implementation looks awkward, but that might be the state of Rust
async right now.
I spotted this by inspection: adding a media time and wall time didn't
look right. I also confirmed the brokenness on my primary NVR:
```
sqlite> .mode column
sqlite> select
...> r1.composite_id,
...> r1.prev_media_duration_90k,
...> r1.wall_duration_90k,
...> r1.media_duration_delta_90k,
...> r2.composite_id,
...> r2.prev_media_duration_90k
...> from
...> recording r1 join recording r2 on (r1.composite_id = r2.composite_id - 1)
...> where
...> r1.prev_media_duration_90k + r1.wall_duration_90k + r1.media_duration_delta_90k !=
...> r2.prev_media_duration_90k
...> limit 5;
4296791095 2232623913716 5398956 154 4296791096 2232629312672
4296791096 2232629312672 5400016 38 4296791097 2232634712688
4296791097 2232634712688 5400729 105 4296791098 2232640113417
4296791098 2232640113417 5399024 80 4296791099 2232645512441
4296791099 2232645512441 5400770 124 4296791100 2232650913211
```
In the first row, the second recording's prev_media_duration_90k is the
first's prev_media_duration_90k plus its wall time, not its media time.