This gives much better information to the UI layer, getting rid of a
whole troubleshooting guide entry. See #119#132#218#219
I also restructured the code in anticipation of a new WebSocket event
stream (#40).
Fixes#206. 307a388 switched to creating a single-threaded runtime for
each stream, then destroying prior to waiting for TEARDOWN on shutdown.
This meant that the shutdown process could panic with this error:
```
panic at '/home/slamb/git/retina/src/client/mod.rs:219:22': teardown Sender shouldn't be dropped: RecvError(())
```
Let's switch back to expecting a multithreaded runtime context.
Create one for the config subcommand, too.
Don't go all the way back to the old code with its channels, though.
That had the downside that the underlying retina::Session might outlive
the caller, so there could still be an active session when we start
the next one. I haven't seen this cause problems in practice but it
still doesn't seem right.
* switch the config interface over to use Retina and make the test
button honor rtsp_transport = udp.
* adjust the threading model of the Retina streaming code.
Before, it spawned a background future that read from the runtime and
wrote to a channel. Other calls read from this channel.
After, it does work directly from within the block_on calls (no
channels).
The immediate motivation was that the config interface didn't have
another runtime handy. And passing in a current thread runtime
deadlocked. I later learned this is a difference between
Runtime::block_on and Handle::block_on. The former will drive IO and
timers; the latter will not.
But this is also more efficient to avoid so many thread hand-offs.
Both the context switches and the extra spinning that
tokio appears to do as mentioned here:
https://github.com/scottlamb/retina/issues/5#issuecomment-871971550
This may not be the final word on the threading model. Eventually
I may not have per-stream writing threads at all. But I think it will
be easier to look at this after getting rid of the separate
`moonfire-nvr config` subcommand in favor of a web interface.
* in tests, read `.mp4` files via the `mp4` crate rather than ffmpeg.
The annoying part is that this doesn't parse edit lists; oh well.
* simplify the `Opener` interface. Formerly, it'd take either a RTSP
URL or a path to a `.mp4` file, and they'd share some code because
they both sometimes used ffmpeg. Now, they're totally different
libraries (`retina` vs `mp4`). Pull the latter out to a `testutil`
module with a different interface that exposes more of the `mp4`
stuff. Now `Opener` is just for RTSP.
* simplify the h264 module. It had a lot of logic to deal with Annex B.
Retina doesn't use this encoding.
Fixes#36Fixes#126
After a frustrating search for a suitable channel to use for shutdown
(tokio::sync:⌚:Receiver and
futures::future::Shared<tokio::sync::oneshot::Receiver> didn't look
quite right) in which I rethought my life decisions, I finally just made
my own (server/base/shutdown.rs). We can easily poll it or wait for it
in async or sync contexts. Most importantly, it's convenient; not that
it really matters here, but it's also efficient.
We now do a slightly better job of propagating a "graceful" shutdown
signal, and this channel will give us tools to improve it over time.
* Shut down even when writer or syncer operations are stuck. Fixes#117
* Not done yet: streamers should instantly shut down without waiting for
a connection attempt or frame or something. I'll probably
implement that when removing --rtsp-library=ffmpeg. The code should be
cleaner then.
* Not done yet: fix a couple places that sleep for up to a second when
they could shut down immediately. I just need to do the plumbing for
mock clocks to work.
I also implemented an immediate shutdown mode, activated by a second
signal. I think this will mitigate the streamer wait situation.
* prefix docker/nvr commands with sudo (fixes#142).
I was just going to link to the docker documentation on setting
up non-root access, but that's kind of a personal preference.
I included a `<details>` about it instead and made all the commands
work with sudo.
* take better advantage of github markdown's code block syntax
highlighting. Use "console" for shell session stuff, put the
"nvr" wrapper script in its own block with "bash".
* add some comments to nvr wrapper script where people need to
make changes and/or will be confused.
* add a `<details>` that talks about shutting down and restarting
the session around `nvr config` (see #151). Still not user-friendly
but at least it's better documented now.
Reading from the mmap()ed region in the tokio threads could cause
them to stall:
* That could affect UI serving when there were concurrent
UI requests (i.e., not just requests that needed the reads in
question anyway).
* If there's a faulty disk, it could cause the UI to totally hang.
Better to not mix disks between threads.
* Soon, I want to handle RTSP from the tokio threads (#37). Similarly,
we don't want RTSP streaming to block on operations from unrelated
disks.
I went with just one thread per disk which I think is sufficient.
But it'd be possible to do a fixed-size pool instead which might improve
latency when some pages are already cached.
I also dropped the memmap dependency. I had to compute the page
alignment anyway to get mremap to work, and Moonfire NVR already is
Unix-specific, so there wasn't much value from the memmap or memmap2
crates.
Fixes#88
In particular, this was happening out of the box on Raspberry Pi OS Lite
20210304, as reported by ironoxidizer@gmail.com here:
https://groups.google.com/g/moonfire-nvr-users/c/2j9LvfFl2u8/m/tJcNS2WfCQAJ
* adjust main.rs to make the problem more obvious
* mention it in the troubleshooting guide
* sidestep it in the nvr docker wrapper script
also just use --networking=host rather than --publish (avoiding a proxy
process). I'm using Docker to simplify the build and deployment process,
not as a security boundary, so just do the simpler thing.
As noted in mylog's 2b1085c:
Looks like both the GNU tools' --color argument and cargo's
CARGO_TERM_COLOR expect always/never rather than on/off. Match that.
Might as well understand off/no/false and on/yes/true also.
* add more description to the troubleshooting guide
* adjust the log format to match more recent glog
* include a config for the lnav tool, which will help colorize,
browse, and search the logs.
Next up: install an ffmpeg log callback for consistency.
Inspired by the poor error message here:
https://github.com/scottlamb/moonfire-nvr/issues/107#issuecomment-777587727
* print the friendlier Display version of the error rather than Debug.
Eg, "EROFS: Read-only filesystem" rather than "Sys(EROFS)". Do this
everywhere: on command exit, on syncer retries, and on stream
retries.
* print the most immediate problem and additional lines for each
cause.
* print the backtrace or an advertisement for RUST_BACKTRACE=1 if it's
unavailable.
* also mention RUST_BACKTRACE=1 in the troubleshooting guide.
* add context in various places, including pathnames. There are surely
many places more it'd be helpful, but this is a start.
* allow subcommands to return failure without an Error.
In particular, "moonfire-nvr check" does its own error printing
because it wants to print all the errors it finds. Printing "see
earlier errors" with a meaningless stack trace seems like it'd just
confuse. But I also want to get rid of the misleading "Success" at
the end and 0 return to the OS.
The 091217b workaround of telling ffmpeg to only request the video
stream works perfectly fine for now. I'll revisit when adding audio
support (#34).
Fixes#36
Some caveats:
* it doesn't record the peer IP yet, which makes it harder to verify
sessions are valid. This is a little annoying to do in hyper now
(see hyperium/hyper#1410). The direct peer might not be what we want
right now anyway because there's no TLS support yet (see #27). In
the meantime, the sane way to expose Moonfire NVR to the Internet is
via a proxy server, and recording the proxy's IP is not useful.
Maybe better to interpret a RFC 7239 Forwarded header (and/or
the older X-Forwarded-{For,Proto} headers).
* it doesn't ever use Secure (https-only) cookies, for a similar reason.
It's not safe to use even with a tls proxy until this is fixed.
* there's no "moonfire-nvr config" support for inspecting/invalidating
sessions yet.
* in debug builds, logging in is crazy slow. See libpasta/libpasta#9.
Some notes:
* I removed the Javascript "no-use-before-defined" lint, as some of
the functions form a cycle.
* Fixed#20 along the way. I needed to add support for properly
returning non-OK HTTP statuses to signal unauthorized and such.
* I removed the Access-Control-Allow-Origin header support, which was
at odds with the "SameSite=lax" in the cookie header. The "yarn
start" method for running a local proxy server accomplishes the same
thing as the Access-Control-Allow-Origin support in a more secure
manner.