27 Commits

Author SHA1 Message Date
Scott Lamb
438de38202
rework WebSocket error return protocol
This gives much better information to the UI layer, getting rid of a
whole troubleshooting guide entry. See #119 #132 #218 #219

I also restructured the code in anticipation of a new WebSocket event
stream (#40).
2023-02-15 17:26:40 -08:00
Scott Lamb
7b0a489541 rework stream threading model
Fixes #206. 307a388 switched to creating a single-threaded runtime for
each stream, then destroying prior to waiting for TEARDOWN on shutdown.
This meant that the shutdown process could panic with this error:

```
panic at '/home/slamb/git/retina/src/client/mod.rs:219:22': teardown Sender shouldn't be dropped: RecvError(())
```

Let's switch back to expecting a multithreaded runtime context.
Create one for the config subcommand, too.

Don't go all the way back to the old code with its channels, though.
That had the downside that the underlying retina::Session might outlive
the caller, so there could still be an active session when we start
the next one. I haven't seen this cause problems in practice but it
still doesn't seem right.
2022-04-13 11:39:38 -07:00
Scott Lamb
307a3884a0 drop ffmpeg support
* switch the config interface over to use Retina and make the test
  button honor rtsp_transport = udp.

* adjust the threading model of the Retina streaming code.

  Before, it spawned a background future that read from the runtime and
  wrote to a channel. Other calls read from this channel.

  After, it does work directly from within the block_on calls (no
  channels).

  The immediate motivation was that the config interface didn't have
  another runtime handy. And passing in a current thread runtime
  deadlocked. I later learned this is a difference between
  Runtime::block_on and Handle::block_on. The former will drive IO and
  timers; the latter will not.

  But this is also more efficient to avoid so many thread hand-offs.
  Both the context switches and the extra spinning that
  tokio appears to do as mentioned here:
  https://github.com/scottlamb/retina/issues/5#issuecomment-871971550

  This may not be the final word on the threading model. Eventually
  I may not have per-stream writing threads at all. But I think it will
  be easier to look at this after getting rid of the separate
  `moonfire-nvr config` subcommand in favor of a web interface.

* in tests, read `.mp4` files via the `mp4` crate rather than ffmpeg.
  The annoying part is that this doesn't parse edit lists; oh well.

* simplify the `Opener` interface. Formerly, it'd take either a RTSP
  URL or a path to a `.mp4` file, and they'd share some code because
  they both sometimes used ffmpeg. Now, they're totally different
  libraries (`retina` vs `mp4`). Pull the latter out to a `testutil`
  module with a different interface that exposes more of the `mp4`
  stuff. Now `Opener` is just for RTSP.

* simplify the h264 module. It had a lot of logic to deal with Annex B.
  Retina doesn't use this encoding.

Fixes #36
Fixes #126
2022-03-18 13:22:47 -07:00
Scott Lamb
b41a6c43da shutdown better
After a frustrating search for a suitable channel to use for shutdown
(tokio::sync::Receiver and
futures::future::Shared<tokio::sync::oneshot::Receiver> didn't look
quite right) in which I rethought my life decisions, I finally just made
my own (server/base/shutdown.rs). We can easily poll it or wait for it
in async or sync contexts. Most importantly, it's convenient; not that
it really matters here, but it's also efficient.

We now do a slightly better job of propagating a "graceful" shutdown
signal, and this channel will give us tools to improve it over time.

* Shut down even when writer or syncer operations are stuck. Fixes #117
* Not done yet: streamers should instantly shut down without waiting for
  a connection attempt or frame or something. I'll probably
  implement that when removing --rtsp-library=ffmpeg. The code should be
  cleaner then.
* Not done yet: fix a couple places that sleep for up to a second when
  they could shut down immediately. I just need to do the plumbing for
  mock clocks to work.

I also implemented an immediate shutdown mode, activated by a second
signal. I think this will mitigate the streamer wait situation.
2021-09-23 16:33:29 -07:00
Scott Lamb
3de605be6c improve some log msgs' readability 2021-08-31 08:59:33 -07:00
Scott Lamb
30cea5cfcb several documentation improvements
*   prefix docker/nvr commands with sudo (fixes #142).
    I was just going to link to the docker documentation on setting
    up non-root access, but that's kind of a personal preference.
    I included a `<details>` about it instead and made all the commands
    work with sudo.

*   take better advantage of github markdown's code block syntax
    highlighting. Use "console" for shell session stuff, put the
    "nvr" wrapper script in its own block with "bash".

*   add some comments to nvr wrapper script where people need to
    make changes and/or will be confused.

*   add a `<details>` that talks about shutting down and restarting
    the session around `nvr config` (see #151). Still not user-friendly
    but at least it's better documented now.
2021-08-23 12:44:48 -07:00
Scott Lamb
b8b5038f71 better error msg on live view when misconfigured
Improves (but doesn't fix) #119 and #120.
2021-08-13 12:02:42 -07:00
Scott Lamb
2c3a22c223 link to our own guide on disabling UAS 2021-07-13 12:49:17 -07:00
Scott Lamb
1bd05348bf troubleshooting section about incorrect timestamps 2021-07-13 11:18:17 -07:00
Scott Lamb
5be69baaa6 switch default RTSP library to retina 2021-06-28 16:38:21 -07:00
Scott Lamb
23d77693de read sample files from dedicated threads
Reading from the mmap()ed region in the tokio threads could cause
them to stall:

*   That could affect UI serving when there were concurrent
    UI requests (i.e., not just requests that needed the reads in
    question anyway).
*   If there's a faulty disk, it could cause the UI to totally hang.
    Better to not mix disks between threads.
*   Soon, I want to handle RTSP from the tokio threads (#37). Similarly,
    we don't want RTSP streaming to block on operations from unrelated
    disks.

I went with just one thread per disk which I think is sufficient.
But it'd be possible to do a fixed-size pool instead which might improve
latency when some pages are already cached.

I also dropped the memmap dependency. I had to compute the page
alignment anyway to get mremap to work, and Moonfire NVR already is
Unix-specific, so there wasn't much value from the memmap or memmap2
crates.

Fixes #88
2021-06-04 19:50:13 -07:00
Scott Lamb
98d106553a add a quick troubleshooting note about #119 2021-04-09 14:28:20 -07:00
Scott Lamb
7c0a634bed avoid clock problems on some Docker setups
In particular, this was happening out of the box on Raspberry Pi OS Lite
20210304, as reported by ironoxidizer@gmail.com here:
https://groups.google.com/g/moonfire-nvr-users/c/2j9LvfFl2u8/m/tJcNS2WfCQAJ

*   adjust main.rs to make the problem more obvious
*   mention it in the troubleshooting guide
*   sidestep it in the nvr docker wrapper script

also just use --networking=host rather than --publish (avoiding a proxy
process). I'm using Docker to simplify the build and deployment process,
not as a security boundary, so just do the simpler thing.
2021-04-08 22:21:03 -07:00
Scott Lamb
4d4d78ba64 mass markdown reformatting
Add tables of contents (using the VS Code Markdown All-In-One extension)
and reformat lists to consistently use 4-space indents. No content
changes.
2021-04-01 12:32:31 -07:00
Scott Lamb
74b13a0fbf add troubleshooting info about out of disk space
As discussed in #84. Also reorganize the troubleshooting guide a bit so
it will be easier to navigate as it grows.
2021-04-01 11:43:21 -07:00
Scott Lamb
a434b42672 live stream troubleshooting info 2021-03-31 16:36:21 -07:00
Scott Lamb
1458d668d7 tweak color env variable options
As noted in mylog's 2b1085c:
Looks like both the GNU tools' --color argument and cargo's
CARGO_TERM_COLOR expect always/never rather than on/off. Match that.
Might as well understand off/no/false and on/yes/true also.
2021-03-12 09:24:20 -08:00
Scott Lamb
9099d07dfa improve panic messages and docs (#112) 2021-03-10 08:12:49 -08:00
Scott Lamb
ea8bdef7d9 support color coding logs (#112)
ffmpeg was already doing this; now do it for native logs.
2021-03-09 22:09:30 -08:00
Scott Lamb
242368aa47 more help reading logs 2021-03-09 13:15:25 -08:00
Scott Lamb
2808d5e139 help folks read logs (#112)
*   add more description to the troubleshooting guide
*   adjust the log format to match more recent glog
*   include a config for the lnav tool, which will help colorize,
    browse, and search the logs.

Next up: install an ffmpeg log callback for consistency.
2021-03-09 00:10:32 -08:00
Scott Lamb
5acca1a253 tips on hardware failures and fs corruption 2021-02-11 16:04:40 -08:00
Scott Lamb
9a5957d5ef overhaul error messages
Inspired by the poor error message here:
https://github.com/scottlamb/moonfire-nvr/issues/107#issuecomment-777587727

*   print the friendlier Display version of the error rather than Debug.
    Eg, "EROFS: Read-only filesystem" rather than "Sys(EROFS)". Do this
    everywhere: on command exit, on syncer retries, and on stream
    retries.
*   print the most immediate problem and additional lines for each
    cause.
*   print the backtrace or an advertisement for RUST_BACKTRACE=1 if it's
    unavailable.
*   also mention RUST_BACKTRACE=1 in the troubleshooting guide.
*   add context in various places, including pathnames. There are surely
    many places more it'd be helpful, but this is a start.
*   allow subcommands to return failure without an Error.
    In particular, "moonfire-nvr check" does its own error printing
    because it wants to print all the errors it finds. Printing "see
    earlier errors" with a meaningless stack trace seems like it'd just
    confuse. But I also want to get rid of the misleading "Success" at
    the end and 0 return to the OS.
2021-02-11 10:57:09 -08:00
Scott Lamb
31801e20c3 update docs to recommend new Docker setup
and remove the old scripts, to reduce the supported ways of doing
things.
2021-01-21 16:00:38 -08:00
Scott Lamb
d7a0cb9a7c Remove mention of #36 from troubleshooting guide
The 091217b workaround of telling ffmpeg to only request the video
stream works perfectly fine for now. I'll revisit when adding audio
support (#34).

Fixes #36
2019-02-13 22:25:33 -08:00
Scott Lamb
422cd2a75e preliminary web support for auth (#26)
Some caveats:

  * it doesn't record the peer IP yet, which makes it harder to verify
    sessions are valid. This is a little annoying to do in hyper now
    (see hyperium/hyper#1410). The direct peer might not be what we want
    right now anyway because there's no TLS support yet (see #27).  In
    the meantime, the sane way to expose Moonfire NVR to the Internet is
    via a proxy server, and recording the proxy's IP is not useful.
    Maybe better to interpret a RFC 7239 Forwarded header (and/or
    the older X-Forwarded-{For,Proto} headers).

  * it doesn't ever use Secure (https-only) cookies, for a similar reason.
    It's not safe to use even with a tls proxy until this is fixed.

  * there's no "moonfire-nvr config" support for inspecting/invalidating
    sessions yet.

  * in debug builds, logging in is crazy slow. See libpasta/libpasta#9.

Some notes:

  * I removed the Javascript "no-use-before-defined" lint, as some of
    the functions form a cycle.

  * Fixed #20 along the way. I needed to add support for properly
    returning non-OK HTTP statuses to signal unauthorized and such.

  * I removed the Access-Control-Allow-Origin header support, which was
    at odds with the "SameSite=lax" in the cookie header. The "yarn
    start" method for running a local proxy server accomplishes the same
    thing as the Access-Control-Allow-Origin support in a more secure
    manner.
2018-11-27 11:08:33 -08:00
Scott Lamb
cbd8f7d3d2 Reorganize and expand documentation 2017-10-01 22:02:39 -07:00