* in markdown files, use code fences rather than indented blocks.
This is harder to screw up (one of them was off by a space so didn't
render properly) and allows me to add info strings.
* uniformly use "useradd" to create the user and group in all three
places (install-manual.md, script-functions.sh, Dockerfile) rather
than addgroup + adduser. Create a full home dir, which I suspect was
the problem in #67. Don't allow customizing group name; it's always
the same as the user.
* install the sqlite3 package so that the "moonfire-nvr sql" command
works properly.
* remove "setup_db" function, which was out of place. Since the
creation of the "moonfire-nvr init" command, this has to happen
after installation of the binary. install.md gives instructions on
this part anyway so remove it from the script.
* give a proper command to create the db dir. It was creating it
within the current directory, not within /var/lib/moonfire-nvr.
Don't bother creating sample directory; "moonfire-nvr config"
will do this.
* when setting owners on a newly created directory, use a single
"install -d" command rather than "mkdir" + "chown".
* address confusion about whether sample file dirs need to be
precreated. (Only when Moonfire NVR doesn't have write permissions
on the parent.)
* always just install the packaged version of ffmpeg rather than
building our own. This has been usable since Debian/Raspbian 9
Stretch; Debian/Raspbian 10 Buster is out now so there's no excuse
for still running Debian/Raspbian 8 Jessie.
* don't chown the UI directory; it can be owned by root as with
the binary.
* in scripts/install.sh, don't enable/start the service yet. It hasn't
been configured.
travis-ci pointed out that the dependency bump broke 1.31:
Compiling docopt v1.1.0
error[E0658]: imports can only refer to extern crate names passed with `--extern` on stable channel (see issue #53130)
--> /home/travis/.cargo/registry/src/github.com-1ecc6299db9ec823/docopt-1.1.0/src/parse.rs:48:5
|
48 | use regex;
| ^^^^^
|
Looks like uniform_paths was stabilized in 1.32, and I verified locally that
version builds.
Looks like a bug got introduced with the great UI rewrite: when you add
a (start or end) time constraint, then remove one, the change wouldn't
be reflected. Within CalendarTSRange, it used null to mean to keep a
value, and || to check if it was null. These meant empty strings turned
into the existing value, instead of no constraint as they should be.
This was unnecessarily clever; stop doing that.
Also keep the console logging in the deployed config; it's harmless and
eases debugging.
The 091217b workaround of telling ffmpeg to only request the video
stream works perfectly fine for now. I'll revisit when adding audio
support (#34).
Fixes#36
My installation recently somehow ended up with a recording with a
duration of 503793844 90,000ths of a second, way over the maximum of 5
minutes. (Looks like the machine was pretty unresponsive at the time
and/or having network problems.)
When this happens, the system really spirals. Every flush afterward (12
per minute with my installation) fails with a CHECK constraint failure
on the recording table. It never gives up on that recording. /var/log
fills pretty quickly as this failure is extremely verbose (a stack
trace, and a line for each byte of video_index). Eventually the sample
file dirs fill up too as it continues writing video samples while GC is
stuck. The video samples are useless anyway; given that they're not
referenced in the database, they'll be deleted on next startup.
This ensures the offending recording is never added to the database, so
we don't get the same persistent problem. Instead, writing to the
recording will fail. The stream will drop and be retried. If the
underlying condition that caused a too-long recording (many
non-key-frames, or the camera returning a crazy duration, or the
monotonic clock jumping forward extremely, or something) has gone away,
the system should recover.
This is so far completely untested, for use by a new UI prototype.
It creates a new URL endpoint which sends one video/mp4 media segment
per key frame, with the dependent frames included. This means there will
be about one key frame interval of latency (typically about a second).
This seems hard to avoid, as mentioned in issue #59.
Use version 1 of the mvhd, tkhd, and mdhd boxes to support 64-bit
durations. 2^32 units / 90,000 units/sec / 60 sec/min / 60 min/hr ~=
13.25 hrs.
Compatibility: looks like Chrome, Firefox, VLC, and ffmepg all support
version 1 with no problem.
I went with the third idea in 1ce52e3: have the tests run each iteration
of the syncer explicitly. These are messy tests that know tons of
internal details, but I think they're less confusing and racy than if I
had the syncer running in a separate thread.
Now each syncer has a binary heap of the times it plans to do a flush.
When one of those times arrives, it rechecks if there's something to do.
Seems more straightforward than rechecking each stream's first
uncommitted recording, especially with the logic to retry failed flushes
every minute.
Also improved the info! log for each flush to see the actual recordings
being flushed for better debuggability.
No new tests right now. :-( They're tricky to write. One problem is that
it's hard to get the timing right: a different flush has to happen
after Syncer::save's database operations and before Syncer::run calls
SimulatedClocks::recv_timeout with an empty channel[*], advancing the
time. I've thought of a few ways of doing this:
* adding a new SyncerCommand to run something, but it's messy (have
to add it from the mock of one of the actions done by the save),
and Box<dyn FnOnce() + 'static> not working (see
rust-lang/rust#28796) makes it especially annoying.
* replacing SimulatedClocks with something more like MockClocks.
Lots of boilerplate. Maybe I need to find a good general-purpose
Rust mock library. (mockers sounds good but I want something that
works on stable Rust.)
* bypassing the Syncer::run loop, instead manually running iterations
from the test.
Maybe the last way is the best for now. I'm likely to try it soon.
[*] actually, it's calling Receiver::recv_timeout directly;
Clocks::recv_timeout is dead code now? oops.
This no longer requires installing ffmpeg manually, so there should be
significantly less data to cache (faster runs). The build step itself
should also be faster when the cache is unavailable/stale.
Also sneak in a change from "pkg-config" to "pkgconf" package in the
scripts and travis CI. They didn't match the manual instructions; make
them all consistent. They both seem to work fine, but I gather pkgconf
is the newer thing. Its roadmap is here and notes that distros are
moving toward it.
https://github.com/pkgconf/pkgconf/wiki/Roadmap
Fixes#46. If there are no video_sample_entries, it returns
InvalidArgument, which gets mapped to a HTTP 400. Various other failures
turn into non-500s as well.
There are many places that can & should be using typed errors, but it's
a start.
* remove intermediate bool from adjust_day.
* rewrite LockedDatabase::list_aggregate_recordings.
I started by collapsing the flush into the first part of the if, in a
similar way to adjust_day. But then I refactored more and ended up
with a structure that probably would have been allowed with the old
lexical borrow checker. I think it's more readable, and it does 1
btree operation per row where before it did 2 or 3.
This is mostly just "cargo fix --edition" + Cargo.toml changes.
There's one fix for upgrading to NLL in db/writer.rs:
Writer::previously_opened wouldn't build with NLL because of a
double-borrow the previous borrow checker somehow didn't catch.
Restructure to avoid it.
I'll put elective NLL changes in a following commit.
Before, this would panic from the reactor thread. After, it returns a
internal server error. Still not ideal, but better.
To return "bad request" as it should, mp4::FileBuilder::build() should
return a new error type that distinguishes "invalid argument" from
"internal" and the like. I'm thinking of using a ErrorKind enum
throughout the program that's similar to grpc::StatusCode.
Apparently with docopt, --require-auth=false doesn't work, so booleans
with a default value of true can't be turned off. Toggle the default to
false to deal with this, for now. I'd prefer the default be true, but
I also would prefer to not use a negative --no-require-auth or
--allow-unauthenticated flag. I think I'll switch from docopt to clap
in the near future; it seems to be what the cool kids use.
The guide is not as quick to follow and amateur-friendly as I'd like. A
few things that might improve matters:
* complete #27 (built-in https+letsencrypt), so that when not sharing
the port, users don't need to use nginx or certbot.
* more ubiquitous IPv6 (out of my control but should happen over
time) to reduce need to share the port
* embed a dynamic DNS client
* support UPnP Internet Gateway Device Control Protocol (if common
routers have this enabled? probably not for security reasons.)
It's progress, though. Enough that I think I'll merge the auth branch
into master shortly.
I initially chose SameSite=Lax because I thought if a user followed a
link to the landing page, the landing page's ajax requests wouldn't send
the cookie. But I just did an experiment, and that's not true. Only the
initial page load (of a .html file) lacks the cookie. All of its
resources and ajax requests send the cookie. I'm not sure about
document.cookie accesses, but my cookie is HttpOnly anyway, so it's
irrelevant. So no reason to be lax.
newTimeFormat didn't handle newTimeZone not having been called well.
Restore the prior behavior of having called newTimeZone(null), which was
apparently good enough.
I just ran a "cargo test" on this after a round of tweaks, not
"cargo test --all", so I missed compile errors in the db crate,
and a Javascript lint config error. travis-ci caught these.
Some caveats:
* it doesn't record the peer IP yet, which makes it harder to verify
sessions are valid. This is a little annoying to do in hyper now
(see hyperium/hyper#1410). The direct peer might not be what we want
right now anyway because there's no TLS support yet (see #27). In
the meantime, the sane way to expose Moonfire NVR to the Internet is
via a proxy server, and recording the proxy's IP is not useful.
Maybe better to interpret a RFC 7239 Forwarded header (and/or
the older X-Forwarded-{For,Proto} headers).
* it doesn't ever use Secure (https-only) cookies, for a similar reason.
It's not safe to use even with a tls proxy until this is fixed.
* there's no "moonfire-nvr config" support for inspecting/invalidating
sessions yet.
* in debug builds, logging in is crazy slow. See libpasta/libpasta#9.
Some notes:
* I removed the Javascript "no-use-before-defined" lint, as some of
the functions form a cycle.
* Fixed#20 along the way. I needed to add support for properly
returning non-OK HTTP statuses to signal unauthorized and such.
* I removed the Access-Control-Allow-Origin header support, which was
at odds with the "SameSite=lax" in the cookie header. The "yarn
start" method for running a local proxy server accomplishes the same
thing as the Access-Control-Allow-Origin support in a more secure
manner.
Fixes#62
* added travis config for latest node as well as 8.
* ran "yarn upgrade -P webpack-dev-server", which caused the upath
dependency to be upgraded. I arrived at this by inspecting yarn.lock
for the things depending on upack, along with some trial and error.
("yarn upgrade -P chokidar" was less successful.)