Now there's room to add arbitrary configuration to signals and types.
Several things are no longer fixed columns/tables but instead within
the configuration types.
This fixes#178. Before, everything got translated to 5xx status;
now it produces the correct type in several cases.
Ideally I'd get rid of the untyped errors in all of web.rs; this is
a small step.
As written in the changelog: Live streams formerly worked around a
Firefox pixel aspect ratio bug by forcing all videos to 16:9, which
dramatically distorted 9:16 camera views. Playback didn't, so anamorphic
videos looked correct on Chrome but slightly stretched on Firefox. Now
both live streams and playback are fully correct on all browsers.
* API change: in update signals, allow setting a start time relative
to now. This is an accuracy improvement in the case where the client
has been retrying an initial request for a while. Kind of an obscure
corner case but easy enough to address. And use a more convenient
enum representation.
* in update signals, choose `now` before acquiring the database lock.
If lock acquisition takes a long time, this more accurately reflects
the time the caller intended.
* in general, make Time and Duration (de)serializable and use them
in json types. This makes the types more self-describing, with
better debug printing on both the server side and on the client
library (in moonfire-playground). To make this work, base has to
import serde which initially seemed like poor layering to me, but
serde seems to be imported in some pretty foundational Rust crates
for this reason. I'll go with it.
To improve reliability of live streams (#59) on Safari.
Safari was dropping the cookie from websocket update requests.
(But it worked sometimes. I don't get why.) I saw folks on the Internet
thinking this related to HttpOnly:
* https://developer.apple.com/forums/thread/104488
* https://stackoverflow.com/q/47742807/23584
but I still see this behavior without HttpOnly. SameSite=Strict vs
SameSite=Lax appears to make a difference. Try that instead.
SameSite=Strict is pointless for us anyway as noted in a new comment.
Turning off HttpOnly would be more unfortunate security-wise.
Chrome appears to time out at 60 seconds of inactivity otherwise.
I think it's better to keep the stream open, even if the camera is
broken.
The implementation looks awkward, but that might be the state of Rust
async right now.
I also enforced some invariants in the signals code, fixing a couple
bugs. The signals code is more complex than I'd like, but hopefully
is working now.
This should reduce live stream latency by two seconds when my cameras
are at their default setting (I frame interval = 2 * frame rate)!
I was under the impression that every HTML5 Media Source Extensions
media segment had to start with a Random Access Point. This used to
be true, but apparently changed quite a while ago:
https://bugs.chromium.org/p/chromium/issues/detail?id=229412
Support generating segments that don't start with a key frame, and
plumb this through the mp4 media segment generation logic. Add some
extra error checking in mp4 slice handling, as my first attempts had a
mismatch between expected and actual lengths that silently returned
corrupted .m4s files.
Also pull everything from the most recent key frame on along with the
first live segment to reduce startup latency. Live view is quite a bit
more pleasant now.
This splits the schema and playback path. The recording path still
adjusts the frame durations and always says the wall and media durations
are the same. I expect to change that in a following commit. I wouldn't
be surprised if that shakes out some bugs in this portion.
This is a quick fix to a problem that gives a confusing/poor initial
experience, as in this thread:
https://groups.google.com/g/moonfire-nvr-users/c/WB-TIW3bBZI/m/Gqh-L6I9BgAJ
I don't think it's a permanent solution. In particular, when we
implement an event stream (#40), I don't want to have a separate event
for every frame, so having the days map change that often won't work.
The client side will likely manipulate the days map then to include a
special entry for a growing recording, representing "from this time to
now".
This is useful for a combo scrub bar-based UI (#32) + live view UI (#59)
in a non-obvious way. When constructing a HTML Media Source Extensions
API SourceBuffer, the caller can specify a "mode" of either "segments"
or "sequence":
In "sequence" mode, playback assumes segments are added sequentially.
This is good enough for a live view-only UI (#59) but not for a scrub
bar UI in which you may want to seek backward to a segment you've never
seen before. You will then need to insert a segment out-of-sequence.
Imagine what happens when the user goes forward again until the end of
the segment inserted immediately before it. The user should see the
chronologically next segment or a pause for loading if it's unavailable.
The best approximation of this is to track the mapping of timestamps to
segments and insert a VTTCue with an enter/exit handler that seeks to
the right position. But seeking isn't instantaneous; the user will
likely briefly see first the segment they seeked to before. That's
janky. Additionally, the "canplaythrough" event will behave strangely.
In "segments" mode, playback respects the timestamps we set:
* The obvious choice is to use wall clock timestamps. This is fine if
they're known to be fixed and correct. They're not. The
currently-recording segment may be "unanchored", meaning its start
timestamp is not yet fixed. Older timestamps may overlap if the system
clock was stepped between runs. The latter isn't /too/ bad from a user
perspective, though it's confusing as a developer. We probably will
only end up showing the more recent recording for a given
timestamp anyway. But the former is quite annoying. It means we have
to throw away part of the SourceBuffer that we may want to seek back
(causing UI pauses when that happens) or keep our own spare copy of it
(memory bloat). I'd like to avoid the whole mess.
* Another approach is to use timestamps that are guaranteed to be in
the correct order but that may have gaps. In particular, a timestamp
of (recording_id * max_recording_duration) + time_within_recording.
But again seeking isn't instantaneous. In my experiments, there's a
visible pause between segments that drives me nuts.
* Finally, the approach that led me to this schema change. Use
timestamps that place each segment after the one before, possibly with
an intentional gap between runs (to force a wait where we have an
actual gap). This should make the browser's natural playback behavior
work properly: it never goes to an incorrect place, and it only waits
when/if we want it to. We have to maintain a mapping between its
timestamps and segment ids but that's doable.
This commit is only the schema change; the new data aren't exposed in
the API yet, much less used by a UI.
Note that stream.next_recording_id became stream.cum_recordings. I made
a slight definition change in the process: recording ids for new streams
start at 0 rather than 1. Various tests changed accordingly.
The upgrade process makes a best effort to backfill these new fields,
but of course it doesn't know the total duration or number of runs of
previously deleted rows. That's good enough.
Benefits:
* Blake3 is faster. This is most noticeable for the hashing of the
sample file data.
* we no longer need OpenSSL, which helps with shrinking the binary size
(#70). sha1 basically forced OpenSSL usage; ring deliberately doesn't
support this old algorithm, and the pure-Rust sha1 crate is painfully
slow. OpenSSL might still be a better choice than ring/rustls for TLS
but it's nice to have the option.
For the video sample entries, I decided we don't need to hash at all. I
think the id number is sufficiently stable, and it's okay---perhaps even
desirable---if an existing init segment changes for fixes like e5b83c2.
I want to start returning the pixel aspect ratio of each video sample
entry. It's silly to duplicate it for each returned recording, so
let's instead return a videoSampleEntryId and then put all the
information about each VSE once.
This change doesn't actually handle pixel aspect ratio server-side yet.
Most likely I'll require a new schema version for that, to store it as a
new column in the database. Codec-specific logic in the database layer
is awkward and I'd like to avoid it. I did a similar schema change to
add the rfc6381_codec.
I also adjusted ui-src/lib/models/Recording.js in a few ways:
* fixed a couple mismatches between its field name and the key defined
in the API. Consistency aids understanding.
* dropped all the getters in favor of just setting the fields (with
type annotations) as described here:
https://google.github.io/styleguide/jsguide.html#features-classes-fields
* where the wire format used undefined (to save space), translate it to
a more natural null or false.
* As discussed in #48, say "The Moonfire NVR Authors" at the top of
every file rather than whoever created that file. Have one AUTHORS
file listing everyone.
* Consistently call it a "security camera network video recorder" rather
than "security camera digital video recorder".
The multipart stream / hanging GET approach worked in a prototype for a
single stream, but Chrome has a per-host limit of six connections. If I
try streaming all my cameras at once, I hit that limit. I can't open all
the streams, much less additional connections to load init segments and
such. Websockets apparently has a much higher limit of 256.
Add a new schema version 5; now 4 means the directory meta may or may
not be upgraded.
Fixes#65: now it's possible to open the directory even if it lies on a
completely full disk.
My dad's "GW-GW4089IP" cameras use separate ports for the main and sub
streams:
rtsp://192.168.1.110:5050/H264?channel=0&subtype=0&unicast=true&proto=Onvif
rtsp://192.168.1.110:5049/H264?channel=0&subtype=1&unicast=true&proto=Onvif
Previously I could get one of the streams to work by including :5050 or
:5049 in the host field of the camera. But not both. Now make the
camera's host field reflect the ONVIF port (which is also non-standard
on these cameras, :85). It's not directly used yet but probably will be
sooner or later. Make each stream know its full URL.
(I also considered the names "capabilities" and "scopes", but I think
"permissions" is the most widely understood.)
This is increasingly necessary as the web API becomes more capable.
Among other things, it allows:
* non-administrator users who can view but not access camera passwords
or change any state
* workers that update signal state based on cameras' built-in motion
detection or a security system's events but don't need to view videos
* control over what can be done without authenticating
Currently session permissions are just copied from user permissions, but
you can also imagine admin sessions vs not, as a checkbox when signing
in. This would match the standard Unix workflow of using a
non-administrative session most of the time.
Relevant to my current signals work (#28) and to the addition of an
administrative API (#35, including #66).
This is a definite work in progress. In particular,
* there's no src/web.rs support yet so it can't be used,
* the code is surprisingly complex, and there's almost no tests so far.
I want to at least get complete branch coverage.
* I may still go back to time_sec rather than time_90k to save RAM and
flash.
I simplified the approach a bit from the earlier goal in design/api.md.
In particular, there's no longer the separate concept of "observation"
vs "prediction". Now the predictions are just observations that extend a
bit beyond now. They may be flushed prematurely and I'll try living with
that to avoid making things even more complex.
This is mostly untested and useless by itself, but it's a starting
point. In particular:
* there's no way to set up signals or add/remove/update events yet
except by manual changes to the database.
* if you associate a signal with a camera then remove the camera,
hitting /api/ will error out.
This is so far completely untested, for use by a new UI prototype.
It creates a new URL endpoint which sends one video/mp4 media segment
per key frame, with the dependent frames included. This means there will
be about one key frame interval of latency (typically about a second).
This seems hard to avoid, as mentioned in issue #59.