start splitting wall and media duration for #34

This splits the schema and playback path. The recording path still
adjusts the frame durations and always says the wall and media durations
are the same. I expect to change that in a following commit. I wouldn't
be surprised if that shakes out some bugs in this portion.
This commit is contained in:
Scott Lamb
2020-08-04 21:44:01 -07:00
parent 476bd86b12
commit cb97ccdfeb
12 changed files with 437 additions and 241 deletions

View File

@@ -13,14 +13,10 @@ In the future, this is likely to be expanded:
(at least for bootstrapping web authentication)
* mobile interface
## Terminology
*signal:* a timeseries with an enum value. Signals might represent a camera's
motion detection or day/night status. They could also represent an external
input such as a burglar alarm system's zone status.
## Detailed design
*Note:* italicized terms in this document are defined in the [glossary](glossary.md).
All requests for JSON data should be sent with the header
`Accept: application/json` (exactly).
@@ -112,7 +108,7 @@ The `application/json` response will have a dict as follows:
* `config`: (only included if request parameter `cameraConfigs` is
true) a dictionary describing the configuration of the stream:
* `rtsp_url`
* `signals`: a list of all signals known to the server. Each is a dictionary
* `signals`: a list of all *signals* known to the server. Each is a dictionary
with the following properties:
* `id`: an integer identifier.
* `shortName`: a unique, human-readable description of the signal
@@ -254,13 +250,12 @@ Example response:
### `GET /api/cameras/<uuid>/<stream>/recordings`
Returns information about recordings.
Valid request parameters:
Returns information about *recordings*. Valid request parameters:
* `startTime90k` and and `endTime90k` limit the data returned to only
recordings which overlap with the given half-open interval. Either or both
may be absent; they default to the beginning and end of time, respectively.
recordings with wall times overlapping with the given half-open interval.
Either or both may be absent; they default to the beginning and end of time,
respectively.
* `split90k` causes long runs of recordings to be split at the next
convenient boundary after the given duration.
* TODO(slamb): `continue` to support paging. (If data is too large, the
@@ -291,12 +286,12 @@ arbitrary order. Each recording object has the following properties:
an increasing "open id". This field is the open id as of when these
recordings were written. This can be used to disambiguate ids referring to
uncommitted recordings.
* `startTime90k`: the start time of the given recording. Note this may be
less than the requested `startTime90k` if this recording was ongoing
at the requested time.
* `endTime90k`: the end time of the given recording. Note this may be
greater than the requested `endTime90k` if this recording was ongoing at
the requested time.
* `startTime90k`: the start time of the given recording, in the wall time
scale. Note this may be less than the requested `startTime90k` if this
recording was ongoing at the requested time.
* `endTime90k`: the end time of the given recording, in the wall time scale.
Note this may be greater than the requested `endTime90k` if this recording
was ongoing at the requested time.
* `videoSampleEntryId`: a reference to an entry in the `videoSampleEntries`
map.mp4` URL.
* `videoSamples`: the number of samples (aka frames) of video in this
@@ -362,18 +357,19 @@ Expected query parameters:
* `s` (one or more): a string of the form
`START_ID[-END_ID][@OPEN_ID][.[REL_START_TIME]-[REL_END_TIME]]`. This
specifies recording segments to include. The produced `.mp4` file will be a
concatenation of the segments indicated by all `s` parameters. The ids to
retrieve are as returned by the `/recordings` URL. The open id is optional
and will be enforced if present; it's recommended for disambiguation when
the requested range includes uncommitted recordings. The optional start and
end times are in 90k units and relative to the start of the first specified
id. These can be used to clip the returned segments. Note they can be used
to skip over some ids entirely; this is allowed so that the caller doesn't
need to know the start time of each interior id. If there is no key frame
at the desired relative start time, frames back to the last key frame will
be included in the returned data, and an edit list will instruct the
viewer to skip to the desired start time.
specifies *segments* to include. The produced `.mp4` file will be a
concatenation of the segments indicated by all `s` parameters. The ids to
retrieve are as returned by the `/recordings` URL. The *open id* (see
[glossary](glossary.md)) is optional and will be enforced if present; it's
recommended for disambiguation when the requested range includes uncommitted
recordings. The optional start and end times are in 90k units of wall time
and relative to the start of the first specified id. These can be used to
clip the returned segments. Note they can be used to skip over some ids
entirely; this is allowed so that the caller doesn't need to know the start
time of each interior id. If there is no key frame at the desired relative
start time, frames back to the last key frame will be included in the
returned data, and an edit list will instruct the viewer to skip to the
desired start time.
* `ts` (optional): should be set to `true` to request a subtitle track be
added with human-readable recording timestamps.
@@ -397,6 +393,11 @@ Example request URI to retrieve recording id 1, skipping its first 26
/api/cameras/fd20f7a2-9d69-4cb3-94ed-d51a20c3edfe/main/view.mp4?s=1.26
```
Note carefully the distinction between *wall duration* and *media duration*.
It's normal for `/view.mp4` to return a media presentation with a length
slightly different from the *wall duration* of the backing recording or
portion that was requested.
TODO: error behavior on missing segment. It should be a 404, likely with an
`application/json` body describing what portion if any (still) exists.
@@ -415,20 +416,20 @@ trim undesired leading portions.
This response will include the following additional headers:
* `X-Prev-Duration`: the total duration (in 90 kHz units) of all recordings
before the first requested recording in the `s` parameter. Browser-based
callers may use this to place this at the correct position in the source
buffer via `SourceBuffer.timestampOffset`.
* `X-Prev-Media-Duration`: the total *media duration* (in 90 kHz units) of all
*recordings* before the first requested recording in the `s` parameter.
Browser-based callers may use this to place this at the correct position in
the source buffer via `SourceBuffer.timestampOffset`.
* `X-Runs`: the cumulative number of "runs" of recordings. If this recording
starts a new run, it is included in the count. Browser-based callers may
use this to force gaps in the source buffer timeline by adjusting the
timestamp offset if desired.
* `X-Leading-Duration`: if present, the total duration (in 90 kHz units) of
additional leading video included before the caller's first requested
timestamp. This happens when the caller's requested timestamp does not
fall exactly on a key frame. Media segments can't include edit lists, so
unlike with the `/api/.../view.mp4` endpoint the caller is responsible for
trimming this portion. Browser-based callers may use
* `X-Leading-Media-Duration`: if present, the total duration (in 90 kHz
units) of additional leading video included before the caller's first
requested timestamp. This happens when the caller's requested timestamp
does not fall exactly on a key frame. Media segments can't include edit
lists, so unlike with the `/api/.../view.mp4` endpoint the caller is
responsible for trimming this portion. Browser-based callers may use
`SourceBuffer.appendWindowStart`.
Expected query parameters:
@@ -448,8 +449,12 @@ this fundamental reason Moonfire NVR makes no effort to make multiple-segment
* There's currently no way to generate an initialization segment for more
than one video sample entry, so a `.m4s` that uses more than one video
sample entry can't be used.
* The `X-Prev-Duration` and `X-Leading-Duration` headers only describe the
first segment.
* The `X-Prev-Media-Duration` and `X-Leading-Duration` headers only describe
the first segment.
Timestamp tracks (see the `ts` parameter to `.mp4` URIs) aren't supported
today. Most likely browser clients will implement timestamp subtitles via
WebVTT API calls anyway.
### `GET /api/cameras/<uuid>/<stream>/view.m4s.txt`

66
design/glossary.md Normal file
View File

@@ -0,0 +1,66 @@
# Moonfire NVR Glossary
*media duration:* the total duration of the actual samples in a recording. These
durations are based on the camera's clock. Camera clocks can be quite
inaccurate, so this may not match the *wall duration*. See [time.md](time.md)
for details.
*open id:* a sequence number representing a time the database was opened in
write mode. One reason for using open ids is to disambiguate unflushed
recordings. Recordings' ids are assigned immediately, without any kind of
database transaction or reservation. Thus if a recording is never flushed
successfully, a following *open* may assign the same id to a new recording.
The open id disambiguates this and should be used whenever referring to a
recording that may be unflushed.
*recording:* the video from a (typically 1-minute) portion of an RTSP session.
RTSP sessions are divided into recordings as a detail of the
storage schema. See [schema.md](schema.md) for details. This concept is exposed
to the frontend code through the API; see [api.md](api.md). It's not exposed in
the user interface; videos are reconstructed from segments automatically.
*run:* all the recordings from a single RTSP session. These are all from the
same *stream* and could be reassembled into a single video with no gaps. If the
camera is lost and re-established, one run ends and another starts.
*sample:* data associated with a single timestamp within a recording, e.g. a video
frame or a set of
*sample file:* a file on disk that holds all the samples from a single recording.
*sample file directory:* a directory in the local filesystem that holds all
sample files for one or more streams. Typically there is one directory per disk.
*segment:* part or all of a recording. An API request might ask for a video of
recordings 14 starting 80 seconds in. If each recording is exactly 60 seconds,
this would correspond to three segments: recording 2 from 20 seconds in to
the end, all of recording 3, and all of recording 4. See [api.md](api.md).
*session:* a set of authenticated Moonfire NVR requests defined by the use of a
given credential (`s` cookie). Each user may have many credentials and thus
many sessions. Note that in Moonfire NVR's the term "session" by itself has
nothing to do with RTSP sessions; those more closely match a *run*.
*signal:* a timeseries with an enum value. Signals might represent a camera's
motion detection or day/night status. They could also represent an external
input such as a burglar alarm system's zone status. See [api.md](api.md).
Note signals are still under development and not yet exposed in Moonfire NVR's
UI. See [#28](https://github.com/scottlamb/moonfire-nvr/issues/28) for more
information.
*stream:* the "main" or "sub" stream from a given camera. Moonfire NVR expects
cameras support configuring and simultaneously viewing two streams encoded from
the same underlying video and audio source. The difference between the two is
that the "main" stream's video is typically higher quality in terms of frame
rate, resolution, and bitrate. Likewise it may have higher quality audio.
A stream corresponds to an ONVIF "media profile". Each stream has a distinct
RTSP URL that yields a difference RTSP "presentation".
*track:* one of the video, audio, or subtitles associated with a single
*stream*. This is consistent with the definition in ISO/IEC 14496-12 section
3.1.19. Note that RTSP RFC 2326 uses the word "stream" in the same way
Moonfire NVR uses the word "track".
*wall duration:* the total duration of a recording for the purpose of matching
with the NVR's wall clock time. This may not match the same recording's media
duration. See [time.md](time.md) for details.

View File

@@ -1,6 +1,10 @@
# Moonfire NVR Time Handling
Status: **current**
Status: **in flux**. The approach below works well for video, but audio frames'
durations can't be adjusted as easily. As part of implementing audio support,
the implementation is changing to instead decouple "wall time" and "media time",
as described in
[this comment](https://github.com/scottlamb/moonfire-nvr/issues/34#issuecomment-651548468).
> A man with a watch knows what time it is. A man with two watches is never
> sure.