This is almost certain to have performance problems with large databases,
but it's a useful starting point.
No tests yet. It shouldn't be too hard to add some for moonfire-db.h, but
I'm impatient to fake up enough data to check on the performance and see
what needs to change there first.
This wraps libevent's evhttp_parse_query_str and friends. It's easier to use
than the raw libevent stuff because it handles initialization (formerly not
done properly in profiler.cc) and cleans up with RAII.
I wrote the old interface before playing much with SQLite. Now that I've
played around with it a bit, I found many ways to make the interface more
pleasant and fool-proof:
* it opens the database in a mode that honors foreign keys and
returns extended result codes.
* it forces locking to avoid SQLITE_BUSY and
sqlite3_{changes,last_insert_rowid} race conditions.
* it supports named bind parameters.
* it defers some errors until Step() to reduce caller verbosity.
* it automatically handles calling reset, which was quite easy to forget.
* it remembers the Step() return value, which makes the row loop every so
slightly more pleasant.
* it tracks transaction status.
This isn't as much of a speed-up as you might imagine; most of the large HTTP
content was mmap()ed files which are relatively efficient. The big improvement
here is that it's now possible to serve large files (4 GiB and up) on 32-bit
machines. This actually works: I was just able to browse a 25-hour, 37 GiB
.mp4 file on my Raspberry Pi 2 Model B. It takes about 400 ms to start serving
each request, which isn't exactly zippy but might be forgivable for such a
large file. I still intend for the common request from the web interface to be
for much smaller fragmented .mp4 files.
Speed could be improved later through caching. Right now my test code is
creating a fresh VirtualFile from a database query on each request, even
though it hasn't changed. The tricky part will be doing cache invalidation
cleanly if it does change---new recordings are added to the requested time
range, recordings are deleted, or existing recordings' timestamps are changed.
The downside to the approach here is that it requires libevent 2.1 for
evhttp_send_reply_chunk_with_cb. Unfortunately, Ubuntu 15.10 and Debian Jessie
still bundle libevent 2.0. There are a few possible improvements here:
1. fall back to assuming chunks are added immediately, so that people with
libevent 2.0 get the old bad behavior and people with libevent 2.1 get the
better behavior. This is kind of lame, though; it's easy to go through
the whole address space pretty fast, particularly when the browsers send
out requests so quickly so there may be some unintentional concurrency.
2. alter the FileSlice interface to return a pointer/destructor rather than
add something to the evbuffer. HttpServe would then add each chunk via
evbuffer_add_reference, and it'd supply a cleanupfn that (in addition to
calling the FileSlice-supplied destructor) notes that this chunk has been
fully sent. For all the currently-used FileSlices, this shouldn't be too
hard, and there are a few other reasons it might be beneficial:
* RealFileSlice could call madvise() to control the OS buffering
* RealFileSlice could track when file descriptors are open and thus
FileManager's unlink() calls don't actually free up space
* It feels dirty to expose libevent stuff through the otherwise-nice
FileSlice interface.
3. support building libevent 2.1 statically in-tree if the OS-supplied
libevent is unsuitable.
I'm tempted to go with #2, but probably not right now. More urgent to commit
support for writing the new format and the wrapper bits for viewing it.
This avoids iteration through the video index for the "interior" recordings of
a virtual file. This takes generating the size of a ~8-hour / 15 fps file from
about 60 ms to about 10 ms. I expect better savings on a Raspberry Pi 2, for
longer records, and for higher frame rates. The total time here can be
significant; one one ~day-long recording on the Pi, it was several seconds.
I'm optimistic this will help with that.
It'd also be possible to optimize DecodeVar32 (perhaps by unrolling the loop)
but better to remove a call than to optimize one.
To add the fast path, we need a new field "video_sync_samples" in the
recording table to calculate the length of the stss table. Storage cost should
be minimal; I think typically two bytes in SQLite's record format (serial type
1, value < 128), described here: <https://www.sqlite.org/fileformat2.html>.
* Fix the mdat box size, which was not properly including the length of the
header itself. (The "mp4file" tool nicely diagnosed this corruption.)
* Fix the stsc box. The first number of each entry is meant to be a chunk
index, not a sample index. This was causing strange behavior in basically
any video player for multi-recording videos.
* Populate etag and last-modified so that Range: requests can work properly.
The etag must be changed every time the generated file format changes.
There's a serial number constant for this purpose and a test meant to help
catch such problems.
This is still pretty rough. For example, there's no test coverage of virtual
files based on multiple recordings. The etag and last modified code are stubs.
And various other conditions aren't tested at all. But it does appear to work
in a test that does a round-trip from a .mp4 file, so it should be a decent
starting point.
This code isn't pretty exactly---particularly the hardcoded lengths---but it
does work. I'll have a different mechanism for calculating the length and
nesting structure forthe more dynamic parts of the moov atom. This way is
convenient when generating a single string of mostly static data.