For a one-hour recording, this is about 2 KiB, so a decent chunk of a
Raspberry Pi 2's L1 cache. Generating the Slices and searching/scanning
it should be a bit faster and pollute the cache less.
This is a pretty small optimization now when transferring a decent chunk
of the moov or mdat, but it's easy enough to do. It will be much more
noticeable if I end up interleaving the captions between each key frame.