mirror of
https://github.com/scottlamb/moonfire-nvr.git
synced 2025-11-20 01:50:24 -05:00
switch from log to tracing
I think this is a big improvement in readability. I removed the `lnav` config, which is a little sad, but I don't think it supports this structured logging format well. Still seems worthwhile on balance.
This commit is contained in:
@@ -34,9 +34,13 @@ While Moonfire NVR is running, logs will be written to stderr.
|
||||
Try `nvr logs` or `docker logs moonfire-nvr`.
|
||||
* When running through systemd, stderr will be redirected to the journal.
|
||||
Try `sudo journalctl --unit moonfire-nvr` to view the logs. You also
|
||||
likely want to set `MOONFIRE_FORMAT=google-systemd` to format logs as
|
||||
likely want to set `MOONFIRE_FORMAT=systemd` to format logs as
|
||||
expected by systemd.
|
||||
|
||||
*Note:* Moonfire's log format has recently changed significantly. You may
|
||||
encounter the older format in the issue tracker or (despite best efforts)
|
||||
documentation that hasn't been updated.
|
||||
|
||||
Logging options are controlled by environment variables:
|
||||
|
||||
* `MOONFIRE_LOG` controls the log level. Its format is similar to the
|
||||
@@ -45,80 +49,50 @@ Logging options are controlled by environment variables:
|
||||
`MOONFIRE_LOG=info` is the default.
|
||||
`MOONFIRE_LOG=info,moonfire_nvr=debug` gives more detailed logging of the
|
||||
`moonfire_nvr` crate itself.
|
||||
* `MOONFIRE_FORMAT` selects the output format. The two options currently
|
||||
accepted are `google` (the default, like the Google
|
||||
[glog](https://github.com/google/glog) package) and `google-systemd` (a
|
||||
variation for better systemd compatibility).
|
||||
* `MOONFIRE_COLOR` controls color coding when using the `google` format.
|
||||
It accepts `always`, `never`, or `auto`. `auto` means to color code if
|
||||
stderr is a terminal.
|
||||
* `MOONFIRE_FORMAT` selects an output format. It defaults to an output meant
|
||||
for human consumption. It can be overridden to either of the following:
|
||||
* `systemd` uses [sd-daemon logging prefixes](https://man7.org/linux/man-pages/man3/sd-daemon.3.html))
|
||||
* `json` outputs one JSON-formatted log message per line, for machine
|
||||
consumption.
|
||||
* Errors include a backtrace if `RUST_BACKTRACE=1` is set.
|
||||
|
||||
If you use Docker, set these via Docker's `--env` argument.
|
||||
|
||||
With the default `MOONFIRE_FORMAT=google`, log lines are in the following
|
||||
format:
|
||||
With `MOONFIRE_FORMAT` left unset, log events look as follows:
|
||||
|
||||
```text
|
||||
I20210308 21:31:24.255 main moonfire_nvr] Success.
|
||||
LYYYYmmdd HH:MM:SS.FFF TTTT PPPPPPPPPPPP] ...
|
||||
L = level:
|
||||
E = error; when color mode is on, the message will be bright red.
|
||||
W = warn; " " " " " " " " " " yellow.
|
||||
I = info
|
||||
D = debug
|
||||
T = trace
|
||||
YYYY = year
|
||||
mm = month
|
||||
dd = day
|
||||
HH = hour (using a 24-hour clock)
|
||||
MM = minute
|
||||
SS = second
|
||||
FFF = fractional portion of the second
|
||||
TTTT = thread name (if set) or tid (otherwise)
|
||||
PPPP = log target (usually a module path)
|
||||
... = message body
|
||||
2023-02-15T22:45:06.999329 INFO s-courtyard-sub streamer{stream="courtyard-sub"}: moonfire_nvr::streamer: opening input url=rtsp://192.168.5.112/cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif
|
||||
```
|
||||
|
||||
This example contains the following elements:
|
||||
|
||||
* the timestamp (`2023-02-15T22:45:06.9999329`) in the system's local zone.
|
||||
* the log level (`INFO`) is one of `TRACE`, `DEBUG`, `INFO`, `WARN`, or
|
||||
`ERROR`.
|
||||
* the thread name (`s-courtyard-sub`), see explanation below.
|
||||
* the "spans" (`streamer{stream="courtyard-sub"}`), which contain
|
||||
context information for a group of messages. In this case there is a single
|
||||
span `streamer` with a single field `stream`. There can be multiple
|
||||
spans; they are listed starting from the root. Each may have fields.
|
||||
* the target (`moonfire_nvr::streamer`), which generally corresponds to a Rust
|
||||
module name.
|
||||
* the log message (`opening input`), a human-readable string
|
||||
* event fields (`url=...`)
|
||||
|
||||
Moonfire NVR names a few important thread types as follows:
|
||||
|
||||
* `main`: during `moonfire-nvr run`, the main thread does initial setup then
|
||||
just waits for the other threads. In other subcommands, it does everything.
|
||||
* `s-CAMERA-TYPE` (one per stream, where `TYPE` is `main`, `sub`, or `ext`):
|
||||
these threads write video to disk.
|
||||
* `sync-PATH` (one per sample file directory): These threads call `fsync` to
|
||||
* `sync-DIR_ID` (one per sample file directory): These threads call `fsync` to
|
||||
* commit sample files to disk, delete old sample files, and flush the
|
||||
database.
|
||||
* `r-PATH` (one per sample file directory): These threads read sample files
|
||||
* `r-DIR_ID` (one per sample file directory): These threads read sample files
|
||||
from disk for serving `.mp4` files.
|
||||
* `tokio-runtime-worker` (one per core, unless overridden with
|
||||
`--worker-threads`): these threads handle HTTP requests and read video
|
||||
data from cameras via RTSP.
|
||||
* `logger`: this thread writes the log buffer to `stderr`. Logging is
|
||||
asynchronous; other threads don't wait for log messages to be written
|
||||
unless the log buffer is full.
|
||||
|
||||
You can use the following command to teach [`lnav`](http://lnav.org/) Moonfire
|
||||
NVR's log format:
|
||||
|
||||
```console
|
||||
$ lnav -i misc/moonfire_log.json
|
||||
```
|
||||
|
||||
`lnav` versions prior to 0.9.0 print a (harmless) warning message on startup:
|
||||
|
||||
```console
|
||||
$ lnav -i git/moonfire-nvr/misc/moonfire_log.json
|
||||
warning:git/moonfire-nvr/misc/moonfire_log.json:line 2
|
||||
warning: unexpected path --
|
||||
warning: /$schema
|
||||
warning: accepted paths --
|
||||
warning: /(?<format_name>\w+)/ -- The definition of a log file format.
|
||||
info: installed: /home/slamb/.lnav/formats/installed/moonfire_log.json
|
||||
```
|
||||
|
||||
You can avoid this by removing the `$schema` line from `moonfire_log.json`
|
||||
and rerunning the `lnav -i` command.
|
||||
|
||||
Below are some interesting log lines you may encounter.
|
||||
|
||||
@@ -128,16 +102,16 @@ During normal operation, Moonfire NVR will periodically flush changes to its
|
||||
SQLite3 database. Every flush is logged, as in the following info message:
|
||||
|
||||
```
|
||||
I20210308 23:14:18.388 sync-/media/14tb/sample moonfire_db::db] Flush 3810 (why: 120 sec after start of 1 minute 14 seconds courtyard-main recording 3/1842086):
|
||||
2021-03-08T23:14:18.388000 sync-2 syncer{path=/media/14tb/sample}:flush{flush_count=2 reason="120 sec after start of 1 minute 14 seconds courtyard-main recording 3/1842086"}: moonfire_db::db: flush complete:
|
||||
/media/6tb/sample: added 98M 864K 842B in 8 recordings (4/1839795, 7/1503516, 6/1853939, 1/1838087, 2/1852096, 12/1516945, 8/1514942, 10/1506111), deleted 111M 435K 587B in 5 (4/1801170, 4/1801171, 6/1799708, 1/1801528, 2/1815572), GCed 9 recordings (6/1799707, 7/1376577, 4/1801168, 1/1801527, 4/1801167, 4/1801169, 10/1243252, 2/1815571, 12/1418785).
|
||||
/media/14tb/sample: added 8M 364K 643B in 3 recordings (3/1842086, 9/1505359, 11/1516695), deleted 0B in 0 (), GCed 0 recordings ().
|
||||
```
|
||||
|
||||
This log message is packed with debugging information:
|
||||
|
||||
* the date and time: `20210308 23:14:18.388`.
|
||||
* the name of the thread that prompted the flush: `sync-/media/14tb/sample`.
|
||||
* a sequence number: `3810`. This is handy for checking how often Moonfire NVR
|
||||
* the date and time: `2021-03-08T23:14:18.388`.
|
||||
* the name of the thread that prompted the flush: `sync-2`.
|
||||
* a flush count: `3810`. This is handy for checking how often Moonfire NVR
|
||||
is flushing.
|
||||
* a reason for the flush: `120 sec after start of 1 minute 14 seconds courtyard-main recording 3/1842086`.
|
||||
This was a regular periodic flush at the `flush_if_sec` for the stream,
|
||||
@@ -178,9 +152,7 @@ file a bug if you see one. It's helpful to set the `RUST_BACKTRACE`
|
||||
environment variable to include more information.
|
||||
|
||||
```
|
||||
E20210304 11:09:29.230 main s-peck_west-main] panic at 'src/moonfire-nvr/server/db/writer.rs:750:54': should always be an unindexed sample
|
||||
|
||||
(set environment variable RUST_BACKTRACE=1 to see backtraces)"
|
||||
2021-03-04T11:09:29.230291 ERROR s-peck_west-main streamer{stream="peck_west-main"}: panic: should always be an unindexed sample location=src/moonfire-nvr/server/db/writer.rs:750:54 backtrace=...
|
||||
```
|
||||
|
||||
In this case, a stream thread (one starting with `s-`) panicked. That stream
|
||||
@@ -195,11 +167,11 @@ It's normal to see these warnings on startup and occasionally while running.
|
||||
Frequent occurrences may indicate a performance problem.
|
||||
|
||||
```
|
||||
W20201129 12:01:21.128 s-driveway-main moonfire_base::clock] opening rtsp://admin:redacted@192.168.5.108/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif took PT2.070715796S!
|
||||
W20201129 12:32:15.870 s-west_side-sub moonfire_base::clock] getting next packet took PT10.158121387S!
|
||||
W20201228 12:09:29.050 s-back_east-sub moonfire_base::clock] database lock acquisition took PT8.122452
|
||||
W20201228 21:22:32.012 main moonfire_base::clock] database operation took PT39.526386958S!
|
||||
W20201228 21:27:11.402 s-driveway-sub moonfire_base::clock] writing 37 bytes took PT20.701894190S!
|
||||
2020-11-29T12:01:21.128725 WARN s-driveway-main streamer{stream="driveway-main"}: moonfire_base::clock: opening rtsp://admin:redacted@192.168.5.108/cam/realmonitor?channel=1&subtype=0&unicast=true&proto=Onvif took PT2.070715796S!
|
||||
2020-11-29T12:32:15.870658 WARN s-west_side-sub streamer{stream="west_side-sub"}: moonfire_base::clock: getting next packet took PT10.158121387S!
|
||||
2020-12-28T12:09:29.050464 WARN s-back_east-sub streamer{stream="s-back_east-sub"}: moonfire_base::clock: database lock acquisition took PT8.122452
|
||||
2020-12-28T21:22:32.012811 WARN main moonfire_base::clock: database operation took PT39.526386958S!
|
||||
2020-12-28T21:27:11.402259 WARN s-driveway-sub streamer{stream="s-driveway-sub"}: moonfire_base::clock: writing 37 bytes took PT20.701894190S!
|
||||
```
|
||||
|
||||
### Camera stream errors
|
||||
@@ -211,7 +183,7 @@ quickly enough. In the latter case, you'll likely see a
|
||||
`getting next packet took PT...S!` message as described above.
|
||||
|
||||
```
|
||||
W20210309 00:28:55.527 s-courtyard-sub moonfire_nvr::streamer] courtyard-sub: sleeping for PT1S after error: Stream ended
|
||||
2021-03-09T00:28:55.527078 WARN s-courtyard-sub streamer{stream="courtyard-sub"}: moonfire_nvr::streamer: sleeping for PT1S after error: Stream ended
|
||||
(set environment variable RUST_BACKTRACE=1 to see backtraces)
|
||||
```
|
||||
|
||||
@@ -251,7 +223,7 @@ If Moonfire NVR runs out of disk space on a sample file directory, recording
|
||||
will be stuck and you'll see log messages like the following:
|
||||
|
||||
```
|
||||
W20210401 11:21:07.365 s-driveway-main moonfire_base::clock] sleeping for PT1S after error: No space left on device (os error 28)
|
||||
2021-04-01T11:21:07.365 WARN s-driveway-main streamer{stream="s-driveway-main"}: moonfire_base::clock: sleeping for PT1S after error: No space left on device (os error 28)
|
||||
```
|
||||
|
||||
If something else used more disk space on the filesystem than planned, just
|
||||
|
||||
Reference in New Issue
Block a user