This commit adds some Prometheus metrics to /metrics in headscale.
It will add the standard go metrics, some automatic gin metrics and some
initial headscale specific ones.
Some of them has been added to aid debugging #97 (loop bug)
In the future, we can use the metrics to get rid of the sleep in the
integration tests by checking that our expected number of nodes has been
registered:
```
headscale_machine_registrations_total
```
This commit rewrites a bunch of the code to always use *Machine instead
of a mix of both, and a mix of tailcfg.Node and Machine.
Now we use *Machine, and if tailcfg.Node is needed, it is converted just
before needed.
We are currently holding Machine objects in memory for a long time,
while waiting for stream/longpoll, this might make us end up with stale
objects, that we just call save on, potentially overwriting stuff in
the database.
A typical scenario would be someone changing something from the CLI,
e.g. enabling routes, which in turn is overwritten again by the stale
object in the longpolling function.
The code has been left with TODO's and a discussion is available in #93.
This commit tries to address the possible raceondition that can happen
if a client closes its connection after we have fetched it from the
syncmap before sending the message.
To try to avoid introducing new dead lock conditions, all messages sent
to updateChannel has been moved into a function, which handles the
locking (instead of calling it all over the place)
The same lock is used around the delete/close function.
This function migrates more poll functions (including keepalive) to
poll.go to keep it somehow in the same file.
In addition it makes changes to improve the stability and ensure nodes
get the appropriate updates from the headscale control and are not left
in an inconsistent state.
Two new additions is:
omitpeers=true will now trigger an update if the clients are not already up
to date
keepalive has been extended with a timer that will check every 120s if
all nodes are up to date.