We are currently holding Machine objects in memory for a long time,
while waiting for stream/longpoll, this might make us end up with stale
objects, that we just call save on, potentially overwriting stuff in
the database.
A typical scenario would be someone changing something from the CLI,
e.g. enabling routes, which in turn is overwritten again by the stale
object in the longpolling function.
The code has been left with TODO's and a discussion is available in #93.
This commit splits the lint and test steps into two different jobs in
github actions.
Consider this a suggestion, the idea is that when we look at PRs we will
see explicitly which one of the two types of checks fails without having
to open Github actions.
This commit rewrites the `routes list` command to use ptables to present
a slightly nicer list, including a new field if the route is enabled or
not (which is quite useful).
In addition, it reworks the enable command to support enabling multiple
routes (not only one route as per removed TODO). This allows users to
actually take advantage of exit-nodes and subnet relays.
This commit tries to address the possible raceondition that can happen
if a client closes its connection after we have fetched it from the
syncmap before sending the message.
To try to avoid introducing new dead lock conditions, all messages sent
to updateChannel has been moved into a function, which handles the
locking (instead of calling it all over the place)
The same lock is used around the delete/close function.
This function migrates more poll functions (including keepalive) to
poll.go to keep it somehow in the same file.
In addition it makes changes to improve the stability and ensure nodes
get the appropriate updates from the headscale control and are not left
in an inconsistent state.
Two new additions is:
omitpeers=true will now trigger an update if the clients are not already up
to date
keepalive has been extended with a timer that will check every 120s if
all nodes are up to date.
This commit makes two reasonably major changes:
Set a default timeout for the go HTTP server (which gin uses), which
allows us to actually have broken long poll sessions fail so we can have
the client re-establish them.
The current 10s number is chosen randomly and we need more testing to
ensure that the feature work as intended.
The second is adding a last updated field to keep track of the last time
we had an update that needs to be propagated to all of our
clients/nodes. This will be used to keep track of our machines and if
they are up to date or need us to push an update.
This commit adds a new field to machine, lastSuccessfulUpdate which
tracks when we last was able to send a proper mapupdate to the node. The
purpose of this is to be able to compare to a "global" last updated time
and determine if we need to send an update map request to a node.
In addition it allows us to create a scheduled check to see if all known
nodes are up to date.
Also, add a helper function to send a message to the update channel of a
machine.
This commit adds integration tests to headscale. They are currently
quite simple, but it lays the groundwork for more comprehensive testing
and ensuring we dont break things with the official tailscale client.
The test works by leveraging Docker (via dockertest) to spin up a
Headscale container, and a number of tailscale containers (10).
Each tailscale container is joined to the headscale and then "passed on"
to the tests.
Currently three tests have been implemented:
- Have all tailscale containers join headscale (in the setup process)
- Get IP from each container (I plan to extend this with cross-ping)
- List nodes with headscales CLI and verify all has been registered
This test depends on Docker, and currently, I have not looked into
hooking it into Github Actions.