This PR addresses some consistency issues that was introduced or discovered with the nodestore.
nodestore:
Now returns the node that is being put or updated when it is finished. This closes a race condition where when we read it back, we do not necessarily get the node with the given change and it ensures we get all the other updates from that batch write.
auth:
Authentication paths have been unified and simplified. It removes a lot of bad branches and ensures we only do the minimal work.
A comprehensive auth test set has been created so we do not have to run integration tests to validate auth and it has allowed us to generate test cases for all the branches we currently know of.
integration:
added a lot more tooling and checks to validate that nodes reach the expected state when they come up and down. Standardised between the different auth models. A lot of this is to support or detect issues in the changes to nodestore (races) and auth (inconsistencies after login and reaching correct state)
This PR was assisted, particularly tests, by claude code.
- tailscale client gets a new AuthUrl and sets entry in the regcache
- regcache entry expires
- client doesn't know about that
- client always polls followup request а gets error
When user clicks "Login" in the app (after cache expiry), they visit
invalid URL and get "node not found in registration cache". Some clients
on Windows for e.g. can't get a new AuthUrl without restart the app.
To fix that we can issue a new reg id and return user a new valid
AuthUrl.
RegisterNode is refactored to be created with NewRegisterNode() to
autocreate channel and other stuff.
We are already being punished by github actions, there seem to be
little value in running all the tests for both databases, so only
run a few key tests to check postgres isnt broken.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* cmd/hi: add integration test runner CLI tool
Add a new CLI tool 'hi' for running headscale integration tests
with Docker automation. The tool replaces manual Docker command
composition with an automated solution.
Features:
- Run integration tests in golang:1.24 containers
- Docker context detection (supports colima and other contexts)
- Test isolation with unique run IDs and isolated control_logs
- Automatic Docker image pulling and container management
- Comprehensive cleanup operations for containers, networks, images
- Docker volume caching for Go modules
- Verbose logging and detailed test artifact reporting
- Support for PostgreSQL/SQLite selection and various test flags
Usage: go run ./cmd/hi run TestPingAllByIP --verbose
The tool uses creachadair/command and flax for CLI parsing and
provides cleanup subcommands for Docker resource management.
Updates flake.nix vendorHash for new Go dependencies.
* ci: update integration tests to use hi CLI tool
Replace manual Docker command composition in GitHub Actions
workflow with the new hi CLI tool for running integration tests.
Changes:
- Replace complex docker run command with simple 'go run ./cmd/hi run'
- Remove manual environment variable setup (handled by hi tool)
- Update artifact paths for new timestamped log directory structure
- Simplify command from 15+ lines to 3 lines
- Maintain all existing functionality (postgres/sqlite, timeout, test patterns)
The hi tool automatically handles Docker context detection, container
management, volume mounting, and environment variable setup that was
previously done manually in the workflow.
* makefile: remove test integration
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* fix issue auto approve route on register bug
This commit fixes an issue where routes where not approved
on a node during registration. This cause the auto approval
to require the node to readvertise the routes.
Fixes#2497Fixes#2485
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* hsic: only set db policy if exist
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* policy: calculate changed based on policy and filter
v1 is a bit simpler than v2, it does not pre calculate the auto approver map
and we cannot tell if it is changed.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* utility iterator for ipset
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* split policy -> policy and v1
This commit split out the common policy logic and policy implementation
into separate packages.
policy contains functions that are independent of the policy implementation,
this typically means logic that works on tailcfg types and generic formats.
In addition, it defines the PolicyManager interface which the v1 implements.
v1 is a subpackage which implements the PolicyManager using the "original"
policy implementation.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* use polivyv1 definitions in integration tests
These can be marshalled back into JSON, which the
new format might not be able to.
Also, just dont change it all to JSON strings for now.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* formatter: breaks lines
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* remove compareprefix, use tsaddr version
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* remove getacl test, add back autoapprover
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* use policy manager tag handling
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* rename display helper for user
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* introduce policy v2 package
policy v2 is built from the ground up to be stricter
and follow the same pattern for all types of resolvers.
TODO introduce
aliass
resolver
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* wire up policyv2 in integration testing
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* split policy v2 tests into seperate workflow to work around github limit
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* add policy manager output to /debug
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* update changelog
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
* set state and nounce in oidc to prevent csrf
Fixes#2276
* try to fix new postgres issue
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
---------
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Docker releases a patch release which changed the required permissions to be able to do tun devices in containers, this caused all containers to fail in tests causing us to fail all tests. This fixes it, and adds some tools for debugging in the future.
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Setup mike to provide versioned builds of the documentation.
The goal is to have versioned docs for stable releases (0.23.0, 0.24.0)
and development docs that can progress along with the code. This allows
us to tailor docs to the next upcoming version as we no longer need to
care about diversion between rendered docs and the latest release.
Versions:
* development (alias: unstable) on each push to the main branch
* MAJOR.MINOR.PATCH (alias: stable, latest for the newest version)
* for each "final" release tag
* for each push to doc maintenance branches: doc/MAJOR.MINOR.PATCH
The default version should the current stable version. The doc
maintenance branches may be used to update the version specific
documentation when issues arise after a release.
* #2140 Fixed updating of hostname and givenName when it is updated in HostInfo
* #2140 Added integration tests
* #2140 Fix unit tests
* Changed IsAutomaticNameMode to GivenNameHasBeenChanged. Fixed errors in files according to golangci-lint rules