mirror of
https://github.com/juanfont/headscale.git
synced 2025-12-08 00:32:23 -05:00
make tags first class node owner (#2885)
This PR changes tags to be something that exists on nodes in addition to users, to being its own thing. It is part of moving our tags support towards the correct tailscale compatible implementation. There are probably rough edges in this PR, but the intention is to get it in, and then start fixing bugs from 0.28.0 milestone (long standing tags issue) to discover what works and what doesnt. Updates #2417 Closes #2619
This commit is contained in:
336
AGENTS.md
336
AGENTS.md
@@ -237,6 +237,21 @@ headscale/
|
||||
- `policy.go`: Policy storage and retrieval
|
||||
- Schema migrations in `schema.sql` with extensive test data coverage
|
||||
|
||||
**CRITICAL DATABASE MIGRATION RULES**:
|
||||
|
||||
1. **NEVER reorder existing migrations** - Migration order is immutable once committed
|
||||
2. **ONLY add new migrations to the END** of the migrations array
|
||||
3. **NEVER disable foreign keys** in new migrations - no new migrations should be added to `migrationsRequiringFKDisabled`
|
||||
4. **Migration ID format**: `YYYYMMDDHHSS-short-description` (timestamp + descriptive suffix)
|
||||
- Example: `202511131500-add-user-roles`
|
||||
- The timestamp must be chronologically ordered
|
||||
5. **New migrations go after the comment** "As of 2025-07-02, no new IDs should be added here"
|
||||
6. If you need to rename a column that other migrations depend on:
|
||||
- Accept that the old column name will exist in intermediate migration states
|
||||
- Update code to work with the new column name
|
||||
- Let AutoMigrate create the new column if needed
|
||||
- Do NOT try to rename columns that later migrations reference
|
||||
|
||||
**Policy Engine (`hscontrol/policy/`)**
|
||||
|
||||
- `policy.go`: Core ACL evaluation logic, HuJSON parsing
|
||||
@@ -687,6 +702,326 @@ assert.EventuallyWithT(t, func(c *assert.CollectT) {
|
||||
}, 10*time.Second, 500*time.Millisecond, "mixed operations")
|
||||
```
|
||||
|
||||
## Tags-as-Identity Architecture
|
||||
|
||||
### Overview
|
||||
|
||||
Headscale implements a **tags-as-identity** model where tags and user ownership are mutually exclusive ways to identify nodes. This is a fundamental architectural principle that affects node registration, ownership, ACL evaluation, and API behavior.
|
||||
|
||||
### Core Principle: Tags XOR User Ownership
|
||||
|
||||
Every node in Headscale is **either** tagged **or** user-owned, never both:
|
||||
|
||||
- **Tagged Nodes**: Ownership is defined by tags (e.g., `tag:server`, `tag:database`)
|
||||
- Tags are set during registration via tagged PreAuthKey
|
||||
- Tags are immutable after registration (cannot be changed via API)
|
||||
- May have `UserID` set for "created by" tracking, but ownership is via tags
|
||||
- Identified by: `node.IsTagged()` returns `true`
|
||||
|
||||
- **User-Owned Nodes**: Ownership is defined by user assignment
|
||||
- Registered via OIDC, web auth, or untagged PreAuthKey
|
||||
- Node belongs to a specific user's namespace
|
||||
- No tags (empty tags array)
|
||||
- Identified by: `node.UserID().Valid() && !node.IsTagged()`
|
||||
|
||||
### Critical Implementation Details
|
||||
|
||||
#### Node Identification Methods
|
||||
|
||||
```go
|
||||
// Primary methods for determining node ownership
|
||||
node.IsTagged() // Returns true if node has tags OR AuthKey.Tags
|
||||
node.HasTag(tag) // Returns true if node has specific tag
|
||||
node.IsUserOwned() // Returns true if UserID set AND not tagged
|
||||
|
||||
// IMPORTANT: UserID can be set on tagged nodes for tracking!
|
||||
// Always use IsTagged() to determine actual ownership, not just UserID.Valid()
|
||||
```
|
||||
|
||||
#### UserID Field Semantics
|
||||
|
||||
**Critical distinction**: `UserID` has different meanings depending on node type:
|
||||
|
||||
- **Tagged nodes**: `UserID` is optional "created by" tracking
|
||||
- Indicates which user created the tagged PreAuthKey
|
||||
- Does NOT define ownership (tags define ownership)
|
||||
- Example: User "alice" creates tagged PreAuthKey with `tag:server`, node gets `UserID=alice.ID` + `Tags=["tag:server"]`
|
||||
|
||||
- **User-owned nodes**: `UserID` defines ownership
|
||||
- Required field for non-tagged nodes
|
||||
- Defines which user namespace the node belongs to
|
||||
- Example: User "bob" registers via OIDC, node gets `UserID=bob.ID` + `Tags=[]`
|
||||
|
||||
#### Mapper Behavior (mapper/tail.go)
|
||||
|
||||
The mapper converts internal nodes to Tailscale protocol format, handling the TaggedDevices special user:
|
||||
|
||||
```go
|
||||
// From mapper/tail.go:102-116
|
||||
User: func() tailcfg.UserID {
|
||||
// IMPORTANT: Tags-as-identity model
|
||||
// Tagged nodes ALWAYS use TaggedDevices user, even if UserID is set
|
||||
if node.IsTagged() {
|
||||
return tailcfg.UserID(int64(types.TaggedDevices.ID))
|
||||
}
|
||||
// User-owned nodes: use the actual user ID
|
||||
return tailcfg.UserID(int64(node.UserID().Get()))
|
||||
}()
|
||||
```
|
||||
|
||||
**TaggedDevices constant** (`types.TaggedDevices.ID = 2147455555`): Special user ID for all tagged nodes in MapResponse protocol.
|
||||
|
||||
#### Registration Flow
|
||||
|
||||
**Tagged Node Registration** (via tagged PreAuthKey):
|
||||
|
||||
1. User creates PreAuthKey with tags: `pak.Tags = ["tag:server"]`
|
||||
2. Node registers with PreAuthKey
|
||||
3. Node gets: `Tags = ["tag:server"]`, `UserID = user.ID` (optional tracking), `AuthKeyID = pak.ID`
|
||||
4. `IsTagged()` returns `true` (ownership via tags)
|
||||
5. MapResponse sends `User = TaggedDevices.ID`
|
||||
|
||||
**User-Owned Node Registration** (via OIDC/web/untagged PreAuthKey):
|
||||
|
||||
1. User authenticates or uses untagged PreAuthKey
|
||||
2. Node registers
|
||||
3. Node gets: `Tags = []`, `UserID = user.ID` (required)
|
||||
4. `IsTagged()` returns `false` (ownership via user)
|
||||
5. MapResponse sends `User = user.ID`
|
||||
|
||||
#### API Validation (SetTags)
|
||||
|
||||
The SetTags gRPC API enforces tags-as-identity rules:
|
||||
|
||||
```go
|
||||
// From grpcv1.go:340-347
|
||||
// User-owned nodes are nodes with UserID that are NOT tagged
|
||||
isUserOwned := nodeView.UserID().Valid() && !nodeView.IsTagged()
|
||||
if isUserOwned && len(request.GetTags()) > 0 {
|
||||
return error("cannot set tags on user-owned nodes")
|
||||
}
|
||||
```
|
||||
|
||||
**Key validation rules**:
|
||||
|
||||
- ✅ Can call SetTags on tagged nodes (tags already define ownership)
|
||||
- ❌ Cannot set tags on user-owned nodes (would violate XOR rule)
|
||||
- ❌ Cannot remove all tags from tagged nodes (would orphan the node)
|
||||
|
||||
#### Database Layer (db/node.go)
|
||||
|
||||
**Tag storage**: Tags are stored in PostgreSQL ARRAY column and SQLite JSON column:
|
||||
|
||||
```sql
|
||||
-- From schema.sql
|
||||
tags TEXT[] DEFAULT '{}' NOT NULL, -- PostgreSQL
|
||||
tags TEXT DEFAULT '[]' NOT NULL, -- SQLite (JSON array)
|
||||
```
|
||||
|
||||
**Validation** (`state/tags.go`):
|
||||
|
||||
- `validateNodeOwnership()`: Enforces tags XOR user rule
|
||||
- `validateAndNormalizeTags()`: Validates tag format (`tag:name`) and uniqueness
|
||||
|
||||
#### Policy Layer
|
||||
|
||||
**Tag Ownership** (policy/v2/policy.go):
|
||||
|
||||
```go
|
||||
func NodeCanHaveTag(node types.NodeView, tag string) bool {
|
||||
// Checks if node's IP is in the tagOwnerMap IP set
|
||||
// This is IP-based authorization, not UserID-based
|
||||
if ips, ok := pm.tagOwnerMap[Tag(tag)]; ok {
|
||||
if slices.ContainsFunc(node.IPs(), ips.Contains) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
```
|
||||
|
||||
**Important**: Tag authorization is based on IP ranges in ACL, not UserID. Tags define identity, ACL authorizes that identity.
|
||||
|
||||
### Testing Tags-as-Identity
|
||||
|
||||
**Unit Tests** (`hscontrol/types/node_tags_test.go`):
|
||||
|
||||
- `TestNodeIsTagged`: Validates IsTagged() for various scenarios
|
||||
- `TestNodeOwnershipModel`: Tests tags XOR user ownership
|
||||
- `TestUserTypedID`: Helper method validation
|
||||
|
||||
**API Tests** (`hscontrol/grpcv1_test.go`):
|
||||
|
||||
- `TestSetTags_UserXORTags`: Validates rejection of setting tags on user-owned nodes
|
||||
- `TestSetTags_TaggedNode`: Validates that tagged nodes (even with UserID) are not rejected
|
||||
|
||||
**Auth Tests** (`hscontrol/auth_test.go:890-928`):
|
||||
|
||||
- Tests node registration with tagged PreAuthKey
|
||||
- Validates tags are applied during registration
|
||||
|
||||
### Common Pitfalls
|
||||
|
||||
1. **Don't check only `UserID.Valid()` to determine user ownership**
|
||||
- ❌ Wrong: `if node.UserID().Valid() { /* user-owned */ }`
|
||||
- ✅ Correct: `if node.UserID().Valid() && !node.IsTagged() { /* user-owned */ }`
|
||||
|
||||
2. **Don't assume tagged nodes never have UserID set**
|
||||
- Tagged nodes MAY have UserID for "created by" tracking
|
||||
- Always use `IsTagged()` to determine ownership type
|
||||
|
||||
3. **Don't allow setting tags on user-owned nodes**
|
||||
- This violates the tags XOR user principle
|
||||
- Use API validation to prevent this
|
||||
|
||||
4. **Don't forget TaggedDevices in mapper**
|
||||
- All tagged nodes MUST use `TaggedDevices.ID` in MapResponse
|
||||
- User ID is only for actual user-owned nodes
|
||||
|
||||
### Migration Considerations
|
||||
|
||||
When nodes transition between ownership models:
|
||||
|
||||
- **No automatic migration**: Tags-as-identity is set at registration and immutable
|
||||
- **Re-registration required**: To change from user-owned to tagged (or vice versa), node must be deleted and re-registered
|
||||
- **UserID persistence**: UserID on tagged nodes is informational and not cleared
|
||||
|
||||
### Architecture Benefits
|
||||
|
||||
The tags-as-identity model provides:
|
||||
|
||||
1. **Clear ownership semantics**: No ambiguity about who/what owns a node
|
||||
2. **ACL simplicity**: Tag-based access control without user conflicts
|
||||
3. **API safety**: Validation prevents invalid ownership states
|
||||
4. **Protocol compatibility**: TaggedDevices special user aligns with Tailscale's model
|
||||
|
||||
## Logging Patterns
|
||||
|
||||
### Incremental Log Event Building
|
||||
|
||||
When building log statements with multiple fields, especially with conditional fields, use the **incremental log event pattern** instead of long single-line chains. This improves readability and allows conditional field addition.
|
||||
|
||||
**Pattern:**
|
||||
|
||||
```go
|
||||
// GOOD: Incremental building with conditional fields
|
||||
logEvent := log.Debug().
|
||||
Str("node", node.Hostname).
|
||||
Str("machine_key", node.MachineKey.ShortString()).
|
||||
Str("node_key", node.NodeKey.ShortString())
|
||||
|
||||
if node.User != nil {
|
||||
logEvent = logEvent.Str("user", node.User.Username())
|
||||
} else if node.UserID != nil {
|
||||
logEvent = logEvent.Uint("user_id", *node.UserID)
|
||||
} else {
|
||||
logEvent = logEvent.Str("user", "none")
|
||||
}
|
||||
|
||||
logEvent.Msg("Registering node")
|
||||
```
|
||||
|
||||
**Key rules:**
|
||||
|
||||
1. **Assign chained calls back to the variable**: `logEvent = logEvent.Str(...)` - zerolog methods return a new event, so you must capture the return value
|
||||
2. **Use for conditional fields**: When fields depend on runtime conditions, build incrementally
|
||||
3. **Use for long log lines**: When a log line exceeds ~100 characters, split it for readability
|
||||
4. **Call `.Msg()` at the end**: The final `.Msg()` or `.Msgf()` sends the log event
|
||||
|
||||
**Anti-pattern to avoid:**
|
||||
|
||||
```go
|
||||
// BAD: Long single-line chains are hard to read and can't have conditional fields
|
||||
log.Debug().Caller().Str("node", node.Hostname).Str("machine_key", node.MachineKey.ShortString()).Str("node_key", node.NodeKey.ShortString()).Str("user", node.User.Username()).Msg("Registering node")
|
||||
|
||||
// BAD: Forgetting to assign the return value (field is lost!)
|
||||
logEvent := log.Debug().Str("node", node.Hostname)
|
||||
logEvent.Str("user", username) // This field is LOST - not assigned back
|
||||
logEvent.Msg("message") // Only has "node" field
|
||||
```
|
||||
|
||||
**When to use this pattern:**
|
||||
|
||||
- Log statements with 4+ fields
|
||||
- Any log with conditional fields
|
||||
- Complex logging in loops or error handling
|
||||
- When you need to add context incrementally
|
||||
|
||||
**Example from codebase** (`hscontrol/db/node.go`):
|
||||
|
||||
```go
|
||||
logEvent := log.Debug().
|
||||
Str("node", node.Hostname).
|
||||
Str("machine_key", node.MachineKey.ShortString()).
|
||||
Str("node_key", node.NodeKey.ShortString())
|
||||
|
||||
if node.User != nil {
|
||||
logEvent = logEvent.Str("user", node.User.Username())
|
||||
} else if node.UserID != nil {
|
||||
logEvent = logEvent.Uint("user_id", *node.UserID)
|
||||
} else {
|
||||
logEvent = logEvent.Str("user", "none")
|
||||
}
|
||||
|
||||
logEvent.Msg("Registering test node")
|
||||
```
|
||||
|
||||
### Avoiding Log Helper Functions
|
||||
|
||||
Prefer the incremental log event pattern over creating helper functions that return multiple logging closures. Helper functions like `logPollFunc` create unnecessary indirection and allocate closures.
|
||||
|
||||
**Instead of:**
|
||||
|
||||
```go
|
||||
// AVOID: Helper function returning closures
|
||||
func logPollFunc(req tailcfg.MapRequest, node *types.Node) (
|
||||
func(string, ...any), // warnf
|
||||
func(string, ...any), // infof
|
||||
func(string, ...any), // tracef
|
||||
func(error, string, ...any), // errf
|
||||
) {
|
||||
return func(msg string, a ...any) {
|
||||
log.Warn().
|
||||
Caller().
|
||||
Bool("omitPeers", req.OmitPeers).
|
||||
Bool("stream", req.Stream).
|
||||
Uint64("node.id", node.ID.Uint64()).
|
||||
Str("node.name", node.Hostname).
|
||||
Msgf(msg, a...)
|
||||
},
|
||||
// ... more closures
|
||||
}
|
||||
```
|
||||
|
||||
**Prefer:**
|
||||
|
||||
```go
|
||||
// BETTER: Build log events inline with shared context
|
||||
func (m *mapSession) logTrace(msg string) {
|
||||
log.Trace().
|
||||
Caller().
|
||||
Bool("omitPeers", m.req.OmitPeers).
|
||||
Bool("stream", m.req.Stream).
|
||||
Uint64("node.id", m.node.ID.Uint64()).
|
||||
Str("node.name", m.node.Hostname).
|
||||
Msg(msg)
|
||||
}
|
||||
|
||||
// Or use incremental building for complex cases
|
||||
logEvent := log.Trace().
|
||||
Caller().
|
||||
Bool("omitPeers", m.req.OmitPeers).
|
||||
Bool("stream", m.req.Stream).
|
||||
Uint64("node.id", m.node.ID.Uint64()).
|
||||
Str("node.name", m.node.Hostname)
|
||||
|
||||
if additionalContext {
|
||||
logEvent = logEvent.Str("extra", value)
|
||||
}
|
||||
|
||||
logEvent.Msg("Operation completed")
|
||||
```
|
||||
|
||||
## Important Notes
|
||||
|
||||
- **Dependencies**: Use `nix develop` for consistent toolchain (Go, buf, protobuf tools, linting)
|
||||
@@ -697,3 +1032,4 @@ assert.EventuallyWithT(t, func(c *assert.CollectT) {
|
||||
- **Integration Tests**: Require Docker and can consume significant disk space - use headscale-integration-tester agent
|
||||
- **Performance**: NodeStore optimizations are critical for scale - be careful with changes to state management
|
||||
- **Quality Assurance**: Always use appropriate specialized agents for testing and validation tasks
|
||||
- **Tags-as-Identity**: Tags and user ownership are mutually exclusive - always use `IsTagged()` to determine ownership
|
||||
|
||||
Reference in New Issue
Block a user