Networking
The networking layer is what makes a Rune cluster feel like one machine. You give every service a stable virtual IP (VIP), a DNS name your containers can resolve, optional default-deny network policy, and an edge HTTPS ingress with automatic certificate issuance. None of this requires sidecars, eBPF, or a separate control plane — it ships in the same runed binary.
This page covers the design at a level that’s useful for operators and curious developers. If you just want to expose a service or write a network policy, jump to the corresponding guide.
The pieces
Section titled “The pieces”┌──────────────────────────────────────────────────────────────────┐│ control plane ││ ││ Service ──► Endpoints ──► OrderedLog (Badger today, Raft soon) ││ │ │ ││ └──────► VIP allocator ──────┘ │└──────────────────────────────────────────────────────────────────┘ │ (per-node watch) ▼┌──────────────────────────────────────────────────────────────────┐│ agent ││ ││ embedded DNS ◀──── containers ──── policy enforcer ││ │ │ │ ││ ▼ ▼ ▼ ││ <svc>.<ns>.rune container IP nftables (Linux) ││ │ │ │ ││ └────────► data plane proxy ◀─────────┘ ││ ││ ingress (edge nodes) ──► ACME orchestrator │└──────────────────────────────────────────────────────────────────┘Five subsystems, all driven by an append-only OrderedLog that the agent watches:
- VIP allocator — bootstraps a
ClusterNetworkfrom a CIDR, hands every service a stable/32from it. Allocations are persisted; a service keeps its VIP across cast cycles. - Endpoints — the orchestrator publishes
<service, namespace> ⇒ {pod IP, port, node}triples whenever an instance starts, stops, or fails health checks. - Embedded DNS — every node runs a tiny authoritative resolver on
127.0.0.123:53. Containers reach it via Docker’s--dns/--dns-searchinjection. Anything ending in.runeresolves to a service VIP; everything else forwards to/etc/resolv.confupstreams. - Network policy —
ServiceNetworkPolicyobjects compile to per-VIP rule tables. The agent enforces them at the data plane proxy and (on Linux) installs matchingnftableschains for kernel-level dropping. - Ingress + ACME — on edge nodes,
runedowns:80/:443, terminates TLS, and asks the ACME orchestrator (HTTP-01) to keep certificates fresh. Single-node = always leader; multi-node leader election lands with Phase 2.
The OrderedLog seam
Section titled “The OrderedLog seam”Every networking write goes through OrderedLog.Propose. Today the backend is Badger (single-node); Phase 2 swaps it for Raft (multi-node) without touching any subsystem code. The seam is enforced by scripts/check_orderedlog_seam.sh — direct Badger writes to the protected key prefixes (network/, endpoints/, local_instances/, policy/) fail CI.
Practically:
- Stale-budget protection. Each subsystem tracks the last applied OrderedLog sequence; if the watch falls behind by more than a configurable budget (default 30s), the subsystem fails closed instead of serving stale data. You’ll see
rune_dataplane_stale_secondsclimb on/metricsbefore things break. - Hot reload. Updating a
ServiceNetworkPolicyis one OrderedLog write. The agent compiles it, atomically swaps the rule table, and the next packet sees the new policy. No restarts. - No split brain on cast.
castflips the desired state in the store; the orchestrator publishes endpoints; the agent picks them up. There is no out-of-band signalling.
VIPs and DNS
Section titled “VIPs and DNS”A service VIP is a stable /32 from the cluster CIDR (default 10.96.0.0/16 — RFC1918 or 100.64/10 only). VIPs are not pingable and don’t appear on any host interface; they exist purely as the answer the embedded DNS hands out and as the destination the data plane proxy intercepts.
$ dig +short api.default.rune10.96.0.42
$ # inside any container on the cluster:$ curl http://api.default.rune:8080/healthzokThe DNS subsystem is gated on data-plane readiness — it won’t start serving answers until the data path can carry the resulting traffic. This avoids the classic “DNS works, connections hang” failure mode during cold start.
Dev mode
Section titled “Dev mode”--dev-mode (or networking.dev_mode = true) flips three switches:
- Implies
--node-role=edge— the ingress controller and ACME orchestrator come up automatically. You don’t need to pass both flags; a single--dev-modeis enough for the full laptop experience. Set--node-role=""if you explicitly want dev mode without edge. - DNS resolves
*.runeto127.0.0.1so you can hit services from your host with a port-forward. - The ingress controller binds
:8080/:8443instead of:80/:443(nosudoneeded). - nftables is skipped on Linux. Policy still evaluates in user space at the proxy.
Dev mode is a laptop convenience, not a production stance. Real clusters always run in production mode.
Network policy
Section titled “Network policy”ServiceNetworkPolicy is embedded in Service under networkPolicy:. You ship it as part of the service spec:
service: name: api image: ghcr.io/example/api:1.4.0 ports: - name: http port: 8080 networkPolicy: ingress: - from: - service: web - service: worker namespace: jobs - cidr: 10.0.0.0/8 # office network ports: [http]Two rules to keep in mind:
- Default deny activates per-service. As soon as a service has a
networkPolicyblock for ingress or egress, that direction flips to default-deny. Services without policies remain default-allow. This is intentional — it lets you adopt policies one workload at a time. - Service-name selectors are same-node only in v1. Cross-node identity (so
from: { service: web }matches web pods on other nodes) lands with Phase 2 alongside Raft. CIDR selectors work everywhere today.
Dropped packets increment rune_policy_drops_total{service,namespace,policy,reason} so you can alert on policy bites in production.
Ingress + automatic TLS
Section titled “Ingress + automatic TLS”A service becomes internet-reachable by adding an expose: block:
service: name: api image: ghcr.io/example/api:1.4.0 scale: 2 expose: host: api.example.com port: http tls: auto: true # ACME (Let's Encrypt) — see acme.* configOn edge nodes (--node-role=edge), runed:
- Adds
api.example.comto the ingress router. - Submits a certificate request to the ACME orchestrator.
- Serves the HTTP-01 challenge from
/.well-known/acme-challenge/on port 80. - Once issued, hot-reloads the
tls.Configso the next handshake on:443uses the new cert.
You can watch progress with rune get ingresses:
$ rune get ingressesNAMESPACE SERVICE HOST TLS CERT EXPIRESdefault api api.example.com acme ready 89dmode=manual lets you point at a pre-existing TLS secret instead of using ACME — useful for wildcard certs you manage out-of-band.
Observability
Section titled “Observability”Every networking-layer subsystem registers metrics with the Prometheus registry served on --metrics-addr (default 127.0.0.1:9100). The high-signal counters:
| Metric | What it means |
|---|---|
rune_vip_allocations_total | VIPs handed out; jumps on cast |
rune_endpoints_published_total{service,namespace} | Endpoint set rewrites; one per scale/health change |
rune_dataplane_stale_seconds | Lag between agent watch and OrderedLog tail |
rune_policy_drops_total{service,namespace,reason} | Packets dropped by policy — alert when non-zero |
rune_policy_local_instances | Container-IP rows on this node; sanity check |
rune_ingress_requests_total{host,status} | Edge HTTP/HTTPS request rate (edge nodes only) |
rune_acme_certificates{state} | Certs by lifecycle state — pending/ready/error |
Combine with rune_dataplane_stale_seconds > 30 for a “data plane is falling behind” alert.
What ships, what’s next
Section titled “What ships, what’s next”Phase 1 (today) ships VIPs, endpoints, embedded DNS, network policy, single-node ingress, and HTTP-01 ACME. The OrderedLog seam is in place so Phase 2 can land Raft, multi-node ingress, and cross-node identity selectors without rewriting the subsystems above.
If you’re operating a single droplet or VM, Phase 1 is the entire story you need. The DigitalOcean deploy guide is the fastest path from zero to a production HTTPS service.