Skip to content

Networking

The networking layer is what makes a Rune cluster feel like one machine. You give every service a stable virtual IP (VIP), a DNS name your containers can resolve, optional default-deny network policy, and an edge HTTPS ingress with automatic certificate issuance. None of this requires sidecars, eBPF, or a separate control plane — it ships in the same runed binary.

This page covers the design at a level that’s useful for operators and curious developers. If you just want to expose a service or write a network policy, jump to the corresponding guide.

┌──────────────────────────────────────────────────────────────────┐
│ control plane │
│ │
│ Service ──► Endpoints ──► OrderedLog (Badger today, Raft soon) │
│ │ │ │
│ └──────► VIP allocator ──────┘ │
└──────────────────────────────────────────────────────────────────┘
(per-node watch)
┌──────────────────────────────────────────────────────────────────┐
│ agent │
│ │
│ embedded DNS ◀──── containers ──── policy enforcer │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ <svc>.<ns>.rune container IP nftables (Linux) │
│ │ │ │ │
│ └────────► data plane proxy ◀─────────┘ │
│ │
│ ingress (edge nodes) ──► ACME orchestrator │
└──────────────────────────────────────────────────────────────────┘

Five subsystems, all driven by an append-only OrderedLog that the agent watches:

  1. VIP allocator — bootstraps a ClusterNetwork from a CIDR, hands every service a stable /32 from it. Allocations are persisted; a service keeps its VIP across cast cycles.
  2. Endpoints — the orchestrator publishes <service, namespace> ⇒ {pod IP, port, node} triples whenever an instance starts, stops, or fails health checks.
  3. Embedded DNS — every node runs a tiny authoritative resolver on 127.0.0.123:53. Containers reach it via Docker’s --dns/--dns-search injection. Anything ending in .rune resolves to a service VIP; everything else forwards to /etc/resolv.conf upstreams.
  4. Network policyServiceNetworkPolicy objects compile to per-VIP rule tables. The agent enforces them at the data plane proxy and (on Linux) installs matching nftables chains for kernel-level dropping.
  5. Ingress + ACME — on edge nodes, runed owns :80/:443, terminates TLS, and asks the ACME orchestrator (HTTP-01) to keep certificates fresh. Single-node = always leader; multi-node leader election lands with Phase 2.

Every networking write goes through OrderedLog.Propose. Today the backend is Badger (single-node); Phase 2 swaps it for Raft (multi-node) without touching any subsystem code. The seam is enforced by scripts/check_orderedlog_seam.sh — direct Badger writes to the protected key prefixes (network/, endpoints/, local_instances/, policy/) fail CI.

Practically:

  • Stale-budget protection. Each subsystem tracks the last applied OrderedLog sequence; if the watch falls behind by more than a configurable budget (default 30s), the subsystem fails closed instead of serving stale data. You’ll see rune_dataplane_stale_seconds climb on /metrics before things break.
  • Hot reload. Updating a ServiceNetworkPolicy is one OrderedLog write. The agent compiles it, atomically swaps the rule table, and the next packet sees the new policy. No restarts.
  • No split brain on cast. cast flips the desired state in the store; the orchestrator publishes endpoints; the agent picks them up. There is no out-of-band signalling.

A service VIP is a stable /32 from the cluster CIDR (default 10.96.0.0/16 — RFC1918 or 100.64/10 only). VIPs are not pingable and don’t appear on any host interface; they exist purely as the answer the embedded DNS hands out and as the destination the data plane proxy intercepts.

$ dig +short api.default.rune
10.96.0.42
$ # inside any container on the cluster:
$ curl http://api.default.rune:8080/healthz
ok

The DNS subsystem is gated on data-plane readiness — it won’t start serving answers until the data path can carry the resulting traffic. This avoids the classic “DNS works, connections hang” failure mode during cold start.

--dev-mode (or networking.dev_mode = true) flips three switches:

  • Implies --node-role=edge — the ingress controller and ACME orchestrator come up automatically. You don’t need to pass both flags; a single --dev-mode is enough for the full laptop experience. Set --node-role="" if you explicitly want dev mode without edge.
  • DNS resolves *.rune to 127.0.0.1 so you can hit services from your host with a port-forward.
  • The ingress controller binds :8080/:8443 instead of :80/:443 (no sudo needed).
  • nftables is skipped on Linux. Policy still evaluates in user space at the proxy.

Dev mode is a laptop convenience, not a production stance. Real clusters always run in production mode.

ServiceNetworkPolicy is embedded in Service under networkPolicy:. You ship it as part of the service spec:

service:
name: api
image: ghcr.io/example/api:1.4.0
ports:
- name: http
port: 8080
networkPolicy:
ingress:
- from:
- service: web
- service: worker
namespace: jobs
- cidr: 10.0.0.0/8 # office network
ports: [http]

Two rules to keep in mind:

  • Default deny activates per-service. As soon as a service has a networkPolicy block for ingress or egress, that direction flips to default-deny. Services without policies remain default-allow. This is intentional — it lets you adopt policies one workload at a time.
  • Service-name selectors are same-node only in v1. Cross-node identity (so from: { service: web } matches web pods on other nodes) lands with Phase 2 alongside Raft. CIDR selectors work everywhere today.

Dropped packets increment rune_policy_drops_total{service,namespace,policy,reason} so you can alert on policy bites in production.

A service becomes internet-reachable by adding an expose: block:

service:
name: api
image: ghcr.io/example/api:1.4.0
scale: 2
expose:
host: api.example.com
port: http
tls:
auto: true # ACME (Let's Encrypt) — see acme.* config

On edge nodes (--node-role=edge), runed:

  1. Adds api.example.com to the ingress router.
  2. Submits a certificate request to the ACME orchestrator.
  3. Serves the HTTP-01 challenge from /.well-known/acme-challenge/ on port 80.
  4. Once issued, hot-reloads the tls.Config so the next handshake on :443 uses the new cert.

You can watch progress with rune get ingresses:

$ rune get ingresses
NAMESPACE SERVICE HOST TLS CERT EXPIRES
default api api.example.com acme ready 89d

mode=manual lets you point at a pre-existing TLS secret instead of using ACME — useful for wildcard certs you manage out-of-band.

Every networking-layer subsystem registers metrics with the Prometheus registry served on --metrics-addr (default 127.0.0.1:9100). The high-signal counters:

MetricWhat it means
rune_vip_allocations_totalVIPs handed out; jumps on cast
rune_endpoints_published_total{service,namespace}Endpoint set rewrites; one per scale/health change
rune_dataplane_stale_secondsLag between agent watch and OrderedLog tail
rune_policy_drops_total{service,namespace,reason}Packets dropped by policy — alert when non-zero
rune_policy_local_instancesContainer-IP rows on this node; sanity check
rune_ingress_requests_total{host,status}Edge HTTP/HTTPS request rate (edge nodes only)
rune_acme_certificates{state}Certs by lifecycle state — pending/ready/error

Combine with rune_dataplane_stale_seconds > 30 for a “data plane is falling behind” alert.

Phase 1 (today) ships VIPs, endpoints, embedded DNS, network policy, single-node ingress, and HTTP-01 ACME. The OrderedLog seam is in place so Phase 2 can land Raft, multi-node ingress, and cross-node identity selectors without rewriting the subsystems above.

If you’re operating a single droplet or VM, Phase 1 is the entire story you need. The DigitalOcean deploy guide is the fastest path from zero to a production HTTPS service.