Upgrades
Rune ships single binaries — upgrades are mostly “swap and restart.” This page covers the corners that matter in production: which path to use, what actually pauses during the swap, the file-capability gotcha, and the Terraform module behaviour that’s bitten operators.
Pick an upgrade path
Section titled “Pick an upgrade path”| Scenario | Path | Notes |
|---|---|---|
Routine runed version bump on an existing host | scripts/upgrade-server.sh | Swaps binaries, re-applies cap_net_bind_service, restarts, rolls back on failure. Doesn’t touch the systemd unit, runefile, or data dir. |
| First-time install, or you want the systemd unit refreshed to current | scripts/install-server.sh | Greenfield path. Re-running it on an existing host will rewrite the unit and re-setcap — useful when the on-disk unit has drifted behind the installer template. |
Provisioned via terraform-digitalocean-rune (or similar) | upgrade-server.sh over SSH, not terraform apply with a bumped rune_version | See the Terraform-managed deployments section. |
| Just the CLI on a developer machine | scripts/install-cli.sh | Doesn’t touch runed. |
Version skew
Section titled “Version skew”Client and server share a generated proto package. Mismatched versions usually still work for compatible RPCs, but new features only show up when both sides are upgraded.
- CLI ahead of server: missing fields in responses, possible
unknown fieldwarnings on requests. Mostly fine. - CLI behind server: missing client-side support for new flags. Update the CLI.
rune version since v0.0.1-dev.38 prints both client and server build info — use it to spot skew at a glance.
$ rune versionClient: Version: v0.0.1-dev.44 Commit: 32ab1d04Server: Version: v0.0.1-dev.43 Commit: ffde251cPin to the same version on both sides for production.
Upgrade runed — the script
Section titled “Upgrade runed — the script”Since v0.0.1-dev.44, the recommended in-place upgrade path is scripts/upgrade-server.sh. Run it as root on the host:
sudo bash <(curl -fsSL https://raw.githubusercontent.com/runestack/rune/main/scripts/upgrade-server.sh) \ --version v0.0.1-dev.44What it does:
- Downloads
rune_linux_<arch>.tar.gzfor the requested version. - Notes whether the current
runedhascap_net_bind_serviceset (viagetcap). - Backs up the current binaries to
/usr/local/bin/.{rune,runed}.bak. - Stops
runed, atomically replaces the binaries, re-applies the file capability when applicable, startsruned. - Polls
systemctl is-activefor up to 15s. - On verification failure: restores the backup binaries and restarts. The EXIT trap covers any failure point.
Flags worth knowing:
--skip-restart— replace binaries without restarting (for scripted maintenance windows).--skip-caps— don’t re-applycap_net_bind_service. Use if you’ve moved low-port binding entirely to systemd’sAmbientCapabilities.--no-keep-backup— remove the backup files after success. Default keeps them so you can roll back by hand.--refresh-unit(since v0.0.1-dev.45) — replace the on-diskruned.servicewith a fresh one fromruned print-systemd. See Refreshing the systemd unit below.
Refreshing the systemd unit
Section titled “Refreshing the systemd unit”The on-disk /etc/systemd/system/runed.service is written by install-server.sh at first boot and never updated again unless you do something about it. The Rune team adds directives to that template over time — AmbientCapabilities=CAP_NET_BIND_SERVICE (the back-stop for the file-capability trap), resource limits, OOM tuning — and hosts that were provisioned from older installers don’t pick those up automatically.
Since v0.0.1-dev.45, the runed binary itself emits its canonical unit via runed print-systemd, and upgrade-server.sh --refresh-unit uses that to swap in a current unit during the upgrade.
sudo bash <(curl -fsSL https://raw.githubusercontent.com/runestack/rune/main/scripts/upgrade-server.sh) \ --version v0.0.1-dev.45 \ --refresh-unitThe flow during a --refresh-unit run:
- Binaries are swapped first (same as a plain upgrade).
- The new
runedis invoked:/usr/local/bin/runed print-systemd > runed.service.new. Rendering uses the new binary, so the unit always matches what this version ofrunedexpects. - The old unit is backed up to
/etc/systemd/system/runed.service.bak. - The new unit is installed and
systemctl daemon-reloadruns. runedis restarted; the verification path is the same as the plain upgrade.- On verification failure, the EXIT trap restores both the binary and the previous unit, reloads, and restarts.
You can inspect what runed would write before committing to the refresh:
# What would the new unit look like?runed print-systemd
# Diff against what's deployed:diff <(runed print-systemd) /etc/systemd/system/runed.serviceruned print-systemd accepts --user, --group, --binary, and --config if your install uses non-default paths. With no flags it emits the same unit install-server.sh would write today.
Caveat for customized units. --refresh-unit replaces the whole base unit. If you’ve edited it by hand to add (say) Environment= lines or a custom RestartSec, those go to the .bak file and don’t carry forward. Two safer patterns:
- Drop-ins: put your customisations in
/etc/systemd/system/runed.service.d/*.confinstead of editing the base unit. Drop-ins aren’t touched by--refresh-unit. This is the patterninstall-server.shalready uses forSupplementaryGroups=dockerwhen the docker group exists. - Diff first: run
diff <(runed print-systemd) /etc/systemd/system/runed.servicebefore--refresh-unitand adapt the customisations into drop-ins ahead of the refresh.
The file-capability trap
Section titled “The file-capability trap”This is the one to remember. cap_net_bind_service is set via setcap and stored as an extended attribute on the binary file. It does not survive cp / mv / install — any binary-only replacement strips it, and runed (running as the non-root rune user) then fails to bind :80 / :443 / :53 with bind: permission denied.
upgrade-server.sh handles this automatically. If you’re doing a manual swap (see below), you need to re-apply:
sudo setcap cap_net_bind_service=+ep /usr/local/bin/runedThe systemd unit shipped by install-server.sh also sets AmbientCapabilities=CAP_NET_BIND_SERVICE and CapabilityBoundingSet=CAP_NET_BIND_SERVICE as a belt-and-braces measure. On modern systemd those alone should be sufficient — but on hosts provisioned from older install-server.sh versions the unit may pre-date that line. If you suspect that’s you, refresh the unit by re-running install-server.sh once.
Upgrade runed — manual swap
Section titled “Upgrade runed — manual swap”If you want to know every step or you’re building a custom upgrade flow:
VER=v0.0.1-dev.44ARCH=$(uname -m); case "$ARCH" in x86_64) ARCH=amd64 ;; aarch64|arm64) ARCH=arm64 ;; *) echo "Unsupported"; exit 1 ;;esac
# Backupsudo cp /usr/local/bin/rune /usr/local/bin/.rune.baksudo cp /usr/local/bin/runed /usr/local/bin/.runed.bak
# Swapsudo systemctl stop runedcurl -L -o /tmp/rune.tgz \ "https://github.com/runestack/rune/releases/download/$VER/rune_linux_${ARCH}.tar.gz"sudo tar -C /usr/local/bin -xzf /tmp/rune.tgz rune runed
# Re-apply file capability (REQUIRED for edge nodes binding :80/:443/:53)sudo setcap cap_net_bind_service=+ep /usr/local/bin/runed
sudo systemctl start runedruned --versionsudo systemctl status runed --no-pager | catupgrade-server.sh is the same flow with rollback and verification baked in — prefer it.
Upgrade the CLI only
Section titled “Upgrade the CLI only”curl -fsSL https://raw.githubusercontent.com/runestack/rune/main/scripts/install-cli.sh | bashrune versionWhat actually pauses during the swap
Section titled “What actually pauses during the swap”runed is the API server, the orchestrator, the ingress proxy, the embedded DNS resolver, and (when configured) the ACME runner — all in one process. While it’s stopped, four things are unavailable:
| Surface | Behaviour during the ~5–15s window |
|---|---|
| gRPC control plane | rune CLI calls fail (connection refused). No new services/instances can start. |
| Ingress on :80/:443 | Listener is in-process. External HTTP/HTTPS traffic to ingress-exposed services drops. Already-established connections may stall. |
| Embedded DNS | *.rune name resolution between containers breaks. Already-resolved connections stay up; new lookups fail. |
| Health probes | Runner-driven probes don’t fire. Services with tight failureThreshold × intervalSeconds may briefly flip to Degraded and recover when runed returns. |
Service workload containers keep running — they’re independent Docker containers, not in runed’s data path. Their TCP listeners stay up; whatever was talking to them via container IPs continues unaffected.
True zero-downtime upgrades (and a story for ingress that survives runed restart) require multi-node Raft, on the roadmap as RUNE-025.
Terraform-managed deployments
Section titled “Terraform-managed deployments”If you’re using terraform-digitalocean-rune (or a similar module), there’s a sharp edge worth knowing about.
The module renders var.rune_version into the droplet’s user_data (cloud-init). Cloud-init runs only on first boot — bumping rune_version in code does not re-run the installer on an existing droplet. Until v0.0.5 of the module, the default Terraform behaviour on a user_data change was to mark the droplet for replacement (destroy + create), which would wipe /var/lib/rune (KEK, BadgerDB store, host-local volumes).
Since v0.0.6 the module sets lifecycle { ignore_changes = [user_data] } on the droplet so the variable can advance freely in code without triggering a destroy.
The correct flow with the TF module:
# 1. SSH to the droplet and upgrade in place:sudo bash <(curl -fsSL https://raw.githubusercontent.com/runestack/rune/main/scripts/upgrade-server.sh) \ --version v0.0.1-dev.44
# 2. Bump var.rune_version in your TF code so new droplets (DR rebuild,# region migration, etc.) start at the same version. The apply will# be a no-op for the existing droplet because of ignore_changes.terraform applyIf you genuinely want a fresh droplet at a new version — e.g. for a deliberate DR rebuild or a disposable preview environment — use -replace:
terraform apply -replace=module.rune.digitalocean_droplet.thisThis bypasses ignore_changes and recreates the droplet. Expect data loss on the destroyed host. Floating IPs and externally-attached DO Block Storage volumes survive; everything on the droplet’s root disk does not.
Pre-upgrade checklist
Section titled “Pre-upgrade checklist”- Backup the data dir (default
/var/lib/rune) and the KEK separately. - Read the release notes — breaking changes are flagged.
- Run on a staging host first if your environment supports it.
- Confirm reachability of all your image registries from the host.
- Note the current version (
rune version) so you have a target if rollback is needed.
Post-upgrade checks
Section titled “Post-upgrade checks”rune version # both blocks should matchsudo systemctl status runed --no-pagersudo journalctl -u runed -n 100 --no-pager # look for binding errors
rune whoami # API responsive, server version linerune get services -A # full inventory, look for Failedrune statusFor edge nodes:
# Confirm the file capability is still in place after upgrade.getcap /usr/local/bin/runed# → /usr/local/bin/runed cap_net_bind_service=epIf services come back as Failed, check probe configuration — schema validation sometimes tightens between minor versions (see the Init steps troubleshooting section for examples).
Rollback
Section titled “Rollback”If the new version misbehaves and upgrade-server.sh has already declared success (the failure surfaced later), restore by hand:
sudo systemctl stop runedsudo cp /usr/local/bin/.rune.bak /usr/local/bin/runesudo cp /usr/local/bin/.runed.bak /usr/local/bin/runedsudo setcap cap_net_bind_service=+ep /usr/local/bin/runed # if edgesudo systemctl start runedThe backups are at /usr/local/bin/.{rune,runed}.bak unless you ran with --no-keep-backup.
If upgrade-server.sh itself fails verification, it has already rolled back via its EXIT trap before exiting non-zero — no manual restore needed.
Data on disk is forward- and backward-compatible across patch versions. Across minor versions, breaking schema migrations are flagged in release notes — back up before, and only roll forward unless the notes say otherwise.
Schema migrations
Section titled “Schema migrations”Most upgrades are pure binary swaps with no schema migration. When a migration is needed, runed runs it on first boot of the new version. If a migration fails:
- The server refuses to serve until the migration completes or you restore from backup.
- The journal will tell you exactly which step failed.
- Restore from backup, downgrade, file an issue.
Upgrading the CLI on every developer’s machine
Section titled “Upgrading the CLI on every developer’s machine”For teams, ship the CLI version as a managed dependency:
- Homebrew tap (planned).
- CI step that downloads a known version into the runner.
- Devcontainer / asdf plugin for local dev.
Avoid relying on curl | bash ad hoc — pin a version per environment.