Skip to content

Storage

Rune ships a small, pluggable storage subsystem so a service can outlive the container it runs in. The model is intentionally close to Kubernetes (StorageClass / PersistentVolumeClaim / VolumeSnapshot) but flattened to fit a single-binary control plane.

ResourceScopePurpose
StorageClassclusterNames a driver + a set of parameters. Operator-defined.
VolumenamespacedA unit of durable storage. Provisioned by a driver.
SnapshotnamespacedA point-in-time copy of a volume; can be restored anywhere.

A Volume always references a StorageClass, which in turn names a driver (local, local-host, do-volume, …). The driver is the only piece that talks to the underlying storage — adding a new backend means writing a driver, not patching the controller.

Two storage classes are seeded automatically on first boot:

NameDriverDefaultSnapshotsAccess modesNotes
locallocalyesyesRWORune-managed directory tree under localVolumeRoot.
local-hostlocal-hostnonoRWO, ROXOperator-pinned host paths; allowlisted in runefile.

local is the default — a Volume (or claimTemplate) that omits storageClassName resolves to local. Override the boot default with [storage] defaultStorageClass = "..." in the runefile.

The cloud-backed do-volume driver (DigitalOcean Volumes) is shipped in-tree but does not seed a class — operators define one explicitly with their API token reference and region.

Pending ──▶ Provisioning ──▶ Available ──▶ Bound
│ │ │
▼ ▼ ▼
Failed Released Released
│ │
▼ ▼
Stalled Retain | Delete
  • Pending — row exists, controller hasn’t picked it up yet.
  • Provisioning — driver Provision in flight.
  • Available — provisioned, no instance attached.
  • Bound — attached to an instance (BoundClaim/BoundNode set).
  • Released — instance went away, volume awaits its reclaim policy.
  • Failed then StalledProvision failed; the controller retries with backoff, then freezes the row in Stalled so a human can intervene with rune volume retry-provision.

A service mounts a volume with one of two shapes — they differ in who owns the volume row.

claim — service points at an existing Volume

Section titled “claim — service points at an existing Volume”
service:
name: web
scale: 1
volumes:
- name: data
mountPath: /var/lib/web
claim:
name: web-data # an existing Volume in the same namespace
  • One volume, one mount, shared across replicas.
  • For RWO drivers, this requires scale: 1. Cast-time error otherwise.
  • Cross-namespace claims use the FQDN form: name: shared.common.rune.

claimTemplate — Rune auto-provisions one volume per replica

Section titled “claimTemplate — Rune auto-provisions one volume per replica”
service:
name: postgres
scale: 3
volumes:
- name: pgdata
mountPath: /var/lib/postgresql/data
claimTemplate:
storageClassName: local # optional; falls back to default class
size: 10Gi
accessMode: ReadWriteOnce
  • Rune creates pgdata-postgres-0, pgdata-postgres-1, pgdata-postgres-2 on first reconcile.
  • Names are stable across reconciles — replica-N always re-binds to its own volume.
  • The volume row carries an OwnerService reference, so when the service is deleted (with --cascade) the volumes are cleaned up by the VolumeCleanupFinalizer. Otherwise instance death never reclaims a volume.
ModeAbbreviationMeaning
ReadWriteOnceRWOOne writer at a time. The default for block storage.
ReadOnlyManyROXMany concurrent read-only mounters.
ReadWriteManyRWXMany concurrent read-write mounters (NFS-class).

Drivers advertise the modes they support via Capabilities.AccessModes. The API server rejects writes that demand a mode the chosen driver doesn’t advertise — there is no “I’ll try” path.

Each Volume has a reclaimPolicy, inherited from its StorageClass but overridable per-volume:

  • retain (default) — when the Volume row is deleted, the underlying storage is left intact.
  • delete — the driver’s Delete runs. For local, that’s rm -rf of the managed directory.

local-host rejects delete outright — the operator owns the path. runefile.[storage].preserveOnDelete = true is a belt-and-braces switch that turns delete into retain for the local driver only.

A Snapshot is a point-in-time copy of one volume:

snapshot:
name: pgdata-2025-11-15
namespace: prod
source:
volume: pgdata-postgres-0

Snapshots have their own state machine: Pending → Creating → Ready (or Failed). Deletion is two-phase (Deleting → row removed) so the driver can clean up the underlying snapshot first.

Restore creates a new volume populated from the snapshot:

Terminal window
rune volume restore web-data-restored \
--from-snapshot pgdata-2025-11-15 \
--storage-class local

The local driver implements snapshots as filesystem copies (cp -a). local-host does not support snapshots; the API rejects the write up-front. do-volume uses DigitalOcean’s snapshot API.

DriverTypeSnapshotsAccess modesNotes
localfilesystemyesRWODefault; managed directory per volume.
local-hostbind-mountnoRWO, ROXPre-existing host paths only.
do-volumeblock (cloud)yesRWODigitalOcean Volumes; auth via secret.

Adding a backend (Hetzner, AWS EBS, GCP PD, …) is a single Go package implementing the driver.Driver interface — no controller, scheduler, runner, API or CLI changes required. See RUNE-069 design notes for the contract and conformance suite.

Driver Provision requests carry topology labels for placement-aware backends:

LabelUsed byMeaning
rune.io/regiondo-volumeCloud region (e.g. nyc3).
rune.io/zone(cloud)Availability zone.
rune.io/host-path-rootlocal-hostPins a local-host volume to a node’s allowlist root.

StorageClass.allowedTopologies constrains placement at the class level.

  • Bind-mount blocklistmountPath cannot be /, /etc, /proc, /sys, or /var/run/docker.sock. Cast-time lint.
  • Host path allowlistlocal-host rejects any hostPath not under a prefix in runefile.[storage].hostPathAllowlist.
  • createIfMissing is opt-inlocal-host will not create a missing host directory unless runefile.[storage].allowCreateMissing = true. runed --dev-mode overlays both knobs to a developer-friendly default (~/.rune/volumes allowlist, allowCreateMissing = true).
  • Default-class invariant — exactly one StorageClass may carry default: true. The API server demotes the previous default on write.
  • RBAC — promoting a StorageClass to default and --cascade deletes are admin-only.
  • Driver capability lint at write — the API server cross-checks every Volume / Snapshot write against the named driver’s capabilities and rejects requests for modes/snapshots the driver doesn’t support.