Velero is an open-source Kubernetes backup and disaster recovery tool that captures cluster state — API objects, namespaces, and PersistentVolume data — and stores it in object storage for later restore or migration. For platform teams running stateful applications, Velero is the standard mechanism for point-in-time backup, namespace cloning across clusters, and recovery from accidental deletion or cluster failures.
Velero operates through a controller deployed in the cluster and a CLI for triggering and managing backup schedules. It captures Kubernetes API objects (Deployments, Services, PVCs, ConfigMaps, Secrets) and, optionally, the actual PV data. Data backup uses either the CSI VolumeSnapshot API or file-level agents (restic or kopia) depending on what the storage backend supports.
How Velero Backs Up Persistent Volumes
Velero supports two distinct paths for PV data:
CSI snapshot path (preferred): Velero triggers a VolumeSnapshot via the Kubernetes CSI snapshot API. The CSI driver creates a storage-level snapshot of the volume, and Velero stores the snapshot metadata and associated API objects in object storage (typically S3-compatible). Restoring creates a new PVC from the snapshot, which the CSI driver expands into a usable volume. This path is fast because the snapshot is taken at the storage layer — it does not stream all data over the network.
File-level path (restic/kopia): When the CSI driver does not support VolumeSnapshots, Velero can use restic or kopia agents running as privileged DaemonSet pods to perform file-level backups of PV mount paths. This approach is slower, more CPU-intensive, and blocks pod progress during backup windows. It works with any storage backend but should be considered a fallback.
The CSI snapshot path requires a functioning CSI external snapshotter in the cluster, including the VolumeSnapshotClass, VolumeSnapshotContent, and VolumeSnapshot CRDs.
Velero vs. Storage Replication vs. Snapshots Alone
A common source of confusion is treating Velero, storage-level replication, and snapshots as interchangeable DR tools. They are not:
| Mechanism | Granularity | RPO capability | RTO capability | Best for |
|---|---|---|---|---|
| Velero backup | Namespace + PV data | Minutes to hours (schedule-driven) | Minutes (restore from object storage) | Accidental deletion, cluster migration, DR copy |
| Storage replication (async) | Volume-level | Near-zero (continuous) | Fast failover to replica | Site failover, HA across zones |
| Storage replication (sync) | Volume-level | Zero (no data loss) | Seconds to minutes | Zero-RPO requirements, financial data |
| CSI snapshot alone | Volume-level point-in-time | Depends on snapshot schedule | Fast (local restore) | Rollback, data cloning |
Velero complements replication and snapshots. It is the right tool for namespace-level backup, cross-cluster restore, and audit-trail copies in object storage. It is not a replacement for continuous replication when RPO must be near-zero.
🚀 Fast Velero backups with CSI snapshot support Simplyblock’s CSI driver creates storage-level snapshots instantly, making Velero backup windows shorter and restore tests cheaper with thin clones. 👉 CSI snapshot architecture for Kubernetes
Velero and the CSI Snapshot Architecture
For teams using the CSI path, the full flow involves several components:
- Velero’s backup controller triggers a
VolumeSnapshotobject. - The CSI snapshot controller watches for VolumeSnapshot objects and calls the CSI driver’s
CreateSnapshotRPC. - The CSI driver creates a storage-level snapshot and reports back the snapshot handle.
- The snapshot controller updates the
VolumeSnapshotContentwith the handle, marking the snapshot ready. - Velero stores the VolumeSnapshotContent metadata in object storage alongside the API object backup.
On restore, Velero creates a new PVC referencing the VolumeSnapshotContent, and the CSI driver reconstructs a volume from the snapshot. Restore speed depends on whether the CSI driver uses thin cloning (instant) or full data copy (minutes to hours for large volumes).
Restoring a StatefulSet with Velero
Restoring a StatefulSet from a Velero backup involves recreating both the API objects (the StatefulSet spec, Services, ConfigMaps) and the associated PVC data. Velero handles this in a single restore operation when the backup includes both layers. Key steps:
- Velero creates the PVCs first, triggering the CSI driver to restore volumes from snapshots.
- Once PVCs are bound, Velero creates the StatefulSet, which picks up the existing PVCs by name.
- Pod startup proceeds normally once volumes are attached.
RTO depends on volume size, snapshot restore mechanism (thin clone vs. full copy), and the application’s own startup time. For databases, a restored StatefulSet may also need recovery log replay after the PVC is mounted.
Velero with Simplyblock
Simplyblock’s CSI driver supports the Kubernetes VolumeSnapshot API, enabling the fast CSI snapshot path for Velero backups. Relevant capabilities:
- Instant snapshot creation: simplyblock uses copy-on-write snapshots, so
CreateSnapshotcompletes in milliseconds regardless of volume size. This makes Velero backup windows extremely short. - Thin clones for restore testing: after a Velero backup, teams can restore to a test namespace using thin clones from the snapshot. The clone shares underlying data blocks with the source, consuming minimal extra space until diverged.
- NVMe/TCP and NVMe/RoCE transport: snapshot and restore I/O runs over the same high-throughput fabric used for live workload data, keeping restore times predictable.
- Integration with snapshot vs. clone workflows: platform teams can use simplyblock snapshots directly for fast local rollback while delegating cross-cluster backup copies to Velero.
Related Terms
These glossary entries cover the components and concepts that work alongside Velero in Kubernetes DR architectures.
- CSI External Snapshotter
- CSI Snapshot Architecture
- What Is RPO
- What Is RTO
- Snapshot vs. Clone in Storage
Questions and Answers
How does Velero back up persistent volumes?
Velero backs up persistent volumes using one of two methods. The preferred method is the CSI VolumeSnapshot path: Velero triggers a VolumeSnapshot, the CSI driver creates a storage-level snapshot, and Velero stores the snapshot metadata in object storage alongside the Kubernetes API objects. The fallback method uses restic or kopia agents to perform file-level backups by reading PV mount paths directly, which is slower and more resource-intensive. The CSI path requires that the storage backend’s CSI driver supports the VolumeSnapshot API.
Does Velero work with CSI drivers?
Yes. Velero has first-class support for CSI VolumeSnapshots via its CSI plugin. When a StorageClass-backed PVC has a matching VolumeSnapshotClass, Velero uses the CSI snapshot path automatically. The cluster must have the VolumeSnapshot CRDs installed and the CSI external-snapshotter controller running. Simplyblock’s CSI driver supports VolumeSnapshots and is fully compatible with Velero’s CSI plugin.
What is the difference between Velero and storage replication?
Velero is a point-in-time backup tool that captures cluster state and PV data at scheduled intervals, storing backups in object storage. Storage replication continuously mirrors data from a source volume to a replica, either synchronously (zero data loss) or asynchronously (near-zero RPO). Velero suits scenarios like accidental deletion recovery and cluster migration; replication suits scenarios requiring fast failover and minimal data loss. The two are complementary — many production setups run both.
How do I restore a Kubernetes StatefulSet with Velero?
Run velero restore create --from-backup <backup-name>. Velero first recreates the PVCs by triggering CSI snapshot restores, waits for volumes to bind, then recreates the StatefulSet and associated API objects. The StatefulSet pods start once PVCs are bound and the volumes are attached. For large volumes, restore time depends on whether the CSI driver performs a thin clone (seconds) or a full data copy. After restore, verify application health, then check replication lag or recovery log state for databases.