Break Glass #08: Repository Out of Space Mid-Job: Recovering and Getting Backups Running Again
Why This Happens
Repositories fill up for predictable reasons that compound quietly. Data growth from the protected environment is the slow burn. VMs get bigger, more VMs get added to jobs, change rates climb, and nobody adjusts retention or storage capacity to keep pace. The sharp trigger is usually a synthetic full transformation. Veeam needs temporary space to write the new full before it can delete the old one. Veeam best practices recommend sizing your repository to hold at least 1.25 times the size of a full backup as additional headroom beyond your normal backup data. On a nearly full repository, there is no room for the transformation and the job fails partway through the write.
Retention enforcement fires after the backup completes, not before. Veeam writes the new restore point first, then applies retention to delete the oldest ones. On a full repository the new write fails because there is nowhere to put it. Jobs start failing and the oldest restore points do not get cleaned up because that cleanup never gets a chance to run.
Someone deleting files directly from the repository filesystem outside of VBR creates a different kind of problem. The OS level free space returns but VBR still thinks those files exist. Jobs fail with "file does not exist" errors because VBR tries to reference a chain that is no longer complete on disk.
The SOBR has an additional behavior to know about. VBR uses an estimated free space calculation to prevent concurrent jobs from racing to fill an extent simultaneously. By default, VBR only refreshes that estimate when no tasks are assigned to an extent. In a busy SOBR there is almost always a task running, so the cached estimate drifts away from actual free space. Jobs then fail with "no scale-out repository extents have sufficient disk space" even though extents are not actually full. This is an estimation issue, not genuine out of space, and the fix is different.
Triage
- 1Confirm actual free space at the OS level. RDP or SSH to the repository host and check the filesystem directly:df -h /path/to/backup/directoryOn Windows, check drive Properties. Compare this to what VBR shows in Backup Infrastructure, Backup Repositories. A mismatch between OS reported space and VBR reported space points at an estimation issue.
- 2Disable all jobs targeting the full repository. Do not let another job attempt to write while you are working on the space issue. It will fail again and may break a chain that is currently still intact.
- 3Look for partially written files on the repository host. VBK or VIB files with very recent modification timestamps and unusually small file sizes compared to your typical incremental or full backup sizes. Note them. Do not delete them yet.
- 4Run a Rescan on the repository. Right click the repository in Backup Infrastructure and select Rescan. After the rescan, check Home, Backups for grayed out or unavailable restore points. These indicate VBR is aware that something is missing or incomplete.
- 5For a SOBR reporting "no scale-out repository extents have sufficient disk space" when OS level free space is available: this is the estimation issue. The fix is different from a genuine out of space situation. See the Gotchas section.
The Recovery Path
- 1Free space through the VBR console. Go to Home, Backups, and right click the oldest backup chains. Select Delete from Disk. VBR removes the files from disk and updates its database in a single coordinated operation. This is the correct way to free space. Do not delete files directly from the filesystem.
- 2If files were already deleted directly from the filesystem outside of VBR: run a Rescan on the repository. After the rescan, unavailable restore points appear grayed out in VBR. Right click an unavailable restore point in the backup Properties and choose Forget to remove the VBR database record while leaving any remaining files untouched, or Remove to remove both the database record and any remaining physical files. Use Forget when you are not certain and want to preserve what is there. Use Remove when you are sure the chain is broken beyond recovery.
- 3Free enough space to comfortably accommodate the next job run. The general guidance from Veeam best practices is to maintain free space equal to at least 1.25 times the size of a full backup. Freeing just enough for one incremental is not enough. A synthetic full transformation will fail again if there is no headroom beyond the normal write.
- 4Run a Rescan on the repository after freeing space. This updates VBR's free space calculation and clears any stale estimates.
- 5Review the job retention settings before turning the jobs back on. Open each disabled job, check the Storage tab, and confirm the restore points value makes sense for the storage you have. If you have been running 30 restore points on a 2 TB repository and data growth hit the wall, reducing retention or adding storage capacity are both valid options. But pick one before bringing the jobs back online.
- 6For the SOBR estimation issue described in KB2282: add the following registry value on the VBR server and restart the Veeam Backup Service:Key: HKLM\SOFTWARE\Veeam\Veeam Backup and Replication\ Value: SobrForceExtentSpaceUpdate Type: DWORD (32-bit) Data: 1The default behavior (Data: 0) only refreshes the cached free space when no tasks are assigned to an extent. Setting Data: 1 enables periodic recalculation of estimated free space while tasks are active, which is what fixes the drift. Per KB2282, this should only be enabled where the SOBR is configured to use Per-Machine Backup Files. Per-Machine is the SOBR default, but verify your SOBR is set that way before applying the key.
- 7Turn jobs back on one at a time. Start with the most critical and watch the first run complete successfully before enabling the next.
- 8Enable the backup file health check on any job that had an interrupted write. In job Properties, Storage, Advanced, on the Maintenance tab, in the Storage-level corruption guard section, check Perform backup files health check and click Configure to set the schedule. This verifies the chain you are now building from is intact before you rely on it for a restore.
Gotchas
Prevention Checklist
- Configure free space alerting in VBR's repository settings. In the repository properties, VBR has a built in option to alert when free space falls below a threshold. Set this and make sure the notifications are reaching someone. Alert at 20 percent remaining, not 5 percent.
- Use Veeam ONE for capacity planning. The Capacity Planning for Backup Repositories report shows projected days remaining before each repository runs out of space, based on your current growth trend. This gives you time to act before jobs start failing.
- Size repositories to hold your backup data plus 25 percent additional headroom for synthetic full transformations. A repository sized to hold exactly your current backup data has no room to breathe.
- For SOBRs, add a capacity tier using object storage. Moving older restore points to object storage frees performance tier space for active chains without deleting data.
- Never delete backup files directly from the repository filesystem. Put this in your team's runbook as a hard rule. The correct path is always through the VBR console.
- Enable Storage-level corruption guard on all jobs. An out of space event that interrupts a write partway through a chain is exactly the situation this feature is designed to detect and recover from.
- Disable all jobs targeting the full repository before doing anything else
- Confirm OS level free space first. SOBR estimation drift can look like genuine out of space
- Free space through VBR console only. Never by deleting files directly from the filesystem
- Files already deleted outside VBR: Rescan, then Forget (keeps files) or Remove (deletes files)
- Veeam writes first then applies retention. You must free space manually before the next run succeeds
- Immutable repo: cannot force delete during the immutability window. Add storage or wait
- SOBR estimation drift: SobrForceExtentSpaceUpdate DWORD = 1, restart Veeam Backup Service. Per KB2282, only enable when SOBR uses Per-Machine Backup Files (the SOBR default)
- Free 1.25 times full backup size worth of headroom, not just enough for one incremental
- Partial VBK from interrupted synthetic full: let VBR handle it on next run, do not manually delete
- Enable health check on jobs that had interrupted writes before relying on that chain for a restore