Break Glass #04: Accidental Backup Deletion - Recovering After Retention Misconfiguration or Manual Deletion
Why This Happens
Retention misconfigurations are almost always human error amplified by Veeam's willingness to act immediately. Someone reduces the retention setting from 14 days to 3 days while investigating a space alert. The next job run applies the new retention and removes every restore point outside that 3-day window without prompting. Done. The storage alert is resolved and the restore window is gone. In VBR v13, retention is days-based only. Environments upgraded from v12 may still show restore-point-count retention on existing jobs, but the behavior is the same: change the number, and the next run enforces it.
Direct deletion is the other path. An administrator right-clicks a backup in the VBR console and selects Delete from Disk. There is a confirmation prompt. It gets clicked through. The .vbk and all dependent .vib files are deleted from the repository. There is no recycle bin for Veeam backup data. If the repository has immutability enabled, this is blocked. If it does not, it is irreversible.
GFS (Grandfather-Father-Son) misconfiguration is more subtle. GFS transforms use existing restore points to build weekly, monthly, or yearly fulls. A misconfigured GFS job can consume the most recent restore points to synthesize a monthly full, leaving gaps in the active chain. The job reports success. The gaps only appear when someone tries to restore to a specific date.
Retention for removed VMs is another trap. The "Remove deleted items data after" setting controls what happens when a VM disappears from a job's scope. By default, this option is disabled, which means Veeam keeps the backup data forever. That sounds safe until someone enables it and sets a short retention period. If an admin enables it and sets it to 1 day, the backup files are deleted on the next job run after the VM disappears from scope. A VM migration that removes and re-adds a VM to inventory can silently trigger this. Veeam recommends setting this to at least 7 days. In practice, 30 days is safer for any environment where VM migrations are common.
Triage
- 1Stop the backup job that performed the deletion immediately. Right-click the job in the VBR console and select Disable. Do not let it run again until you understand what happened. Another run with the same (wrong) retention settings will delete more restore points.
- 2Identify what was deleted and when. Open Home in the VBR console. In the left pane, expand Backups. Select the backup in question. The right pane shows available restore points. Compare this to what you expected to have. Note the dates of available restore points and the gap.
- 3Check the repository for orphaned files. Navigate to Backup Infrastructure, then Backup Repositories. Right-click the repository and select Rescan. After the rescan completes, go to Home and check under Backups, Disk (Orphaned). Files that VBR deleted from its database but that still exist on disk appear here. This is rare but can happen when a deletion is interrupted.
- 4Check the repository file system directly. SSH or RDP to the repository host and look at the backup directory. If the .vbk or .vib files still exist on disk but VBR does not show them, you can recover them via import. If the files are gone, they are gone.
- 5Check for a backup copy job. If a backup copy job was running against this backup, the copy target may have more restore points than the primary. Open the backup copy job and check its restore point count. This is often the save here.
- 6Check object storage (capacity tier). If the SOBR has a capacity tier pointing at object storage, older restore points that were offloaded may still exist there even if the performance tier copies were deleted. VBR can restore directly from the capacity tier.
- 7Check for a replication job. A replica VM created from the backup may have its own restore points. In the VBR console, go to Home, Replicas. Look for the VM in question.
Recovery Path A: Files on Disk, Not in VBR Database
- 1Confirm the files exist on the repository host. The full backup file (.vbk) must be present. Without the full, the incremental chain (.vib files) cannot be used for restore even if they exist.
- 2In the VBR console, go to Home. On the ribbon, click Import Backup. Browse to the .vbm metadata file for the backup chain. The .vbm is in the same directory as the .vbk file and has the same base name. If the .vbm is missing but the .vbk exists, VBR can still import the .vbk directly. VBR will rebuild the metadata.
- 3After import, the backup appears under Backups, Disk (Imported) in the Home view. Right-click any restore point and perform the required restore operation.
- 4After the restore is complete, fix the retention settings on the original job before re-enabling it. Open the job properties, go to the Storage tab, and set the correct restore points value. Confirm GFS settings are intentional.
Recovery Path B: Restore from Backup Copy or Object Storage
- 1In the VBR console, go to Home. Expand Backups. Look under Disk (the backup copy job target) or under Object Storage (capacity tier). Find the VM you need to restore.
- 2Right-click the restore point and select the appropriate restore type. For a VM restore from a backup copy, the process is identical to restoring from a primary backup. For a capacity tier restore, VBR handles the download from object storage automatically. The process is slower but the steps are the same.
- 3If restoring from capacity tier (object storage), expect significantly longer restore times. Object storage throughput is limited compared to local disk. For a large VM, consider restoring to a staging location first and then migrating to production rather than doing a direct Instant VM Recovery from object storage.
- 4After the restore is complete, address the root cause on the primary backup job. Fix retention settings, re-enable the job, and verify the next run completes without deleting additional data.
Fixing a Retention Misconfiguration Going Forward
- 1Open the job properties. Go to the Storage tab. Verify the retention policy value is correct. In VBR v13, retention is configured in days. Confirm the number of days matches your intended restore window. If the job was upgraded from v12 and still shows restore-point-count retention, consider converting it to days-based retention to align with v13 defaults.
- 2If GFS was the cause, open the job properties, go to Storage, Advanced, and review the GFS settings. GFS weekly/monthly/yearly fulls consume restore points from the active chain. Confirm the GFS retention counts are additive to your active chain, not replacing it.
- 3Check the "Remove deleted items data after" setting. In the job properties, go to Storage, Advanced, Maintenance. If this option is enabled, confirm the retention period is appropriate. Setting this to a low value like 1 day means Veeam deletes backup data for a VM shortly after it disappears from the job scope. Veeam recommends at least 7 days. Set it to 30 days in any environment where VM migrations are common. If the option is disabled, removed VM data is kept indefinitely.
- 4Re-enable the job. Monitor the first run carefully. Check the session log for any "Removing restore point" messages and confirm they match expectations before declaring recovery complete.
The Hard Stop
Gotchas
Prevention Checklist
- Run all backup target repositories with immutability enabled. A Hardened Linux Repo, Object First OOTBI, or S3 with Object Lock prevents Delete from Disk from taking effect during the immutability window.
- Require change control for any retention setting change. A one-line config change that reduces restore points from 14 to 3 should go through the same process as any other infrastructure change.
- Run a backup copy job for every critical workload. The copy job is your safety net when the primary job's retention is misconfigured.
- Enable and set "Remove deleted items data after" to at least 30 days on every job where VM migrations are possible. This option is disabled by default. If you have not enabled it, removed VM data is kept forever. If you have enabled it, verify the value is not set dangerously low.
- Use RBAC to restrict who can modify job properties and who can delete from disk. Operators do not need write access to retention settings.
- Enable a capacity tier (object storage) on SOBRs for additional long-term retention without increasing on-prem storage costs.
- Test restores from your backup copy jobs quarterly. Many environments discover the copy job was misconfigured only when they try to use it.
- Disable the job immediately. Another run with wrong retention settings deletes more
- Rescan the repository. Orphaned files sometimes survive a deletion and can be imported
- Check backup copy, capacity tier, and replicas before accepting data loss
- Files on disk but not in VBR: use Import Backup on the .vbm or .vbk file
- Files deleted from disk: no VBR-side recovery. Immutability would have blocked this
- Retention change takes effect on next run, not immediately. You have a window to revert
- GFS transformations consume existing restore points. Configure additive, not replacing, retention
- Removed VM retention is disabled by default (data kept forever). If enabled with a short period, it deletes data fast
- Performance policy SOBR splits full and incrementals. Losing one extent loses the whole chain
- Immutability is the only reliable protection against accidental or intentional deletion