Break Glass #10: Backup Chain Corruption: Recovering a Broken Forward Incremental Chain in Veeam v13

Share
Break Glass // Scenario 10
A restore fails with "Backup files are unavailable." Or a job reports that restore points are missing in the chain. Or the health check fired and flagged corruption in an incremental VIB file. You have a forward incremental chain that is broken somewhere in the middle and you need to figure out what you can still recover and how to get the job healthy again.
Break Glass VBR v13 Chain Corruption Forward Incremental

Why This Happens

A forward incremental chain is a sequence of files: one full backup (VBK) followed by one or more incrementals (VIB). Each incremental depends on the file before it. Break any link in that chain and every restore point after the break becomes inaccessible. The full and everything before the break may still be fine.

Storage hardware failure partway through a write is the most direct cause. A disk fails, a RAID array degrades, the repository host crashes while a VIB file is being written. The resulting file is truncated or corrupt. The health check finds it on the next run. Without health check enabled, the corruption sits silently until someone tries to restore from a point past the break.

Retention database issues are a subtler cause. During a retention enforcement pass, if the VBR server or database is interrupted partway through an operation, the database record for a chain can end up in an inconsistent state. The files on disk are fine but VBR thinks incrementals exist without a valid full at the start of the chain. Jobs fail with errors like "Full Storage Not Found" or complaints about corrupted storage metadata. These errors point to a database inconsistency, not storage corruption, and Veeam Support can often resolve the issue without data loss.

Manual file deletion is the third cause. Someone deletes a VBK or VIB file directly from the repository filesystem. VBR still tracks the chain in its database. Restore attempts fail because the file no longer exists. Jobs also fail because VBR finds the chain is incomplete before writing a new incremental.

Forever incremental chains amplify the blast radius. A chain with no periodic synthetic fulls that runs for months has a very long dependency chain. A single corrupted incremental six months in breaks every restore point after that date. Periodic synthetic fulls divide the chain into separate sections and limit the corruption blast radius to the section containing the corrupted file.

Triage

  1. 1Open the VBR console. Go to Home, Backups, Disk. Find the backup for the affected job. Right click the backup and select Properties. Look at the restore point list. Unavailable or grayed out restore points have missing or corrupt backing files. Note the date of the first unavailable restore point. Everything after that date in the same section of the chain is also broken.
  2. 2Rescan the repository. Right click the repository in Backup Infrastructure and select Rescan. After the rescan, refresh the backup properties view. Rescanning sometimes resolves apparent corruption that was a database sync issue rather than actual file damage.
  3. 3Check whether the backup files physically exist on the repository host. SSH or RDP to the repo and navigate to the backup directory. Confirm the VBK and VIB files listed in VBR are actually present on disk. Missing files confirm physical deletion or storage failure. Present files with VBR showing them as unavailable suggests a database issue.
  4. 4Check the job session log for the error message type. "Backup files are unavailable" or "File does not exist" points to missing files. Errors about corrupted storage metadata or "Full Storage Not Found" point to a database side inconsistency. These have different recovery approaches.
  5. 5Determine what restore points you can still recover from. The VBK full and all VIB files before the corrupted one are typically still good. In the Properties window, attempt a test restore from the last available restore point before the break. If that succeeds, note the date. That is your current recovery window.
Decision Point
Files missing from disk (storage failure, manual deletion): Recovery Path A. Files exist but VBR shows metadata corruption or database inconsistency: Recovery Path B. Health check found corruption in a VIB file: Recovery Path C. Entire chain is unrecoverable and you need to start over: Recovery Path D.

Recovery Path A: Files Missing from Disk

  1. 1In VBR, open the backup Properties. Right click the first unavailable restore point. You have two options: Forget removes the record from the VBR database but leaves any remaining files on disk untouched. Choose this if you are unsure and want to investigate further. Remove from disk removes the record and deletes the backing files from disk. Choose this only when you are certain the chain from that point forward is not recoverable.
  2. 2When prompted, select "This and dependent backups" to remove the broken restore point and every restore point that depends on it. Selecting "All unavailable backups" removes every unavailable restore point across the entire backup, which may be more than you intend.
  3. 3After the unavailable restore points are removed, run the backup job manually. VBR should pick up the chain from the last intact restore point and continue incrementally. Watch the session log to confirm it runs without errors.
  4. 4If the last intact restore point is the VBK full itself, and the chain forward from there is broken, VBR will start a new incremental chain from the existing full. The full restore point remains valid and accessible.
  5. 5If the VBK full itself is missing or corrupt, the entire chain is unrecoverable. Go to Recovery Path D.

Recovery Path B: Database Inconsistency (Files Exist, Metadata Corrupt)

  1. 1Do not delete anything yet. Errors like "Full Storage Not Found" or complaints about corrupted storage metadata, where the files are physically present on disk, are usually a VBR database issue rather than storage corruption. Veeam Support can typically resolve this without data loss by correcting the database records. Open a support case and export logs per KB1832 (Help, Support Information in the VBR console, then walk through the Export Logs wizard).
  2. 2While waiting for support, create a parallel backup job for the same VMs targeting a different repository. This ensures you have a fresh, clean chain building in parallel. Do not delete the original chain until support has assessed whether it is recoverable.
  3. 3If you cannot wait for support and need the job running, rescan the repository, then use Forget on the unavailable restore points to clear the broken database records. VBR will resume the job from the last intact point. The files on disk remain untouched. Forget only removes database records. If support later determines the chain is recoverable, the files are still there.

Recovery Path C: Health Check Detected Corruption in a VIB File

Veeam's storage level corruption guard (health check) detected corrupted data during a scheduled health check pass. VBR completes the backup job with the Error status and starts a health check retry process to rebuild the chain.

  1. 1Let the health check retry complete. The retry starts as a separate backup job session. Its behavior depends on where the corruption was found. If corrupted metadata was found in an incremental, VBR removes records of that incremental and every subsequent incremental from the configuration database, then transports new incremental data relative to the latest valid restore point and writes a new incremental file. If corrupted data blocks were found in a full or incremental file, VBR marks the affected restore point and subsequent points as corrupted and transports data blocks from the source datastore during the retry. Either way, the next time you look at the chain, it should be healthy from the retry point forward.
  2. 2Note: on Hardened Linux Repositories, health check detection works but automatic repair does not. The official Veeam documentation states: "Linux immutable repositories do not support repair. If the health check detects corrupted data, Veeam Backup and Replication marks the restore point as corrupted in the configuration database and finishes the health check session." You must run an active full backup to start a new chain. If you do not, every subsequent incremental will complete with Error status.
  3. 3After the retry completes, check the session log and confirm the job session shows Success or Warning with no further corruption messages. Verify that restores from the repaired chain work correctly before trusting it for production recovery.
  4. 4Investigate the root cause of the corruption. A single corrupted VIB is a warning sign. If storage level issues caused it, expect more corruption. Check the repository host's storage health: SMART data for drives, RAID controller event log, and filesystem integrity.

Recovery Path D: Entire Chain Unrecoverable

The VBK is missing, corrupt, or the metadata is so broken that VBR cannot reconstruct a usable chain. All restore points from this chain are lost.

  1. 1Check for alternate restore sources before accepting total loss: backup copy job target, SOBR capacity tier, replication target, tape archive.
  2. 2Remove the broken chain from VBR. In Home, Backups, Disk, right click the broken backup and select Remove from disk. This clears the database records and removes the physical files. If the physical files are already gone, use Forget instead.
  3. 3Enable the job and run it. VBR will create a new active full on the next run, starting a fresh chain. The first run will take significantly longer than normal. It is a full backup, not an incremental.
  4. 4Investigate the storage failure that caused the VBK corruption before the new chain runs. Starting a new chain on a storage system that is actively degrading will reproduce the corruption.

Gotchas

Forget vs Remove from Disk. Know Which One You Are Using
Forget removes the database record only. The files remain on disk. Use this when you are not certain and want to investigate further. Remove from disk deletes both the database record and the physical files. Once removed from disk, the data is gone. If the repository has immutability enabled, Remove from disk cannot delete files that are still within the immutability window. It removes only the database record in that case. Read the VBR confirmation dialog carefully before clicking.
Corruption Before Health Check Runs Is Silent
Without storage level corruption guard enabled on the job, corruption in a VIB file is not detected at write time. VBR writes what it receives, calculates CRC on the blocks, and stores them. If the storage returned corrupt data during the write, the CRC matches the corrupt data. The health check rereads the actual VM data and compares it to the stored backup blocks. This is how it catches storage level corruption that happened silently. Without health check, the corruption sits in the chain until a restore is attempted. Enable health check on all jobs.
Immutable Repositories Cannot Be Repaired
The health check repair process requires writing new data over the corrupted restore point. Immutable files cannot be modified during the immutability window. If health check detects corruption in an immutable repo, VBR marks the restore point corrupted and stops. The only forward path is an active full backup to start a new chain, or waiting for the immutability window to expire. Plan for this: immutable repos should have backup copy jobs to a second location for exactly this scenario.
Forever Incremental Chains Have Unbounded Blast Radius
A forward incremental chain with no periodic synthetic fulls grows indefinitely. Every restore point after the VBK is a link in a single chain. One corrupted VIB six months into the chain makes every restore point after it inaccessible. The VBK and the first six months of VIBs may still be fine, but everything from the corruption forward is gone. Periodic synthetic fulls divide the chain into separate sections. Corruption in one section does not affect other sections. Configure a synthetic or active full at least weekly for any production workload.
Database Inconsistency Errors Are Not Always Storage Corruption
Errors like "Full Storage Not Found" and complaints about corrupted storage metadata sound alarming but are often VBR database issues where the chain records are out of sync with the files on disk. Not actual data corruption. The files may be perfectly intact. Do not delete the chain immediately. Rescan the repository and open a Veeam support case before removing anything. Veeam Support can correct database inconsistencies without touching the backup files.

Prevention Checklist

  • Enable storage level corruption guard (backup file health check) on every job. Schedule it to run weekly. This is the difference between catching corruption during a health check and discovering it during a production restore.
  • Configure periodic synthetic or active fulls on all jobs. Weekly is the standard. Forever incremental chains have unbounded blast radius when corruption hits.
  • Run SureBackup jobs against your most critical VMs. SureBackup is the only way to verify that the application inside the backup is actually recoverable, not just that the backup files are intact.
  • For immutable repositories, run a backup copy job to a separate target. When health check detects corruption on an immutable repo, you need an alternate recovery path that does not require the immutability window to expire.
  • Keep the repository storage healthy. Monitor SMART data on drives, RAID controller event logs, and repository host system logs. Chain corruption is usually a symptom of a storage health problem that will repeat.
  • Never delete backup files directly from the repository filesystem. Always use the VBR console.
Break Glass Recap
  • Open backup Properties in VBR to see which restore points are unavailable and where the break is
  • Rescan the repository first. Some apparent corruption resolves as a sync issue
  • Files missing: Forget (keep files) or Remove from disk (delete files) on unavailable restore points
  • Menu option is "This and dependent backups" to remove only the broken section, or "All unavailable backups" for everything
  • Files present but metadata corrupt: open a Veeam support case before deleting anything
  • Health check detected corruption: let the health check retry complete. On Linux immutable repos, run an active full instead
  • Entire chain gone: check backup copy and capacity tier before accepting total loss
  • Forget only removes the database record. Files stay on disk
  • Remove from disk deletes both database record and files. Irreversible
  • Immutable repos: only an active full on a new chain (or waiting for the immutability window to expire) is an option
  • Forever incremental chains have unbounded blast radius. Use periodic synthetic fulls

Read more