How to fix a Corrupted Time Machine Backup

The other day time machine decided to stop working for me and kept giving me errors about mounting the backup image. After firing up Console.app I found that the sparsebundle which time machine stores the backup data in was corrupted. This is a bad thing since if I couldn’t find a way to fix the image all my backups for the last 6 months would be lost :(.

My first step on the road to fixing the problem was to open the bundle with “Disk Utility” and see if repair disk would work. To do this you need to drag the back image (the .sparsebundle) and drop it on disk utility. Then you can select the image and press “Repair Disk”. Usually if if you have some disk corruption this would fix it, but not for me.

After working for a couple minutes Disk Util would fail at repairing the disk since it found an “Invalid Sibling Link”. It seems the disk was corrupted and the disk repair utility decided it wasn’t going to repair it. Worried that my backups were gone for good I did some Googling and found that this problem isn’t totally uncommon. It seems that Disk Utility is just fairly bad at repairing disks and that using the “real” disk repair utility fsck_hfs might actually fix the problem.

Of course there’s one problem fsck_hfs has to be used directly on a block device not a “sparsebundle” disk image. Not only that but the volume you run the util on cannot be mounted and OS X always tries to mount the image when you click on it. The solution? Use the command line tool hdiutil to “attach” but not “mount” the image. This makes OS X create a /dev/ device for the sparse bundle but doesn’t actually mount the filesystem.

So combining hdiutil and fsck_hfs we might have a way to fix the disk image. Note: Make sure you turn off Time Machine in the System Preferences panel. We don’t want Time Machine trying to mount our image and back up to it until we’re done making all the repairs.

First I ran hdiutil:

hdiutil attach -nomount -readwrite Bhaal_0011247e3338.sparsebundle

After a minute or two of thinking we had success:

/dev/disk1              Apple_partition_scheme
/dev/disk1s1            Apple_partition_map
/dev/disk1s2            Apple_HFSX

The next step was run fsck_hfs on the main volume.

fsck_hfs -rf /dev/disk1s2

-f is required to force a check since this is a journaled file system. I also used -r to have it rebuild the filesystem catalog for the “invalid sibling link” this is required (thanks Dan). At this point I went out for breakfast since I was running a disk repair utility over my wireless internet to my Linux SAMBA server. After coming back from breakfast we had success!

** /dev/rdisk1s2
** Checking Journaled HFS Plus volume.
** Detected a case-sensitive catalog.
** Checking Extents Overflow file.
** Checking Catalog file.
** Rebuilding Catalog B-tree.
** Rechecking volume.
** Checking Journaled HFS Plus volume.
** Detected a case-sensitive catalog.
** Checking Extents Overflow file.
** Checking Catalog file.
   Incorrect number of thread records
(4, 13716)
** Checking multi-linked files.
** Checking Catalog hierarchy.
   Invalid directory item count
   (It should be 0 instead of 1)
   Invalid directory item count
   (It should be 3 instead of 4)
   Incorrect folder count in a directory (id = 3795486)
   (It should be 0 instead of 1)
** Checking Extended Attributes file.
** Checking multi-linked directories.
** Checking volume bitmap.
** Checking volume information.
   Invalid volume free block count
   (It should be 37267681 instead of 37310834)
   Volume Header needs minor repair
(2, 0)
** Repairing volume.
** Rechecking volume.
** Checking Journaled HFS Plus volume.
** Detected a case-sensitive catalog.
** Checking Extents Overflow file.
** Checking Catalog file.
** Checking multi-linked files.
** Checking Catalog hierarchy.
** Checking Extended Attributes file.
** Checking multi-linked directories.
** Checking volume bitmap.
** Checking volume information.
** The volume Backup of Bhaal was repaired successfully.