Linux notes

Recovering data from a failing IDE disk drive using dd_rescue

It's a common nightmare - a hard drive containing precious data starts playing up and eventually won't even boot the computer any more. Is your data lost? Is the only recourse to send it out to an expensive data-recovery company, with no guarantee of success?

We'll look at how to try some effective data recovery using Linux. The tool we'll base our attempts around is called dd_rescue, written by one Kurt Garloff. This program tries to read all the data from one 'block device' and copy it to another. We'll explain this terminology more later.

Concepts

There are several concepts you need to know about before we dive into the 'fun' of data recovery.

Block device

Although purists would disagree, for our purposes a block device can be considered as a partition on a hard disk drive. In our case, we are assuming the hard disk drive is an IDE device, so we can start our search for them by listing entries in the directory /proc/ide, like this:
> ls /proc/ide
drivers  hda  hdc  ide0  ide1
IDE drives are given the prefix hd followed by a single letter, so you can see there are two distinct physical devices attached to this machine - one is set as a master device on the primary interface of the first IDE controller (hda); the other is also a master device on the secondary interface of the first IDE controller (hdc). So we have one IDE controller, two interfaces (primary and secondary) and two IDE devices connected - each one configured as a master device on its particular interface.

To get more of a clue, let's take a look at the contents of the file fstab.

> cat /etc/fstab
# /etc/fstab: static file system information.
#
#                
proc            /proc           proc    defaults        0       0
/dev/hdc1       /               ext3    defaults,errors=remount-ro 0       1
/dev/hdc6       /home           ext3    defaults        0       2
/dev/hdc7       /recovery       ext3    defaults        0       2
/dev/hdc5       none            swap    sw              0       0
/dev/hda        /media/cdrom0   udf,iso9660 user,noauto     0       0
The block devices are listed in the lefthand column, with the /dev prefix, i.e. /dev/hda. The next column describes the mount point, i.e. the directory you must access in order to see the access the device. In the case of the device /dev/hda, the mount point is /media/cdrom0, which makes it apparent that this device is a CDROM drive (rather than a hard disk drive).

The other block device listed that also showed up in /proc/ide is /dev/hdc. Now however, numbers are appended giving hdc1, hdc5, hdc6 and hdc7. Strictly speaking, these new entries refer to partitions of the physical hard disk drive, hdc. Although they all exist on the same physical hard disk, they are distinct as far as Linux is concerned, and they are still 'block devices'. It is these block devices that we specify to dd_rescue as being either the source or destination of the data we are trying to recover.

Building a recovery system

Assuming that the machine we build for our recovery work will not have a CDROM attached to it, there will once again be two drives connected. One will contain the Linux installation and the other will be the one from which we are trying to recover the data. For now, let's further assume that our 'operating system' drive will be hdc and that the one we are trying to get data from will be hda. dd_rescue needs to have somewhere to put all the data is recovers from hda - where will it go?

The answer is that it needs another partition of sufficient size on hdc. This is the partition you specify as the 'destination' to dd_rescue. Naturally, this partition should be as big (and preferably bigger) than the size of the drive you are trying to recover.

Attaching the damaged drive

I noticed some weirdness when swapping out my CDROM drive - seen in the fstab file as hda - for the damaged drive. The system had been configured to attempt to boot first from the CDROM (hda) and then from what the BIOS describes as HDD-0 (the 'first' hard disk). After I removed the CDROM and replaced it with the damaged hard drive, the system promptly tried to boot from it! In fact, this does make sense; the damaged drive would now appear as hda, and technically speaking this would now be the 'first' hard drive, rather than hdc. The fix was to change the BIOS settings to make the first boot device HDD-1 (which in our case is hdc).

To avoid conflicting with the CDROM entry in fstab (hda), I made the newly attached hard drive a slave by changing its jumper settings. Thus, after removing the CDROM and adding the damaged drive, plus changing the BIOS as described above, I get:

> ls /proc/ide
drivers  hdb  hdc  ide0  ide1
Note that hda - the CDROM, formerly - has gone, to be replaced by hdb. This is the block device corresponding to the damaged disk.

Attempting the recovery

With our corrupt disk visible to Linux as hdb and our operating system and storage disk visible as hdc, we can just need to know the proper names of our block device partitions and we are ready to go.
> dd_rescue /dev/hdb /dev/hdc7