banner
Previous Page
PCLinuxOS Magazine
PCLinuxOS
Article List
Disclaimer
Next Page

Hard Drive Care & Repair


by Phil

The hard drive is a modern mechanical marvel. The disk platters are coated with ferromagnetic material and rotate at around 7,000 rpm. The head clearance to the disk is 20-30 nanometres, and they can extract files at around 1Mb per second. Because they are mechanical devices, they will certainly fail. If you are lucky, you will get warnings and can do something. Otherwise, you may have a catastrophic, instant fail and everything on that drive may be lost.



Hard drives are a key component to a system. It's lucky for Linux users that the hard drive is regularly checked and maintained by the system. You can expect a typical hard drive to last around 5 years. Some will fail quickly, and others seem to last for decades (there are plenty of 10 year old drives in use).

Your hard drive will fail. If you have valuable files, you should back them up to another drive, or elsewhere. If they are really valuable, back them up multiple times, and make sure you check that your backed up files really do exist. Furthermore make sure important backups are completely separate from your system, as malicious crackers and pedantic machines can and will wantonly obliterate backups for their amusement.


What to Look for from a Failing Drive

Your system will boot up slower than usual, errors may be shown when doing so, you may hear strange noises, or no noise at all. Files may no longer be readable, or you may not be able save a file. The disk may not be mountable, or the system may say the disk does not exist. It may take much longer than usual to read or save a file, or the system starts freezing/crashing.


Monitoring and Checking Your Drive

SMART

You can check the wellbeing of your disk drives by enabling SMART. This is a monitoring system built into modern drives which keeps an eye on them, and can give a warning of a failing drive. It is very simple to do this. In Synaptic apply the following items:

smartmontools
gsmartcontrol
gnome-disk-utility
task-mate (the above work best in a Gnomish desktop)

If you login to a MATE desktop by changing the session type on the login screen, you can review the integrity of your drives and read and order their health reports. Early warning may be evident, or maybe your drive will crash anyway with no warning.


fsck

Your system automatically checks your drives every so often. You can check your drives and partitions at anytime with fsck, especially if you are worried or say there has been a power outage.

NOTE: The drives and partitions you are checking should be unmounted. The best way is to use a live disk, maybe one of the light ones.

Change to a root terminal (remembering to take care now that you are root).

fdisk -l (list) shows your mounted and unmounted drives, and partitions

fsck /dev/sda1 (check a partition)

fsck -fy /dev/sda1 (NOTE: check and repair which will amend the file structure without intervention. If there are a lot of errors on a partition, you may not want to fix the drive, and switch to file backup and disk recovery)


Disk/Partition Full?

Disk full? A golden oldie. It is not uncommon to inadvertently fill up your root / partition, after which your desktop or system may not boot. If, for example, your KDE desktop will not boot, change to a different session, maybe an LXDE desktop which may work, or switch to a live disk. A partition should be no more than 90% full.

On KDE More Applications > Monitoring > KDiskFree is worth a look from time to time.

To find what is filling up a disk (large file?) change to a root terminal (remembering to take care now that you are the root user).

df -al (This shows your mounted drives and how much space they have used)

If your / disk is full, it is time to play hunt the rabbit:

cd /
du -hsx * | sort -rh | head -10

Keep changing directories to trace the culprit(s). If it's just one rogue file delete it, and your job is done. If the partition is too small, ask on the forum about resizing a partition.


Looking After Your Drive

If you have a laptop, do not walk around with it or move it when it is working. Gyroscopic forces will apply strain to the drive, and with clearances in nanometres, that is not sensible.

Make sure you shutdown your system correctly. Do not just pull the plug.

Have some surge protection to protect against power spikes. Dirty power is a killer of drives, along with many other components of your system.

If you have a power outage, a brownout, or lightning strike nearby, check your drives.

Do not bang your machine, if it is vibrating, to fix the issue.

Backup your data, as your drive will eventually fail. If your files are not backed up when this happens, they will most likely be lost.


Recovery and File Extraction

If you suspect your drive is failing, run some diagnostic tests on partitions.

NOTE: when dealing with disk drives and partitions, it is imperative that they are unmounted, particularly the root / partition. Use a live disk with a light desktop.

Here is a rough checklist:

For a list of your disks, enter fdisk -l at a command prompt to list them.

For each suspect partition, enter the following command at a command prompt:

fsck /dev/sdNn (eg /sda1)

If it has a lot of issues, then immediately back up the files, if you can.

If there are only a few issues, backup your files and try to fix the drive errors, entering the following command at a command prompt:

fsck -fy /dev/sdNn (eg /sda1)

If you get an error on "superblocks," enter the following command at a command prompt:

fsck -fy -b superblock /dev/sdNn

        Superblock backups stored on blocks:

        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872

If the partition is badly mangled and you cannot get any files off, then more invasive measures are needed.

Make an image of the partition, using clonezilla, dd, or the Gnome disk utility.

Check the cable connections to the drive, and perhaps change them. Check the memory with memtest, which is one of the boot options.

In a root terminal run testdisk:

http://www.cgsecurity.org/wiki/TestDisk

This will attempt to salvage partitions on a disk. It will make changes to the entire drive if you allow it, so be warned. If you are lucky, you may get your drive back, in which case extract all your valuable files now.

If you have a failed drive and wish to try to extract any files from it, your last resort is photorec. This will search through a broken drive for files (you can refine its search). It dumps everything it finds into a folder of your choice. The result is a mess of unnamed files which you need to sort through (filtering by size and extension helps), and then rename the file to whatever you want.

You can try to format and partition the broken drive to put it back into working order. PCC may work, otherwise fdisk and mkfs.ext4 will help. A disk which has failed will always be deemed to be dubious, so do not rely on any such disk.



Previous Page              Top              Next Page