How to recover "dead" disk drives (sometimes).

dos2unix · Apr 28, 2026

Warning: This is mostly for advanced users. Yes, some of these commands can fix a broken drive. On the other hand, some of these
commands can break a fixed drive. I'm not responsible for broken drives, run at your own risk. There is no recovery from running some of these commands the wrong way, on the wrong device. - Moderators: If this is too dangerous, delete it.

sgdisk

Part of the gdisk package, sgdisk is the scriptable command-line version of GPT fdisk. Where fdisk gets confused or corrupted by GPT weirdness, sgdisk speaks native GPT.

Print partition table, non-destructive, always start here:

Code:

sgdisk -p /dev/sdX

Verify GPT integrity:

Code:

sgdisk --verify /dev/sdX

Backup the partition table to a file, do this before anything else:

Code:

sgdisk --backup=/root/sdX_partition_table.bak /dev/sdX

Restore from backup:

Code:

sgdisk --load-backup=/root/sdX_partition_table.bak /dev/sdX

Attempt to recover a damaged GPT, rebuilds from secondary GPT header:

Code:

sgdisk -e /dev/sdX

Move secondary GPT to end of disk, useful after a resize:

Code:

sgdisk -e /dev/sdX

Zap everything, the nuclear option:

Code:

sgdisk --zap-all /dev/sdX

Clone partition table from one disk to another, then randomize GUIDs on the target:

Code:

sgdisk --replicate=/dev/sdY /dev/sdX

Code:

sgdisk --randomize-guids /dev/sdY

When it saves you: corrupted primary GPT header but secondary GPT is intact, misaligned partition tables after a sector-size change like 512e to 4Kn, partition table lost after a dd mishap.

hdparm

The Swiss Army knife for ATA/SATA drives. Less relevant for NVMe, use nvme-cli there, but still essential for spinning rust and older SSDs.

Drive identity, model, firmware, supported features:

Code:

hdparm -I /dev/sdX

Read speed benchmark, bypasses page cache:

Code:

hdparm -tT /dev/sdX

Check power state:

Code:

hdparm -C /dev/sdX

Check APM level:

Code:

hdparm -B /dev/sdX

Set APM to maximum performance, 255 disables APM entirely:

Code:

hdparm -B 255 /dev/sdX

Disable spindown:

Code:

hdparm -S 0 /dev/sdX

Disable power-up-in-standby:

Code:

hdparm -s 0 /dev/sdX

Check ATA security lock status:

Code:

hdparm -I /dev/sdX | grep -i security

Unlock a locked drive:

Code:

hdparm --security-unlock PASSWORD /dev/sdX

Disable ATA security entirely:

Code:

hdparm --security-disable PASSWORD /dev/sdX

If the drive shows frozen, suspend the machine to RAM, then bring it back. That clears the frozen bit without a full power cycle on some systems, then retry the unlock.

ATA Secure Erase, useful for restoring write performance on a degraded SSD:

Code:

hdparm --user-master u --security-set-pass TEMPPASS /dev/sdX

Code:

hdparm --user-master u --security-erase TEMPPASS /dev/sdX

This resets all NAND cells to erased state and can recover dramatically degraded SSD write performance.

Imaging First, Always

Before doing anything else, get an image off the drive if it is still readable at all. ddrescue is the king here.

Pass 1, fast pass, get what you can:

Code:

ddrescue -d -r0 /dev/sdX /mnt/rescue/image.img /mnt/rescue/image.map

Pass 2, retry bad sectors up to 3 times:

Code:

ddrescue -d -r3 /dev/sdX /mnt/rescue/image.img /mnt/rescue/image.map

Pass 3, scrape mode, reads individual sectors around bad spots:

Code:

ddrescue -d -r3 -R /dev/sdX /mnt/rescue/image.img /mnt/rescue/image.map

The mapfile is critical. It lets you resume interrupted rescues. Never skip it. dd_rescue (the older, separate tool, note the underscore) and dcfldd are alternatives but GNU ddrescue handles failing drives better because of that mapfile resume capability.

Filesystem Repair

ext4, force check, verbose, fix:

Code:

e2fsck -fvy /dev/sdX1

ext4 with alternate superblock if the primary is gone:

Code:

e2fsck -fvy -b 32768 /dev/sdX1

Find alternate superblock locations without writing anything:

Code:

mke2fs -n /dev/sdX1

XFS standard repair:

Code:

xfs_repair /dev/sdX1

XFS zero log, loses in-flight transactions but gets it mountable:

Code:

xfs_repair -L /dev/sdX1

Btrfs check:

Code:

btrfs check /dev/sdX1

Btrfs zero log:

Code:

btrfs rescue zero-log /dev/sdX1

Btrfs restore, pulls files even from an unmountable filesystem:

Code:

btrfs restore /dev/sdX1 /mnt/recovered/

NTFS fix for Windows drives:

Code:

ntfsfix /dev/sdX1

NVMe

Install nvme-cli on RHEL/Fedora:

Code:

dnf install nvme-cli

SMART equivalent health log:

Code:

nvme smart-log /dev/nvme0

Error log:

Code:

nvme error-log /dev/nvme0

Identify controller:

Code:

nvme id-ctrl /dev/nvme0

Identify namespace:

Code:

nvme id-ns /dev/nvme0n1

Sanitize, block erase equivalent to ATA secure erase:

Code:

nvme sanitize /dev/nvme0 --sanact=2

Watch Critical Warning, Available Spare, and Percentage Used in the smart-log output. Those three tell you most of what you need to know about NVMe health at a glance.

SMART via smartmontools

Quick health check:

Code:

smartctl -H /dev/sdX

Full attribute dump:

Code:

smartctl -a /dev/sdX

Short self-test, about two minutes:

Code:

smartctl -t short /dev/sdX

Long self-test, hours on large drives:

Code:

smartctl -t long /dev/sdX

Poll self-test results:

Code:

smartctl -l selftest /dev/sdX

USB drives with SAT bridge:

Code:

smartctl -d sat -a /dev/sdX

USB drives with JMicron bridge chips:

Code:

smartctl -d usbjmicron -a /dev/sdX

Key SMART attributes to watch: 05 is Reallocated Sector Count and any nonzero value is trouble. C5 is Current Pending Sector, sectors waiting to be remapped. C6 is Uncorrectable Sector Count. BB is Seagate-specific uncorrectable errors. 01 is Raw Read Error Rate, normalize against the vendor baseline before panicking.

USB Drive Resurrection

USB drives are their own special pain. The bridge controller chip matters as much as the NAND itself.

Identify the USB bridge controller:

Code:

lsusb -v | grep -A5 "Mass Storage"

Install f3 to fight fake-capacity drives:

Code:

dnf install f3

Fast probe:

Code:

f3probe /dev/sdX

Full write test:

Code:

f3write /mnt/usbdrive

Full read verify:

Code:

f3read /mnt/usbdrive

Interactive partition table recovery with testdisk:

Code:

testdisk /dev/sdX

File carving with photorec, ignores the filesystem entirely and finds files by signature:

Code:

photorec /dev/sdX

For completely dead USB drives where the controller has bricked itself, the last resort is NAND chip-off recovery. You physically desolder the flash chip and read it directly. That is lab territory and requires specialized hardware, but it exists and it works when everything else fails.

Resurrection Priority Order

SMART check. Is the hardware even talking?
ddrescue image. Preserve what you have before anything else.
sgdisk --verify or testdisk. Is the partition table intact?
fsck, xfs_repair, or btrfs rescue. Hit the filesystem layer.
photorec or foremost. File carving if the filesystem is completely gone.
Mount read-only regardless and see what you can see:
Code:
```
mount -o ro,noatime,noload /dev/sdX1 /mnt/recovery
```
ATA secure erase or NVMe sanitize. Last resort for SSD performance recovery.

The cardinal rule: never write to a dying drive until you have an image. Every write attempt on a marginal drive risks turning unrecovered sectors into permanent losses.

wizardfromoz · Apr 28, 2026

Looks like a good article, Ray

However, I think I would probably move

dos2unix said:
The cardinal rule: never write to a dying drive until you have an image. Every write attempt on a marginal drive risks turning unrecovered sectors into permanent losses.

to the top, and then feature the image part.

Just a suggestion

Chris

How to recover "dead" disk drives (sometimes).

dos2unix

Well-Known Member

wizardfromoz

Administrator

Similar threads

Follow Linux.org

Members online

Latest posts