Complete filesystem failure, S.M.A.R.T report OK

GeckoLinux

Member
Joined
Jul 18, 2020
Messages
45
Reaction score
67
Credits
514
Hi there, a friend contacted me with a Thinkpad T440 laptop that was running the latest Ubuntu LTS. They closed the lid (presumably went into hibernation) and came back a little while later and opened it to find a GRUB rescue command line with "unknown filesystem". Before that they were just using Chrome.

I booted it with a live Linux USB and it shows /dev/sda1 has a 512MB FAT32 partition, which was presumably for EFI. It's mountable, but completely empty. I tried an fsck on it but still nothing. Then, /dev/sda2 is an extended partition, and inside that there is large /dev/sda5 partition which is an "unknown filesystem". I'm currently running Photorec on the drive, and it's recovering a huge amount of files. Also, the S.M.A.R.T report on the disk says everything is OK.

So first of all, after recovering whatever possible with Photorec, what tricks would you recommend to try to recover the root filesystem? It was installed by a non-technical user, so I assume it was just the default EXT4 that Ubuntu uses. This is what I was considering trying:
- http://forums.debian.net/viewtopic.php?f=10&t=143054
- https://unix.stackexchange.com/questions/114429/short-read-while-trying-to-open-partition
- https://superuser.com/questions/1097009/ext4-partition-type-unknow-after-restart-and-cannot-be-mount

And secondly, how can I figure out what hardware exactly is failing? Can the S.M.A.R.T. report be trusted? Or could it have bad RAM?

Thanks a lot!
 


Is there any faint possibility that Timeshift was installed and set up?.......so that a snapshot would be available to restore the failed system?...the failure would be easier to diagnose if the system was up and running.
 
G'day Sam. You could get him to install Gecko, it's pretty reliable :)

Just kidding, but you can install Timeshift on Gecko. It's worth its weight in gold.

I'm currently running Photorec on the drive, and it's recovering a huge amount of files.

Just Photorec, or TestDisk?

Both TestDisk and PhotoRec are by Christophe Grenier and usually go hand in hand, but Photorec is particularly for salvaging pictures/photos.

On an Ubuntu Live USB, memtest86+ could be used to check for memory faults.

Regarding the actual error itself, rather than the three links you have provided, I would be inclined to use the following

https://www.quora.com/How-do-I-fix-a-grub-rescue-unknown-file-system-error

From the look of your 2nd paragraph's findings, I am inclined to think that the computer is either not UEFI, or else is running in Legacy mode, because that partitioning reflects BIOS/MBR rather than UEFI/GPT. That involves the 4-Partition Rule, although your friend is hamstringing himself by having the extended partition at /dev/sda2, and so /dev/sda5 becomes the first available partition for his Linux Root system files.

Extended should always be /dev/sda4, and then /dev/sda2 and /dev/sda3 can be used for storage or other purposes.

Bottom line is that the Quora option is best advised.

Cheers

Wizard

Edit - added BTW

BTW if the above fails, I would look to use a Live Ubuntu with Persistence enabled, to salvage his Home folder content (it will be a folder rather than a partition by the sound of it) and then fresh install.

Sing out if you need help if you choose to use the stick to chroot into his system. :)
 
Last edited:
@Condobloke Hi, thanks for the reply! Nope, they definitely didn't have any backups, but I'm sure they'll start now once they have a squeaky clean new system up and running! ;-P

@wizardfromoz Hi there! Good to hear from you.

You could get him to install Gecko
That will be the next step. }:)

Both TestDisk and PhotoRec are by Christophe Grenier and usually go hand in hand, but Photorec is particularly for salvaging pictures/photos.

Photorec is actually a bit of a misnomer. It recovers basically all types of files. The bad thing is that it scrapes them off the disk with no respect for the filesystem (because the filesystem is broken), so it creates thousands of sequentially named folders with thousands of generically named files. Like dumping a filing cabinet onto the floor, taking all the papers out of their folders, and then throwing all the papers into a huge box.

As for TestDisk.... where has this program been all my life?? :) Thanks for tip sir! This one must have found a backup superblock for the EXT4 filesystem, because it found the files and directories and is letting me copy them onto another disk. Very nice!!!

The data is the only thing I care about right now. Once that's done I'm going to try to run a memtest and run another more extended smartmontools test to see what exactly is failing. I don't really care about recovering the OS, that's the beauty of Linux, the OS is so easy and cheap (free) to reinstall that I usually only worry about the data and then do a fresh install. I'll post an update ASAP.

Thanks again!
 
I at one point recovered someone's data from a crashed disk where it was still accessible. I did something with debugfs and was able to recover most of the important files but don't ask me how I did it, but maybe you can trying looking into that.
 
Last edited:
OK, so with TestDisk I was able to pull off most of the data and even restore the backup superblock to make the filesystem mountable and bootable again. It seems to work fine. But... memtest always freezes. I tried it first with the installed Ubuntu 20.04 memtest in GRUB. I read somewhere that the Ubuntu 20.04 version is buggy, so I tried it in a Fedora live CD. Same thing. The + in Memtest86+ 5.01 still flashes, but it can't get past 60% of the first test, 0% of the pass. But then again I've read that memtest is buggy with modern EFI firmware. How can I determine what the real problem is?
 
Regrets, Sam, I have come to this late in my day, after trying to help settle a duel with pistols at 20 paces, it seems. :)

I'll respond in more detail on my tomorrow, dinner time here in Oz.

Wizard
 
As far as I know, Memtest64 not finishing is in itself indicative of a malfunctioning RAM.

If it's a single stick of RAM, that's easily resolved. If it is multiple sticks of RAM, the process of elimination will help get you sorted.
 
@KGIII

Makes sense. I just sort of lost confidence in memtest86+ from the amount of Google hits for "memtest86+ stuck". The usual response is that it doesn't work well with EFI systems. But maybe all those users reporting the problem also had bad RAM. ;-)

I'm currently testing this userspace RAM testing tool, although it won't be as conclusive:
 
I've never tried that one before. I always test (when I do test) outside the OS. I usually just grab their latest .iso and check that way.

 
I just use the free edition at said site. I don't really need any of the pro features.
 
Well, I can't make any sense of it. Ran the memtest live USB all day, and no errors found. Also ran an extended test on the SSD, and no problems reported. I sent the computer back to my friend and told them to not trust it, and I'll be helping them with a frequent data backup strategy.
 
OK, so if

OK, so with TestDisk I was able to pull off most of the data and even restore the backup superblock to make the filesystem mountable and bootable again. It seems to work fine.

then his Ubuntu is working OK now, but there are unanswered questions on the reliability of the rig.

Is that the case?

If so, I would run a complete snapshot with Timeshift (if he is using 20.04 then it is in the Software Store), stored externally, if that can be done.

I'll just insert a bit of blurb I give to new Timeshift users to save me winging it each time.

REFERENCE URLs

Tony George pages

https://github.com/teejee2008/timeshift

https://github.com/teejee2008/Timeshift/releases

OTHER GUIDES

https://itsfoss.com/backup-restore-linux-timeshift/

https://www.fosslinux.com/34377/how-to-backup-and-restore-ubuntu-with-timeshift.htm

https://www.fossmint.com/backup-restore-linux-with-timeshift/

and this from Linux Lite Help Manual, and MX-Linux

https://www.linuxliteos.com/manual/tutorials.html#timeshift

https://mxlinux.org/wiki/applications/timeshift/

or read about it and ask questions at my Thread here


https://www.linux.org/threads/timeshift-similar-solutions-safeguard-recover-your-linux.15241/

If there is anything else we can help with, Sam, just sing out, Mate.

Avagudweegend

Chris
 
then his Ubuntu is working OK now, but there are unanswered questions on the reliability of the rig.

Hi there! Yes, that's correct. Really can't make sense of what happened, but it was surely a catastrophic hardware fault, and therefore it can't be trusted anymore.

Thanks for the Timeshift suggestion, looks good. Would you recommend using it with Btrfs, or is the rsync fallback good enough?
 
I haven't any Btrfs Distros in my stable, so all my experience with it (since around 1 October 2014) has been on the Rsync option.

I have possibly used it 2,000 times, and in that experience only had one problem, in the early days, where I managed to morph Ubuntu 14.04 and Zorin OS9 into one hybrid distro. I believe it was because I was too quick on the trigger finger, but Tony George did not think it was to do with Timeshift.

But I won't go Off Topic with the details, for now.

Cheers

Chris
 
@GeckoLinux - Hey Sam, just a heads up, on TestDisk.

As for TestDisk.... where has this program been all my life??

Not often I get to steal a march off a Dev :)

It's in the Repos of most Linux, that is in Debian, 'buntu-based, Fedora, Arch, Arch-based and so on, but guess what?

This from my Gecko Tumbleweeds, ran

zypper refresh

first.

Code:
chris@GeckoPlasma-HDD:~> zypper if testdisk
Loading repository data...
Reading installed packages...


Information for package testdisk:
---------------------------------
Repository     : Tumbleweed_OSS
Name           : testdisk
Version        : 7.1-4.1
Arch           : x86_64
Vendor         : openSUSE
Installed Size : 665.8 KiB
Installed      : No
Status         : not installed
Source package : testdisk-7.1-4.1.src
Summary        : Tool to Recover and Fix Partitions
Description    :
    TestDisk is a data recovery software primarily designed to help recover lost
    partitions and/or make non-booting disks bootable again.

... so it's covered with the openSUSE connection.

The version available to most Distros is currently 7.1, but more recent versions may be on rescue disks such as Hirens Boot CD, Rescatux and so on.

Cheers

Chris
 
Hey Chris,

Yep, TestDisk installed and worked perfectly for me on the GeckoLinux live USB that I used to recover the system. As a matter of fact my friend would have lost their entire digital life and possibly their business if it weren't for TestDisk. It's an impressive piece of software. I first used it to search for missing filesystems, which it quickly found, and I immediately copied the data off it with the convenient copy features built into TestDisk. When that was done, I then used TestDisk to search for the EXT4 backup superblocks that it had scattered around the disk, and then I ran the fsck.ext4 command that it suggested. That actually restored the filesystem and made GRUB be able to boot it again.

Thanks again! This forum is far and away the most helpful and above all friendly one for Linux.
 

Staff online

Members online


Top