kibasnowpaw
Well-Known Member
I am trying to figure out what is actually wrong with this drive and whether anyone here has seen the same thing before.
The drive is:
Seagate Exos X20 20TB
Model: ST20000NM007D
Part Number: 3DJ103-011
Firmware: SN01
Serial: ZVT5R78G
I got it from a friend because he could not figure it out either. He had the same type of issue on Windows 10 before I got it, so this did not start on Linux.
The problem is random startup/detection failure. Sometimes the drive works normally for days, passes tests, handles large transfers, and looks fine. Other times, after the PC has been powered off for a while, the drive does not come up properly at boot.
What makes this annoying is that it is not a dead drive in the usual sense. It does not behave like a drive with obvious bad sectors or a drive that is constantly throwing read/write errors.
What I have tested so far:
SMART media-side values still look clean:
Once the drive is detected and running, it can move large amounts of data without losing files. I have already used it for moving data back and forth while reformatting another HDD to ext4, and I have not had data loss.
But I still had one real failure myself:
My PC had been running for about 3 days without problems. Then I shut it fully off for around 10 hours. On next boot the drive did not come up properly. After that, once it was back, it worked again.
The SMART value that bothers me is this:
UDMA_CRC_Error_Count was already 114 when I got the drive from my friend.
After about a week of my own testing it has gone to 116.
So I am not saying all 116 happened on my system. I am only saying it increased by 2 while I had it.
That makes me think this is more of an interface / link / startup problem than a normal media failure.
I also checked hdparm and got this:
Another thing that caught my eye: this exact variant is 3DJ103-011 with SN01, and it does not seem to match the public firmware listings I found for other ST20000NM007D variants like 3DJ103-006 / 3DJ103-720 / 3DJ103-790 that show newer branches like SN03/SN06. That makes me wonder if this specific revision has some odd firmware behavior.
One person on a Danish hardware forum suggested Linux SATA power management, specifically the med_power_with_dipm / medium_power angle. I do not fully buy that as the root cause, because the same problem was already present on Windows 10 before I got the drive. I still tried checking that path on my Linux system, but my controller/kernel will not let me change link_power_management_policy at runtime. I only get Operation not supported or I/O error, so I cannot really confirm or rule that theory out on this machine.
My current system is:
Ubuntu Resolute Raccoon dev branch
Kernel 7.0.0-13-generic
KDE Plasma on X11
Intel i7-6850K
NVIDIA RTX 2070 SUPER
At this point my own conclusion is:
The drive does not look dead on the media side.
But it also does not look fully healthy or trustworthy.
It feels more like an intermittent startup / interface / firmware / PCB issue than a classic bad-sector failure.
I have sent an email to Seagate support ([email protected]) asking whether SN01 is correct for 3DJ103-011 and whether there is any known issue or firmware update for this variant, but I am not expecting much and I would not be surprised if the answer is just “go use the contact page”.
So I am asking here:
Has anyone seen this exact kind of behavior on an Exos X20?
Especially a drive that looks fine in SMART and under load, but randomly fails to come up properly after power-off?
And does this sound more like firmware/startup behavior, SATA link initialization, PCB/controller trouble, or something else?
If needed I can post full SMART output as well.
The drive is:
Seagate Exos X20 20TB
Model: ST20000NM007D
Part Number: 3DJ103-011
Firmware: SN01
Serial: ZVT5R78G
I got it from a friend because he could not figure it out either. He had the same type of issue on Windows 10 before I got it, so this did not start on Linux.
The problem is random startup/detection failure. Sometimes the drive works normally for days, passes tests, handles large transfers, and looks fine. Other times, after the PC has been powered off for a while, the drive does not come up properly at boot.
What makes this annoying is that it is not a dead drive in the usual sense. It does not behave like a drive with obvious bad sectors or a drive that is constantly throwing read/write errors.
What I have tested so far:
- different SATA data cables
- different SATA power connectors / power leads
- different PCs
- full format to ext4
- large data transfers to and from the drive
- SeaTools short test
- SMART checks with smartctl
- hdparm feature check
SMART media-side values still look clean:
- Reallocated_Sector_Ct = 0
- Current_Pending_Sector = 0
- Offline_Uncorrectable = 0
Once the drive is detected and running, it can move large amounts of data without losing files. I have already used it for moving data back and forth while reformatting another HDD to ext4, and I have not had data loss.
But I still had one real failure myself:
My PC had been running for about 3 days without problems. Then I shut it fully off for around 10 hours. On next boot the drive did not come up properly. After that, once it was back, it worked again.
The SMART value that bothers me is this:
UDMA_CRC_Error_Count was already 114 when I got the drive from my friend.
After about a week of my own testing it has gone to 116.
So I am not saying all 116 happened on my system. I am only saying it increased by 2 while I had it.
That makes me think this is more of an interface / link / startup problem than a normal media failure.
I also checked hdparm and got this:
- Power-Up In Standby feature set
- SET_FEATURES required to spinup after power up
Another thing that caught my eye: this exact variant is 3DJ103-011 with SN01, and it does not seem to match the public firmware listings I found for other ST20000NM007D variants like 3DJ103-006 / 3DJ103-720 / 3DJ103-790 that show newer branches like SN03/SN06. That makes me wonder if this specific revision has some odd firmware behavior.
One person on a Danish hardware forum suggested Linux SATA power management, specifically the med_power_with_dipm / medium_power angle. I do not fully buy that as the root cause, because the same problem was already present on Windows 10 before I got the drive. I still tried checking that path on my Linux system, but my controller/kernel will not let me change link_power_management_policy at runtime. I only get Operation not supported or I/O error, so I cannot really confirm or rule that theory out on this machine.
My current system is:
Ubuntu Resolute Raccoon dev branch
Kernel 7.0.0-13-generic
KDE Plasma on X11
Intel i7-6850K
NVIDIA RTX 2070 SUPER
At this point my own conclusion is:
The drive does not look dead on the media side.
But it also does not look fully healthy or trustworthy.
It feels more like an intermittent startup / interface / firmware / PCB issue than a classic bad-sector failure.
I have sent an email to Seagate support ([email protected]) asking whether SN01 is correct for 3DJ103-011 and whether there is any known issue or firmware update for this variant, but I am not expecting much and I would not be surprised if the answer is just “go use the contact page”.
So I am asking here:
Has anyone seen this exact kind of behavior on an Exos X20?
Especially a drive that looks fine in SMART and under load, but randomly fails to come up properly after power-off?
And does this sound more like firmware/startup behavior, SATA link initialization, PCB/controller trouble, or something else?
If needed I can post full SMART output as well.

