Very odd issue with new RAID 5 HDDs. Some data keeps disappearing when running large copies.

wcramblitt

New Member
Joined
Mar 8, 2023
Messages
3
Reaction score
1
Credits
35
I have 2 almost identical servers both running Ubuntu 20.04.5 LTS each with the same HDD brand setup in RAID 5 configurations.

I have around 10TB of data I am attempting to copy to both servers. Everything worked fine on the first server.

However, on the second server, the copy script seems to run fine until it reaches around 100 - 500G, and then all of a sudden, with no warning, the partition drops most of the copied files leaving around 9G of files left. It will do this over and over until the copy script finishes and only the last 9G of items exist on the partition.

Some background on the partition:
-It is a RAID 5 configuration totaling around 15 TB
-It is mounted at /dev/sdc1 and the partition is ext4.
-I can check the disk while the copy operation is running, and the drive doesn't even reach 1% capacity.
-I have deleted the partition and remade it several times

I've tried a variety of copy scripts include scp and rsync over LAN and a direct copy from an external drive

Here are some examples of the scripts:

scp -rv user@ip:/folder/ destination
rsync -rvPW --ignore-existing user@ip:/folder/ distination

They never have any error messages or warnings of any kind.

I have to actively monitor the memory in the hdd to see the issue happen.

Background on servers:

Server 1 (working):

Memory
1678314136449.png


Disks
1678314262442.png



Server 2 (HDD issues):

Memory
1678314373602.png



Disks
1678314321950.png




Any help would be appreciated!
 

Attachments

  • 16TBHD.png
    16TBHD.png
    183.7 KB · Views: 165
Last edited:


Someone else will probably be able to help you better with this, but it would probably be useful if you shared the script you use and do you get an error message when the script exists? Also how much ram is in each of the servers, are there any other differences between the two systems?
 
It may also help if you check details & correct.

Ubuntu's main products are year.month in format (2000 subtracted from year), so your release details would mean the 14th month of 2020 ??
 
It may also help if you check details & correct.

Ubuntu's main products are year.month in format (2000 subtracted from year), so your release details would mean the 14th month of 2020 ??
My bad, I was half asleep when I originally created the thread. Fixed
 
Someone else will probably be able to help you better with this, but it would probably be useful if you shared the script you use and do you get an error message when the script exists? Also how much ram is in each of the servers, are there any other differences between the two systems?
Updated the post.

The nature of the script doesn't seem to matter. scp, rsync, cp. They run fine with no errors and warnings.
If you actively monitor the HDD, you'll see it reaching over 100G the all of a sudden drop down to 9G or so. It will do this over and over until the script finishes.

At the end, there should be around 8TB of data copied, but I'm left with only the last 9G of files.

I put the differences between the servers in the post.
 


Top