Me and Deedee, it's complicated.

dos2unix

Well-Known Member
Joined
May 3, 2019
Messages
3,489
Reaction score
3,221
Credits
31,259
I use the "dd" command quite often. I don't think I've ever gone more than a month in the last 20 years without using it.
What is :dd"? It stands for "data duplicator" or "disk duplicator" and is primarily used for low-level copying and conversion of raw data. It operates at the byte level, making it a bit-by-bit copier, which is particularly useful for tasks that require precise data replication.

dd can create exact copies of disks and partitions, making it ideal for backups and cloning.
It can create .img and .iso files from filesystems, partitions, and even directories.
dd can convert data formats during the copying process, such as changing the byte order or converting between ASCII and EBCDIC.
(Unless you're using an IBM mainframe, chances are you don't care about EBCDIC).

dd reads from an input file (if) and writes to an output file (of). By default, it reads and writes in blocks of 512 bytes, but this can be adjusted using various options.

Here is how you would copy the partition of a disk. In all the examples below I am using sdX and sdY. Change those
values to your specific situation.
Code:
sudo dd if=/dev/sdX1 of=/dev/sdY1 bs=4M

In that example, I'm copying from a disk partition to a different disk partition. (it doesn't have to be the same physical disk).
I'm using 4megabyte blocks to speed things up. More about that later, you can go too far with that concept.

This is how to create an .img file from you entire disk.
Code:
sudo dd if=/dev/sdX of=/path/to/image.img bs=4M

If this disk is bootable, then your target image will also be bootable. The disadvantage to this is, you get ALL the bytes of your disk.
Even the empty non-used bytes that have no data. If my disk is 128GB, my image will be 128GB also. There is a way around this.

How to create an iso file from my disk.
Code:
sudo dd if=/dev/cdrom of=/path/to/image.iso bs=4M

Which brings up another subject. It seems people are confused about the difference between .img and .iso files.
For the most part, dd doesn't really care. You could accidentally use the wrong one, and chances are it won't really
make any difference. But for the record...

.img files are typically raw disk images that contain an exact bit-by-bit copy of a storage device, such as a hard drive, USB drive, or SD card.
They can include the entire file system, including boot sectors, partitions, and file data.
Commonly used for creating backups, cloning disks, or transferring the entire contents of a storage device.
They can be mounted directly as a block device in Linux using the mount command.

.iso files are specifically designed to be images of optical discs, such as CDs, DVDs, or Blu-ray discs.
They usually contain a file system that is compatible with optical media, such as ISO 9660 or UDF.
Commonly used for distributing software, operating system installation media, and creating bootable discs.
They can be mounted as a loop device in Linux using the mount command with the -o loop option.

But nobody really follows those rules. Everyone burns iso images to USB thumb drives, and in the end, it doesn't really matter.
But for the record...

.img: General-purpose disk images for various storage devices.
.iso: Specifically for optical disc images.
.img: Can contain any file system, including those used by hard drives and USB drives
.iso: Typically contains file systems used by optical media (ISO 9660, UDF).
.img: Used for backups, cloning, and transferring entire storage devices.
.iso: Used for distributing software, creating bootable media, and installing operating systems.

How's that for confusing?

I mentioned earlier that using larger block sizes will speed your copy up, and it will, but...
(why is there always a but?) For example, we can control the block size like this...
Code:
sudo dd if=/dev/sdX of=/dev/sdY ibs=1M obs=512k

Well that's great, but why not just use ibs=8G or obs=8G. dd will let you do that. I don't know if there is an
upper limit. However...

Large block sizes require more memory. If your system has limited RAM, using very large block sizes could lead to memory exhaustion or swapping, which can degrade performance
When ibs and obs are set to large values, the system's I/O buffering might not be as efficient. This can sometimes lead to suboptimal performance, especially if the underlying storage device has its own optimal block size.
With larger block sizes, if an error occurs during the read or write process, more data might be lost or corrupted compared to using smaller block sizes. This is because each block contains more data.
Some older or less capable storage devices might not handle very large block sizes efficiently, leading to potential compatibility issues.

So there is a trade off. Going too big can cause some problems. As a rule, I try to limit the blocks to 4M or 8M. That's usually a
pretty good compromise.

But I'm getting myself side-tracked here. Who's writing this article anyway? Back to dd.

I mentioned earlier, that by default dd does a bit by bit copy. Even the unused bits. But you can keep dd from copying the
empty (0 data) blocks.

Code:
sudo dd if=/dev/sdX of=/path/to/sparse.img conv=sparse

But I have noticed, this doesn't work on every distro. Why? Aren't they all using the same source code to compile from?

Now sometimes, these copies can take a long time, depending on the speed of your computer and the speed of the devices
you are reading from and writing to. I have literally sat over an hour or two waiting for a large image to copy. (256GB)
However there is a way to see how far along the copy is.
Code:
sudo dd if=/dev/sdX of=/dev/sdY bs=4M status=progress

That will tell you how far along your copy is. That way you know if you have time to go grab a burger before it's done.
Looking through the man pages, I see that dd has some interesting options that I never use. But for sake of completeness
I will add them here.

count=N: Copy only N input blocks
skip=N: Skip N input blocks before starting to copy
seek=N: Skip N output blocks before starting to copy
conv=sync: Pad every input block to the input buffer size

What's the difference between skip and seek? I don't know. But here is how you would use those examples in a command.
Code:
sudo dd if=/dev/sdX of=/dev/sdY bs=4M count=100 status=progress

How would change the byte order?
Code:
dd if=inputfile of=outputfile conv=swab

For you mainframe users out there, how would you convert ASCII to EBCDIC ?
Code:
dd if=ascii_file.txt of=ebcdic_file.txt conv=ebcdic

..and you can back the other way also.

Code:
dd if=ebcdic_file.txt of=ascii_file.txt conv=ascii

You can combine all of this.
Code:
dd if=ascii_file.txt of=ebcdic_swabbed_file.txt conv=ebcdic,swab

I have gone through the entire LFS exercise. Then I created a bootable distro for that using dd.
It's the most reliable disk/image copier there is... bar none!!! It's not as easy to use as Rufus, MediaCreator,
Etcher, or some others. But it always works, even when those others don't.

Do I get a commission for that plug?
 
Last edited:


One last comment about burning images to USB thumb drives:

Linux might tell you the copy is done, but it isn't always true. If you remove the drive the moment the command prompt returns, your copy might be corrupted.

Have you ever noticed the "Safely remove drive" popup in Windows and Linux? Why is it necessary? You might think you can pull the drive whenever you want, but that's not the case.

The data is actually written to a disk cache, a part of memory that pretends to be the disk. When you click "Safely remove," it writes the data from the cache to the disk before you remove it.

But there's no "Safely remove" button in the command line. So how do you know when it's really done? One trick is to try to unmount the drive. If it's still in use, it won't let you unmount it. Another method is to use the sync command, which writes whatever is in the cache to the disk.

Some people reboot their system. But this isn't always convenient, dd will prevent the computer from rebooting until the write is finished, so you know it's completely done when the computer reboots.

Additionally, you can use tools like hdparm to check and manage write cache settings, or monitor I/O operations with tools like iostat to check if all data is written.
 
Last edited:
The data is actually written to a disk cache, a part of memory that pretends to be the disk. When you click "Safely remove," it writes the data from the cache to the disk before you remove it.

But there's no "Safely remove" button in the command line. So how do you know when it's really done? One trick is to try to unmount the drive. If it's still in use, it won't let you unmount it. Another method is to use the sync command, which writes whatever is in the cache to the disk.
After doing dd you can use sync command which is same as "Safely remove"
 
But there's no "Safely remove" button in the command line. So how do you know when it's really done? One trick is to try to unmount the drive. If it's still in use, it won't let you unmount it. Another method is to use the sync command, which writes whatever is in the cache to the disk.

@dos2unix :-

I tend to use the 'trick' as highlighted above.

Part of Puppy's desktop layout has always been to give the user a 'listing' of attached drives, which show as 'drive icons' along the bottom left-hand side of the screen.

In my case, I've moved the entire drive-icon 'line-up' along a short distance, so it sits better with my desktop layout. All my Pups are like this, with home-made 'docks' for various groups of icons drawn directly onto the background wallpaper.

Screenshot-485.png


(You'll notice that some drives have a small 'X' in the top right corner. These are drives that are mounted, and 'in use'. Puppy achieves this by means of a transparent overlay with a small 'executable' area in the top-right corner; it's just a quick way for running the 'umount' command (makes things simpler for noobs). There's one for each drive type; HDD/SSD, USB (stick, HDD or SSD), floppy, SD card, or optical).


So I simply click on the small "X" for the appropriate drive.....and wait for it to disappear. Then, I know 'dd' has finished..!


Mike. ;)
 
Last edited:
Here is how you would copy the partition of a disk. In all the examples below I am using sdX and sdY. Change those
values to your specific situation.
Code:
sudo dd if=/dev/sdX of=/dev/sdY bs=4M

In that example, I'm copying from a disk partition to a different disk partition. (it doesn't have to be the same physical disk).
I'm using 4megabyte blocks to speed things up. More about that later, you can go too far with that concept.
Thanks for the rundown on dd.

Just to avoid code that may be misleading, if one is copying partitions, the device names would be: /dev/sdX#, where # is a number, as distinct from /dev/sdX, which is the device name of a drive itself where X is usually a letter such a or b etc.

It's probably worth emphasising that dd doesn't differentiate between filesystems or partitions, rather it just copies raw data from one location to another, as mentioned. The user just needs to ascertain that the target device is correctly partitioned and formatted if needed in the case of using dd with partitions.
 
The dd command has laid many a systems to rest because people didn't get the target right.

She's a ruthless mistress. She takes no prisoners because she does exactly what you tell her to do.
 
A couple of important features of dd need to be mentioned. You can use iflag=direct or oflag=direct for direct I/O with the device instead of buffered I/O. If you run something like top you will see that kswapd0 will not be near the top of the list if you are using direct I/O. Direct I/O allows you to read from or write to the devices without using up system buffer space. sync may take quite a while, especially if you're using a flash drive. You can use top to see if the kernel is still spending a lot of time in a wait state to wait on hardware. sync should not return until all of the buffered data is written to the storage device. Using direct I/O allows the computer to directly write to the target device so syncing afterwards should not be necessary. I prefer to get the flash drives with the little LED light on them so I can see it blinking when the flash drive is in use.

Signed,

Matthew Campbell
 
Code:
dd if=/path/to/input/file of=/path/to/output/file iflag=direct oflag=direct bs=4M

I tested this... again, it doesn't seem to work on all distros????
It slows things down a bit because it bypasses the cache.
 
Code:
dd if=/path/to/input/file of=/path/to/output/file iflag=direct oflag=direct bs=4M

I tested this... again, it doesn't seem to work on all distros????
It slows things down a bit because it bypasses the cache.
It can slow things down, for dd anyway, while leaving the system buffers available for everything else. I have had sync hang for more than 10 minutes before when using dd to write to a USB flash drive. I would think dd would use open(2) would use something like O_DIRECT and perhaps O_DSYNC to avoid using the system buffers when using oflag=direct. I have been told functions like tcsetattr(3) don't work on CentOS either. Perhaps some Linux distributions are a bit deficient.

Signed,

Matthew Campbell
 


Members online


Latest posts

Top