File compression doesn't seem to work - what am I missing?

Priest_Apostate

Active Member
Joined
Nov 7, 2023
Messages
197
Reaction score
47
Credits
2,117
I am currently studying setting up tarballs as part of my LPIC studies. I am testing this out with 16 files totaling 1.6GB. From the information I researched online, It seemed that the xz program offered the best compression. After 30 minutes of compressing the archive, I checked to find out how much the files were compressed - only to find that the compression didn't seem to be all that great:


1751832890722.png


1751832935664.png

This doesn't seem to be all that great of a compression.

What am I missing in regards to creating tarballs?
 


I'm not familiar with the xz program. So I looked it up.
You might find the notes on the 'compression options' helpful.

I tend to use the command line for most things.

For Videos you could use Video Compressor.
OR> you could try Handbrake.

Our members @dos2unix, @wizardfromoz & @JasKinasis may have other helpful alternatives.
 
I tried using these two actions to test:


1. tar -cvf to create the tar file - and xz to create the compressed file.

then

2. tar -cvf, then xz -v9 to improve the compression rate. The screenshots provided show the results of that xz -v9 option.

xz seems to have the highest compression rate (as opposed to gzip and bzip) - which is why I chose that one for testing.
I'm not familiar with the xz program. So I looked it up.
You might find the notes on the 'compression options' helpful.

I tend to use the command line for most things.

For Videos you could use Video Compressor.
OR> you could try Handbrake.

Our members @dos2unix, @wizardfromoz & @JasKinasis may have other helpful alternatives.
 
Binary files do not compress much. Especially mp4 mpa files, they are already compressed, so you are
trying to compress files that are already compressed.
Okay, another question: how/where did you learn that? Asking as this is the first time I've heard of mp4/mpa files already being compressed.
 
G'day

Okay, another question: how/where did you learn that? Asking as this is the first time I've heard of mp4/mpa files already being compressed.

You may be surprised to learn that that has been the case for over 30 years.

1991/1992 the Joint Photographic Experts Group came up with the .jpeg/.jpg standard to transfer image files more effectively over the Internet than could be done with .bmp (bitmap).

Likewise around that time the Motion Pictures Expert Group defined the standard for .mpeg over .avi, .mov and so on for video.

.mp3 was added as an Audio layer for music.

These formats gained momentum maybe 1995 to 2000 as the Internet gained in popularity.

HTH

Wizard
 
I guess my confusion stems from my understanding that at the bottom line, all files could be reduced to binary - and thus considered as such.
Is that in error?

Technically, you are right... everything is really binary. But in Linux... what we really mean is...
was it compiled with a compiler? ( C, C++, Rust, Java, etc..) or it is text like file meant to be ran by an interpreter?
Bash, Python, Perl, etc...

Now these are files that do something, run something or change something.

Then you have static files. They don't really do anything they are just data.
This could be a text file.

Bash, python, perl, text, these files are really just ASCII text and they compress pretty good.
mp3's, jpg's, gif's, mp4a's, ... these are also staic files. They don't do anything on their own. They are just data.
You need a graphics program to view them, run them or listen to them (as the case may be).

Typically these files are already compressed pretty well.
There are several different compression algorithms and some compress better than others.
It's possible you might get slightly better compression... but then they wouldn't be compatible with your application that uses them. You would have to compress and uncompress everything you used that file. It's a lot of work for hardly any gain.

Normally in Linux, you just compress text files, (bash, python, source code, perl, etc..).

I don't really remember how I learned this, or when.. but it's been that way for a very long time.
Even in the Windows world, compression programs can't compress gifs, jpegs, or mp3s. (at least not very much).
 
In the example above you are using tar with compression. I do that a lot also myself.
Mostly just out of habit, more than any real space savings.
tar itself is just an archiving program. It doesn't compress anything on it's own.

When you add the gzip, zip, bzip2, or xz flags. It uses another external application to compress the file(s).
tar is great for archiving. Not so great for compressing binary and already compressed files.
 


Follow Linux.org

Staff online

Members online


Top