Why Is LZMA Compression So SLOW??..... :<

blackneos940

Active Member
Joined
May 16, 2017
Messages
347
Reaction score
207
Credits
332
I understand that Compression Algorithms check for redundant Data in Files and then compress them to save space on the Disk, but man, using Ark to compress my Virtual Machines from VirtualBox is so painfully SLOW!!..... :( I think I'll go back to playing Need For Speed, while compressing my *BSD and TAILS VMs..... :( Thanks for any answers guys
Infinity_Geometry.gif
..... :3

Edit: That Geometry infinitum is how I feel right now while compressing..... :(
 


Ark sucks. Arch is good as a distro but uses the most lame archive manager. I prefer using the option in my distro with right click on the files to archive and press "Compress", then select 7z. Or, if I wanna set a compression level, the 7-zip for Windows runs perfectly with Wine - 7zFM.exe and you compress whatever you want at whatever level you want and wherever you want.
 
Generally speaking most modern compression algorithms give roughly the same compression, and with regard to the number of cores that you can use at once, it is up to you to decide how many you want to use. However, 7-zip is free and open source. The 7z format supports encryption with the AES algorithm with a 256-bit key. If the zip file exceeds that size, 7-zip will split it into multiple files automatically, such as integration_serviceLog.zip.001, integration_serviceLog.zip.002, etc. (Way back when, PK Zip used this to span zip files across multiple floppy disks.) You'll need all the files to be present to unzip them. The 7z format provides the option to encrypt the filenames of a 7z archive.
 
Ark sucks. Arch is good as a distro but uses the most lame archive manager.

But the Arch installer only gives you a basic terminal based environment. Arch doesn't have a default archive manager, or a default desktop environment! After the initial installation - the graphical environment and all of the other software selections are completely up to the user to configure and install.

However, some of the Arch-derived distros, which provide a fully configured Arch desktop system might supply Ark by default. Usually only the ones that install KDE/Plasma as the default desktop. Any that install a Gnome based system will typically use file-roller. But in my experience, both archive managers integrate well with the file managers in their respective Desktop Environments. I've never had a problem with either of them. Both can handle all of the most common compression formats - as long as you have the appropriate tools installed for each file-type.

That said, I'm a bit of a terminal junkie. I tend to use the terminal for most things. I find it a lot more convenient to issue a quick one-liner, use a bash alias, or fire off a script (I write scripts for everything), than to clumsily fire up some GUI program, wait for it to load, point and click through a bunch of options, before starting whatever task I want to get done.
Just switch to a terminal (I always have at least one running), two seconds of typing and BAM! Job done! Or at the very least, job running in the background whilst I get on with other things! XD

the 7-zip for Windows runs perfectly with Wine - 7zFM.exe and you compress whatever you want at whatever level you want and wherever you want.
Why are you running the Windows version through Wine? If you are running Arch - there should be a native Linux version of 7zFM available - if not in the repos - then in the AUR..... At least there always used to be - I haven't used Arch in a good number of years though, so perhaps this situation has changed. It always used to be part of the p7zip package.

Going back to @blackneos940 's post - if you are compressing large disk images of virtual machines - it's going to take a long time and potentially use up a lot of system resources to do so. And from what I recall, your PC was fairly low-spec.

The amount of time taken to compress/decompress will depend on things like: the algorithm used, the compression ratio, the available system resources of the computer performing the compression/decompression (memory, processor speed, the number of processor cores, number of other processes running etc)

The other thing to bear in mind is that the amount of compression that is applied to an archive also depends on the type of files that you are compressing.

If you have a lot of files that are already in a compressed format (e.g. certain video/audio files, data files, or other archive files) it may be impossible to compress those files much further. So no matter what compression algorithm you're using - the archive size may not be much smaller than the original files it contains.

Re: @rado84 's recommendation of 7zip:
Avoid using .7z on it's own for compressing backups on Linux. The .7z format does NOT preserve file-permissions (the owner and group permissions) for files.

It says so right there in the man-page for 7z:
Backup and limitations
DO NOT USE the 7-zip format for backup purpose on Linux/Unix because :
- 7-zip does not store the owner/group of the file.

On Linux/Unix, in order to backup directories you must use tar :
- to backup a directory : tar c - directory | 7za a -si directory.tar.7z
- to restore your backup : 7za x -so directory.tar.7z | tar xf -

7z is a great format and can provide compact archives, but I only use 7z on its own if I don't care about preserving file-permissions. For example, if I'm compressing a folder containing some files to send to somebody, then I might use .7z.

I don't use .tar.7z (as per the quote from the 7z man-page) because decompression/extraction becomes a 2-stage operation in the GUI. It's a two stage operation in the terminal too. But at least in the terminal you can chain the two commands together into one operation - as per the example quoted from the man-page.

Personally, I often use tar.xz for backups, or .tar.gz if I want it done quickly and when file-size is not an issue. With .tar.xz and .tar.gz - it's still technically a two stage operation to extract, but unlike .tar.7z - both steps can be performed in a single operation/command in the GUI (and in the terminal using one tar command).

Also, the .xz and .7z compression algorithms are very comparable in terms of speed and file-size. Both provide nice, compact archives (smaller file-sizes), but they do use a little more time and memory to compress and decompress files than gzip, which is by far the fastest algorithm.

With xz, depending on the compression ratio and the types of files being compressed, you can typically get the archives file-size down to between 8%-18% of the original size. Which again - is comparable to .7z.

If you wanted the smallest possible file-sizes, and the memory usage and time taken are not an issue - you could use xz with the -e option (extreme) and the compression ratio set to -9 - which tries to get the file-size as small as possible. But that will take much, much longer to compress and decompress.
You can do the same with 7z too.

By contrast - with gzip, you're typically looking at the archive being approximately 12%-26% of its original size. And depending on the type of data being compressed/decompressed and the compression ratio - it is typically 3 to 6 times faster than 7z, or xz.

So if you want fast compression, and file-size is not so much of an issue - gzip is your best bet. gzip is the fastest algorithm for compression and decompression and uses less memory/system resources, but yields slightly larger files. In that scenario - I'd recommend .tar.gz.

If you want the smallest files - at the cost of longer compression/decompression times I'd go with .tar.xz

Or if you want something slightly faster than xz or 7z, but with better compression than gzip, another good option is bz2 (bzip2) (using the .tar.bz2 extension).

Wrapping things up - personally - I tend to use .tar.xz or .tar.gz for backups, or whenever it's important to preserve permissions.
For quick and dirty archives to send to other people, I'll use .7z when preservation of permissions is not a priority, or just use plain old zip!

Just to back up everything I've said, here's a quick experiment:
I have a folder which contains 45Mb of c++ source code. All UTF-8 text files with .h or .cpp file extensions and some other random text files containing notes and things. There may also be some git repo data in there too.
Code:
$ du -sh JasProjects
45M     JasProjects

Creating a few archives - using the default compression ratios on each and using the time command to time each operation:
Code:
$ time tar czf JasProjects.tar.gz JasProjects

real    0m1.310s
user    0m1.075s
sys     0m0.576s

$ time tar cJf JasProjects.tar.xz JasProjects

real    0m3.104s
user    0m2.775s
sys     0m0.560s

$ time ( tar c - JasProjects | 7za a -si JasProjects.tar.7z )

# Verbose output from 7za snipped..

real    0m8.424s
user    0m11.231s
sys     0m1.416s
That created a .tar.gz, a .tar.xz and a .tar.7z and we can see how long each operation took.

Now listing the files, so we can see their sizes:
Code:
$ ls -lh JasProjects.tar.*
-rw-r--r-- 1 jason jason 4.3M Apr  8 16:29 JasProjects.tar.7z
-rw-r--r-- 1 jason jason 6.1M Apr  8 16:29 JasProjects.tar.gz
-rw-r--r-- 1 jason jason 4.3M Apr  8 16:29 JasProjects.tar.xz
3

So looking at the output above - I get the following results:
From fastest to slowest, with the final size and the percentage of the original directory size:
.tar.gz in 1.3 seconds => 6.1Mb => 13.5% of the original size
.tar.xz in 3.1 seconds => 4.3Mb => 9.5% of the original size
.tar.7z in 8.4 seconds => 4.3Mb => 9.5% of the original size

So in this case - the results were fairly close for all three methods in terms of file-size. gzip was by far the fastest algorithm - almost 3 times faster than xz and just over 6 times faster than 7z, which was a big surprise. But gz yielded the largest file.

xz and 7z yielded identical sized files, but xz was more than twice as fast as 7z. Which is interesting, because even I thought 7z would be a bit faster than that!

Interesting!
 
Why are you running the Windows version through Wine? If you are running Arch

Re: @rado84 's recommendation of 7zip:
Avoid using .7z on it's own for compressing backups on Linux. The .7z format does NOT preserve file-permissions (the owner and group permissions) for files.

I'm not running Arch. I tried it once or twice but didn't agree with me, so I went back to the distro I know and love - Mint, which is in my signature.

The kind of backups I make is mostly files from ~/.config . The other purpose is to compress 'scs' files which are mods for American Truck Simulator. With the Windows' 7-zip a 100 MB file easily becomes 10 MB at Ultra compression level. Never had a single problem with permissions since I solved that quite easily by simply opening fstab and adding


Code:
uid=1000

to the "Options" column for both partitions where I store my files. Thus, all files and folders automatically become my property when the system boots up.
 
BIG like on the above, Jas ... thanks for sharing :), I have bookmarked.

If you have a lot of files that are already in a compressed format (e.g. certain video/audio files, data files, or other archive files) it may be impossible to compress those files much further. So no matter what compression algorithm you're using - the archive size may not be much smaller than the original files it contains.

In my old days with Windows archiving, we used to refer this as "white space" within the files.

Cheers

Wizard
 


Latest posts

Top