Skip to content

Commit 6a8edd9

Browse files
committed
added more about compression, also demo video and reference card links available
1 parent 281c8c2 commit 6a8edd9

File tree

1 file changed

+60
-16
lines changed

1 file changed

+60
-16
lines changed

docs/16.md

Lines changed: 60 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,19 +10,52 @@ comments: true
1010

1111
As a system administrator, you need to be able to confidently work with compressed “archives” of files. In particular two of your key responsibilities; installing new software, and managing backups, often require this.
1212

13+
On other operating systems, applications like WinZip, and `pkzip` before it, have long been used to gather a series of files and folders into one compressed file - with a `.zip` extension.
14+
15+
Archiving and compressing, however, are actually distinct processes that are frequently used together. Archiving consolidates multiple files into one "box" while preserving their structure and attributes, but it does not change the total data size. Compression uses sophisticated algorithms to encode data efficiently, saving storage space by shrinking that file.
16+
1317
## YOUR TASKS TODAY
1418

15-
* Create a tarball
16-
* Create a compressed tarball and compare sizes
19+
* Compress a file and compare sizes
20+
* Archive the contents of a folder
21+
* Create a compressed "tarball"
1722
* Extract files from a tarball
1823

19-
## CREATING ARCHIVES
24+
Check out the [demo](https://asciinema.org/a/765120)
25+
26+
_Help maintaining the course by purchasing the [reference card](https://buymeacoffee.com/livialima/e/493808) for this lesson._
27+
28+
## COMPRESSING FILES
29+
30+
The goal of compression is to reduce file size. Smaller files use less bandwidth and can be transmitted over networks faster. Different tools use different algorithms to actually shrink that data.
31+
32+
You can compress a file with GZip like this:
33+
34+
`gzip my-big-file`
35+
36+
...which will create `my-big-file.gz` but it will replace the original file. If you want to preserve the uncompressed file, try:
37+
38+
`gzip -vk my-big-file`
39+
40+
This uses the `-v` to make the command "verbose" and `-k` to keep the original file.
41+
42+
When it's time to decompress, do it with the `-d` switch, like this:
2043

21-
On other operating systems, applications like WinZip, and pkzip before it, have long been used to gather a series of files and folders into one compressed file - with a .zip extension. Linux takes a slightly different approach, with the "gathering" of files and folders done in one step, and the compression in another.
44+
`gzip -d my-big-file.gz`
45+
46+
Popular Tools:
47+
48+
* [gzip](https://www.gnu.org/software/gzip/manual/gzip.html): Very common and fast.
49+
* [bzip2](https://manned.org/bzip2): Slower but generally creates smaller files.
50+
* [xz](https://manned.org/xz): Currently the most space-efficient tool in Linux.
51+
52+
Text usually compress well regardless of the tool. JPEG/MP4 files may not shrink as much, but those formats are already somewhat compressed.
53+
54+
## CREATING ARCHIVES
2255

2356
So, you could create a "snapshot" of the current files in your _/etc/init.d_ folder like this:
2457

25-
`tar -cvf myinits.tar /etc/init.d/`
58+
`tar -cvf myinits.tar /etc/init.d/`
2659

2760
This creates _myinits.tar_ in your current directory.
2861

@@ -38,35 +71,46 @@ You could then compress this file with GnuZip like this:
3871

3972
...which will create `myinits.tar.gz`. A compressed tar archive like this is known as a "tarball". You will also sometimes see tarballs with a _.tgz_ extension - at the Linux commandline this doesn't have any meaning to the system, but is simply helpful to humans.
4073

41-
In practice you can do the two steps in one with the "-z" switch, like this:
74+
In practice you can do the two steps in one with the `-z` switch, like this:
4275

4376
`tar -cvzf myinits.tgz /etc/init.d/`
4477

4578
This uses the `-c` switch to say that we're creating an archive; `-v` to make the command "verbose"; `-z` to compress the result - and `-f` to specify the output file.
4679

47-
## TASKS FOR TODAY
80+
## EXTRACTING ARCHIVES
81+
82+
To "explode" an archive and retrieve your files, use the `-x` (extract) flag:
4883

49-
* Check the links under "Resources" to better understand this - and to find out how to extract files from an archive!
50-
* Use `tar` to create an archive copy of some files and check the resulting size
51-
* Run the same command, but this time use `-z` to compress - and check the file size
52-
* Copy your archives to _/tmp_ (with: `cp`) and extract each there to test that it works
84+
`tar -xvf archive.tar.gz`
5385

86+
**Safety First:** Before extracting, it is a good practice to preview the contents using the `-t` (list) flag:
5487

55-
## POSTING YOUR PROGRESS
88+
`tar -tf archive.tar.gz`
5689

57-
Nothing to post today - but make sure you understand this stuff, because we'll be using it for real in the next day's session!
90+
That gives you an idea of how the file structure will look like after extracting. If the list shows many files without a common prefix (like folder/file), you have found a _tarbomb_. No, not [this one](https://xkcd.com/1168/). Extracting this may spew files directly into your current directory, potentially overwriting existing files.
91+
92+
You can use the `-C` flag to extract files into a specified target directory to keep things tidy.
93+
94+
`tar -xvf archive.tar.gz -C target_dir/`
5895

5996
## EXTENSION
6097

61-
* What is a .bz2 file - and how would you extract the files from it?
62-
* Research how absolute and relative paths are handled in tar - and why you need to be careful extracting from archives when logged in as root
63-
* You might notice that some tutorials write "tar cvf" rather than "tar -cvf" with the switch character - do you know why?
98+
You might notice that some tutorials write `tar cvf` rather than `tar -cvf` with the switch character - do you know why? [Hint](https://unix.stackexchange.com/questions/28403/tar-cvf-or-tar-cvf): It’s related to old "tape archive" styles and modern compatibility.
99+
100+
A note about **compression levels** - Most tools allow you to adjust the balance between speed and size, using a numeric scale (1–9):
101+
102+
* Level 1: Fast, but provides low compression.
103+
* Level 6: The default setting; a balance of speed and size.
104+
* Level 9: Best compression (smallest file size) but takes the longest time.
105+
106+
However, when calling a compression tool while creating a tarball, `tar` will only use the default level 6.
64107

65108
## RESOURCES
66109

67110
* [18 Tar Command Examples in Linux](https://www.tecmint.com/18-tar-command-examples-in-linux/)
68111
* [Linux TAR Command](http://linuxbasiccommands.wordpress.com/2008/04/04/linux-tar-command/)
69112
* [Linux tar command tutorial](https://www.youtube.com/watch?v=CUdwDEKlDrw) (video)
113+
* [Between xz, gzip, and bzip2, which compression algorithim is the most efficient?](https://superuser.com/questions/581035/between-xz-gzip-and-bzip2-which-compression-algorithim-is-the-most-efficient)
70114

71115
Some rights reserved. Check the license terms
72116
[here](https://github.com/livialima/linuxupskillchallenge/blob/master/LICENSE)

0 commit comments

Comments
 (0)