avatar
ねぉじゅりえと
ねぉじゅりえと是NeoJuliet的谐音

Statistics

  • Total Posts: 7
  • Total Tags: 8

Compressing and Archiving

By Juliet
Pubilshed on

By Juliet

Compressing Files

gzip

The gzip is used to compress one or more files, when executed, it replaces the original files with a compressed version of the original

Juliet@ArchJ ~$ ls -l /etc > test.txt
Juliet@ArchJ ~$ gzip test.txt
Juliet@ArchJ ~$ ls -l test.*
-rw-r—-r—- Juliet Juliet 3230 2025-10-23 08:27 test.txt.gz

And the corresponding command gunzip is used to restore the compressed file to it’s original

Juliet@ArchJ ~$ gunzip test.txt
Juliet@ArchJ ~$ ls -l test.*
-rw-r—-r—- Juliet Juliet 15738 2025-10-23 08:27 test.txt

The gzip has many options, here list some commonly used

OptionsLong optiondescription
-d—decompressionDecompress, act like gunzip
-l—listList compression statistics for each file compressed
-v—verboseDisplay verbose message while compressing
-t—testCheck the integrity of a compress file
-r—recursiveIf there are directories in the arguments, recursively compress files contained within them
-k—keepKeep the original files

Run gzip --help or man gzipfor more information
Examples:

gzip -tv test.txt.gz  #check the integrity and display
gzip -d test.txt.gz  #same as gunzip, uncompress

If you want to compress all the file in a directory recursively:

Juliet@ArchJ ~$ tree ./example
example/
 ├── a.txt
 ├── b.log
 └── sub/
     └── c.csv
Juliet@ArchJ ~$ gzip -rv ./example
Juliet@ArchJ ~$ tree ./example
example/
 ├── a.txt.gz
 ├── b.log.gz
 └── sub/
     └── c.csv.gz

Archiving Files

tar

Basic Usage

The tar(tape archive) is a typical tool for archiving files
The command syntax works like this:

tar mode[options] pathname

Here lists some commonly used modes:

ModeDescription
cCreate a archive from a list of files or directories
xExtract an archive
rAppend specified pathnames to the end of an archive
tList the contents of an archive

Note that you can only choose one mode while using tar, and the mode must be specified before any other option.

Here lists some commonly used options:

OptionsLong OptionsDescription
f—file=example.tarSpecified the name of the archive file
v—verboseDisplay verbose message while archiving
p—preserve-permissionsReserve the original files’ permission
z—gzipArchive and use gzip to compress
j—bzip2Archive and use bzip2 to compress
tar cf example.tar example

This command create a archived file named example.tar
Notice that there is no need to add a leading dash

tar tvf example.tar

This command is used for listing the content of the archived file

tar xf example.tar

This command is used for extract the example.tar to present working directory

Pathname Process

The pathname process of tar is interesting, the default for pathname is relative, rather than absolute, so tar does this by simply removing the leading slash / from the pathname when creating the archive, let’s take an example:

Juliet@ArchJ ~$ mkdir -p ~/example1/test
Juliet@ArchJ ~$ mkdir ~/example2
Juliet@ArchJ ~$ cd ~
Juliet@ArchJ ~$ tar cf example1.tar /home/Juliet/example1
Juliet@ArchJ ~$ tree
/home/Juliet/
└── example1
    └── test
├── example2
└── ...
Juliet@ArchJ ~$ cd ~/example2
Juliet@ArchJ ~$ tar xf /home/Juliet/example1.tar
Juliet@ArchJ ~$ tree
/home/Juliet/example2/
└── home
    └── Juliet
        └── example1
            └──test

This because in the archiving file example1.tar the path prefix(leading slash /) is removed, so after extracting in ~/example2, tar will create home/Juliet/... again

Advanced Usage

When extracting an archive, it is possible to limit what is extracted from the archive, for example, if we want to extract one or more files from the archive:

tar xvf ./example.tar <pathname1> <pathname2> <pathname3>

The <pathname> must be full, you can use tar tf ./example.tar to check the pathname
Also, tar support the --wildcards option

tar xvf ./example --wildcards 'example1/directory*'

tar can also make use of both standard input and output:

find ./example -name 'file-A' | tar czf example.tar.gz -T -

Here -T means read the list of files to be archived from standard input
This is equivalent to

find ./example -name 'file-A' | tar cf - --files-from=- | gzip > example.tar.gz

Here tar cf - means writing the archived result to the standardout instead of a file, --files-from means archive the files from standardin

Use tar Savely

Sometimes it is dangerous to use tar without any check, because some malicious archive files may overwrite or contaminate system files
A better way to extract archive files is to list the archive contents first:

tar tvf example.tar

And if you don’t trust the source, you can first create a empty diretory and extract in this directory

mkdir ~/safe_extract
cd ~/safe_extract
tar xvf /path/to/example.tar

Also you can use --one-top-level to limit the extract directory

tar xvf /path/to/example.tar --one-top-level

This will create a new directory named with the archive file to store the extracted files
Another advice is that be careful to run tar in root privilege, unless you know what are you doing and what will happen!
You can add --keep-old-files to avoid overwriting and --skip-old-files to skip the files that aleady exist

tar xvf /path/to/example.tar --keep-old-files

Other Useful Tools

zip

The zip program is both a compression tool and archiver
In it’s most basic usage, zip is invoked like this:

zip [options] zipfile file

For example, if we want to make a zip archive of the home derectory:

cd /home
zip -r Juliet.zip ./Juliet

Use unzip to extract a zip file

cd /example
unzip /path/to/example.zip

Before extracting, you can add -l (and -v) option to list the content of the zip file:

unzip -l /path/to/example.zip

7zip


Comments

Comment freeeeely

Article License

Compressing and Archiving

Author Juliet Published License: CC BY-SA 4.0