Backing up FreeBSD Part 3
by David Lay <[email protected]>
Introduction
This is the third article in the series on data backup under FreeBSD.
Previous articles covered sizing up
and device setup. This installment
investigates basic tape drive operation.
Talking to tape devices
mt(1) , the magnetic tape manipulating program, is perhaps
the most basic command line utility for performing simple tape drive
operations. While most people will probably end up using backup software
which provides a higher level interface (such as
Amanda), there may come a day when
catastrophe strikes, taking the filesystem containing your high level
backup software with it. Under these circumstances, resorting to the low
level tools like mt(1) on a fixit floppy may be the only way
to recover the data stored in your backup archives. So, a basic working
knowledge of mt(1) might come in handy in a crisis.
The syntax of the mt(1) command in a nutshell is:
mt command [arguments]
There are several possible commands documented in the man page for
mt(1) , but I'm only going to cover some of the more commonly
used ones, based on my limited experience with DDS tape drives.
If you examine the file permission modes on the device
special files for your tape drive, you'll probably notice that the
default permissions are set so that only the user root
and group operator have read/write permission. To execute
these commands on your system, you may have to add yourself to the operator group, or become root .
Default Tape Device
Before we get too far with mt(1) we need to make sure
that mt(1) will be operating with the correct tape device,
especially if there are several different tape devices on your system.
mt(1) will operate on /dev/nrsa0 by default.
If you don't have a SCSI tape drive, or you wish to work with a different
device you can override this default in either of two ways: by setting the
TAPE environment variable, or passing the -f
argument to mt(1) on the command line. In either case, you
need to supply the path to the device special file for the desired
tape drive. The default is suitable in my case, so I won't go into the
details of this (see the mt(1) man page for help).
Device status
As the name suggests, the mt(1) status
command reports the tape drive status. The status command will fail will
a rather cryptic message if there is no tape mounted (produces a "Device
not configured" return code under FreeBSD 3.4-RELEASE, which usually only
occurs when you've failed to add the device to your kernel), so make sure
you've got a tape loaded before you try this at home.
dave@disco:~ % mt status
Mode Density Blocksize bpi Compression
Current: 0x13:X3B5/88-185A variable 61000 disabled
---------available modes---------
0: 0x13:X3B5/88-185A variable 61000 DCLZ
1: 0x13:X3B5/88-185A variable 61000 DCLZ
2: 0x13:X3B5/88-185A variable 61000 DCLZ
3: 0x13:X3B5/88-185A variable 61000 DCLZ
---------------------------------
Current Driver State: at rest.
---------------------------------
File Number: 0 Record Number: 0
dave@disco:~ %
My tape drive is a Sony SDT-9000 SCSI DDS-3 drive, but I'm using only
using DDS-1 media. I've also got hardware compression jumpered off on
my drive (for reasons which will become apparent when I get around to
discussing Amanda in a future article).
So, the status reported by mt(1) above makes sense, given
my configuration. The full list of the different media density and
compression codes is available in the mt(1) man page.
Rewinding and ejecting tapes
There are two commands which control tape rewinding: rewind ,
and offline . The rewind command rewinds
the tape back to the beginning, and the offline command
rewinds the tape and then ejects it. Neither of these two mt(1)
commands produce any output, and since rewinding and/or ejecting can take
several seconds, mt(1) does not exit and return you to the
shell prompt until the tape drive has completed the operation. So, don't
be alarmed if these commands appear to hang; just be patient.
Reading and writing files
Before covering the mt(1) file navigation commands, we need
to write some dummy files to tape so that we have a reference point. One
utility which can be used to read and write files to/from tape is
dd(1) . dd(1) is a generic low level file I/O
utility, and isn't inherently connected to tape drives. You may have
come across dd(1) before if you've ever had to make your
own FreeBSD boot/install floppies from the raw disk images files provided
in the FreeBSD software distribution archive.
First, we need to create some dummy files on the local
filesystem:
dave@disco:~ % echo 'This is the first dummy file' > file1
dave@disco:~ % cp /kernel file2
dave@disco:~ % dd if=/dev/urandom of=file3 bs=1024 count=2
2+0 records in
2+0 records out
2048 bytes transferred in 0.004905 secs (417524 bytes/sec)
dave@disco:~ %
The means of creating the first two dummy files should be familiar to
most Unix users, but the use of dd(1) to create the third
may not be. Briefly explaining the dd(1) command line
arguments from left to right: we use /dev/urandom as the
input file, file3 as the output file, a block size of 1024
bytes for read/write operations, and a block count of 2. This is directing
dd(1) to read 2048 bytes from /dev/urandom and
write those 2048 bytes to file3 . dd(1) will
create file3 in the current directory, and it has no qualms
about overwriting existing files, so be careful with output filenames.
A full explanation of the arguments accepted by dd(1) is
available in the man page.
Reviewing our freshly created files:
dave@disco:~ % ls -l
total 9460
-rw-r--r-- 1 dave disco 29 May 13 17:42 file1
-r-xr-xr-x 1 dave disco 1348399 May 13 17:42 file2
-rw-r--r-- 1 dave disco 2048 May 13 17:42 file3
dave@disco:~ %
Now we want to write these two files to tape. Note that I instruct
dd(1) to access the tape drive via the non-rewinding
device special file interface (see the previous article in this series for more information):
dave@disco:~ % dd if=file1 of=/dev/nrsa0
0+1 records in
0+1 records out
29 bytes transferred in 0.208732 secs (139 bytes/sec)
dave@disco:~ % dd if=file2 of=/dev/nrsa0
2633+1 records in
2633+1 records out
1348399 bytes transferred in 6.482802 secs (207996 bytes/sec)
dave@disco:~ % dd if=file3 of=/dev/nrsa0
4+0 records in
4+0 records out
2048 bytes transferred in 0.011236 secs (182270 bytes/sec)
dave@disco:~ %
Now that we've written all three files to tape, a status check shows
the tape is currently positioned at file number 3:
dave@disco:~ % mt status
Mode Density Blocksize bpi Compression
Current: 0x13:X3B5/88-185A variable 61000 disabled
---------available modes---------
0: 0x13:X3B5/88-185A variable 61000 DCLZ
1: 0x13:X3B5/88-185A variable 61000 DCLZ
2: 0x13:X3B5/88-185A variable 61000 DCLZ
3: 0x13:X3B5/88-185A variable 61000 DCLZ
---------------------------------
Current Driver State: at rest.
---------------------------------
File Number: 3 Record Number: 0
dave@disco:~ %
File navigation
The purpose of creating and writing these files to tape was to get to
a point where we can investigate the file navigation commands of
mt(1) .
One significant difference between disk and tape storage media is that
disks usually contain filesystems, while tapes do not. This stems from
the difference in media access methods supported. Disks are random access
devices. You can read or write data to any block on a disk in any
particular order at any time. Disks have a mechanism to support such
"random" access. Tapes are sequential access devices. This means that
you can only access those blocks on tape in sequential order. If you
write 1000 blocks of data to tape, and you later need to read back block
number 1000, you must read the first 999 blocks before block number 1000
is positioned under the tape's read/write head. This sequential access
characteristic makes tape based filesystems impratical. Filesystems
produce patterns of random access when updating file data and metadata.
Where disks can seek between arbitrary blocks in a matter of milliseconds,
tapes can take several minutes to move from the beginning of the tape to
the end.
So, although you can store several files on a tape, there is no hierarchical
file and directory structure, like that you can take for granted when
working with disk based filesystems. Your files are written to the
tape, and the tape drive writes end of file markers between successive
files (so it can locate the start of each file later), but there is no
directory listing as such. When working with multiple files on a single
tape you will need to keep track of which files have been written and in
which order. To retrieve the third file on a tape, you need to start
from the beginning of the tape and skip past the contents of the first
two files before you can start reading the contents of the third file.
If you're old enough to remember buying pre-recorded music on audio
cassette, this process is probably familiar.
Getting to the next example, let's say we need to retrieve file2
from the tape we've just written. The status command executed
after the file was written showed us that the tape was positioned at
file number 3, but remember that the file numbering scheme begins with
zero (like so many computing numbering schemes). Since file2
was the second file written to tape (ie. file number 1), we need to
rewind the tape to the beginning and skip past the first file (ie. file
number 0). Here's how to do it:
dave@disco:~ % mt rewind
dave@disco:~ % mt status
[...]
---------------------------------
File Number: 0 Record Number: 0
dave@disco:~ % mt fsf 1
dave@disco:~ % mt status
[...]
---------------------------------
File Number: 1 Record Number: 0
dave@disco:~ %
The output of the status commands has been editted to show only the
information relevant to file positioning.
The new mt(1) command introduced here is the fsf
command. The fsf (forward space files) command accepts
a single numeric argument, which is the number of files to forward through.
Remember that our goal was to position the tape such that the second file
is ready for reading. If we wanted to read the third file, we would have
needed to move forward 2 files (ie. mt fsf 2 ).
Now that the tape has been positioned, we can read the file using dd(1) and compare it's MD5 signature with that of file2
to make sure we've got what we expected:
dave@disco:~ % dd if=/dev/nrsa0 of=output1
2633+1 records in
2633+1 records out
1348399 bytes transferred in 7.265276 secs (185595 bytes/sec)
dave@disco:~ % md5 output1
MD5 (output1) = 216a27734f17a775d62182f3b84a5b14
dave@disco:~ % md5 file2
MD5 (file2) = 216a27734f17a775d62182f3b84a5b14
dave@disco:~ %
The MD5 signatures match, so the file read from tape is actually the same
as file2 , which is all according to plan.
mt(1) has a complementary command for moving the tape
transport backwards a number of files. This command is bsf
and it works in the same manner as the fsf command. Had
I mentioned this a moment ago, we could have simply moved backwards
one file to locate the copy of file2 on the tape, rather than
rewinding and skipping forward, but that would have been far too easy.
Sometimes the journey is more important than the destination.
There are many other mt(1) commands available, and they're
all documented in the mt(1) manual page, so feel free to
experiment.
A brief word about block sizes
Something else that you might want to experiment with when playing with
dd(1) and your tape drive is to try reading and writing
large files using various block sizes. Different read/write block sizes
can produce vastly different read/write throughput on different tape
drives and media densities. If you have a significant amount of data
that you need to push through your tape drive, you should try to find
the block size "sweet spot" which maximises throughput. Optimising
your throughput can make a dramatic difference to the elapsed time
of your backup runs, especially when you've got gigabytes of data
to dump.
Here's a brief example that should get you started. largefile
is a 10 MB file which is written to tape several times, each time varying
the bs argument to dd(1) :
dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=512
20480+0 records in
20480+0 records out
10485760 bytes transferred in 62.828581 secs (166895 bytes/sec)
dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=1024
10240+0 records in
10240+0 records out
10485760 bytes transferred in 31.019588 secs (338037 bytes/sec)
dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=2048
5120+0 records in
5120+0 records out
10485760 bytes transferred in 26.172557 secs (400639 bytes/sec)
dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=4096
2560+0 records in
2560+0 records out
10485760 bytes transferred in 25.033055 secs (418877 bytes/sec)
dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=8192
1280+0 records in
1280+0 records out
10485760 bytes transferred in 25.319326 secs (414141 bytes/sec)
dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=16384
640+0 records in
640+0 records out
10485760 bytes transferred in 25.379657 secs (413156 bytes/sec)
dave@disco:~ %
On my particular tape drive, the optimal block size seems to be 4096 bytes
(throughput of 409 KB/sec). The manufacturers specifications quote a
maximum throughput of around 1 MB/sec, but that maximum rate is probably
only achieved on higher density DDS-3 media with the drive's hardware
compression feature enabled (I'm using DDS-1 media with compression
disabled). You can probably expect to achieve your drive's rated maximum
throughput with the right tweaking.
Conclusion
That wraps up this installment. Since we touched on the lack of filesystem
and directory structure information with tape media in this article, the
way is paved to discuss the tape archive utility tar(1)
in the next installment. See you then.
|