The FreeBSD 'zine : Backing up FreeBSD Part 3

May 2000 : System Administration

Backing up FreeBSD Part 3
by David Lay <[email protected]>

Introduction

This is the third article in the series on data backup under FreeBSD. Previous articles covered sizing up and device setup. This installment investigates basic tape drive operation.

Talking to tape devices

mt(1), the magnetic tape manipulating program, is perhaps the most basic command line utility for performing simple tape drive operations. While most people will probably end up using backup software which provides a higher level interface (such as Amanda), there may come a day when catastrophe strikes, taking the filesystem containing your high level backup software with it. Under these circumstances, resorting to the low level tools like mt(1) on a fixit floppy may be the only way to recover the data stored in your backup archives. So, a basic working knowledge of mt(1) might come in handy in a crisis.

The syntax of the mt(1) command in a nutshell is:

mt command [arguments]

There are several possible commands documented in the man page for mt(1), but I'm only going to cover some of the more commonly used ones, based on my limited experience with DDS tape drives.

If you examine the file permission modes on the device special files for your tape drive, you'll probably notice that the default permissions are set so that only the user root and group operator have read/write permission. To execute these commands on your system, you may have to add yourself to the operator group, or become root.

Default Tape Device

Before we get too far with mt(1) we need to make sure that mt(1) will be operating with the correct tape device, especially if there are several different tape devices on your system. mt(1) will operate on /dev/nrsa0 by default. If you don't have a SCSI tape drive, or you wish to work with a different device you can override this default in either of two ways: by setting the TAPE environment variable, or passing the -f argument to mt(1) on the command line. In either case, you need to supply the path to the device special file for the desired tape drive. The default is suitable in my case, so I won't go into the details of this (see the mt(1) man page for help).

Device status

As the name suggests, the mt(1) status command reports the tape drive status. The status command will fail will a rather cryptic message if there is no tape mounted (produces a "Device not configured" return code under FreeBSD 3.4-RELEASE, which usually only occurs when you've failed to add the device to your kernel), so make sure you've got a tape loaded before you try this at home.

  dave@disco:~ % mt status
  Mode      Density              Blocksize      bpi      Compression
  Current:  0x13:X3B5/88-185A    variable       61000    disabled
  ---------available modes---------
  0:        0x13:X3B5/88-185A    variable       61000    DCLZ
  1:        0x13:X3B5/88-185A    variable       61000    DCLZ
  2:        0x13:X3B5/88-185A    variable       61000    DCLZ
  3:        0x13:X3B5/88-185A    variable       61000    DCLZ
  ---------------------------------
  Current Driver State: at rest.
  ---------------------------------
  File Number: 0	Record Number: 0
  dave@disco:~ %

My tape drive is a Sony SDT-9000 SCSI DDS-3 drive, but I'm using only using DDS-1 media. I've also got hardware compression jumpered off on my drive (for reasons which will become apparent when I get around to discussing Amanda in a future article). So, the status reported by mt(1) above makes sense, given my configuration. The full list of the different media density and compression codes is available in the mt(1) man page.

Rewinding and ejecting tapes

There are two commands which control tape rewinding: rewind, and offline. The rewind command rewinds the tape back to the beginning, and the offline command rewinds the tape and then ejects it. Neither of these two mt(1) commands produce any output, and since rewinding and/or ejecting can take several seconds, mt(1) does not exit and return you to the shell prompt until the tape drive has completed the operation. So, don't be alarmed if these commands appear to hang; just be patient.

Reading and writing files

Before covering the mt(1) file navigation commands, we need to write some dummy files to tape so that we have a reference point. One utility which can be used to read and write files to/from tape is dd(1). dd(1) is a generic low level file I/O utility, and isn't inherently connected to tape drives. You may have come across dd(1) before if you've ever had to make your own FreeBSD boot/install floppies from the raw disk images files provided in the FreeBSD software distribution archive.

First, we need to create some dummy files on the local filesystem:

  dave@disco:~ % echo 'This is the first dummy file' > file1
  dave@disco:~ % cp /kernel file2
  dave@disco:~ % dd if=/dev/urandom of=file3 bs=1024 count=2
  2+0 records in
  2+0 records out
  2048 bytes transferred in 0.004905 secs (417524 bytes/sec)
  dave@disco:~ %

The means of creating the first two dummy files should be familiar to most Unix users, but the use of dd(1) to create the third may not be. Briefly explaining the dd(1) command line arguments from left to right: we use /dev/urandom as the input file, file3 as the output file, a block size of 1024 bytes for read/write operations, and a block count of 2. This is directing dd(1) to read 2048 bytes from /dev/urandom and write those 2048 bytes to file3. dd(1) will create file3 in the current directory, and it has no qualms about overwriting existing files, so be careful with output filenames. A full explanation of the arguments accepted by dd(1) is available in the man page.

Reviewing our freshly created files:

  dave@disco:~ % ls -l
  total 9460
  -rw-r--r--  1 dave  disco       29 May 13 17:42 file1
  -r-xr-xr-x  1 dave  disco  1348399 May 13 17:42 file2
  -rw-r--r--  1 dave  disco     2048 May 13 17:42 file3
  dave@disco:~ %

Now we want to write these two files to tape. Note that I instruct dd(1) to access the tape drive via the non-rewinding device special file interface (see the previous article in this series for more information):

  dave@disco:~ % dd if=file1 of=/dev/nrsa0
  0+1 records in
  0+1 records out
  29 bytes transferred in 0.208732 secs (139 bytes/sec)
  dave@disco:~ % dd if=file2 of=/dev/nrsa0
  2633+1 records in
  2633+1 records out
  1348399 bytes transferred in 6.482802 secs (207996 bytes/sec)
  dave@disco:~ % dd if=file3 of=/dev/nrsa0
  4+0 records in
  4+0 records out
  2048 bytes transferred in 0.011236 secs (182270 bytes/sec)
  dave@disco:~ %

Now that we've written all three files to tape, a status check shows the tape is currently positioned at file number 3:

  dave@disco:~ % mt status
  Mode      Density              Blocksize      bpi      Compression
  Current:  0x13:X3B5/88-185A    variable       61000    disabled
  ---------available modes---------
  0:        0x13:X3B5/88-185A    variable       61000    DCLZ
  1:        0x13:X3B5/88-185A    variable       61000    DCLZ
  2:        0x13:X3B5/88-185A    variable       61000    DCLZ
  3:        0x13:X3B5/88-185A    variable       61000    DCLZ
  ---------------------------------
  Current Driver State: at rest.
  ---------------------------------
  File Number: 3	Record Number: 0
  dave@disco:~ %

File navigation

The purpose of creating and writing these files to tape was to get to a point where we can investigate the file navigation commands of mt(1).

One significant difference between disk and tape storage media is that disks usually contain filesystems, while tapes do not. This stems from the difference in media access methods supported. Disks are random access devices. You can read or write data to any block on a disk in any particular order at any time. Disks have a mechanism to support such "random" access. Tapes are sequential access devices. This means that you can only access those blocks on tape in sequential order. If you write 1000 blocks of data to tape, and you later need to read back block number 1000, you must read the first 999 blocks before block number 1000 is positioned under the tape's read/write head. This sequential access characteristic makes tape based filesystems impratical. Filesystems produce patterns of random access when updating file data and metadata. Where disks can seek between arbitrary blocks in a matter of milliseconds, tapes can take several minutes to move from the beginning of the tape to the end.

So, although you can store several files on a tape, there is no hierarchical file and directory structure, like that you can take for granted when working with disk based filesystems. Your files are written to the tape, and the tape drive writes end of file markers between successive files (so it can locate the start of each file later), but there is no directory listing as such. When working with multiple files on a single tape you will need to keep track of which files have been written and in which order. To retrieve the third file on a tape, you need to start from the beginning of the tape and skip past the contents of the first two files before you can start reading the contents of the third file. If you're old enough to remember buying pre-recorded music on audio cassette, this process is probably familiar.

Getting to the next example, let's say we need to retrieve file2 from the tape we've just written. The status command executed after the file was written showed us that the tape was positioned at file number 3, but remember that the file numbering scheme begins with zero (like so many computing numbering schemes). Since file2 was the second file written to tape (ie. file number 1), we need to rewind the tape to the beginning and skip past the first file (ie. file number 0). Here's how to do it:

  dave@disco:~ % mt rewind
  dave@disco:~ % mt status
  [...]
  ---------------------------------
  File Number: 0	Record Number: 0
  dave@disco:~ % mt fsf 1
  dave@disco:~ % mt status
  [...]
  ---------------------------------
  File Number: 1	Record Number: 0
  dave@disco:~ %

The output of the status commands has been editted to show only the information relevant to file positioning.

The new mt(1) command introduced here is the fsf command. The fsf (forward space files) command accepts a single numeric argument, which is the number of files to forward through. Remember that our goal was to position the tape such that the second file is ready for reading. If we wanted to read the third file, we would have needed to move forward 2 files (ie. mt fsf 2).

Now that the tape has been positioned, we can read the file using dd(1) and compare it's MD5 signature with that of file2 to make sure we've got what we expected:

  dave@disco:~ % dd if=/dev/nrsa0 of=output1
  2633+1 records in
  2633+1 records out
  1348399 bytes transferred in 7.265276 secs (185595 bytes/sec)
  dave@disco:~ % md5 output1
  MD5 (output1) = 216a27734f17a775d62182f3b84a5b14
  dave@disco:~ % md5 file2
  MD5 (file2) = 216a27734f17a775d62182f3b84a5b14
  dave@disco:~ %

The MD5 signatures match, so the file read from tape is actually the same as file2, which is all according to plan.

mt(1) has a complementary command for moving the tape transport backwards a number of files. This command is bsf and it works in the same manner as the fsf command. Had I mentioned this a moment ago, we could have simply moved backwards one file to locate the copy of file2 on the tape, rather than rewinding and skipping forward, but that would have been far too easy. Sometimes the journey is more important than the destination.

There are many other mt(1) commands available, and they're all documented in the mt(1) manual page, so feel free to experiment.

A brief word about block sizes

Something else that you might want to experiment with when playing with dd(1) and your tape drive is to try reading and writing large files using various block sizes. Different read/write block sizes can produce vastly different read/write throughput on different tape drives and media densities. If you have a significant amount of data that you need to push through your tape drive, you should try to find the block size "sweet spot" which maximises throughput. Optimising your throughput can make a dramatic difference to the elapsed time of your backup runs, especially when you've got gigabytes of data to dump.

Here's a brief example that should get you started. largefile is a 10 MB file which is written to tape several times, each time varying the bs argument to dd(1):

  dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=512
  20480+0 records in
  20480+0 records out
  10485760 bytes transferred in 62.828581 secs (166895 bytes/sec)
  dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=1024
  10240+0 records in
  10240+0 records out
  10485760 bytes transferred in 31.019588 secs (338037 bytes/sec)
  dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=2048
  5120+0 records in
  5120+0 records out
  10485760 bytes transferred in 26.172557 secs (400639 bytes/sec)
  dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=4096
  2560+0 records in
  2560+0 records out
  10485760 bytes transferred in 25.033055 secs (418877 bytes/sec)
  dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=8192
  1280+0 records in
  1280+0 records out
  10485760 bytes transferred in 25.319326 secs (414141 bytes/sec)
  dave@disco:~ % dd if=largefile of=/dev/nrsa0 bs=16384
  640+0 records in
  640+0 records out
  10485760 bytes transferred in 25.379657 secs (413156 bytes/sec)
  dave@disco:~ %

On my particular tape drive, the optimal block size seems to be 4096 bytes (throughput of 409 KB/sec). The manufacturers specifications quote a maximum throughput of around 1 MB/sec, but that maximum rate is probably only achieved on higher density DDS-3 media with the drive's hardware compression feature enabled (I'm using DDS-1 media with compression disabled). You can probably expect to achieve your drive's rated maximum throughput with the right tweaking.

Conclusion

That wraps up this installment. Since we touched on the lack of filesystem and directory structure information with tape media in this article, the way is paved to discuss the tape archive utility tar(1) in the next installment. See you then.

�

Current Issue

. Issue #05 : June 2000

�

Old Issues

. Issue #01 : February 2000
. Issue #02 : March 2000
. Issue #03 : April 2000
. Issue #04 : May 2000

�

Quick Links