Snapshots as a Backup for Research Computing NAS Filesystems

Table of Contents

What is a Snapshot?

What about Disaster Recovery?

What is the purpose of snapshot?

How does this all really work?

Snapshot schedules and retention

Additional Helpful Notes

Additional Help

What is a Snapshot?

A snapshot is simply a read-only copy or pointer to filesystem blocks in a volume at a particular timeframe which can be used as a backup mechanism. On a volume which utilizes snapshot technology, as filesystem blocks change (either by deletion, renaming, or updating) the blocks which are changed are actually written to new blocks. Old blocks are made read-only due to the snapshot pointers, which allows restoration of files or whole volumes back to a particular point-in-time. This method uses on disk space for backup snapshot copies of data. The greater the rate-of-change on a filesystem, the greater the disk space consumption required to create the backup snapshot copies (or pointers).

What about Disaster Recovery?

Snapshot alone is simply a backup method and does not provide disaster recovery such as writing data to a traditional tape backup system and moving those tapes off-site to a safe, secure location.

However, it is possible to move data to a safe, secure location with snapshots as well. Snapshot copies are a building block for disk mirroring, which is the disaster recovery method for this backup method. Mirroring is the moving of snapshot copies from one filesystem volume to a remote location on yet another filesystem volume.  Our remote filesystem volume is currently being hosted at ITS Franklin. 

What is the purpose of snapshot?

Traditional tape backup becomes increasingly cumbersome in terms of backup and can be a complete nightmare in terms of restore. The reason why tape backup is becoming more difficult today is due to the amount of data being generated (or changed) as well as the amount of time it takes to backup that data. 

Snapshots utilize blocks (not files) so it is more conservative on space consumption. Snapshots are almost instantaneous (taking only seconds to produce). In addition, copying changed blocks via a disk mirror to a remote system is much faster than that of tape backup since less data is transferred, and disk is always going to be faster than tape.

How does this all really work?

As mentioned previously, snapshots use disk space in the filesystem. This means that snapshots can be accessed from the filesystem itself. It also means that file restore needs can be made self-service at any time. That is, you do not need a systems administrator to do the restore (though if you need assistance we are glad to help).

Snapshots are accessed by listing or changing directories to the “.snapshot” directory. To restore files, you simply go to the snapshot that has the file you need; then copy that file either to the same location it existed previously or to a new location (so as not to overwrite the existing file in the current filesystem space).

Let’s look at an example:

On the Kure and KillDevil clusters your home directory and starting point upon login is the /nas02/home/o/n/onyen/ directory. You have a ten Gigabyte quota and are encouraged to save any applications, scripts or smaller input / output files in this location. Please note the following example will work for all directories and files under /nas01 and /nas02 (i.e. Research Computing’s NAS Filesystems) including home space, depts space, as well as other spaces.

[chammitt@kure-login1 ~]$ pwd

/nas02/home/c/h/chammitt

 

List of snapshots on /nas02 volume:

 

Note that in the output below only the past week is displayed.

[chammitt@kure-login1 ~]$ ls -tlu .snapshot

total 200

drwxrwxrwx 23 chammitt root 8192 Oct  8 11:03 hourly.0

drwxrwxrwx 23 chammitt root 8192 Oct  8 00:00 nightly.0

drwxrwxrwx 23 chammitt root 8192 Oct  7 16:03 hourly.1

drwxrwxrwx 23 chammitt root 8192 Oct  7 00:00 nightly.1

drwxrwxrwx 23 chammitt root 8192 Oct  6 00:00 nightly.2

drwxrwxrwx 23 chammitt root 8192 Oct  5 00:00 nightly.3

drwxrwxrwx 23 chammitt root 8192 Oct  4 00:00 nightly.4

drwxrwxrwx 23 chammitt root 8192 Oct  3 00:03 nightly.5

drwxrwxrwx 23 chammitt root 8192 Oct  2 00:03 nightly.6

 

Active Filesystem:

 

The following lists the contents in my home directory that contain the word “test.”

 

[chammitt@kure-login1 ~]$ ls -lhu|grep test

drwxrwx— 2 root     root     4.0K Oct  8 03:49 test

drwxr-xr-x 2 chammitt employee 4.0K Oct  8 11:24 testsnapshot

 

Snapshot Filesystem:

 

The following is from the directory hourly.0 (see the output listing two steps above), about which more will be discussed below.

[chammitt@kure-login1 ~]$ ls -lth .snapshot/hourly.0|grep test

-rw-r–r– 1 chammitt employee 2.6G Oct  3 07:46 testwestportdd6

-rw-r–r– 1 chammitt employee 977M Oct  3 07:45 testwestportdd66

-rw-r–r– 1 chammitt employee 3.9G Oct  2 11:11 testwestportdd2

-rw-r–r– 1 chammitt employee 977M Sep 27 11:21 test5dd

-rw-r–r– 1 chammitt employee 977M Sep 27 11:21 test4dd

drwxrwx— 2 root     root     4.0K Aug  5 11:18 test

 

Notice there are several more files in the snapshot than in the active filesystem, as some files have been deleted from the active filesystem.  Also note that the directory “testsnapshot” is missing from the snapshot, which is because that directory was created after the snapshot was taken.

Now, let’s say that file “test4dd” is the file I wish to recover to my home directory. 

[chammitt@kure-login1 ~]$ cp -p .snapshot/hourly.0/test4dd .

 

Notice the “-p” flag with the cp command above to preserve file attributes:  permissions, timestamps, etc. Without this flag, the copy is seen as a new write and thus the timestamp is the current time and the permissions are based upon your umask and primary UNIX GID.

[chammitt@kure-login1 ~]$ ls -lht|grep test

drwxr-xr-x 2 chammitt employee 4.0K Oct  8 11:24 testsnapshot

-rw-r–r– 1 chammitt employee 977M Sep 27 11:21 test4dd

drwxrwx— 2 root     root     4.0K Aug  5 11:18 test

Snapshot schedules and retention

Snapshots can be hourly, nightly, or weekly.  Hourly snapshots happen daily at particular times every day. Nightly snapshots happen at midnight Monday through Saturday. The weekly snapshot happens at midnight on Sundays.

 

Our particular schedule and retention of hourly snapshots is at 11am and 4pm. We keep two hourly snapshots.  Hourly snapshots are useful for mistakes made that you recognize immediately, i.e. for when you accidentally delete a file, update the wrong file, or overwrite a file copy. You can quickly correct your mistake by copying files out of one of the two hourly snapshots. There will always be only two hourly snapshots. The 11am snapshot overwrites the previous day’s 11am snapshot while preserving the previous day’s 4pm snapshot.  At 4pm the 4pm snapshot from yesterday is effectively overwritten.

 

So, if you need to restore a file which is greater than that of 24 hours, you will be interested in the nightly snapshots. The plan and intent is to retain up to 24 nightly snapshots or four weeks’ worth of backups. 

 

A caveat can come into play here. As already mentioned, disk space consumption is required for snapshots based on the rate-of-change within the filesystem. If a high rate-of-change or odd use of the filesystem causes the filesystem to run low on disk space, there is a mechanism in place to delete the oldest snapshot in order to free up space in the filesystem. This should rarely (if ever) happen.

Additional Helpful Notes

 

  • When listing snapshot, use the –u flag with the ls command to see the access times and when the snapshot was created. Otherwise, you would go by the snapshot name.

[chammitt@kure-login1 ~]$ ls -lt .snapshot

total 200

drwxrwxrwx 23 chammitt root 8192 Oct  5 08:51 hourly.0

drwxrwxrwx 23 chammitt root 8192 Oct  5 08:51 hourly.1

drwxrwxrwx 23 chammitt root 8192 Oct  5 08:51 nightly.0

drwxrwxrwx 23 chammitt root 8192 Oct  5 08:51 nightly.1

drwxrwxrwx 23 chammitt root 8192 Oct  5 08:51 nightly.2

drwxrwxrwx 23 chammitt root 8192 Oct  3 07:45 nightly.3

drwxrwxrwx 23 chammitt root 8192 Oct  3 07:45 nightly.4

drwxrwxrwx 23 chammitt root 8192 Oct  2 11:21 nightly.5

drwxrwxrwx 23 chammitt root 8192 Sep 28 10:06 nightly.6

 

As opposed to

[chammitt@kure-login1 ~]$ ls -ltu .snapshot

total 200

drwxrwxrwx 23 chammitt root 8192 Oct  8 11:03 hourly.0

drwxrwxrwx 23 chammitt root 8192 Oct  8 00:00 nightly.0

drwxrwxrwx 23 chammitt root 8192 Oct  7 16:03 hourly.1

drwxrwxrwx 23 chammitt root 8192 Oct  7 00:00 nightly.1

drwxrwxrwx 23 chammitt root 8192 Oct  6 00:00 nightly.2

drwxrwxrwx 23 chammitt root 8192 Oct  5 00:00 nightly.3

drwxrwxrwx 23 chammitt root 8192 Oct  4 00:00 nightly.4

drwxrwxrwx 23 chammitt root 8192 Oct  3 00:03 nightly.5

drwxrwxrwx 23 chammitt root 8192 Oct  2 00:03 nightly.6

 

  • To preserve file attributes, use cp –p or rsync –a when copying from snapshots to the active filesystem.
  •  If copying lots of files or directories, use rsync instead of cp.
  • Snapshot permissions are the same as filesystem permissions; if you don’t have access…you don’t have access.
  • The .snapshot directory is available in many and all areas.  You can start at the root of a volume, or far down the directory tree. For example, these locations are the same:

 

[chammitt@kure-login1 ~]$ ls -lhtu /nas02/home/.snapshot/hourly.0/c/h/chammitt/|grep test

drwxrwx— 2 root     root     4.0K Oct  8 11:03 test

-rw-r–r– 1 chammitt employee 977M Oct  8 11:03 test4dd

-rw-r–r– 1 chammitt employee 977M Oct  8 11:03 test5dd

-rw-r–r– 1 chammitt employee 3.9G Oct  8 11:03 testwestportdd2

-rw-r–r– 1 chammitt employee 2.6G Oct  8 11:03 testwestportdd6

-rw-r–r– 1 chammitt employee 977M Oct  8 11:03 testwestportdd66

[chammitt@kure-login1 ~]$ ls -lhtu /nas02/home/c/h/chammitt/.snapshot/hourly.0|grep test

drwxrwx— 2 root     root     4.0K Oct  8 11:03 test

-rw-r–r– 1 chammitt employee 977M Oct  8 11:03 test4dd

-rw-r–r– 1 chammitt employee 977M Oct  8 11:03 test5dd

-rw-r–r– 1 chammitt employee 3.9G Oct  8 11:03 testwestportdd2

-rw-r–r– 1 chammitt employee 2.6G Oct  8 11:03 testwestportdd6

-rw-r–r– 1 chammitt employee 977M Oct  8 11:03 testwestportdd66

Additional Help

Be sure to check the Research Computing home page  for information about other resources available to you.