This chapter describes Retrospect’s main concepts. This manual and the program itself refers constantly to these basic ideas, so it is important to understand them to get the most out of using Retrospect. In this chapter, you’ll learn how Retrospect works; about the different kinds of Media Sets you can use to back up your data; and about the backup actions you can take with Media Sets.
How Retrospect Works
Retrospect uses an archival method of backup that ensures backed up files are not deleted or written over until you request it. That way, they stay on the backup media indefinitely. For example, if you have been working on a particular document over a period of time, Retrospect backs up a different version of the document each time you back up. If necessary, Retrospect lets you retrieve a previous version of the file from any point in time it was backed up.
Retrospect always performs Smart Incremental backups. A Smart Incremental backup intelligently copies only files that aren’t already present on the current Media Set being used for backups (typically those files that are new or have changed since the previous backup). You don’t have to specify whether you want a “full” or “incremental” backup. Retrospect, by default, copies any and all of the files it hasn’t already backed up.
Because Retrospect only needs to add one instance of each unique file to the backup, it saves space on the backup media that would otherwise be used up storing duplicate copies of files. This space-saving technique is known as file-level deduplication or single-instance storage.
All of the backup, copy, and restore operations in Retrospect require a source and a destination. For a backup, the source is generally a hard drive or a folder on a hard drive (Retrospect calls these sources and Favorite Folders, respectively). The destination is generally a Media Set stored on backup media such as disk or tape.
Retrospect uses a Catalog file, an index of the files and folders contained in a Media Set, to keep track of the different generations of modified files in a Media Set. The Catalog lets you quickly search for files without having to actually search the backup media itself, which would be considerably slower, especially with media like digital tape. By default, Catalog files are stored on the Retrospect server computer, in
Sources are the disk volumes, folders on disk volumes, and networked clients that you want to back up. Each source you want to back up needs to be added to the Sources list.
It’s important to understand that Retrospect uses the term sources to refer to the volumes and folders that you want to back up, and also to refer to hard disk volumes that those backups will be written to. For example, you can back up a client’s hard disk called My Disk (that’s a source) to a Disk or File Media Set that resides on a hard disk named Backup Disk (because it is a hard disk that could also be backed up, Backup Disk also appears in the Sources list).
Above the Sources list, the toolbar allows you to add or remove sources, add Favorite Folders, and otherwise work with items in the Sources list. Below the list, a tabbed area allows you to see important details on a source that you have selected in the list.
Media Sets are the destinations for files and folders that you back up. A Media Set consists of one or more disks, tapes, optical discs, or a single file. Individual pieces of media (for example, tapes, optical discs, or hard disks) are members of a Media Set. A Media Set consists of one or more disks or tapes, or a single file. Individual pieces of media (for example, tapes or hard disks) are members of a Media Set. A Media Set can be made up of almost any sort of storage media: hard drives, disk arrays, tape, and even flash memory.
You can backup as many source volumes as you like to a single Media Set. For example, you could have a single Media Set as the backup destination for your computer’s internal hard disk, your external hard disk, a coworker’s hard disk on a computer with installed Retrospect Client software, and even a Mac OS X Server or Windows Server. All of your Media Sets appear in Retrospect’s Media Sets list.
Above the list, the toolbar allows you to work with Media Sets, including functions such as adding, removing, copying, and verifying a Media Set. Below the list, tabs give you more information on the Media Set you have selected.
When a disk or tape fills with data, Retrospect asks for a new member, adds it to the Media Set and continues appending data. It automatically uses any available new or erased media. If the media has the name Retrospect is looking for, Retrospect will erase and reuse it. However, Retrospect will never automatically use a medium with the wrong name if it has data on it.
When you create a Media Set in Retrospect, it can be one of the following types:
Disk Media Sets are Retrospect’s most flexible Media Set. They allow backups to span across multiple, random-access storage devices, including hard disks, Network Attached Storage (NAS), removable cartridges, and even flash media. Older backups can be groomed from Disk Media Sets to reclaim space, and it’s even possible to perform a restore from a Disk Media Set that’s in use by a backup operation. Disk Media Sets should be your most-used backup destination if you are not using tape backup. A Disk Media Set writes a folder containing a series of files to the destination media, with each file being no larger than 600 MB (which can be useful in environments where these files are replicated to additional storage, such as an off-site vault). Retrospect considers the folder containing the backup files to be a single member of the Disk Media Set. Disk Media Sets replace the less-flexible Removable Disk sets present in older versions of Retrospect. Catalogs for Disk Media Sets are usually stored on the Retrospect server’s hard drive.
Tape Media Sets use tape drives and backup tapes as the storage medium. Retrospect supports many types of tape drives, including DAT drives, LTO drives, AIT drives, VXA drives, and DLT drives. See the Retrospect website for a complete list of supported drives. Some drives, such as tape libraries (which can accommodate and automatically load multiple tapes) may require a license for the Advanced Tape Support add-on. Catalogs for Tape Media Sets are usually stored on the Retrospect server’s hard drive.
Tape WORM Media Sets are similar to Tape Media Sets, except that the tapes they use are WORM (Write Once, Read Many). As the name suggests, WORM tapes cannot be erased or reused once data is written to them. They are used for archival purposes and to comply with government regulations requiring document retention. Catalogs for Tape WORM Media Sets are usually stored on the Retrospect server’s hard drive.
File Media Sets combine the Catalog file and the backed-up data into a single file stored on a volume. They can be saved anywhere a Disk Media Set can be saved, but they are limited by the size of the volume on which it is stored, and also the maximum file size of the file system (FAT32, NTFS, HFS+, etc.). Backups in a File Media Set cannot span across media. File Media Sets are useful for small jobs where everything (the Catalog and the backed up data) is self-contained in a single file, but in most cases, you should use Disk Media Sets.
Storage Groups protect your entire backup environment up to 16x faster with a single, centralized disk or cloud destination that Retrospect can use simultaneously. With Storage Groups, you can run parallel backups to the same disk destination with a single ProactiveAI script. Scheduled scripts support Storage Groups as destinations, but the backups run on a single execution and not in parallel.
Storage Groups support the same workflows that you are accustomed to for backup, restore, transfer, grooming, and catalog rebuild, while providing far better performance and simplicity. You can treat the Storage Group like a Backup Set that allows simultaneous writes to it.
When creating a backup set, you will see "Create as Storage Group" as an option for disk sets and cloud sets. You can create one and use it as a destination for ProactiveAI scripts. If you select multiple sources for the script, they will run backups in parallel to the storage group.
The standard Retrospect Backup workflows for backups, restores, transfers, grooming, and rebuilds are the same for Storage Groups.
Storage Groups appear under the Backup Sets dialog on Windows and in the Media Sets tab on Mac.
For backup, a Storage Group can be treated like a backup set. Select the Storage Group as the destination in your ProactiveAI script.
For restore in Retrospect for Mac, a Storage Group can be treated like a media set.
For transfer in Retrospect for Mac, a Storage Group can be treated like a media set.
For verify in Retrospect for Mac, a Storage Group can be treated like a media set. The user interface is the same.
Under the Hood
Under the hood, a Storage Group is a container for per-volume backup sets. This architecture is why Retrospect allows you to keep the same workflows that you are accustomed to for backup, restore, transfer, grooming, and catalog rebuild, while providing far better performance and simplicity. You can treat the Storage Group like a Backup Set that allows simultaneous writes to it.
The architecture for Storage Groups allows simultaneous operations to the same destination because each volume is a different backup set under the hood. However, this workflow also prevents data deduplication across volumes.
Whenever you run a backup script manually, or when you set a script to later run automatically, you have the option of using one of four media actions. Each media action tells Retrospect how to handle the physical media, which in turn has an effect on which files are backed up.
Retrospect’s four media actions are:
No media action, which is the default choice, tells Retrospect that it doesn’t need to do anything special with media during the current backup. As usual, Retrospect will perform a Smart Incremental backup, which saves time and media space by not copying files that already exist in the Media Set. In other words, Retrospect will copy only files which are new or newly modified since the last backup to the same Media Set.
Skip to new member causes Retrospect to create a new member within the current Media Set. Retrospect will display a dialog requesting a new piece of media, so that you can insert it for use in the next backup operation. This media action is useful when a piece of media that you have previously used for a particular Media Set is not available.
Start new Media Set tells Retrospect to create a new destination Media Set (with a name similar to the old one) of the chosen type. Depending on the type of Media Set, Retrospect will use a new or erased disk or tape. For Disk Media Sets, Retrospect will create a new folder on the disk, and backed-up data will be written as a series of 600 MB backup files inside that folder. Use the Start new Media Set media action so that you can take your old media off-site for safe storage.
Recycle Media Set first clears the catalog contents (if any) of the destination Media Set, so it appears no files are backed up. Then it looks for the first media member of the Media set and erases it if it is available. If the first member is not available, Retrospect uses any available new or erased piece of media appropriate for the Media Set type. All selected files and folders from the source are then backed up to the Media Set. Use the Recycle Media Set action when you want to reuse one or more pieces of media.
Note: As long as matching is left on (the default), Retrospect will always perform a Smart Incremental backup, adding only those files that don’t exactly match already-backed-up files. If a Media Set and its catalog file are empty, then Retrospect’s Smart Incremental backup will automatically add all the files necessary to restore each backed-up source.
Retrospect uses a separate catalog file (usually stored in
/Library/Application Support/Retrospect/ on the Retrospect server) to keep track of all of the files and folders in a Media Set. You can think of the catalog as an index or table of contents of the files on the backup media. The catalog lets you view the contents of a Media Set without requiring the media to be inserted in the backup device, greatly speeding up finding and retrieving files.
A catalog file is required for all operations that copy files to and from a Media Set. Retrospect can repair damaged catalogs, using the Repair button in the list view toolbar under Media Sets. If the catalog is lost or damaged too severely for the repair operation, Retrospect can rebuild it by reading and reindexing the media.
Retrospect can back up any drive that can be mounted on the Macintosh desktop, whether that drive is local or a shared networked volume.
Retrospect Clients extend the backup and restore capabilities of Retrospect to other computers on your network. A computer equipped with the Retrospect Client software is known as a Retrospect client computer, or simply a client. Retrospect can back up clients on the network without the need for installing file servers, starting file sharing, or mounting volumes, and it does so with full administrator privileges on those systems.
ProactiveAI is the next generation of Retrospect’s Proactive scheduling engine. With ProactiveAI, backup scripts will optimize the backup window for the entire environment based on a decision tree algorithm and linear regression to ensure every source is protected as often as possible. BackupBot is Retrospect’s first step–with ProactiveAI, Storage Predictions, and 1-Click Backup–toward leveraging that information for better data protection.
Artificial intelligence might seem exotic and complicated, but if you break it down, many machine learning algorithm that fall under that category are straight-forward to understand. ProactiveAI is no different. Under the hood, ProactiveAI’s algorithm is a list of choices (a decision tree) that takes into account past events (linear regression).
ProactiveAI walks through the following decision tree algorithm to prioritize what to back up next:
Verify backup window: ProactiveAI only runs when it’s allowed to. To restrict the backup window, go to the script’s schedule.
Verify an execution unit is available: ProactiveAI only runs when an execution unit is available.
Ignore last backup time: Retrospect can back up every hour, every day, every Sunday, or any other schedule. As soon as ProactiveAI sees a new backup window (i.e. a new day), it will attempt to back up the sources. In contrast, previous versions of Retrospect would respect the time at which the last backup occurred. See "Backup Window" for more details.
Ignore unavailable sources: If a source is unavailable, Retrospect will not attempt to reach it again until every potentially available source has been contacted. This list includes Wake-on-LAN sources. See "Wake-on-LAN" for more details.
Prioritize by next day: For all available or potentially available sources, Retrospect divides them into buckets for what day they are scheduled to be backed up next.
Using a future date might seem strange, but it can be in the past as well. This sorting algorithm ensure Retrospect prioritizes initial backups and then overdue backups. Think of it as last backup day combined with the script’s schedule. As an example, Script A with weekly backups and Script B with daily backups would calculate the next backup date differently.
Prioritize by last time checked: When Retrospect reaches out to a source, it marks that time in its configuration. ProactiveAI uses this time to ensure it doesn’t re-check sources that it already checked but couldn’t find, so that the script can get through the entire list of sources before circling back.
Prioritize by the last backup’s duration: Now that Retrospect is down to sources within the same day of priority, ProactiveAI sorts them using a linear regression algorithm based on the last backup’s duration. Sources with faster previous backups will be backed up sooner than sources with slower previous backups.
As a real-life example, incremental backups of email services are fast, so those would be prioritized over a longer server backup. Because of this sorting, Retrospect will protect more sources throughout the day, but if a long server backup does not happen on a given day, its backup will be automatically given higher priority because its next backup was the day before.
Our Engineering team experimented with more data points in the linear regression, but the resulting sort order was too prone to hysteresis. In other words, if Retrospect includes more past data, including backup durations that were anomalies, the future prioritization continued to be affected for longer than we thought was useful.
Default to prior order: If there is no duration, ProactiveAI uses the prior order. For instance, if it’s the first set of backups, they will occur as sources are available.
Connect to the next source: Retrospect will attempt to back up the selected source. If it’s not available, Retrospect marks that time and moves on. If Retrospect times out and the client and script have Wake-on-LAN (WAL) set, Retrospect sends a WAL packet, waits three minutes, then tries to connect again. If that connection times out, Retrospect marks the sources as unavailable and moves on.
Record next backup date: After a successful backup, Retrospect marks the next backup date for the source and moves on. As discussed earlier, this future date varies based on the script’s schedule.
Retrospect begins a backup as soon as a source becomes available. If Alice’s laptop was backed up at 2:30pm yesterday, ProactiveAI will attempt to back up her laptop as soon as it comes online today, even if that’s before 2:30pm.
This change corrects a long-standing issue with drift, and for existing customers, this new schedule represents a significant change from previous versions. In the past, Proactive used the "Last Backup Time" to determine when to next back up a source. If Alice’s laptop was backed up at 2:30pm yesterday, an older version of Proactive would wait until 2:30pm today to attempt the next backup, regardless of whether it was idle and Alice’s laptop was available.
Alice might have only opened her laptop at 2:30pm yesterday, but ever other day, she is online at 9am. Without this change, every future backup would have been at 2:30pm or later until she missed a day. Instead, her laptop is protected as soon as it’s available for each backup window. For fine-grain scheduling, customers can use multiple ProactiveAI scripts with different schedules.
ProactiveAI is better optimized for handling Wake-on-LAN (WAL) sources. If the source has WAL enabled or the script has WAL enabled, ProactiveAI will include WAL packets in its operation. For each WAL source, Retrospect attempts a connection. If that times out after one minute, it sends a WAL packet, waits three minutes, and then attempts another connection. If that times out after one minute, ProactiveAI marks the source as unavailable, moves on, and will not attempt another connection until it has contacted each subsequent source.
In previous versions, Proactive would continue to attempt to wake up unresponsive or absent machines. For environments that had many laptops or otherwise unavailable machines, this workflow meant that Retrospect would spend a disproportionate amount of time looking for machines instead of backing up available machines.
ProactiveAI includes detailed logging to to understand the choices it’s making to optimize the backup window:
Engine Log Level 4: What ProactiveAI is doing
Engine Log Level 5: What ProactiveAI is considering
See Advanced Logging Options for details about enabling logging.