Veeam Backup to AWS Virtual Tape Library (VTL)

When I was still working as a backup administrator, one of my biggest day-to-day concern was finding enough space to save all the "archive" level data (a.k.a. infinite retention junk). This kind of data can over time consume absolutely terrifying amount of storage space.

Veeam Backup to AWS Virtual Tape Library (VTL)

In one of the environments I was managing, the problem was simple - the archive space was running out and the management was tired in re-inventing the wheel every couple of years by basically re-building their archiving storage system. They wanted to switch to tape...

...without most of the costs associated with "going tape". They refused to purchase additional equipment (tape drives, racks, libraries etc.) and demanded that the storage previously used by backups (NAS) was returned to production environment asap. Crazy? Yep.

Anyway, I did not get a chance to see how this story ended as I've switched jobs and became a mostly VMware/Linux guy I am today. Having that said, last week I've decided to come back to this problem (as a thinking exercise) and try figuring out what would be a perfect solution in this scenario.

The answer turned out to be... virtual tape.

Cost Analysis

If we look at it purely from a cost perspective, it is as follows:

  • Virtual Tape Storage (EC2) - used as the initial storage when the tape backups are uploaded before they go to Glacier - $0.025/GB/month
  • Virtual Tape Storage Archive (Glacier) - used when tapes are archived - $0.005/GB/month

On top of that, there is also the cost of the gateway appliance itself. The way this works is, you are billed for the first 100GB ($125.00) written to the gateway and anything after that is free of further charges or as Amazon puts it:

Up to a maximum of $125.00 per gateway per month. The first 100 GB per account is free.

So in summary, we have a cost of $125.00 per gateway, per month if we'd keep uploading more than 100GB monthly to AWS (which the client we're discussing obviously did).

If you add this all up, the cost of AWS VTL is just too good to be true. I was mostly surprised by how cheap the gateway itself is. If you'd be interested in the details, you can go check out their full pricing here.

Setup

In this setup guide, we're going to assume the client's backup is based on Veeam Backup and Replication 9.5 U3 and VMware vSphere 6.7

We are also going to assume the client has already an access to AWS account.

1. Download and configure the VTL appliance

  1. After logging into the AWS, open up the services menu
  2. In the AWS services menu, select the Storage Gateway service
  3. Create Gateway wizard should automatically pop-up at this stage, if it won't, select Create Gateway option
  4. Select Tape Gateway
  5. Select VMWare ESXi and download the image.

The image will be saved in .ovf format. The deployment process is pretty straight forward:

  1. Log in to your vSphere environment (vCenter)
  2. Right-click on the datacenter which should be hosting the appliance
  3. Select `Deploy OVF Template"
  4. Follow the steps in the deployment wizard to finish

Once the image has been deployed to your environment, open up the console and finish up the setup. There are two most important things to configure here:

  1. The default username sguser and the password is sgpassword; you might want to change these to something more secure
  2. The appliance probably received an IP address from DHCP. The VTL Gateway has to be in the same subnet as the Veeam B&R server, so you might need to adjust that.

With this configuration part finished, we can move onto the storage stuff. The VTL Gateway requires at least two virtual disks (caching & upload buffer) and you need to add those manually from vSphere. I recommend creating at least a 512GB vmdk for cache and 256GB vmdk for upload buffer.

  1. Exit the VTL console screen and right click on its object (VM) in the vCenter inventory
  2. Open the hardware tab and select Add new hardware
  3. Select Add new disk from the list of available hardware
  4. Specify the size of the new disk and adjust other relevant settings
  5. Repeat this operation for every disk needed.
  6. When finished adding the disks, select Save Configuration

Once the drives are added, we can reboot the appliance.

After a few minutes, if we go back to the AWS console, it should be asking for an IP address to connect to. Enter the LAN IP address of the appliance - do not worry about ports configuration, the connection will be established over HTTPS and then always initiated from within your network.

After configuring a connection between the local appliance and AWS, we should finally create some tapes. As I didn't want to invest too much into this project, I've created only three 100GBs tapes, but you can go higher if needed. The 100GB tape is the smallest available option.

2. Configuring Veeam

  1. Login to your Veeam management server,
  2. After logging in, go to control panel and select iSCSI connections,
  3. Add your AWS VTL Gateway Appliance (by IP address) and select quick connect,
  4. Veeam should think for a seconf and finally discover multiple iSCSI targets,
  5. Connect to every discovered target. These are your tape drives,
  6. After connecting to all targets, close the iSCSI connections screen,
  7. Back in the main Veeam console, select Tape Infrastructure and add a new tape server
  8. Run through the wizard - it's a very straightforward and shouldn't cause any problems.

Your new tape server should now be ready for use in Veeam. Before we'll be able to run any backups or backup copy jobs to it, we'll still need to do few more things...

3. GFS Media Pools

GFS Media pools control how we are going to store our tape media. This is a very important step in our setup, as we'll be using both AWS EC2 and AWS Glacier so these rules are critical for configuring proper load balancing between these two storage pools.

  1. Firstly, log into the Veeam Console and select Tape Infrastructure, if you're not already there,
  2. Right click on Media Pools under Tape Infrastructure. Select Add GFS Media Pool
  3. On the next page, select the free tapes which should belong to that particular pool,

As AWS is going to charge per tape in case of an eventual recovery operation, it is best to split the pools in some groups based on the type of the VM it is going to be a backup repository for. This way, you won't have to pull massive amount of tapes in case of a single VM recovery. At the same time, if the backup going to tape is of DR type (i.e. backup of an entire Active Directory environment), it might be better to not split the pool as recovering tapes from different pools takes longer than recovering multiple tapes from a single one.

  1. After selecting all the required tapes, click Next
  2. On the following page, select the retention settings needed and then open up the Advanced window,
  3. In the Advanced window, there is one very important setting which we want to enable - Move all offline tapes into the following media. Selecting this will make Veeam move archived (finished) backups to AWS Glacier.
  4. Next, close the Advanced windows and proceed to the next page,
  5. On the last page - Encryption - choose which kind of extra encryption Veeam should use for this storage pool. It is recommended to use some kind of encryption as backing up to AWS is considered an off-site backup.

4. Creating GFS Tape Jobs

Finally, as we created our media pool and vault, we can now move on to creating a tape backup job.

A tape backup job will start and then wait for the next backup to occur, just like a regular backup copy job - this means, the tape controler checks every hour for new, complete (i.e. not in progress) backup files. Once a backup job finishes, the tape job controller will send an API request to the library and the job will start up.

  1. To create a new Tape Backup job, select Tape Job and New Backup to Tape Job from the main window of the Veeam Console,
  2. On the next page, under Backup Files, select the backup job(s) which should be saved onto tape (remember to select the last full or differential backup)
  3. Click Next and on the following page, select the media pool created in the previous step,
  4. On the next page, select both the Eject Media upon completion and Export the following media sets upon job completion. Skipping this part will prevent the tapes from being transferred over to AWS Glacier.
  5. Move on to the last page and configure the schedule.

Having all of that done, we can just relax and wait for the backup to be saved onto our virtual tapes in the AWS:

What do you think?

I must admit, when I've first heard about backing up archive data to cloud or "virtual tapes", I wan't too optimistic about it. Seeing the price of cloud storage - reasonable for anything under 1TB and straight-out crazy for anything else - I was certain that tape jobs can only happen locally.

Well, not for the first time and definitely not last, I was wrong in my predictions. Backing up to AWS seems like a reasonably priced alternative if the backup infrastructure in the environment has access to fast enough uplink. Obviously, it's not for everyone (remote sites, huge data backups etc.), but certainly it's perfect for clients such as the once I've described in the intro - desperate to off-load the upkeep of the local infrastructure and adamant about stabilizing the costs on monthly basis.

Have you ever used any virtual tapes in your backup environment? Send me your opinions @wilk_it_wizard or #sshguru

Related Article

/*
*/