Backup and Restore Exchange on VMware using NetBackup with GRT – Part1

Backup and Restore Exchange on VMware using NetBackup – Part1

In this blogging series we will explore how we can protect MS applications installed on VMware using NetBackup, this part will be the first and will explore how to configure NetBackup to protect Exchange 2013 installed on VMware ESXi 5.5

Although it might look straight forward, it is not that straight forward, you must understand some requirements and prerequisites in order to protect MS applications installed inside VMware VMs using Symantec NetBackup.

This lab will assume that you have:

  • 1 Domain Controller installed.
  • 1 Server running ESXi 5.5
  • 1 Server running Exchange 2013 installed on a VM on the ESXi host.
  • 1 Server running NetBackup Software for Windows (the configuration should be different on Linux installation).

 

So let us get started:

Introduction:

Symantec Netbackup 7.6.x can protect VMs and perform VM level backup using offloaded backups to VMware backup host, this backup method can accelerate backup and offload the backup load from the server.

Backups that are performed at the VM level are quiesced for VM consistent backup using VSS, additionally you can perform backup while the machine is running using VMware Snapshot technology.

Often times, MS gurus mix with using snapshots for VM protection and using Snapshots to perform backups, I believe that no one can explain it better than AbdulRasheed , he wrote a great article about it here http://www.symantec.com/connect/blogs/nuts-and-bolts-netbackup-vmware-virtual-machine-snapshots-backing-business-critical-applicatio

 So in summary, using snapshots for backups is not the same as using Snapshots to protect Exchange VM.

Now, although we can perform VM level backup at the host level without an agent installed on the ESXi or the VM by connecting vCenter directly, this backup method doesn’t support file level recovery for applications or GRT (you can perform regular file restore but not mailbox or database restore for SQL for example), so you will need to install Symantec NetBackup client on the Exchange or SQL VM in order to perform application aware backup.

Note: up to the date of publishing this article 13/4/2015, Symantec doesn’t support GRT for Exchange 2013, but GRT is supported for earlier versions either using VM policies or Exchange policies.

A word about SAN transport:

One element is to be aware of is the SAN transport option, traditionally if you backup a VM using the backup agent, you will transport the data over the IP network, but what if you have large data set…very large ones.

Then you can use the FC or SAN transport, where you can back up the data directly to the SAN over the SAN network (either to SAN storage or Tape Library).

In VMware, you can perform VM agentless or agent assisted backups and transport the data over FC which can give you increased speed up to 4 times, all what you need is to present the LUNs to the VMware backup host as offline LUNs and configure the policy to use SAN transport, nice haaa

Prerequisites:

  • Install the NetBackup Agent inside the VM
  • Install Symantec VSS provider
  • Install and configure NFS to browse backup images for GRT (for Exchange 2007/2010).

You can Refer to the documentation on how to perform the above steps.

NetBackup Configuration:

Assuming that you have everything configured including installing the Netbackup agent on the Exchange VM, you can start connecting to the vCenter:

08-04-2015 08-03-07
Add VMware Virtual Machine Server

 

Enter the FQDN of the vCenter server

 

Add the vCenter information and account with credentials to connect, and specify the backup host, in my case it is the master server:

08-04-2015 08-03-35

08-04-2015 08-04-05

one note: make sure to add the account using domain\username, because the GUI doesn’t accept user@domain.com

 

Now you can proceed with configuring a policy, so you can launch the policy configuration wizard:

08-04-2015 08-06-06

Specify a policy name

08-04-2015 08-06-18

In the Policy Storage select your storage destination, unless you have many of them and you want to load balance the backup jobs:

08-04-2015 08-06-38

In the virtual machine options, make sure to specify the VMware backup host, and enable the Exchange recovery.

08-04-2015 08-07-40

Note: in the primary VM identifier, you can select the VM host name, this requires the VM tools to be installed and DNS lookups (forward and reverse) are working, for the simplicity I like to choose VM Display name.

08-04-2015 08-18-38

In order to select a VM for application protection you must use query based to select the VM or you will get the error “Application Protection options for VMware policies are only valid when using the query option for virtual machine selection in the clients tab”, so you need to create a query in the VM selection to include the required VM

08-04-2015 09-25-31

In the backup selection, you must select full backups, you have to note that you can’t perform application level backups using incremental backups, backups of applications at the VM level must utilize full backups.

08-04-2015 09-24-46

Then specify the desired internal and retention and schedule.

08-04-2015 09-25-01

 

08-04-2015 09-25-13

If everything is configured correctly, you should start seeing the policy kicking in and snapshot is being taken and backups are being performed

08-04-2015 09-31-36

08-04-2015 09-32-33

08-04-2015 09-43-07

08-04-2015 09-43-35

Note regarding policy schedule:
if you right click on the policy and chose to run manual backup, the policy will kick in but the backup job will be equivalent to copy backups, meaning that no Exchange logs will be flushed, in order to perform application level backup, you will have to wait for the policy to kick in.

Policy Schedule and Policy Window

Another point that many NBU admins struggle with is the policy frequency, Window and retention, so let me elaborate on this:

Policy Frequency: is how often the policy will kick in, so in the above screenshot, the policy will start every week and this is equivalent to weekly backups.

Policy Window: is the window allowed for the policy to start, so depending on the configured policies, running policies and queued policies, a policy will start when the window comes, or wait until additional resources are freed if all the resources are not available (like a free tape for example).

If the window ended without available resources, the policy will not start and you will miss the backup window, once the policy start, it can exceed the policy window safely, policy window affects the policy start and will not end the policy.

In part 2 of this series we will see how we can perform a restore from the taken backup.

 

Response to SQLmag.com article “Should I Be Using SAN Snapshots as a Backup Solution?”

Should I Be Using SAN Snapshots as a Backup Solution?

last week on twitter, I spotted someone who posted a link to sqlmag.com article Should I Be Using SAN Snapshots as a Backup Solution? http://sqlmag.com/blog/should-i-be-using-san-snapshots-backup-solution

Personally, I have a very aggressive attitude toward false and ignorant information published by known authors on the internet. People listen and look to us as their source of information, this is why people like us (Known Authors, MVPs, Bexperts and vExperts) should take extreme care when publishing our opinions on the internet, because people will form various types of concepts and decision based on our feedback.

The above article is one of the rare articles backed up with extensive false information and massive SAN ignorance which deliver wrong concepts that leads to wrong decisions and wrong understanding.

I teamed up with Mark Arnold from Netapp; he is a fellow MVP and SAN sage and we responded back to the above article providing evidence and correct information to the readers, I provided generic information and Mark provided Netapp’s related information. Although I worked for a Netapp partner for 3 years, Mark let no chances for me to respond or add anything as his comments were very detailed. I am fortunate to work with him in this article.

Below is our detailed response, we hope that we will be able to correct some concepts.

Article:

My response is pretty basic. As the DBA I want to control the database backups and I don’t want to be using SAN Snapshots as a backup, because no matter what any vendor says a Snapshot is not a backup. Here are some of my reasons why I wouldn’t want to use SAN snapshots instead of database backups.

Mark Arnold:

With just about all solutions you as the DBA get full and complete control of your backups. You get to schedule them and keep them as maintenance jobs or Windows task manager jobs. You get to determine what you want to back up and how often. You get full control over Transaction Log backups for databases in Full recovery mode and also control over Simple recovery mode databases. You are correct, however, in stating that a snapshot is not a backup. You do something else with that snapshot. You either replicate it to a mirrored location, a long term retention location, a VTL or other mechanism. The point is that you take the snapshot as an efficient mechanism to take the final backup.

Mahmoud Magdy:

You speak about old SAN technologies, most of modern SAN are capable of integrating SAN snapshot into SQL by enabling full SQL backup via the SAN snapshot, some backup application can integrate with the SAN snapshot as well.

Article:

If there’s a problem and the database needs to be restored it’s the DBA that’s going to be thrown under the bus not the storage admin. If I’m going to be the one getting the blame, I’m going to be in control of the situation.

Mark Arnold

All solutions will allow you to do a restore in any way you deem it. If you look at, for example, SnapManager for SQL Server the DBA is the person in charge. The storage admin simply has no way to help you. It’s your tool and it’s your job to do the snapshots and restores. You will be responsible for telling the storage admin what you want done with the snapshots and that’s about your only interaction on this subject with the storage admin.

Mahmoud Magdy

From where did that come from, we are 2013 not 1980, modern SAN provide their own snapshot tools and applications to offload the work from the storage admin and empower application admins.

Article:

If one or two database pages get corrupt do I really want to restore the entire database to the last snapshot and loose all the changes since then? What is the page become corrupt a month ago and it wasn’t found until now? Now we have to loose a months worth of data to restore the corrupt page? I want the ability to restore just that page using the native page level restore features which require having actual SQL Server backups.

Mark Aronld

The process that you have in mind misses fundamentally basic storage capabilities. You are able to keep hundreds of snapshots on line or at worst near line. You are able to execute rapid cloning (NetApp call it FlexClone and the other vendors have their own name for broadly the same thing) so that you take your snapshot, make a zero-space read/write copy (clone) and then present it to the SQL server. You as the DBA then go into the temporarily created database and decide what you need to do. You most certainly do not do a full database restore and then roll-forward the logs.

Mahmoud Magdy:

SAN snapshots are not dump, they are integrated with the app so you can treat it as normal backup, some backup application can interpret the snapshot and extract the required data from the snapshot.

Article:

You are limited to the times when the snapshot was taken. If I want to roll the database to a point between two snapshots that isn’t possible. When it comes to point in time restore I need the ability to control to which exact point in time the restore happens. Telling me that it’ll be restored to what ever point in time it was when the snapshot was taken isn’t good enough a lot of the time. I need to be able to restore the a specific millisecond.

Response:

This is just flat-out wrong. Backup solutions can conduct transaction log backups and their GUI’s have the ability for you to take a given database backup and roll forward logs to any place you want. SnapManager for SQL will do that for you, as will the competitive but complimentary offerings from EMC etc.

Article:

If the LUN which the database on it fails all the backups are lost (they are snapshots not clones). And the excuse of but that won’t happen isn’t a valid excuse. Anything that can fail, will.

When you are taking snapshots you are assuming that the LUN hosting the original data will still be there. If that LUN goes away for some reason (failed disk, human error, etc.) we’ve just lost the snapshots as well which means we have no backups.

Response:

Murphy’s law does indeed apply. As has been said, snapshots aren’t backups until such time as you do something with that snapshot. Modern systems (FAS, VNX, Isilon etc. etc) all give advanced capabilities. Their RAID subsystems make the likelihood of a failure incredibly small – though not impossible! The point is that the storage systems do “something” with the snapshots so that, if the worst happens, you can get the data back.

Article:

5. The backups are now stored on the same device as the production data. If the device fails you’ve lost access to your backups until the device is restored.

Response:

Again, false in a correctly designed environment. Replicated data (RecoverPoint, SnapMirror et al) enable full recovery and return to service – both in a Business Continuity and Disaster Recovery context. Solutions such as MetroCluster even provide customers who have a zero downtime requirement a synchronously mirrored solution so that if the primary storage system fails the secondary will take up the load instantaneously and seamlessly to the client server.

Remember, snapshots aren’t backups until you do something with those snapshots and a properly designed environment does just that.

But what will fail, the path (which is redundant), the SAN which has redundant controller, the Disk which is in a RAID, what device failure we are talking about exactly.

Article:

6. The backups can’t be compressed.

Those snapshots are going to get large, fast. With native SQL Server backups I’ve got backup compression (assuming you are running the right version and edition of SQL Server) and I’ve got 3rd party tools which I can use to compress the backups with

Response :

Absolutley false. Three things come into play here. Snapshots don’t generally get large because they are, on the storage, incremental changes to the physical disk map. Only the delta’s are part of the snapshot and only the delta’s are replicated to the backup solution. Data can be deduplicated by walking the disk and identifying blocks with the same checksum and then walking the bits to make sure there hasn’t been a checksum clash which is much faster than any software compression/dedup implementation. Identical blocks are then deduplicated. Each vendor has their own way of doing this, some in real time, some on a schedule. Broadly the same concept in all cases. Finally, compression. Yes, modern storage solutions offer compression either instead of or in supplement to deduplication.

Article:

7. As the DBA I have no control as to how many backups are kept.

While it’s awesome that the storage array can keep 500 backups, that means that we are responsible for making sure that 500 additional copies of our data aren’t being lost, stolen, copied to another company, copied to another server, mounted to the wrong server, etc. One of the reasons that DBAs only want a small number of backups on site at any one time is so that we don’t have to keep track of so many backups and who’s touching them.

Response :

Absolutely false. The Snapshot-taking products give you absolute control over this. They let you say if and when the replication to BC/DR takes place, if and when the replication to longer-term storage takes place and how many snapshots you want to maintain on line / near line. You do however want to talk to the storage/backup teams as to how long they keep the data that they have streamed to tape (for example). That’s a trivial piece of interaction with your colleagues. The take-away here is that you get to determine how many of these backups (quantity and/or days back) you want to keep under your direct control (for cloning, restore purposes etc).

Article:

8. Taking a recoverable snapshot requires pausing the IO within the SQL Server every time the snapshot is taken which can lead to inconsistent performance for the end users.

In order to snap the databases to get these database snapshots we have to checkpoint the database and pause all IO while the snapshot is being taken. For any users who are writing to the database while this is happening they will see their sessions hang for up to 10 seconds while this is happening. They then complain to the DBA that the database is slow when in fact it’s the snapshot which is causing the problem.

Response :

All vendors, including Microsoft do their snapshots using the same API’s. The API’s are designed to prevent the behaviour you describe. What you have written, in the way that you wrote it, is pure scaremongering. remember this is 2013.

Article:

9. If there’s a problem with a backup there’s no way to know without attaching the backup to a SQL Server, usually after there’s been a major problem. With native SQL backups I can easily restore the backups to another server to test them rolling transaction logs forward as I see fit.

When we take backups Response that you don’t have a good backup until that backup has been restored. This means that someone needs to take the backup, restore it to the SQL Server and verify that the database can be restored. With the native backups I can do this very easily and roll the logs forward as much as I’d like, all without any risk of performance impact to the production systems. When taking snapshots as backups we now have to attach every snapshot to test it as each backup is totally independent. This requires attaching the snapshot to another server and attaching the databases, which depending on how much data is in the transaction logs as active and needing to be rolled forward or backward could put a lot of stress on the production disks which are being shared with the snapshot (see number 4 above).

Response :

See previous comments about cloning a zero-space copy to see if your backup was any good. You are correct in that a backup is worthless until you have proven its viability. The aforementioned technologies afford you those capabilities and do so without consumption of additional disk space and entirely under your control. All of these activities can be done on the data that has been replicated to the BC/DR site or other named spindles. None of the work has to be done on the production spindles. Again, this point (9) is either scaremongering or a demonstration of a misunderstanding of available technologies.

Article:

10. If I want to encrypt the backups, I don’t have that option with SAN snapshots.

Given that the database backups will at some point be leaving the secure data center they need to be encrypted so that if the tapes are lost the database backup is useless to who ever finds the tapes. As the SAN snapshots can’t be encrypted this means that we have to rely on the encryption process within the tape backup vendor who may or may not be doing encryption correctly, and they may or may not put the keys in the same place as the backup. While SQL Server doesn’t have an encryption option (other than TDE) as a native feature there are several third party backup products which can encrypt the database backups as they are taken which are known to be secure.

Response :

Completely false. The snapshots are simply one’s and zero’s on disk. The SAN vendor doesn’t care about this. You as the SQL professional are responsible for this. If you did a clone of that snapshot to another server it’s useless without the keys for decryption that you provide. In actual fact you are doubly wrong because storage vendors can do disk level encryption so snapshots can be encrypted because the physical disks are encrypted, which is way much efficient and faster than any software implementation. Don’t, however, go implementing both levels of encryption without discussing this within your infrastructure teams.

Article:

In conclusion, I usually recommend that my SQL Server clients do not use database snapshots. Hopefully if you are being pushed into using SAN based backups like this person was, you can use this as some reasons not to.

Mark Arnold:

Of course, everyone is entitled to their opinion but those sharing their opinion ought to seek out some help from those people who have an understanding of some of the capabilities that the storage vendors can offer. Your article does read like a laundry list of scary stories that would lead an inexperienced admin down the route that DAS is the only way forward, or that, at the very least, not to do snapshots. That would be a mistake because, as has been pointed out, much of this article is false, but also because the advanced cloning and replication tools afford the admin a rapid way to present data to test & dev type servers without consuming additional disk, wasting time and spending cycles liaising with colleagues in the storage or backup teams who are busy doing their own work.

Mahmoud Magdy:

You are correct if your assumption and information is correct or this article was published 10 years ago; the article was built over total false information and ignorance of modern SAN storage capabilities, I have issue with this article as it speaks to thousands and millions of people following that respectable site and with such amount of false information, it caused more confusion and deviated from the initial intention.

Final Word:

I was fortunate in my life to realize that the world is not Microsoft only, I am a Microsoft lover and fan; no doubts, but Microsoft is built over other technologies like servers, network and SAN. Thus, I recommend to every Microsoft consultant and architect out there to learn those technologies or even understand the basics to be able to judge and build the proper integration decisions, not to repeat the mistake done in the above article.