SQL Server - Migration and BCDR Guide

August2016

Version 3.0a

Prepared by

1

SQL Server - Migration and BCDR Guide, , Version 3.0a
Prepared by
"Migration and BCDR - SQL Server Guide.docx", Template Version 1.0

Table of Contents

1Introduction

1.1Purpose

1.2Scenario Overview

2Scenario 1: On-Premises SQL Server Protected by Microsoft Azure Site Recovery Service (Hyper-V Replica)

2.1High-Level Scenario Overview

2.2Dependencies

2.3Design and Deployment Considerations

2.4Configuration and Walkthrough Steps

2.4.1Setup On-Premises SQL Server for Replication

2.4.2Create a SQL Server Recovery Plan

3Scenario 2: On-Premises SQL Server Protected by On-Premises Always-On Availability Groups and Microsoft Azure Site Recovery Service

3.1High-Level Scenario Overview

3.2Dependencies

3.3Design and Deployment Considerations

3.4Configuration and Walkthrough Steps

3.4.1Create a SQL Recovery Plan

4Scenario 3: SQL Server Backup with Microsoft Azure Storage

4.1High-Level Scenario Overview

4.2Concepts

4.3Design and Deployment Considerations

4.3.1Backup to URL

4.3.2VHD Upload

4.3.3SQL Server Backup to Azure Tool

4.4Configuration and Walkthrough Steps

4.4.1Backup to URL Walkthrough

4.4.2Backup to URL with SQL Server 2014 Walkthrough

4.4.3VHD Upload Walkthrough

4.4.4Backup Tool Walkthrough

5Scenario 4: On-Premises SQL Server Protected by On-Premises and Microsoft Azure AlwaysOn Availability Groups

5.1High-Level Scenario Overview

5.2Dependencies

5.3Design and Deployment Considerations

5.4Configuration and Walkthrough Steps

5.4.1Step 1: Setup File Share for Cluster Witness, Failover Clustering feature

5.4.2Step 2A: Configure Firewalls, Azure Endpoints – Using Cloud Service

5.4.3Step 2B: Configure Firewalls, Azure Endpoints – Using Internal Load Balancer

5.4.4Step 3: Configure Cluster

5.4.5Install SQL Server on the on-premises and Azure servers

5.4.6Enable AlwaysOn High Availability

5.4.7Configure SQL AlwaysOn Availability Group

5.4.8Troubleshooting Always On Availability Creation

6Scenario 5: SQL Server Protected by Database Mirroring with the Mirror in Azure VM

6.1High-Level Scenario Overview

6.2Design and Deployment Considerations

6.3Configuration and Walkthrough Steps

1

SQL Server - Migration and BCDR Guide, , Version 3.0a
Prepared by
"Migration and BCDR - SQL Server Guide.docx"

1Introduction

This guide is intended to provide technical details for supporting the planning and configuration of Business Continuity and Disaster Recovery (BC/DR) and Server Migration, for workloads which use SQL Server. It includes sections outlined by technical scenario and is generalized to support several types of workload deployments.

1.1Purpose

The purpose of the guide is to support lab and production configurations during customer engagements. It may not align exactly with the customer infrastructure, but the aim of the document is to simplify and outline common configuration steps associated with each scenario.

This guide is intended to be used in conjunction with the following Services Offers:

1.Business Continuity and Disaster Recovery

2.Consolidation and Migration

1.2Scenario Overview

The aforementioned options can be applied to each workload using a series of scenarios. For SQL Server, the following scenarios are defined:

1.On-Premises SQL Server Protected by Microsoft Azure Site Recovery Service: This scenario outlines protecting a standalone On-Premises SQL Server with Microsoft Azure Site Recovery Service.

2.On-Premises SQL Server Protected by On-Premises Always-On Availability Groups and Microsoft Azure Site Recovery Service: This scenario outlines protecting On-Premises Always-On Availability Groups with Microsoft Azure Site Recovery Service

3.SQL Server Backup with Microsoft Azure Storage: This scenario outlines scenarios related to backup and restore of SQL Server databases to Microsoft Azure Storage, and are equally applicable to SQL Server running on-premises and in Microsoft Azure IaaS Virtual Machines.

4.On-Premises SQL Server Protected by On-Premises and Microsoft Azure Always-On Availability Groups: This scenario outlines DR capabilities for a SQL Server database or a group of related databases are provided by using AlwaysOn Availability Groups, a SQL Server feature that was introduced in SQL Server 2012.

5.SQL Server Protected by Database Mirroring with the Mirror in Azure VM: This scenario outlines DR capabilities for a SQL Server database in a production datacenter, which can be either on-premises or in the Azure.

Scenario / Replication Technology / Source / Destination
Scenario 1 / Hyper-V Replica / Hyper-V / Hyper-V
Scenario 2 / Always-On / Hyper-V / Hyper-V
Scenario 3 / Azure Backup / SQL Server on-premise
Microsoft Azure IaaS VM / Azure
Scenario 4
Scenario 5

What about Migration to IaaS?

What about Migration to PaaS?

While these do not encompass all of the potential possible scenarios one could establish for BC/DR of SQL Server using cloud infrastructures, it provides a basis for the most common scenarios which would be encountered. These scenarios will be expanded as newer data and cloud platform capabilities come available.

The following sections provide step-by-step examples of how these scenarios can be established in a cloud environment. This documentation assumes that the reader has access to and a working knowledge of the Windows Server Hyper-V and System Center private cloud environment and has access to a Microsoft Azure subscription.

Tips for migrating SQL Server databases:

  • If you can get enough of an outage window (hours in all liklihood), you can shutdown the SQL Services, let ASR catch up on the replication and then migrate.
  • Another thing that is helpful with SQL Server is reconfigure TEMPDB to a separate, dedicated disk if it isn’t already. You can then exclude this disk from ASR replication. In some of the systems that I’ve done, a large part of the disk churn was TEMPDB and this allowed us to use ASR pretty much as you would use it for other applications.
  • Please note that exclude is currently available for VMware – we are building for Hyper-V as well. Particularly for migration scenario, (i.e. short term, one time) you can use the Physical scenario regardless of the underlying hypervisor, cloud, hardware vendor, etc. Then you would have exclude capability.

2Scenario 1: On-Premises SQL Server Protected by Microsoft Azure Site Recovery Service (Hyper-V Replica)

This section describes the scenario of protecting a standalone On-Premises SQL Server with Microsoft Azure Site Recovery Service.

2.1High-Level Scenario Overview

Microsoft Azure Site Recovery Service coordinates and manages the protection of Hyper-V virtual machines located in private clouds on Virtual Machine Manager (VMM) servers in System Center 2012 Service Pack 1 (SP1), or System Center 2012 R2. Site Recovery Serviceis used to orchestrate failover of these On-Premises standalone virtual machines from one on-premises Hyper-V host server to another. Site Recovery Service vaults in Microsoft Azure are used to store your protection configuration.

Figure 1: High-level Solution Architecture

2.2Dependencies

Implementation of Microsoft Azure Hyper-V Recovery Manager before setting up the Recovery plan.

2.3Design and Deployment Considerations

Microsoft Knowledge Base Article 956893 notes that SQL Server on Hyper-V Replica is supported provided the EnableWriteOrderPreservationAcrossDisksflag is set. For pre-Windows Server 2012 R2 Virtual Machines you must manually set this flag. For Windows Server 2012 R2 Virtual Machines this flag is already set when you enable replication.

If you want virtual machines be connected to a VM network after failover, you configure network mapping in Site Recovery Service. Before beginning deployment verify the following:

6.The virtual machine in the cloud on the source VMM server is connected to a VM network. That VM network should be linked to a logical network that is associated with the cloud.

7.The target cloud on the destination VMM server has a corresponding VM network. That VM network should be linked to a corresponding logical network that is associated with the target cloud.

Note: In the case of a Test Failover, a VM Network that is not mapped to a logical network in the target cloud will be needed.

To learn more about network mapping read Prepare for network mapping in the Planning Guide.

2.4Configuration and Walkthrough Steps

2.4.1Setup On-Premises SQL Server for Replication

Note: Skip this Section if the On-Premises Virtual Machine is running Windows Server 2012 R2.

  1. Open Powershell as an Administrator and importing the Hyper-V module by typing Import-Module Hyper-V.
  2. Type, Set-VMReplication –VMName > -EnableWriteOrderPreservationAcrossDisks 1 and press Enter.

2.4.2Create a SQL Server Recovery Plan

  1. From the Microsoft Azure subscription, in the Recovery Services Node, click the previously created vault, and select Recovery Plans, and then select Create Recovery Plan.
  1. In Specify the Recovery Plan Name and Target pane, specify a descriptive name in the Name textbox. Select the Source and Target VMM servers (created in earlier steps) then select the Next arrow.
  1. In Select Virtual Machines pane, select the virtual machines which will be added to the recovery plan then select the OK button.

3Scenario 2: On-Premises SQL Server Protected by On-Premises Always-On Availability Groups and Microsoft Azure Site Recovery Service

This section describes the scenario of protecting On-Premises Always-On Availability Groups with Microsoft Azure Site Recovery Service.

3.1High-Level Scenario Overview

Microsoft Azure Site Recovery Service coordinates and manages the protection of Hyper-V virtual machines located in private clouds on Virtual Machine Manager (VMM) servers in System Center 2012 Service Pack 1 (SP1), or System Center 2012 R2. Site Recovery Service is used to orchestrate failover of these On-Premises Always-On Availability Group virtual machines from one on-premises Hyper-V host server to another. Site Recovery Service vaults in Microsoft Azure are used to store your protection configuration.

Unlike Scenario 1, where Hyper-V replication was used to create VM replicas for the entire application stack including SQL Server machines, in this scenario SQL Server is instead protected by using AlwaysOn Availability Groups. SQL Server machines in the production datacenter are excluded from Hyper-V replication (and may in fact be physical machines). For each SQL Server database used by an application and placed into an availability group on a SQL Server instance in the production datacenter, an availability replica is created on a SQL Server instance in the DR Datacenter.

The benefits of using Availability Groups vs. Hyper-V Replicas include:

  • Physical SQL Server machines can be used
  • Synchronous replication may be possible (at cost of reduction in performance)
  • Failover times measured in seconds
  • Replica is accessible for reads
  • Multiple replicas are supported
  • Replication to Azure VMs is supported

The benefits of using Hyper-V Replicas vs. Availability Groups include:

  • A common and consistent DR approach for all machines in the application stack
  • Application agnostic approach, reusable across different types of applications
  • Common Active Directory infrastructure between production and DR is not required
  • Compute costs and additional SQL Server license costs in the DR datacenter are avoided

Using AlwaysOn Availability Groups to provide DR capabilities for SQL Server is described in detail in Scenario 4. The distinguishing feature of using this approach in conjunction with using Hyper-V replication to establish DR capabilities for other machines in the application stack is that the entire site failover can be orchestrated with the Microsoft Azure Recovery Services. This expands the use cases of Recovery Services to include Tier 2 and some of Tier 1 SQL Server workloads, which often require the features and performance of Availability Groups, rather than Hyper-V Replicas, to implement DR capabilities.

At a high level, a failover to a DR Datacenter in this approach consists of the following steps, orchestrated by the Site Recovery Service:

  1. Shut down application VMs in the production datacenter.
  2. Fail over SQL Server availability group(s) to the DR Datacenter.
  3. Start application VMs in the DR Datacenter.

In step 2, a PowerShell script is used to connect to the SQL Server instance hosting the secondary replica in the DR Datacenter, and initiate the failover of the availability group to that replica.

Figure 2: High-level Solution Architecture

3.2Dependencies

Implementation of Microsoft Azure Site Recovery Manager before setting up the Recovery plan.

3.3Design and Deployment Considerations

As mentioned in the scenario overview a PowerShell will be needed to conduct the AG Listener failover. In the simplest case, this PowerShell script may be similar to the following:

Import-Modulesqlps

$DriveName="SqlDrive"

$SqlInstanceRoot="SqlServer:\Sql\SQLMachineName\SQLInstanceName"

New-PSDrive$DriveName-PSProviderSqlServer-Root$SqlInstanceRoot

cdSqlDrive:\AvailabilityReplicas

dir|Switch-SqlAvailabilityGroup

This assumes that the secondary replica in the DR Datacenter is a synchronous replica, allowing a failover without data loss. In reality, using a synchronous replica in a remote datacenter for DR purposes is rarely feasible, due to major performance impact that a synchronous replica connected over a WAN would have on the SQL Server workload in the production datacenter.

A much more likely scenario is when the secondary replica in the DR Datacenter is an asynchronous replica. However, in that case, a failover to that replica will likely incur some data loss, and therefore must be explicitly forced. In this case, the PowerShell script has to be adjusted as follows:

Import-Modulesqlps

$DriveName="SqlDrive"

$SqlInstanceRoot="SqlServer:\Sql\SQLMachineName\SQLInstanceName"

New-PSDrive$DriveName-PSProviderSqlServer-Root$SqlInstanceRoot

cdSqlDrive:\AvailabilityReplicas

dir|Switch-SqlAvailabilityGroup-AllowDataLoss-Force

Another assumption in the above scripts is that the account executing them can use Windows Authentication to connect to the SQL Server instance (which implies common Active Directory infrastructure), and is a member of sysadmin server role on that instance. If Windows Authentication cannot be used, then the script may be modified as follows to use SQL Authentication:

Import-Modulesqlps

$DriveName="SqlDrive"

$SqlInstanceRoot="SqlServer:\Sql\SQLMachineName\SQLInstanceName"

$Login="SQLAdmin"

$Pwd=ConvertTo-SecureString-AsPlainText"P@ssword!"-Force

$Cred=New-ObjectSystem.Management.Automation.PSCredential-ArgumentList$Login,$Pwd

New-PSDrive$DriveName-PSProviderSqlServer-Root$SqlInstanceRoot-Credential$Cred

cdSqlDrive:\AvailabilityReplicas

dir|Switch-SqlAvailabilityGroup-AllowDataLoss-Force

Here, SQLAdmin is the name of a SQL Authentication login that is a member of the sysadmin server role.

To be clear, the scripts above are simplified examples. The actual failover script will likely be much more complex. Among other things, it may need to handle multiple SQL Server instances, multiple availability groups on a single SQL Server instance, implement a decision tree based on the current state of the available secondary replicas, attempt to modify the mode of the secondary replica to synchronous commit prior to failover in order to avoid data loss, etc. Above all, the actual failover script must be thoroughly tested, and maintained as the application and infrastructure landscape changes.

In general, performing a forced failover of an availability group can be a complex task, with many prerequisites, limitations, and caveats (getting familiar with the linked documentation topic is a must). Automating this task with a script may be possible, especially for relatively simple deployments of SQL Server. However, for many real life enterprise applications, the complexity of the script and the required maintenance and testing effort will be high. In those cases, it is advised to weigh that complexity against an approach where the availability group failover is performed manually by qualified personnel, after accessing the current state of the application and various infrastructure components during a DR event, and making informed decisions that optimize RPO and RTO.

3.4Configuration and Walkthrough Steps

3.4.1Create a SQL Recovery Plan

Perform the following steps in Microsoft Azure

  1. From the Microsoft Azure subscription, in the Recovery Services Node, click the previously created vault, and select Recovery Plans, and then select Create Recovery Plan.
  1. In Specify the Recovery Plan Name and Target pane, specify a descriptive name in the Name textbox. Select the Source and Target VMM servers (created in earlier steps) then select the Next arrow.
  1. In Select Virtual Machines pane, select the virtual machines which will be added to the recovery plan then select the OK button.
  1. Once complete the status bar will show that the recovery plan was successfully created. In the status bar, click Customize to modify the recovery plan.
  1. In the case of SQL Always on Availability groups, you need to implement a script to redirect the AG Listener to the new primary as called out above. This requires the creation of a PowerShell Workflow based script that is implemented as an Azure Automation Runbook. In order to do this here is an example SQLFailover.ps1 to move the primary replica from on premises to Azure.

<#

======

Copyright 2014 (c) Microsoft Corporation. All rights reserved.

File:SQLFAILOVER.PS1

Purpose:Azure Automation workflow script to use with Azure Site Recovery to failover a SQL Always On Availability Group cluster

Version: 1.0

======

#>

workflow SQLFailover

{

Param

(

[Parameter(Mandatory=$false)]

[String]$LogFile = "C:\Temp\SQLFailover.log"

)

# Region FUNCTIONS

<#

======

Script Functions

InstallWinRMCertificateForVM - Installs the WinRM Cert from Azure to enable secure remote session over public endpoint

LogWrite- Write values to defined log

======

#>

# Initialize helper functions

Function LogWrite

{

Param ([string]$logstring)

Add-content $Logfile -value $logstring

}

Function InstallWinRMCertificateForVM()

{

[OutputType([System.Uri])]

param([string] $CloudServiceName, [string] $Name)

$VerbosePreference=”Continue”

Write-Verbose "Installing WINRM Certificate"

$WinRMCert = (Get-AzureVM -ServiceName $CloudServiceName -Name $Name | select -ExpandProperty vm).DefaultWinRMCertificateThumbprint

$AzureX509cert = Get-AzureCertificate -ServiceName $CloudServiceName -Thumbprint $WinRMCert -ThumbprintAlgorithm sha1

# Add the VM certificate into the LocalMachine

if ((Test-Path Cert:\LocalMachine\Root\$WinRMCert) -eq $false)

{

Write-Progress "VM certificate is not in local machine certificate store - adding it"

$certByteArray = [System.Convert]::fromBase64String($AzureX509cert.Data)

$CertToImport = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2 -ArgumentList (,$certByteArray)

$store = New-Object System.Security.Cryptography.X509Certificates.X509Store "Root", "LocalMachine"

$store.Open([System.Security.Cryptography.X509Certificates.OpenFlags]::ReadWrite)

$store.Add($CertToImport)

$store.Close()

}

# Return the WinRM Uri so that it can be used to connect to this VM

Get-AzureWinRMUri -ServiceName $CloudServiceName -Name $Name

}

# EndRegion FUNCTIONS

#------

$ServiceName = "corpcs"

$VMName= "CorpSQL2"

$Aggroup = "AG1"

$AdminPassword = ""#VM Admin Password

$AdminUserName = "" #VM Admin UserName

$path = "sqlserver:\sql\$vmname\default\availabilitygroups\$Aggroup"

$mode = "async"

#------

# Setup Azure Environment

#------

# Get the Azure connection asset that is stored in the Automation service based on the name that was passed into the runbook