SSkype for business pool failover & failback

Version / Date / Author / Change Description
1.0 / 15Jun 2017 / Shankar Paulraj / Initial draft
1.1 / 17 Jun 2017 / Shankar Paulraj / Include Failover logs
1.2 / 29 Aug 2017 / Shankar Paulraj / Include Pool Status check commands

Contents

Summary

Active CMS location

Edge Federation Route

DNS Consideration

Response Group

PSTN Gateway Configuration

Lab Topology

Planned Failover from sgpool to klpool

Verify Pool State

Failover CMS

Reconfigure Edge Federation Route

Frontend Pool Failover

Verify Pool Failover

Failback sgpool

Summary

The guide attempts to details the steps involved in site failover and failback.

Number of factors needs to be considered before site failover,

Active CMS location

During a pool failover that involves the pool hosting the Central Management store, you must fail over the Central Management store before you fail over the Front-End pool.

Edge Federation Route

For federation relationships with other organizations running Lync Server, inbound federation requests will continue to work as long as you have configured each Edge pool to have a different priority in your SRV records. Any federation requests that come to an Edge pool that is down will fail back and then connect to an Edge pool which is running.

Outbound federation is always set up through one published Edge pool or Edge Server in the organization. If this Edge pool has gone down, you must use Topology Builder to change the outbound federation route to use an Edge pool which is still running

DNS Consideration

To configure DNS to redirect Skype for Business Server 2015 web traffic to your disaster recover (DR) and failover sites, you need to use a DNS provider that supports GeoDNS. You can set up your DNS records to support disaster recover, so that features that use web services continue even if one entire Front End pool goes down. This DR feature supports the Autodiscover, Meet and Dial-insimple URLs.

Assuming the LAB environment does not have GeoDNS in place and the web services are pointing to source pool “sgpool”, you will need to manually reconfigure the DNS to point the web services to destination pool “klpool”, both internal & external DNS records needs to be considered.

Response Group

Response groups does not failover during pool failover. Perform regular backup of response group configuration. Restore them on the destination pool only after failover.

PSTN Gateway Configuration

Once your Front-end pool is failover, all SfB services are shut down for the source pool. Ensure the PSTN gateway can failover calls to destination pool.

Lab Topology

Singapore Edge pool has Federation turned on.

Outbound Federation traffic are routed through Singapore Site, using Singapore Edge pool.

Planned Failover from sgpool to klpool

The command “Invoke-CsPoolFailOver” will perform series of checks before performing the actual failover. Its best to run those checks manually and perform any topology reconfiguration required before attempting front end pool failover.

Verify Pool State

Verify that source and target pool are in active state.

Get-CsRegistrarConfiguration -Identity 'service:Registrar:klpool.mylab.com'

Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com'

Failover CMS

Check Backup service status
Find the Active CMS Pool location

Invoke-CsManagementServerFailover -WhatIf

Based on the output the sgpool hosts the active CMS, which means we will need to failover CMS before attempting to failover front end pool.

Failover CMS

Invoke-CsManagementServerFailover

Confirm CMS Failover

Invoke-CsManagementServerFailover -WhatIf

CMS failover was successful.

Reconfigure Edge Federation Route

Currently the federation traffic is routed though Singapore Edge. Perform below changes.

Enable Federation for Malaysia Edge pool

Modify the federation route, to use Malaysia Edge
Disable Federation for Singapore Edge pool
Publish Topology
Run local SfB setup program on all Edge servers. (If Topology Builder asks you to do so)

Frontend Pool Failover

Invoke-CsPoolFailOver -PoolFqdn sgpool.mylab.com

Skype for Business Server 2015 Deployment Log / Collapse All Actions
Action / Action Information / Time Logged / Execution Result
▼Invoke-CsPoolFailOver / Completed with warnings
└▼Fail over / 16/6/2017 12:42:25 AM / Completed with warnings
└ / Checking to confirm that the target pool klpool.mylab.com is in Active state. / 16/6/2017 12:42:26 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:klpool.mylab.com' / 16/6/2017 12:42:26 AM
└ / Checking the state of the source pool sgpool.mylab.com. / 16/6/2017 12:42:26 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:42:26 AM
└ / Checking the interface compatibility of front end services on machines belonging to the source pool sgpool.mylab.com. / 16/6/2017 12:42:27 AM
└ / Checking the interface compatibility of front end services on machines belonging to the target pool klpool.mylab.com. / 16/6/2017 12:42:27 AM
└ / Checking if Central Management Store is in the source pool sgpool.mylab.com. / 16/6/2017 12:42:27 AM
└ / Checking if the source pool sgpool.mylab.com is not a next hop for the access edge server. / 16/6/2017 12:42:27 AM
└ / Verifying the replication status of the back end databases between the two pools. / 16/6/2017 12:42:48 AM
└ / Flushing data to backup store from pool sgpool.mylab.com. / 16/6/2017 12:42:48 AM
└ / Synchronize the user store of target pool klpool.mylab.com, so that it has up-to-date data for users homed on source pool sgpool.mylab.com. / 16/6/2017 12:42:48 AM
└ / Backup-CsPool -PoolFqdn sgpool.mylab.com -SteadyState -Category UserData / 16/6/2017 12:42:48 AM
└ / Hydrating the data for routing groups on source pool sgpool.mylab.com to the front ends in pool klpool.mylab.com. / 16/6/2017 12:42:48 AM
└ / Sync-CsUserData -PoolFqdn sgpool.mylab.com -Target / 16/6/2017 12:42:48 AM
└ / Hydrating Routing Groups 2 out of 2 hydrated with 0 documents and 0 batches. / 16/6/2017 12:42:48 AM
└ / Skipping.....Flush storage service data into Steady State for the source pool sgpool.mylab.com. / 16/6/2017 12:43:08 AM
└ / Starting squid replication and confirming that the squid replicas on all the front ends on that pool are up-to-date. / 16/6/2017 12:43:08 AM
└ / Invoke-CsManagementStoreReplication -ReplicaFqdn SGFE01.mylab.com / 16/6/2017 12:43:08 AM
└ / Get-CsManagementStoreReplicationStatus -ReplicaFqdn SGFE01.mylab.com / 16/6/2017 12:43:08 AM
└ / Warning: Local Time: 16/6/2017 12:43:08 AM - Users of pool sgpool.mylab.com will have limited services now. / 16/6/2017 12:43:08 AM / Warning
└ / Putting user store on target pool klpool.mylab.com in read-only mode for all the failing over users, so that we can guarantee data consistency. / 16/6/2017 12:43:08 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:43:09 AM
└ / Putting user store on source pool sgpool.mylab.com in read-only mode for all the failing over users, so that we can guarantee data consistency. / 16/6/2017 12:43:09 AM
└ / Warning: Preparing for failover on pool sgpool.mylab.com. / 16/6/2017 12:43:09 AM / Warning
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:43:09 AM
└ / Flush the directory journal entries for Data MCU to the disk. / 16/6/2017 12:43:09 AM
└ / Setting the state as FailingOver in Central Management Store. / 16/6/2017 12:43:14 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:43:14 AM
└ / Set-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' -PoolStateFailingOver / 16/6/2017 12:43:14 AM
└ / Resynchronize the data from user store of source pool sgpool.mylab.com to User Store of target pool klpool.mylab.com. This step takes care of the small number of changes that might have come about after target pool was brought into steady state. / 16/6/2017 12:43:14 AM
└ / Backup-CsPool -PoolFqdn sgpool.mylab.com -FullBackup -Category UserData / 16/6/2017 12:43:14 AM
└ / Hydrating the data for routing groups on source pool sgpool.mylab.com to the front ends in pool klpool.mylab.com. / 16/6/2017 12:43:14 AM
└ / Sync-CsUserData -PoolFqdn sgpool.mylab.com -Target / 16/6/2017 12:43:14 AM
└ / Hydrating Routing Groups 2 out of 2 hydrated with 0 documents and 0 batches. / 16/6/2017 12:43:14 AM
└ / Inform target pool klpool.mylab.com about the failover, so that it can start servicing failed over users in full-mode. / 16/6/2017 12:43:34 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:43:34 AM
└ / Completing the failover process on pool sgpool.mylab.com. / 16/6/2017 12:43:34 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:43:34 AM
└ / Warning: Local Time: 16/6/2017 12:43:35 AM - Users of pool sgpool.mylab.com will have full services now. / 16/6/2017 12:43:35 AM / Warning
└ / Mark that source pool sgpool.mylab.com has failed over in Central Management Store. / 16/6/2017 12:43:35 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 12:43:35 AM
└ / Set-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' -PoolStateFailedOver / 16/6/2017 12:43:35 AM
└ / Skipping.....Flush storage service data for the source pool sgpool.mylab.com. / 16/6/2017 12:43:35 AM
└ / Shutdown RTCSRV service in source pool sgpool.mylab.com. / 16/6/2017 12:43:35 AM
└ / Reset-CsPoolRegistrarState -PoolFqdn sgpool.mylab.com -ResetTypeServiceReset -Force -NoReStart -ServicesStopDelayMins 10 / 16/6/2017 12:43:35 AM
└ / Shutdown the services in source pool sgpool.mylab.com. / 16/6/2017 12:43:55 AM
└ / Stop-CsWindowsService -ComputerName SGFE01.mylab.com -LeaveClsAgentRunning -NoWait -Verbose / 16/6/2017 12:43:55 AM
└ / Stop-CsWindowsService -ComputerName SGFE01.mylab.com -LeaveClsAgentRunning -Verbose / 16/6/2017 12:43:58 AM

DIV2,DIV1

Verify Pool Failover

Verify that the source pool “sgpool” failed over.

Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com'

Failback sgpool

Invoke-CsPoolFailBack -PoolFqdn sgpool.mylab.com

It is not mandatory to failoverCMS

Reconfigure Federation Route & Edge Federation if required.

Skype for Business Server 2015 Deployment Log / Collapse All Actions
Action / Action Information / Time Logged / Execution Result
▼Invoke-CsPoolFailBack / Completed with warnings
└▼Fail back / 16/6/2017 10:25:45 AM / Completed with warnings
└ / Make sure that source pool sgpool.mylab.com is in FailedOver or FailingBack state. / 16/6/2017 10:26:11 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:11 AM
└ / Checking the interface compatibility of front end services on machines belonging to the target pool klpool.mylab.com. / 16/6/2017 10:26:11 AM
└ / Checking the interface compatibility of front end services on machines belonging to the source pool sgpool.mylab.com. / 16/6/2017 10:26:12 AM
└ / Not stopping services for machines before failback for version check. / 16/6/2017 10:26:12 AM
└ / Forcefully set pool state in blob store. / 16/6/2017 10:26:12 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:klpool.mylab.com' / 16/6/2017 10:26:12 AM
└ / Starting the backup and replica services on source pool sgpool.mylab.com. / 16/6/2017 10:26:12 AM
└ / Start-CsWindowsService -Name LYNCBACKUP -ComputerName SGFE01.mylab.com -Verbose / 16/6/2017 10:26:12 AM
└ / Start-CsWindowsService -Name REPLICA -ComputerName SGFE01.mylab.com -Verbose / 16/6/2017 10:26:12 AM
└ / Starting squid replication and confirming that the squid replicas on all the front ends on that pool are up-to-date. / 16/6/2017 10:26:13 AM
└ / Invoke-CsManagementStoreReplication -ReplicaFqdn SGFE01.mylab.com / 16/6/2017 10:26:13 AM
└ / Get-CsManagementStoreReplicationStatus -ReplicaFqdn SGFE01.mylab.com / 16/6/2017 10:26:13 AM
└ / Synchronize user store of source pool sgpool.mylab.com, so that it has up-to-date data for the users who will be failed back to it / 16/6/2017 10:26:13 AM
└ / Backup-CsPool -PoolFqdn klpool.mylab.com -SteadyState -Category UserData -FailedOverPoolOnly / 16/6/2017 10:26:13 AM
└ / Starting all Skype For Business services on all front ends of the source pool asynchronously. / 16/6/2017 10:26:14 AM
└ / Start-CsWindowsService -Name RTCSRV -ComputerName SGFE01.mylab.com -NoWait -Verbose / 16/6/2017 10:26:14 AM
└ / Checking the interface compatibility of front end services on machines belonging to the source pool sgpool.mylab.com. / 16/6/2017 10:26:14 AM
└ / Start the services in source pool sgpool.mylab.com. / 16/6/2017 10:26:14 AM
└ / Start-CsWindowsService -ComputerName SGFE01.mylab.com -NoWait -Verbose / 16/6/2017 10:26:14 AM
└ / Start-CsWindowsService -ComputerName SGFE01.mylab.com -Verbose / 16/6/2017 10:26:14 AM
└ / Verifying the replication status of the back end databases between the two pools. / 16/6/2017 10:26:15 AM
└ / Flushing data to backup store from pool klpool.mylab.com. / 16/6/2017 10:26:15 AM
└ / Synchronize user store of source pool sgpool.mylab.com, so that it has up-to-date data for the users who will be failed back to it / 16/6/2017 10:26:15 AM
└ / Backup-CsPool -PoolFqdn klpool.mylab.com -SteadyState -Category UserData -FailedOverPoolOnly / 16/6/2017 10:26:15 AM
└ / Hydrating the data for routing groups on source pool sgpool.mylab.com to the front ends in pool klpool.mylab.com. / 16/6/2017 10:26:16 AM
└ / Sync-CsUserData -PoolFqdn sgpool.mylab.com / 16/6/2017 10:26:16 AM
└ / Hydrating Routing Groups 2 out of 2 hydrated with 0 documents and 0 batches. / 16/6/2017 10:26:16 AM
└ / Warning: Local Time: 16/6/2017 10:26:36 AM - Users of pool sgpool.mylab.com will have limited services now. / 16/6/2017 10:26:36 AM / Warning
└ / Putting user store on source pool sgpool.mylab.com in read only mode. / 16/6/2017 10:26:36 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:36 AM
└ / Putting user store on target pool klpool.mylab.com in read only mode. / 16/6/2017 10:26:36 AM
└ / Warning: Preparing for failback on pool klpool.mylab.com. / 16/6/2017 10:26:36 AM / Warning
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:36 AM
└ / Put source pool sgpool.mylab.com in failing back state in in Central Management Store. / 16/6/2017 10:26:36 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:36 AM
└ / Set-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' -PoolStateFailingBack / 16/6/2017 10:26:36 AM
└ / Synchronize user stores on target pool and source pool again to make sure that data is consistent. / 16/6/2017 10:26:37 AM
└ / Backup-CsPool -PoolFqdn klpool.mylab.com -FullBackup -Category UserData -FailedOverPoolOnly / 16/6/2017 10:26:37 AM
└ / Hydrating the data for routing groups on source pool sgpool.mylab.com to the front ends in pool sgpool.mylab.com. / 16/6/2017 10:26:37 AM
└ / Sync-CsUserData -PoolFqdn sgpool.mylab.com / 16/6/2017 10:26:37 AM
└ / Hydrating Routing Groups 2 out of 2 hydrated with 0 documents and 0 batches. / 16/6/2017 10:26:37 AM
└ / Marking source pool sgpool.mylab.com as Active, so that it can start servicing users in full mode. / 16/6/2017 10:26:57 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:57 AM
└ / Inform target pool klpool.mylab.com that failback is complete. / 16/6/2017 10:26:58 AM
└ / Warning: Completing failback process on pool klpool.mylab.com. / 16/6/2017 10:26:58 AM / Warning
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:58 AM
└ / Warning: Local Time: 16/6/2017 10:26:58 AM - Users of pool sgpool.mylab.com will have full services now. / 16/6/2017 10:26:58 AM / Warning
└ / Setting source pool sgpool.mylab.com state to Active in the Central Management Store. / 16/6/2017 10:26:58 AM
└ / Get-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' / 16/6/2017 10:26:58 AM
└ / Set-CsRegistrarConfiguration -Identity 'service:Registrar:sgpool.mylab.com' -PoolState Active / 16/6/2017 10:26:58 AM

DIV2,DIV1