Print | posted on Thursday, March 20, 2014 10:19 PM
One of the most significant “IT Pro” or infrastructure related announcements at the recent SharePoint Conference in Las Vegas was related to a change in supportability for using SQL Server Always On for SharePoint databases, and in particular the use of Asynchronous replication in Business Continuity Management (BCM) scenarios.
This is a HUGE deal. Of course, it’s not sexy, it doesn’t directly provide SharePoint IT Pros with a new tool in their belt, and it doesn’t expand deployment scenarios like the announcement relating to 1TB site collections in Office 365. However it is perhaps the single most important piece of infrastructure related information in the entire show for customers and partners operating real farms.
I tend to avoid pimping such announcements unless they have real impact, this is one of the cases where the “news” deserves much broader exposure. Let me be entirely clear, I was in no way involved in this change, nor in any of the work done to achieve the end goal. My only, and extremely minor, contribution was to provide feedback over the last year or so that such supportability would be greatly appreciated and extend the ability of organisations to implement appropriate and durable Operational Service Management for SharePoint deployments. My only role here is to help get the word out.
Extreme props are due here to the SharePoint Product Group, and specifically the Office 365 Customer Advisory Team (CAT) for making this happen. When community naysayers or indeed customers complain unfairly about Microsoft’s commitment to on-premises and the SharePoint IT Pro in particular it really gets my goat up. Sometimes the criticism is valid, but that is very much the exception to the rule, and this work is a demonstration of just how deeply committed Microsoft is to it’s customers and the partners that support them in the marketplace.
Here follows a Q&A on the details of the supportability change, and it’s impact to your designs or implementations.
Q. Which databases support which Always On replication modes?
A.
Database | Sync Supported | Async Supported | Comments |
Central Admin Content | Yes | No | Farm specific database |
App Management | Yes | Yes | |
BDC | Yes | Yes | |
Farm Configuration | Yes | No | Farm specific database |
Content | Yes | Yes | |
Managed Metadata | Yes | Yes | |
PerformancePoint | Yes | Yes | |
PowerPivot | Not Tested | Not Tested | TBD, goal is to be supported. Additional work in progress |
Project | Yes | Yes | |
Search Analytic Reporting | Yes | No | See Search notes below |
Search Admin | Yes | No | See Search notes below |
Search Crawl | Yes | No | See Search notes below |
Search Links | Yes | No | See Search notes below |
Secure Store | Yes | Yes | |
State Service | Yes | No | Farm specific database |
Subscription Settings | Yes | Yes | |
Translation Services | Yes | Yes | |
UPA Profile | Yes | Yes | |
UPA Social | Yes | Yes | |
UPA Sync | Yes | No | Backup and restore or recreate, see UPA notes below |
Usage | Yes– NR | No | Farm specific and unsupported for attach DR. Could be use for data mining only. |
Word Automation | Yes | Yes | |
Commentary: that’s a lot of green “yes”! A few notes are present. Clearly farm specific databases do not support Async, and to do so would be pointless in a DR scenario anyway as they store transient or dynamically generated data. There are a couple of stand outs which require further considerations (Search and UPA) which will be addressed below in more detail.
Q. Which version of SharePoint was tested
A. The testing was carried out on SharePoint 2013 April 2014 CU (Post SP1). Get your farms patched people!
Q. Can I implement Always On Async replication on a version of SharePoint 2013 prior to the tested version
A. While we have every reason to believe that the async replication capabilities will work just fine on a version of SharePoint 2013 prior to the tested version, we do not recommend it. This does not make it unsupported however, just consider that in the event of raising a support case your customer is likely to be asked to install the SharePoint 2013 April CU if the support case is in anyway related.
Q. Can I implement Always On Async replication on a version of SharePoint prior to SharePoint 2013
A. We have not tested or considered support for prior versions of SharePoint and as such the official stance here has to be, unsupported on versions prior to SharePoint 2013.
Q. What has changed in the product to support Always On Async replication in SharePoint 2013
A. Nothing has fundamentally changed in the data transfer or connection layer. We have added a new property to the SPDatabase Object. AlwaysOnAvailabilityGroup is a new property that is populated when you execute the new PowerShell command Add-DatabaseToAvailabilityGroup
Q. What new command have been added to support Always On in SharePoint 2013
A. There are three new commands
Add-DatabasetoAvailabilityGroup , this adds a database to an availability group by database and availability group name
Remove-DatabasefromAvailabilityGroup, this removes a database to an availability group and has options to prevent data loss on the secondaries and also to force the removal if needed
Get-AvailabiltiyGroupStatus, this allows you to interrogate a known availability group and check which SQL nodes and replication status is used on them.
Q. Are there any known limitations when using availability groups in sync or async mode
A. There are multiple reports of being unable to create new databases against listener names or having difficulties patching farms when used Sync and Async replication modes. Work is on-going to document the limitations on TechNet in the coming months.
Special considerations for the UPA Sync database
In the table above, backup and restore (or recreate) are noted as the appropriate steps to take in a BCM scenario for this particular database. This database has it’s schema and some data provisioned at the point of starting the User Profile Synchronization service instance. In addition when updates are applied any changes to it’s schema are only provisioned following a service instance restart. This means that Async replication mode is not appropriate and indeed offers no value. There is a compromise to be made between the backup/restore approach, which involves other components such as encryption keys and certificates. For many customers this is a significant burden and they will chose the “cleanest” approach which is to simply recreate and then provision the UPS service instance. Be aware however that this approach requires a reconfiguration of any Sync connections, filters and additional work with custom property mappings.
Special considerations for the Search service application databases.
You will note that the search databases are specifically called out as not supported for Async replication. This is due to the requirement to maintain synchronization between the search index files on disk and the search databases. With async replication this coordination cannot be guaranteed and the possibility of search index corruption or certainly instability is extremely high.
Thus the question remains, what to do about search?
If you cast your mind back to SharePoint 2010 we had a couple of options.
1. For a high degree of search freshness you could crawl read only databases on the DR side or crawl the production farm from a DR search service application
2. Use a log shipped copy of the search admin database to recreate the search service in DR and re-crawl the content, this has the advantage of bringing over the search configuration but the index needs to be rebuilt
3. Backup and Restore of the search service application. This is a high fidelity restore but may be unacceptable due to an extended restore time.
Two fundamental differences exist between SharePoint 2010 and SharePoint 2013 in this regard. In SharePoint 2013 several enhancements have been made to support tenant and site level search administration and this complicates the search DR story. These enhancements while surfaced at the site collection level and web level actually have their configuration stored in the search administration database. The second key change is the use of the search engine to process and manage analytical data, augmenting the search index and relevance of search results based on usage statistics and click throughs. This information resides inside the search index itself.
These changes give us a challenge for DR – The information in the search administration database can be retained for DR by recreating the service application from a copy of the production search admin database – similar to how we did this task in SP2010. The downside is that we cannot constantly update the DR side and so this has to be done at the point of failover followed by a full crawl. The analytics information however cannot be replicated in anyway except by a full service application backup and restore. This means we have different options for search DR in Sp2013 each with its own pros and cons.
Option | Advantage | Disadvantage |
Crawl Read Only DBs in DR | Can maintain high degree of search content freshness at point of failover. | Cannot maintain search configuration settings below service app level. Also loses analytical influences from production |
Recreate service app in DR from Admin Database | Brings across all the search configuration settings | Have to re-crawl all content. Also loses analytical influences from production |
Search service application back and restore | Brings across all the search settings and the analytical influences | May take a while to get search operational. Search is offline until the restore complete |
Depending on your requirements for search you will need to make a selection from the above. Alternatively you could use a combination. For example if your primary need for search was to discover content with a high degree of freshness then you could select the first option. At the same time – just on the off chance that failing back to production was not possible, perhaps due to a major incident then you could also take a backup from production periodically to be able to restore in DR to recover full fidelity perhaps after a trigger period of a number of hours after failover.
SharePoint CAT is working on a more in depth whitepaper to describe the steps for Search DR in more detail. This whitepaper will be published in due course.
So there you go. if you are in the business of designing or providing BCM for enterprise SharePoint deployments, you surely will agree the importance of these changes cannot be underplayed. If you want more information on the background of this space, and to re-live the actual announcement, head on over to Channel 9 (yes, it’s for IT geeks as well) and watch my good buddy, Neil Hodgkinson deliver the goods in his excellent presentation on BCM for SharePoint 2013.
I’m outta here like i just failed over my data centre.
s.