One of the features that makes SQL Server Merge Replication superior to other sync technologies is something called Precomputed Partitions.
SQL Server creates and maintains distinct data partitions/subsets for each unique user or other type of filter value. Other sync technologies figure out “what’s-changed” on the fly when a Subscriber synchronizes. This means all change-tracking tables must be evaluated to figure out the data-delta while the user (not so patiently) waits. Merge Replication with Precomputed Partitions does all this hard work in advance, so the Subscribers can start downloading changes instantly. Not making your customer wait contributes to a good user experience (UX).
To pull this off, SQL Server has to use a more advanced set of tracking tables + additional trigger logic than simpler sync systems. Since it does more work in advance, it goes without saying that the Merge Agents are busily creating extra contention through locking and blocking of all those extra tracking tables. Luckily, you have some control over how many extra tracking tables are used depending on the partition choices you make. The fewer tables you have to use, while still getting the functionality you need is one of the keys to reducing SQL Server contention.
Let’s take a quick look at the tracking tables to give you some context:
- MSmerge_contents (A row is inserted for each user table insert or update)
- MSmerge_tombstone (A row is inserted for user table delete)
- MSmerge_genhistory (A grouping of the above inserts called a generation)
- MSmerge_partition_groups (One row for each distinct partition defined)
- MSmerge_current_ partition_mappings (One row for each unique combination of rows in MSmerge_contents and MSmerge_partition_groups)
- MSmerge_past_partition_mappings (One row for each row that no longer belongs in a given partition)
Wow, that’s a bunch of tracking tables!
So imagine a bunch of Merge Agents performing SELECTS, INSERTS, UPDATES, and DELETES against all these tables in order to make the magic happen for the end user. The more Merge Agents you having running concurrently, the more locking and blocking the system will experience which degrades performance.
What if I told you that you could reduce contention by eliminating 2 tracking tables from the equation?
Anyone who has setup Merge Replication might recognize the Add Filter dialog below:
You will see one of these dialog boxes for every filter you create. For the purposes of this article, I just want you to focus on the 2 radio buttons at the bottom. You get to choose that a row from a table will sync with multiple subscriptions or just one subscription. If multiple users share the same distinct partition, you select the top one, because it’s an overlapping partition and it will need to use all 6 of the tracking tables I mentioned previously. This means multiple users send the exact same HOST_NAME() filter value when they sync. This might be a scenario where a group of sales professionals share the same region and clients, but nothing specific to themselves.
On the other hand, if you know that only one user subscribes to a given partition, then you select the bottom radio button because it’s a non-overlapping partition, and only uses the top 4 tracking tables I mentioned above. This is actually the most common scenario I see with my customers. Imagine a delivery driver that syncs down the data he needs to tell him where to go, what to do, and then uploads data he’s captured about deliveries.
So the big takeaway here, is to always select the bottom radio button so that SQL Server will create more efficient, non-overlapping partitions and use fewer tracking tables.
But wait, there’s more!
Non-overlapping partitions are less likely to create conflicts that have to be resolved by the built-in conflict resolvers. If you’re not updating other users rows, then you’re not crashing into each other which leads to faster, smoother operation.
Back at the beginning of the article, I mentioned that there’s a quite of lot of code executed by triggers and associated stored procedures to make this sophisticated machine work. Depending on your schema, the amount of data in the tracking tables, and the types of filters you’ve created, a potentially slow 5-way join is executed by sp_MSenumchanges. You can turn that into a 3-way join and speed up the system by sticking to non-overlapping partitions. That was easy.
It’s all about UX, and trust me, your users will thank you for caring about this stuff!
Sharing my knowledge and helping others never stops, so connect with me on my blog at http://robtiffany.com , follow me on Twitter at https://twitter.com/RobTiffany and on LinkedIn at https://www.linkedin.com/in/robtiffany