Mobile Merge Replication Performance and Scalability Cheat Sheet

| April 22, 2010 | 4 Comments

If your Mobile Enterprise Application Platform (MEAP) is using SQL Server Merge Replication to provide the mobile middleware and reliable wireless wire protocol for SQL Server Compact (SSCE) running on Windows Mobile and Windows Embedded Handheld devices + Windows XP/Vista/7 tablets, laptops, desktops, slates, and netbooks; below is a guide to help you build the fastest, most scalable systems:

Active Directory

  • Since your clients will be passing in their Domain\username + password credentials when they sync, both IIS and SQL Server will make auth requests of the Domain Controller. Ensure that you have at least a primary and backup Domain Controller, that the NTDS.dit disk drives are big enough to handle the creation of a large number of new AD DS objects (mobile users and groups), and that your servers have enough RAM to cache all those objects in memory.

Database Schema

  • Ensure your schema is sufficiently de-normalized so that you never have to perform more than a 4-way JOIN across tables. This affects server-side JOIN filters as well as SSCE performance.
  • To ensure uniqueness across all nodes participating in the sync infrastructure, use GUIDs for your primary keys so that SQL Server doesn’t have to deal with the overhead of managing Identity ranges. Make sure to mark your GUIDs as ROWGUIDCOL for that table so that Merge won’t try to add an additional Uniqueidentifier column to the table.  Don’t create clustered indexes when using GUIDs as primary keys because they will suffer horrible fragmentation that will rapidly degrade performance.  Just use a normal index.
  • Create clustered indexes for your primary keys when using Indentity columns, Datetime, or other natural keys.  Ensure that every column in every table that participates in a WHERE clause is indexed.

Distributor

  • If your network connection is fast and reliable like Wi-Fi or Ethernet, your SSCE client has more than 32 MB of free RAM, and SQL Server isn’t experiencing any deadlocks due to contention with ETL operations or too many concurrent Merge Agents, create a new Merge Agent Profile based on the High Volume Server-to-Server Profile so that SQL Server will perform more work per round-trip and speed up your synchronizations.
  • If you’re using a 2G/3G Wireless Wide Area Network connection, create a Merge Agent Profile based on the Default Profile so that SQL Server will perform less work and use fewer threads per round-trip during synchronization than the High Volume Server to Server Profile which will help to reduce server locking contention and perform less work per round trip which will make your synchronizations more likely to succeed.
  • In order to prevent SQL Server from performing Metadata Cleanup every time a Subscriber synchronizes, set the –MetadataRetentionCleanup parameter to 0.
  • As SQL Server has to scale up to handle a higher number of concurrent users in the future, locking contention will increase due to more Merge Agents trying to perform work at the same time.  When this happens, adjust the parameters of the Default Profile so that both  –SrcThreads and –DestThreads are equal to 1.

Publication

  • When defining the Articles you’re going to sync, only check the minimum tables and columns needed by the Subscriber to successfully perform its work.
  • For Lookup/Reference tables that aren’t modified by the Subscriber, mark those as Download-only to prevent change-tracking metadata from being sent to the Subscriber.
  • Despite the fact the column-level tracking sends less data over the air, stick with row-level tracking so SQL Server won’t have to do as much work to track the changes.
  • Use the default conflict resolver where the “Server wins” unless you absolutely need a different manner of picking a winner during a conflict.
  • Use Static Filters to reduce the amount of server data going out to all Subscribers.
  • Make limited use of Parameterized Filters which are designed to reduce and further specify the subset of data going out to a particular Subscriber based on a HOST_NAME() which creates data partitions.  This powerful feature slows performance and reduces scalability with each additional filter, so it must be used sparingly.
  • Keep filter queries simple and don’t use IN clauses, sub-selects or any kind of circular logic.
  • Strive to always create “well-partitioned” Articles where all changes that are uploaded/downloaded are mapped to only the single partition ID for best performance and scalability.
    • When using Parameterized Filters, always create non-overlapping data partitions where each row from a filtered table only goes to a single Subscriber instead of more than one which will avoid the use of certain Merge metadata tables.
    • Each Article in this scenario can only be pubished to a single Publication
    • A Subscriber cannot insert rows that do not belong to its partition ID.
    • A Subscriber cannot update columns that are involved in filtering.
    • In a join filter hierarchy, a regular article cannot be the parent of a “well-partitioned” article.
    • The join filter in which a well-partitioned article is the child must have the join_unique_key set to a value of 1 which relates to the Unique key check box of the Add Join dialog.  This means there’s a one-to-one or one-to-many relationship with the foreign key.
    • Each “well-partitioned” Article can have only one subset or join filter. The article can have a subset filter and be the parent of a join filter, but cannot have a subset filter and be the child of a join filter.
  • Never extend a filter out to more than 4 joined tables.
  • Do not filter tables that are primarily lookup/reference tables, small tables, and tables with data that does not change.
  • Schedule the Snapshot Agent to run once per day to create an unfiltered schema Snapshot.
  • Set your Subscriptions to expire as soon as possible to keep the amount change-tracking metadata SQL Server has to manage to an absolute minimum. Normally, set the value to 2 to accommodate 3-day weekends since 24 hours are automatically added to the time to account for multiple time zones. If server-side change tracking isn’t needed and Subscribers are pulling down a new database every day and aren’t uploading data, then set the expiration value to 1.
  • Set Allow parameterized filters equal to True.
  • Set Validate Subscribers equal to HOST_NAME().
  • Set Precompute partitions equal to True to allow SQL Server to optimize synchronization by computing in advance which data rows belong in which partitions.
  • Set Optimize synchronization equal to False if Precompute partitions is equal to True.  Otherwise set it to True to optimize filtered Subscriptions by storing more metadata at the Publisher.
  • Set Limit concurrent processes equal to True.
  • Set Maximum concurrent processes equal to the number of SQL Server processor cores.  If exceesive locking contention occurs, reduce the number of concurrent processes until the problem is fixed.
  • Set Replicate schema changes equal to True.
  • Check Automatically define a partition and generate a snapshot if needed when a new Subscriber tries to synchronize. This will reduce Initialization times since SQL Server creates and applies snapshots using the fast BCP utility instead of a series of slower SELECT and INSERT statements.
  • Add data partitions based on unique HOST_NAMEs and schedule the Snapshot Agent to create those filtered Snapshots nightly or on the weekend so they’ll be built using the fast BCP utility and waiting for new Subscribers to download in the morning.
  • Ensure that SQL Server has 1 processor core and 2 GB of RAM for every 100 concurrent Subscribers utilizing bi-directional sync. Add 1 core and 2 GB of RAM server for every additional 100 concurrent Subscribers you want to add to the system.  Never add more Subscribers and/or IIS servers without also adding new cores and RAM to the Publisher.
  • Turn off Hyperthreading in the BIOS of the SQL Server as it has been known to degrade SQL Server performance.
  • Do not add your own user-defined triggers to tables on a Published database since Merge places 3 triggers on each table already.
  • Add one or more Filegroups to your database to contain multiple, secondary database files spread out across many physical disks.
  • Limit use of large object types such as text, ntext, image, varchar(max), nvarchar(max) or varbinary(max) as they require a significant memory allocation and will negatively impact performance.
  • Set SQL Servers’s minimum and maximum memory usage to within 2 GB of total system memory so it doesn’t have to allocate more memory on-demand.
  • Always use SQL Server 2008 R2 and Windows Server 2008 R2 since they work better together because they take advantage of the next generation networking stack which dramatically increases network throughput. They can also scale up as high as 256 cores.
  • Due to how Merge Replication tracks changes with triggers, Merge Agents, and tracking tables, it will create locking contention withDML/ ETL operations.  This contention degrades server performance which negatively impacts sync times with devices.  This contention should be mitgated by performing large INSERT/UPDATE/DELETE DML/ETL operations during a nightly maintenance window when Subscribers aren’t synchronizing.
  • Since Published databases result in slower DML/ETL operations, perform changes in bulk by using XML Stored Procedures to boost performance.
  • To improve the performance of pre-computed partitions when DML/ETL operations result in lots of data changes, ensure that changes to a Parent table in a join filter are made before corresponding changes in the child tables.  This means that when DML/ETL operations are pushing new data into SQL Server, they must add master data to the parent filter table first, and then add detail data to all the related child tables second, in order for that data to be pre-computed and optimized for sync.
  • Create filter partitions based on things that don’t change every day.  Partitions that are added and deleted from SQL Server and Subscribers that move from one partition to another is very disruptive to the performance of Merge Replication.
  • Always perform initializations and re-initializations over Wi-Fi or Ethernet when the device is docked because this is the slowest operation where the entire database must be downloaded instead of just deltas.  To determine rough estimates for initialization, multiply the size of the resulting SSCE .sdf file x the bandwidth speed available to the device.  A file copy over the expected network will also yield estimates for mininum sync times.  These times don’t include the work SQL Server and IIS must perform to provide the data or data INSERT times on SSCE.
  • If your SQL Server Publisher hits a saturation point with too many concurrent mobile Subscribers, you can scale it out creating a Server/Push Republishing hierarchy. Put the primary SQL Server Publisher at the top of the pyramid and have two or more SQL Servers subscribe to it. These can be unfiltered Subscriptions where all SQL Servers get the same data or the Subscribers can filter their data feeds by region for example. Then have the Subscribing SQL Servers Publish their Subscription for consumption by mobile SSCE clients.
  • Create just a single Publication.

Internet Information Services

  • Use the x64 version of the SQL Server Compact 3.5 SP2 Server Tools with Windows Server 2008 R2 running inside IIS 7.5.
  • Use a single Server Agent in a single Virtual Directory.
  • Ensure the IIS Virtual Directory where the Server Agent resides is on a fast solid-state drive that’s separate from the disk where Windows Server is installed to better support file I/O.
  • Use a low-end server with 2 processor cores and 2 GB of RAM to support 400 concurrent Subscribers queued at the same time.
  • Set the MAX_THREADS_PER_POOL Server Agent registry key equal to 2 to match the IIS processor cores and RAM. Do not set this value to a higher number than the number of cores.
  • Set the MAX_PENDING_REQUEST Server Agent registry key equal to 400 which means the Server Agent will queue up to 400 concurrent Subscribers waiting for one of the 2 worker threads to become available to sync with SQL Server.
  • Set the IIS Connection Limits property to 400 to prevent an unlimited number of connections reaching the Server Agent.
  • Add a new load-balanced IIS server for every additional 400 concurrent Subscribers you want to add to the system.

Subscriber

  • Use the appropriate x64, x86 or ARM version of SQL Server Compact 3.5 SP2 to take advantage of the PostSyncCleanup property of the SqlCeReplication object that can reduce the time it takes to perform an initial synchronization. Set the PostSyncCleanup property equal to 3 where neither UpdateStats nor CleanByRetention are performed.
  • Increase the Max Buffer Size connection string parameter to 1024 on a phone and 4096 on a PC to boost both replication and SQL query processing performance. If you have more RAM available, set those values even higher until you reach the law of diminishing returns.
  • Keep your SSCE database compact and fast by setting the Autoshrink Threshold connection string parameter to 10 so it starts reclaiming empty data pages once the database has become 10% fragmented.
  • Replication performance testing must be performed using actual PDAs to observe how available RAM, storage space and CPU speed affect moving data into the device’s memory area and how quickly this data is inserted into the SSCE database tables.  Since the SSCE database doubles in size during replication, the device must have enough storage available or the operation will fail.  Having plenty of available RAM is important so that SSCE can utilize its memory buffer to complete a Merge Replication operation more quickly.  With plenty of available RAM and storage, a fast CPU will make all operations faster.
  • The PDA must have at least an extra 32 MB of available free RAM that can be used by the .NET Compact Framework (NETCF) application.  If additional applications are running on the device at the same time, even more RAM is needed.  If a NETCF application has insufficient RAM is will discard its compiled code and run in interpreted mode which will slow the application down.  If the NETCF app is still under memory pressure after discarding compiled code, Windows Mobile will first tell the application to return free memory to the operating system and then will terminate the app if needed.
  • Set the CompressionLevel property of the SqlCeReplication object to 0 for fast connections and increment it from 1 to 6 on slower connections like GPRS to increase speed and reduce bandwidth consumption.
  • Tune the ConnectionRetryTimeout, ConnectTimeout, ReceiveTimeout and SendTimeout properties of the SqlCeReplication object based on expected bandwidth speeds:
Property High Bandwidth Medium Bandwidth Low Bandwidth
ConnectionRetryTimeout 30 60 120
ConnectTimeout 3000 6000 12000
ReceiveTimeout 1000 3000 6000
SendTimeout 1000 3000 6000
  • You can decrease potentially slow SSCE file I/O by adjusting the Flush Interval connection string parameter to write committed transactions to disk less often than the default of every 10 seconds.  Test longer intervals between flushes like 20 or 30 seconds. Keep in mind that these transactions can be lost if the disk or system fails before flushing occurs so be careful.
  • When replicating data that has been captured in the field by the device, perform Upload-only syncs to shorten the duration.

Storage

  • Use a Fibre Channel SAN with 15k RPM or solid-state disks for best I/O performance.
  • Databases should reside on a RAID 10, unshared LUN comprised of at least 6 disks.
  • Database logs should reside on a RAID 10, unshared LUN comprised of at least 6 disks.
  • Tempdb should reside on a RAID 10, unshared LUN comprised of at least 6 disks.
  • The Tempdb log should reside on a RAID 10, unshared LUN comprised of at least 6 disks.
  • The Snapshot share should reside on a RAID 10, unshared LUN comprised of at least 6 disks.  This disk array should be large enough to accommodate a growing number of filtered Snapshots. Snapshot folders for Subscribers that no longer use the system must be manually deleted.
  • Merge Replication metadata tables should reside on a RAID 10, unshared LUN comprised of at least 6 disks.
  • Increase your Host Bus Adapter (HBA) queue depths to 64.
  • Your Publication database should be broken up into half the number of files as the SQL Server has processor cores. Each file must be the same size.
  • Tempdb should be pre-sized with an auto-growth increment of 10%. It should be broken up into the same number of files as the SQL Server has processor cores. Each file must be the same size.

High Availability

  • Load-balance the IIS servers to scale them out. Enable Server Affinity (stickiness) since the Replication Session Control Blocks that transmit data between the Server Agent and SSCE are stateful. Test to ensure that your load-balancer is actually sending equal amounts of Subscriber sync traffic to each IIS server.  Some load-balancers can erroneously send all traffic to a single IIS server if not properly configured.
  • Implement Windows Clustering so that SQL Server can failover to a second node.
  • Using SQL Server Mirroring so that your Published database will failover to a standby server.
  • Make a second SQL Server into an unfiltered Subscriber to your Publisher so that it can take over Merge Replication duties for mobile clients as a Republisher if the primary SQL Server fails. SSCE clients would just have to reinitialize their Subscriptions to begin synchronizing with the new Republisher.

Ongoing Maintenance

  • Use the Replication Monitor to have a real-time view of the synchronization performance of all your Subscribers.
  • Use the web-based SQL Server Compact Server Agent Statistics and Diagnostics tools to monitor the health and activity of the Server Agent running on IIS.
  • Create a SQL Job to execute the sp_MSmakegeneration stored procedure after large DML operations. Regular execution after INSERTING/UPDATING/DELETING data from either DML/ETL operations or after receiving lots of changes from Subscribers will maintain subsequent sync performance. Executing this stored procedure from the server-side is preferable to having it executed as a result of a Subscriber sync which would block all other Subscribers.
  • During your nightly maintenance window, rebuild the indexes and update the statistics of the following Merge Replication metadata tables:
    • MSmerge_contents
    • MSmerge_tombstone
    • MSmerge_genhistory
    • MSmerge_current_partition_mappings
    • MSmerge_past_partition_mappings
    • MSmerge_generation_partition_mappings
  • If you notice performance degradation during the day due to a large number of Subscribers making large changes to the database, you can updates the statistics (with fullscan) of the Merge Replication metadata tables more frequently throughout the day to force stored proc recompiles to get a better query plan.
    • UPDATE STATISTICS MSmerge_generation_partition_mappings WITH FULLSCAN
    • UPDATE STATISTICS MSmerge_genhistory WITH FULLSCAN
  • Rebuild/defrag indexes on your database tables and Merge Replication metadata tables throughout the day to reduce locking contention and maintain performance.
  • Use the Missing Indexes feature of SQL Server to tell you which indexes you could add that would give your system a performance boost. Do not add recommended indexes to Merge system tables.
  • Use the Database Engine Tuning Advisor to give you comprehensive performance tuning recommendations that cover every aspect of SQL Server.
  • Monitor the performance of the following counters:
    • Processor Object: % Processor Time: This counter represents the percentage of processor utilization. A value over 80% is a CPU bottleneck.
    • System Object: Processor Queue Length: This counter represents the number of threads that are delayed in the processor Ready Queue and waiting to be scheduled for execution. A value over 2 is bottleneck and shows that there is more work available than the processor can handle. Remember to divide the value by the number of processor cores on your server.
    • Memory Object: Available Mbytes: This counter represents the amount of physical memory available for allocation to a process or for system use. Values below 10% of total system RAM indicate that you need to add additional RAM to your server.
    • PhysicalDisk Object: % Disk Time: This counter represents the percentage of time that the selected disk is busy responding to read or write requests. A value greater than 50% is an I/O bottleneck.
    • PhysicalDisk Object: Average Disk Queue Length: This counter represents the average number of read/write requests that are queued on a given physical disk. If your disk queue length is greater than 2, you’ve got an I/O bottleneck with too many read/write operations waiting to be performed.
    • PhysicalDisk Object: Average Disk Seconds/Read and Disk Seconds/Write: These counters represent the average time in seconds of a read or write of data to and from a disk. A value of less than 10 ms is what you’re shooting for in terms of best performance. You can get by with subpar values between 10 – 20 ms but anything above that is considered slow. Times above 50 ms represent a very serious I/O bottleneck.
    • PhysicalDisk Object: Average Disk Reads/Second and Disk Writes/Second: These counters represent the rate of read and write operations against a given disk. You need to ensure that these values stay below 85% of a disk’s capacity by adding disks or reducing the load from SQL Server. Disk access times will increase exponentially when you get beyond 85% capacity.
  • A limited number of database schema changes can be made and synchronized down to SSCE Subscribers without any code changes which makes it easier to update your system as it evolves over time.
  • Use a Merge Replication Test Harness to stress test the entire system.  The ability to simulate hundreds or thousands of concurrent synchronizing Subscribers allows you to monitor performance and the load on the system.  This is helpful in properly configuring and tuning SQL Server, IIS, and the Windows Mobile devices.  It will tell you where you’re having problems and it will let you predict how much server hardware you will need to support growing numbers of Subscribers over time.  It’s also very important to simulate worst-case scenarios that you never expect to happen.

I hope this information sufficiently empowers you to take on the largest MEAP solutions that involve SQL Server Merge Replication and SQL Server Compact.  If you need a deeper dive, go check out my book on Enterprise Data Synchronization http://www.amazon.com/Enterprise-Synchronization-Microsoft-Compact-Replication/dp/0979891213/ref=sr_1_1?ie=UTF8&s=books&qid=1271964573&sr=1-1 over at Amazon.  Now go build a fast and scalable solution for your company or your customers.

Best Regards,

Rob

P.S.  If your solution doesn’t require all the advanced features found in Merge Replication, I highly recommend you use Remote Data Access (RDA).  This is a much simpler sync technology that’s extremely fast, scalable, and easier to manage.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Category: Sync

About the Author ()

A mobile strategist and cloud architect at Microsoft, Rob has spent his career as an entrepreneur, advisor, teacher, developer, speaker, and author of bestselling books on mobile and wireless technologies. A pioneer of the smartphone revolution, he drove the development of the mobile app ecosystem from its earliest days and co-founded the world’s first cloud-based mobile device management company.

Comments (4)

Trackback URL | Comments RSS Feed

  1. Kenny Lam says:

    Thank you very much for compiling this cheat sheet. It will certainly be a great guide in helping the business I work for optimize its application’s merge replication performance.

  2. Paul Millar says:

    Good doc, excellent !

Leave a Reply