Monday, January 3, 2022

Fix Email Campaign Pausing: Sitecore Email Experience Manager 3.x Retry Data Provider

Standard

Background

My company uses Email Experience Manager (EXM) to send several million emails a day, and we have been facing issues where our large campaigns would pause mid-send.

We have a scaled EXM environment with 2 dedicated dispatch servers, and a separate SQL Server, all with appropriate resources so the hardware was not an issue. We also ensured that databases were kept in tiptop condition (proper maintenance plans with stats being updated), and configurations where optimal for our environment.


The causing of the pausing

After digging in, I discovered that the pausing was caused by SQL deadlocks due to the massive amount of records and CRUD activity on the EXM SQL databases.

Sample Exception:

 ERROR Transaction (Process ID 116) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.  
 Exception: System.Data.SqlClient.SqlException  
 Message: Transaction (Process ID 116) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.  
 Source: .Net SqlClient Data Provider  
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)  
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)  
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)  
   at System.Data.SqlClient.SqlDataReader.TryHasMoreRows(Boolean& moreRows)  
   at System.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)  
   at System.Data.SqlClient.SqlDataReader.Read()  
   at System.Data.SqlClient.SqlCommand.CompleteExecuteScalar(SqlDataReader ds, Boolean returnSqlValue)  
   at System.Data.SqlClient.SqlCommand.ExecuteScalar()  
   at Sitecore.Modules.EmailCampaign.Core.Data.SqlDbEcmDataProvider.CountRecipientsInDispatchQueue(Guid messageId, RecipientQueue[] queueStates)  
   at Sitecore.Modules.EmailCampaign.Core.Gateways.DefaultEcmDataGateway.CountRecipientsInDispatchQueue(Guid messageId, RecipientQueue[] queueStates)  
   at Sitecore.Modules.EmailCampaign.Core.Analytics.MessageStatistics.get_Unprocessed()  
   at Sitecore.Modules.EmailCampaign.Core.Analytics.MessageStatistics.get_Processed()  
   at Sitecore.Modules.EmailCampaign.Core.MessageStateInfo.InitializeSendingState()  
   at Sitecore.Modules.EmailCampaign.Core.MessageStateInfo.InitializeMessageStateInfo()  
   at Sitecore.Modules.EmailCampaign.Factory.GetMessageStateInfo(String messageItemId, String contextLanguage)  
   at Sitecore.EmailCampaign.Server.Services.MessageInfoService.Get(String messageId, String contextLanguage)  
   at Sitecore.EmailCampaign.Server.Controllers.MessageInfo.MessageInfoController.MessageInfo(MessageInfoContext data)  
   

How does this new data provider fix the problem?

The new data provider introduces efficient SQL deadlock handling. When a deadlock is detected, it will wait 5 seconds and then retry the transaction. The code will try to execute a deadlocked transaction 3 times.

Configuration

Defaults are set to wait 5 seconds for the retry, and the max retry attempts is 3. The DelaySeconds and RetryCount settings can be modified to suit your needs.

 <configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">  
  <sitecore>  
   <ecmDataProvider defaultProvider="sqlretry">  
    <providers>  
     <clear/>  
     <add name="sqlretry" type="Sitecore.EmailCampaign.RetryDataProvider.RetrySqlDbEcmDataProvider, Sitecore.EmailCampaign.RetryDataProvider" connectionStringName="exm.master">  
      <Logger type="Sitecore.ExM.Framework.Diagnostics.Logger, Sitecore.ExM.Framework" factoryMethod="get_Instance"/>  
      <DelaySeconds>5</DelaySeconds>  
      <RetryCount>3</RetryCount>  
     </add>  
     <add name="sqlbase" type="Sitecore.Modules.EmailCampaign.Core.Data.SqlDbEcmDataProvider, Sitecore.EmailCampaign" connectionStringName="exm.master">  
      <Logger type="Sitecore.ExM.Framework.Diagnostics.Logger, Sitecore.ExM.Framework" factoryMethod="get_Instance"/>  
     </add>  
    </providers>  
   </ecmDataProvider>  
  </sitecore>  
 </configuration>  

Source Code and Documentation

Full source code, documentation and package download is available from my GitHub repository:

https://github.com/martinrayenglish/Sitecore-EXM-3.x-Retry-Data-Provider