Connection resiliency in Entity Framework 6

The EF team has recently announced the new features added to EF 6 beta 1, and among them is one I found interesting: connection resiliency.
Despite the fancy name, it’s a simple concept: having built-in support for automatically retrying an EF operation, if the cause of the failure is a transient one, that has a high probablity to be “fixed” in a short interval of time.

While such “transient failures” are quite rare when using a single database server, located near the client (ex.: web application), on the same high-speed LAN, they are expected when the database server is located in a huge server farm, like Azure or Amazon ones. Expected not only once every blue moon, but quite often as servers are automatically relocated or connections are throttled when the load is high.
It’s true that such “retry” functionality it’s not so hard to build manually in our application, but it’s a repetitive process that I would rather have already available when needed.

As expected, currently this feature is available only on the EF provider for SQL Azure, and only in EF 6 beta 1 or newer. Since anyway I am on the bleeding edge on this case, I used the nightly EF6 build, taken with NuGet from MyGet feed (https://entityframework.codeplex.com/wikipage?title=Nightly%20Builds) instead of the official one.

Let’s write some code in a console application:

    class Program
    {
        static void Main(string[] args)
        {
            Database.SetInitializer<AdventureWorksContext>(null);

            Console.WriteLine("Persons starting with T: ...");
            using (var context = new AdventureWorksContext())
            {
                context.Database.Log = Console.Write;

                var personsStartingWithT = from p in context.Persons
                                           where p.FirstName.StartsWith("T")
                                           select p;

                foreach (Person person in personsStartingWithT)
                {
                    Console.WriteLine("{0} {1} {2}", person.Suffix, person.FirstName, person.LastName);
                }

                Console.WriteLine("\n\rPress any key to continue..");
                Console.ReadLine();

            }

        }
    }

So far nothing new.
I used the AdventureWorks2012 database, on a SQL Azure instance, so the connection string looks like this:

  <connectionStrings>
    <add name="AdventureWorksContext" providerName="System.Data.SqlClient" 
connectionString="Server=tcp:[.....].database.windows.net,1433;Database=AdventureWorks2012;User ID=tudor@[.......];Password=[.......];Trusted_Connection=False;Encrypt=True;Connection Timeout=30;" />
  </connectionStrings>

All we need to do in order to enable the connection resiliency for all actions is to add the SqlAzureExecutionStrategy to our custom DbConfiguration class, that is automatically discovered by EF at startup:

namespace TestConnectionResiliency.Models
{
    public class SqlAzureDbConfiguration : DbConfiguration
    {
        public SqlAzureDbConfiguration()
        {
            AddExecutionStrategy(() => new SqlAzureExecutionStrategy());
        }
    }
}

To test if it works, since it’s a bit hard to wait for a transient failure to happen on SQL Azure, I just disabled the network connection on the client just before the query was about to execute, and I enabled it back a few seconds later. The log on the debug console say it all:

A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
A first chance exception of type 'System.Data.SqlClient.SqlException' occurred in System.Data.dll
'TestConnectionResiliency.vshost.exe' (Managed (v4.0.30319)): Loaded 'EntityFrameworkDynamicProxies-TestConnectionResiliency'
...
The thread '<No Name>' (0x2020) has exited with code 0 (0x0).
The program '[8832] TestConnectionResiliency.vshost.exe: Managed (v4.0.30319)' has exited with code 0 (0x0).

The timeout exception was thrown by ADO.NET, but EF caught it and retried the operation until it was successful.

What exactly is the EF provider for SQL Azure doing in this case?
Easy to find out – we have the source code for DbExecutionStrategy.cs and SqlAzureRetriableExceptionDetector.cs.

If a SqlException with a certain error code (full details in SqlAzureRetriableExceptionDetector.cs), or a TimeoutException, is thrown by the underlying provider, the operation is retried no more than 5 times until is successful, using exponentially increasing (random) delays between retries.
The formula used is: MIN(random(1, 1.1) * (2 ^ retryCount - 1), maxDelay)
where retryCount is by default 5, and maxDelay is by default 30s.
In other words, it retries slower and slower each time, but no slower than 30 seconds.
This is an exponential backoff algorithm, that is clasically used in such cases.
The list of SQl error codes that trigger a retry is similar with the ones from http://windowsazurecat.com/2010/10/best-practices-for-handling-transient-conditions-in-sql-azure-client-applications/ .

Some context: even since Azure was launched, various teams within Microsoft have tried to implement something similar:
– the ‘Transient Fault Handling Framework for SQL Azure, Windows Azure Storage, Service Bus & Cache’ (http://windowsazurecat.com/2011/02/transient-fault-handling-framework/) from the former Azure CAT (Customer Advisory Team)
– followed by Enterprise Library Transient Fault Handling Application Block (http://msdn.microsoft.com/en-us/library/hh680934(v=pandp.50).aspx) included in Enterprise Library 5.0 Integration Pack for Windows Azure by P&P team
– now EF 6 will includ this built-in: https://entityframework.codeplex.com/wikipage?title=Connection%20Resiliency%20Spec
– the Windows ODBC Driver for SQL Server/Azure offers something similar: ‘Connection Resiliency in the Windows ODBC Driver
– and in .NET 4.5.1 preview, it might seem that something similar will be included inside ADO.NET itself, as ‘ADO.NET idle connection resiliency’ (I found no details about this one yet)

Azure container

Source: Microsoft: http://news.cnet.com/2300-10805_3-10001679-1.html (Chicago Data Center)

Advertisements
This entry was posted in .NET, Azure, Entity Framework, SQL Server and tagged , , , . Bookmark the permalink.

One Response to Connection resiliency in Entity Framework 6

  1. Pingback: ADO.NET connection resiliency | Tudor Turcu – blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s