*Windows Azure SQL Database makes the blog post title just too darned long!
Followup post: Testing out EF6 Connection Resiliency
I’ve been messing around with the DbExecutionStrategy types in EF6 and trying to force the strategy to kick in.
I spent a horrid amount of time going down the wrong path so thought I would elaborate on the one I’ve been banging on: SqlAzureExecutionStrategy.
I thought that by disconnecting my network connection at the right moment I would be able to see this work. But it turned out that my results were no different in EF5 or with EF6 using it’s DefaulSqlExecutionStrategy (which does not retry).
I finally looked yet again at the description and at some older articles on connection retries for SQL Azure and finally honed in on a critical phrase:
transient failure
A transient failure is one that will most likely correct itself pretty quickly.
Looking at the code for the SqlAzureExecutionStrategy, you can see the following comment:
/// This execution strategy will retry the operation on <see cref="T:System.TimeoutException"/> /// and <see cref="T:System.Data.SqlClient.SqlException"/> /// if the <see cref="P:System.Data.SqlClient.SqlException.Errors"/> contains any of the following error numbers: /// 40613, 40501, 40197, 10929, 10928, 10060, 10054, 10053, 233, 64 and 20
Focusing on these errors led me to a number of articles aimed at dealing with this same list of transient failure.
Here is a code sample from the Windows Server AppFabric Customer Advisory Team that contains brief descriptions of the errors:
// SQL Error Code: 40197 // The service has encountered an error processing your request. Please try again. case 40197: // SQL Error Code: 40501 // The service is currently busy. Retry the request after 10 seconds. case 40501: // SQL Error Code: 10053 // A transport-level error has occurred when receiving results from the server. // An established connection was aborted by the software in your host machine. case 10053: // SQL Error Code: 10054 // A transport-level error has occurred when sending the request to the server. // (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.) case 10054: // SQL Error Code: 10060 // A network-related or instance-specific error occurred while establishing a connection to SQL Server. // The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server // is configured to allow remote connections. (provider: TCP Provider, error: 0 - A connection attempt failed // because the connected party did not properly respond after a period of time, or established connection failed // because connected host has failed to respond.)"} case 10060: // SQL Error Code: 40613 // Database XXXX on server YYYY is not currently available. Please retry the connection later. If the problem persists, contact customer // support, and provide them the session tracing ID of ZZZZZ. case 40613: // SQL Error Code: 40143 // The service has encountered an error processing your request. Please try again. case 40143: // SQL Error Code: 233 // The client was unable to establish a connection because of an error during connection initialization process before login. // Possible causes include the following: the client tried to connect to an unsupported version of SQL Server; the server was too busy // to accept new connections; or there was a resource limitation (insufficient memory or maximum allowed connections) on the server. // (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.) case 233: // SQL Error Code: 64 // A connection was successfully established with the server, but then an error occurred during the login process. // (provider: TCP Provider, error: 0 - The specified network name is no longer available.) case 64: // DBNETLIB Error Code: 20 // The instance of SQL Server you attempted to connect to does not support encryption. case (int)ProcessNetLibErrorCode.EncryptionNotSupported: return true;
A TechNet wiki article on handling transient failures from Windows Azure SQL Database, warns:
One thing that’s tricky about retry logic is actually inducing a transient error for testing.
Oh, now they tell me! (I am a pitbull and tried so many things already).
Their solution is to intentionally cause a deadlock.
So the SqlAzureExecutionStrategy builds in the same retry logic that until now we’ve had to hand code (ala the example in the TechNet article or the Customer Advisory Team code sample.
I have no question that it will work. I just really like to demonstrate before and after when I can rather than just regurgitate what I’ve read somewhere. 🙂
Followup post: Testing out EF6 Connection Resiliency