Always Retry

When multiple people connect to a SQL Server database and attempt to query or update rows, blocking occurs. This is normal, and we expect a certain amount of this, usually very short lived. If it’s not, then having tools that monitor your database and can quickly let you know which connection to kill is important.

There are also issues like network hiccups and deadlocks, which can cause a transaction to fail and roll back. In these cases, the application should retry a query, often quickly, without bothering the user. This isn’t something that most developers code into their applications, though they should. However, is this something that lots and lots of developers ought to learn and re-implement over and over?

Microsoft has released a preview of a configurable retry logic in the SqlClinet driver for .NET (others coming soon). With this enhancement, developers can tell the driver how to react with some types of connectivity errors and perhaps resubmit the query. There are various options you can read about, and you ought to carefully test and you decide to use before you deploy them to production.

To me, this is long overdue. Software ought to work for us and make our work easier, including easier for software developers. Often I’ve heard many vendors point to the configuration options and flexibility of their software, tool, or framework, while placing the burden on developers to write a lot of the code to actually take advantage of the tool. Whenever possible, we ought to make the preferred choice, the best practice, the most common code an easy choice by making it easy to add to a system. Give people flexibility when they need it, but make it easy for them to see the benefits of your software quickly.

Steve Jones

About way0utwest

Editor, SQLServerCentral
This entry was posted in Editorial and tagged . Bookmark the permalink.

1 Response to Always Retry

  1. Greg Moore says:

    Similarly, I have most SQL jobs retry. Had one particular job that would fail like once every 100 runs. (I forget the details, but basically it could end up being blocked by something else).

    I got tired of going in in the morning and rerunning the job.

    After debugging and figuring out what the problem was and why it was re-occurring, I did some math and figured it would now fail completely every 10,000 tries, or about once every 27 years, and if it did fail a second time, it was probably a bigger issue.

    I figure in 27 years it’ll be someone else’s problem. But it solved mine and my client’s in the meantime.


Comments are closed.