Here’s a simple question: how are more of your application issues detected, by people or systems? I bet most of you initially think of your monitoring systems and the automated messages or pages that are sent out regularly as detecting most, or even all, of your problems. Have you stopped to think how many times a phone call lets you know about an issue? Do you consider the ways in which a code review or human tester brings up a concern?
I try to think about all software problems, both the ones that reach production and the ones that are prevented early. If I catch a SELECT * in a view during development, I can prevent a problem months later when a table adds a column and no one refreshes the view in production. Those potential issues that never get to the customer are wins for me, and I think this is something we should be proud of as software engineers and testers.
Can you move the numbers, though? Is there a way to find problems before people find them? I think there is, and it’s with better monitoring and better testing. For monitoring, we need better, and more, instrumentation that measures what we expect, looks for deviations, and (low level) alerts someone. This is an area where I think machine learning and better analysis will help. Those ML models can be hard to setup, so I’m hoping that some individuals or projects will start some work here. Microsoft is doing some of this in Azure, and I hope they share some knowledge with us.
Testing is really the way to catch more issues before humans do. We’ve known this for decades in software development, but so many developers have been resistant to the idea of building some sort of formal test for their code. It’s not fun, it’s hard to maintain, and really, it’s just hard for most people to start writing tests.
I think things are getting better with testing frameworks that make building and executing tests easier. We have frameworks for all major application languages, and even quite a few for T-SQL. We’ve also learned more about the types of tests to write, which type to ignore, and how to avoid building so many brittle tests that testing is more work than coding features. If you know nothing about testing, you owe it to yourself to spend a little time learning about unit testing and practicing writing tests.
Now that we are collecting more and more data about our applications, we have the opportunity to really build software that better meets the goals and needs of our customers. However, we have to take advantage of this data, and the advances in testing, to ensure that we build the best software we can.