Today I had to do some debugging and it reminded me of a book I had read earlier. I'm writing a review of the same. This book was titled "Debugging: The 9 indispensable rules for finding even the most elusive software and hardware problems" by David J Agans.
This book advocates a set of rules that will make your debugging experience more methodical, save you time from those hours of arduous labor associated with this task, avoid wild goose chase, and refine your analytical skills. If you play by these rules, it may not guarantee success in every case but it will help you avoid the common pitfalls. The toughest challenge with debugging has always been that the issue (we call them bugs) is a case by case basis and the worst drawback is the assumptions and generalizations that go with the investigation. These rules hone your investigation skills and are just as evident from real world examples quoted in the book as one may interpret from some documented investigations gone wrong.
These rules are not that hard and are listed below for a read:
Rule 1: "Understand the system" You should read the manual, know the expected behaviors, know the road map and your tools. Ignorance can lead you to frustrations and the system cannot be controlled if you don't know how it should behave.
Rule 2: "Make it fail" A very common issue of understanding what is wrong is to find out what it takes to make it fail. If you know the steps to repeat a symptom to the point that the failure can happen deterministically and repeatedly, you have armed yourself with the knowledge of what you want to investigate. You can start at the beginning and by that it means you can use a baseline or a control for a starting point. And then find the steps from that point on. The toughest part of this rule is that some bugs are not easy to repeat. Often these are labeled as timing issue, bad luck, lies and statistics etc. But this rule encourages you to persist first in knowing how to make it fail.
Rule 3: "Quit thinking and look" The very first thing we do when we get a bug is that we put on our thinking caps. While our reasoning could be considered infallible, we don't know until we have taken a look. What if there was some data that surfaces only because we took a look. Albeit looking is hard, it is well worth a rule to be mentioned. Looking can be hard because the systems may not be transparent or they are complex (we call them black box) but even so we might be able to open up logs or traces or in some cases add instrumentation to know what the system is doing.
Rule 4: "Divide and Conquer" The divide and conquer approach is required to narrow the search space so that we spend less time on that which doesn't matter and more on the ones that matter. You can also fix the bugs you know about first and silence the noise so that you can take a hard look at the bad bug. You can even narrow down the layers vertically or the components horizontally to get to the bug.
Rule 5: "Change one thing at a time" While the rules above are about observing the system and learning from it even as it includes a method to get more information from the system than what it does, this rule is about your prudence in making one change at a time. The equivalent phrases for describing this are "use a rifle not a shotgun" and "grab the brass bar with both hands". The recommendation is to change and compare one thing at a time.
Rule 6: "Keep an audit trail" Write down every change to the system you make, in that order, and the results. Doing this will help you correlate, rediscover the covered ground and learn from it. The equivalent phrase for this is "the shortest pencil is longer than the longest memory".
Rule 7: "Check the plug" This rule helps you avoid making assumptions or even better question them. You should not start from square three but from square one. If the power switch is not on, you are not going to see any response. "Check the plug is a rule to say go about your steps methodically starting from the beginning and the starting point includes the check for the basics.
Rule 8: "Get a fresh view" As in the game of chess, sometimes we pigeon hole ourselves into the very thing we want to solve. And sometimes we are too shy to get help or ask an expert. And we don't have to be sure, we can report symptoms and get a fresh view.
Rule 9: "If you didn't fix it, it ain't fixed". This is a rule that says if you thought you made a fix and the symptoms stopped, you cannot rest until you know that its really fixed and that it's your fix that fixed it. Ignoring this rule means you are pretending that the problem has gone away. It never just goes away by itself. Fix the cause and the process but be doubly sure that its your fix that fixed it.
This book advocates a set of rules that will make your debugging experience more methodical, save you time from those hours of arduous labor associated with this task, avoid wild goose chase, and refine your analytical skills. If you play by these rules, it may not guarantee success in every case but it will help you avoid the common pitfalls. The toughest challenge with debugging has always been that the issue (we call them bugs) is a case by case basis and the worst drawback is the assumptions and generalizations that go with the investigation. These rules hone your investigation skills and are just as evident from real world examples quoted in the book as one may interpret from some documented investigations gone wrong.
These rules are not that hard and are listed below for a read:
Rule 1: "Understand the system" You should read the manual, know the expected behaviors, know the road map and your tools. Ignorance can lead you to frustrations and the system cannot be controlled if you don't know how it should behave.
Rule 2: "Make it fail" A very common issue of understanding what is wrong is to find out what it takes to make it fail. If you know the steps to repeat a symptom to the point that the failure can happen deterministically and repeatedly, you have armed yourself with the knowledge of what you want to investigate. You can start at the beginning and by that it means you can use a baseline or a control for a starting point. And then find the steps from that point on. The toughest part of this rule is that some bugs are not easy to repeat. Often these are labeled as timing issue, bad luck, lies and statistics etc. But this rule encourages you to persist first in knowing how to make it fail.
Rule 3: "Quit thinking and look" The very first thing we do when we get a bug is that we put on our thinking caps. While our reasoning could be considered infallible, we don't know until we have taken a look. What if there was some data that surfaces only because we took a look. Albeit looking is hard, it is well worth a rule to be mentioned. Looking can be hard because the systems may not be transparent or they are complex (we call them black box) but even so we might be able to open up logs or traces or in some cases add instrumentation to know what the system is doing.
Rule 4: "Divide and Conquer" The divide and conquer approach is required to narrow the search space so that we spend less time on that which doesn't matter and more on the ones that matter. You can also fix the bugs you know about first and silence the noise so that you can take a hard look at the bad bug. You can even narrow down the layers vertically or the components horizontally to get to the bug.
Rule 5: "Change one thing at a time" While the rules above are about observing the system and learning from it even as it includes a method to get more information from the system than what it does, this rule is about your prudence in making one change at a time. The equivalent phrases for describing this are "use a rifle not a shotgun" and "grab the brass bar with both hands". The recommendation is to change and compare one thing at a time.
Rule 6: "Keep an audit trail" Write down every change to the system you make, in that order, and the results. Doing this will help you correlate, rediscover the covered ground and learn from it. The equivalent phrase for this is "the shortest pencil is longer than the longest memory".
Rule 7: "Check the plug" This rule helps you avoid making assumptions or even better question them. You should not start from square three but from square one. If the power switch is not on, you are not going to see any response. "Check the plug is a rule to say go about your steps methodically starting from the beginning and the starting point includes the check for the basics.
Rule 8: "Get a fresh view" As in the game of chess, sometimes we pigeon hole ourselves into the very thing we want to solve. And sometimes we are too shy to get help or ask an expert. And we don't have to be sure, we can report symptoms and get a fresh view.
Rule 9: "If you didn't fix it, it ain't fixed". This is a rule that says if you thought you made a fix and the symptoms stopped, you cannot rest until you know that its really fixed and that it's your fix that fixed it. Ignoring this rule means you are pretending that the problem has gone away. It never just goes away by itself. Fix the cause and the process but be doubly sure that its your fix that fixed it.
 
No comments:
Post a Comment