How To Find A Bug Without Pulling Your Hair Out
Working with code you’re unclear about is like wading through a swamp. You should attempt to get yourself on solid ground as quickly as possible
– Jon Skeet
One of the most frustrating things about bugs with computer software is you get a bug report, someone is breathing down your neck to get it fixed, customers are upset, and everyone thinks it should be easy to find and fix.
We’re lucky if the problem is precisely where the error says or the user’s bug report has enough details. Often we are stuck having to dig into the code while trying things ourselves to figure out where the problem is. It’s never as easy as just jumping to a code line everything points to and changing something. We need to find the “root cause” to make sure we aren’t fixing the symptoms of another bug.
root cause - A root cause is an initiating cause of a bug
Debugging is just like investigating a case. There are some initial steps you need to take. You can’t investigate a crime if you don’t know what it is and where it occurred. The same thing applies to fixing a bug.
Time to put on our detective hats to find the root cause.
Gather Details
Before trying anything, before changing any code, you need to gather as many details as possible. Review the bug report or error, then start asking questions to uncover missing details. If it is a bug report, you can ask the user, QA tester, or whoever submitted it. If it’s an automated error report, you may need to ask yourself these questions by digging in the code, your systems, and online.
Here are some of the questions you should ask:
Is it easily reproduced?
If so, what are the exact steps to reproduce it?
Have others run into this issue?
What is the device and system info the error occurred on?
What changed recently? Any code deployments, configuration, or hardware changes?
What should the code do when it works successfully?
Is there any validation, error handling, etc., in place that should be running?
In addition to asking questions, you also want to make sure to research any additional information that could help you.
Check logs and error reporting software for related errors and events that happened around the same time.
Google any error messages. Google both to make sure you understand it and to see if others have solutions.
Check git logs for recent deployments, hardware, or configuration changes. Keep in mind any recent changes, even if they seem unrelated.
Reproduce The Problem
After you learn everything you can about the error, the next step is to reproduce it yourself. Ideally, you will reproduce this in the local environment you usually code in. The fewer chances you have to break other things while debugging, the better.
One of the worst things we can do when debugging is changing things or fixing a bug when we couldn’t reproduce the problem ourselves. This is why if we can’t reproduce it locally, we want to try to reproduce it in production safely.
We need to be careful of the potential issues and requirements we need to maintain in production, but sometimes the only choice is to debug in production. I’ll write another article soon giving tips to limit needing to debug in production and how to do it successfully.
Reproducing the bug is easy if you have steps from a bug report or error reporting tool. If you don’t have the steps, you will need some trial and error until you figure out the exact steps to reproduce the bug. Reproducing bugs gets easier as you learn more about a system.
Once you know the required steps to reproduce a bug, try to reduce the steps to the absolute minimum number of steps. The fewer steps needed to reproduce a bug, the quicker fixes can be tested. This is also a significant benefit for writing automated tests (if you have automated tests) to test the changes. Fewer steps mean less setup and easier to write automated tests.
If you don’t have automated tests, I highly recommend investing in the time to set up and add tests for all your code. There are plenty of resources for each platform and language to get started with automated testing. Automated testing offers many benefits, including speeding up debugging and ensuring bugs don’t come back later.
Ask Someone Else
One of the first steps should always be to ask if someone else ran into this bug before. Don’t be afraid to ask QA testers, other developers, and sometimes even users about the bug. Asking can be the quickest way to fix a tricky bug if someone is already familiar with it.
You never know who may have run into the problem before. There is a chance this person will know:
How to fix it.
They may have a workaround that can be used for now.
They may also know if a permanent solution is filed to fix it.
They could already have completed research to help speed up debugging.
By tackling each of these steps first, you will find fixing bugs to be less stressful. When you know the root cause and how to reproduce a bug, fixing it becomes a quick and painless process.
If you would like more tips and tricks to level up your debugging and make fixing bugs less stressful, my book Level Up Your Debugging is up for early access pre-orders on Gumroad.