I love great debugging stories. There's the classic one about the uninsulated floor tile, of course, but I have a good one of my own to share.
This takes us back a decade or more to when I was a senior in college. A team of six of us were building a race car both to compete in Formula SAE and to serve as our senior project. My responsibility was the engine computer, electronics, and wiring. It was a small team, though, so we typically worked in pairs or threes on different parts of the car.
One of my teammates was responsible for building and tuning the engine. My engine computer was ready right around the time his engine was built, so we spent a few hours installing everything onto the test stand. We were both delighted that the motor actually turned over, but also surprised at how poorly it ran!
An hour or so of tuning later and we finally have the engine idling nicely. But then it dies inexplicably! We fiddle with the numbers again, but it seemed we had already found the best timings for idling, so that wasn't the problem. But time and time again the engine died. Sometimes it died after a few seconds. Other times we got 30 seconds of measurements out of it. On the rare occasion, it seemed to work perfectly.
My teammate and I went back and tore our components apart and rebuilt them piece by piece, checking everything. We did find some little issues, so that was a good exercise in and of itself, but ultimately we still had the same problem.
After the second day this happened, I set up a few scopes to monitor different parts of the electronics. Each time the engine died, the ECU had a voltage spike and then settled into a safety state, requiring a reboot. The ECU, not the engine, was to blame.
But for days I couldn't find the source of the issue. I focused on interference and power conditioning, figuring that maybe (hand wavy) some magnetic field from the motor was causing a ground loop in the ECU. But shielding the ECU and the cabling didn't help. Using batteries and other conditioned power sources didn't help. I changed every single wire and cable at least twice, checking each. The ground was a literal earth ground: I used the iron engine stand, which itself was embedded in the concrete, in the workshop basement, underground.
I was at wits end. My teammate was understandably frustrated with me. He needed to run the engine a lot, at different loads and throttle positions, in order to design the spark and fuel timing maps. But because of my electronics, his engine wouldn't run reliably for more than a minute or so.
After another failure, and feeling completely lost, I said to him, "I don't know man, the only thing it could be is magic." And right at that very moment a familiar odor drifted in. The magical scent of ozone.
He and I charge into the room next door to find a third teammate happily welding away, working on the car's frame. There was no way the welding rig could interfere with the electronics, right? Not through a couple of walls and 30 feet of earth ground?
We asked him to stop welding and started the engine again. It ran. And it kept running quite happily. We waited a few minutes until we were satisfied, then asked him to weld something. The engine died instantly. It turns out the welding rig could interfere with our electronics through a few walls and 30 feet of earth ground.
To this day I'm still not 100% sure if the interference was RF or a ground current. I think a ground current (some may call it a ground loop) is much more likely, but the fact that our welding rig could dump that much current into ground was pretty astonishing at the time.
Anyway, just goes to show that even when you think you have a handle on your environment, you may not.