I wanted to share a funny story that happened to us in the office today: I’m a new developer for the Typemock team, and after running all unit tests locally, going through peer code review etc. I happily committed my first changes to our source repository.
Once our continuous integration system caught up with the changes it went on to start the build process, and then everything broke: the build process got stuck for over an hour and the tests would not run.
Naturally I got that nasty feeling at the bottom of my stomach: it’s my 4th day at the company and I killed the build server. After briefly imaging my conversation with Lior on the following morning I grabbed Menahem to help and we started investigating: as the build process worked perfectly on my machine, we performed a rollback on the build server a tried to build again. No luck. Aside for the immediate relieve I felt (as it was not my code breaking the build) we were still left with a broken build.
After investigating further, we found the culprit: a C++ project in our codebase was taking x10 time to build than usual. Other C++ projects compiled just fine, as well the .NET projects. We (the entire dev team now) speculated around different reasons for the C++ compiler to misbehave, but found out nothing. Finally, we got around to manually going over the project configuration and comparing it to other projects and when we nailed it: it turns out the cpp project configuration had additional include directories pointing to “d:softwaresomething”. This is probably left over from way back – all the company’s computers now have a single c: drive and d: is a CD ROM. This means that each time the precompiler encountered a #include directive (a couple of dosen times) it went to the CD ROM drive to find the files! this action took 1-2 minutes for each #include, bogging down the build process and ultimately causing it to fail. After removing these include directories the build process came back to normal.
The only mistery we had left is what happened? these unnecessary include libs were in the configuration for a while, and only today it broke the build… It turns out that someone installed something on the build server yesterday and left a CD in the drive. In the past the precompiler looked for the include libs, found out they were on an unavailable drive and gracefully ignored them. Because we had a CD in the drive this time, it caused the CD to spin and seek the files that were not there, taking a minute before the build process recovered each time…