If you run an infrastructure, there’s a good chance you have some debt tucked in your system somewhere. There’s also a good chance that you’re not getting enough time to fix those debts. There will most likely be a good reason why something is done in the way it is. This is just how things are in general. After I joined Gluster, I’ve worked with my fellow sysadmin to tackle our large infrastructure technical debt over the course of time. It goes like this:
That is in no way and exhaustive list. But we’ve managed to tackle 2.5 items from the list. Here’s what we did in order:
If I look at it, it almost looks like I’ve failed. But again, like dealing with most infrastructure debt, you touch one thing and you realize it’s broken in someway and someone depended on that breakage. What I’ve done is I’ve had to pick and prioritize what things I would spend my time on. At the end of the day, I have to justify my time in terms of moving the project forward. Fixing the infrastructure debt for Gerrit was a great example. I could actually focus on it with everyone’s support. Fixing Jenkins was a priority since we wanted to use some of the newer features, again I had backing to do that. Moving things to our hardware is where things get tricky. There’s some financial goals we can hit if we make the move, but outside of that, we have no reason to move. But long-term, we want to me mostly in our hardware, since we spent money on it. This is, understandably going slow. There’s a subtle capacity difference and the noisy neighbor problem affects us quite strongly when we try to do anything in this regard.