Tuesday, December 10, 2013

Pulling The Plug - Software End-of-Life

Recently, the company I work for underwent a serious hardware overhaul -- upgrading a large swath of 32-bit servers to 64-bit processors and updating to the latest operating system and application versions. All in all, this upgrade went very smoothly. We planned out this transition and tested our software in our development environment before deploying to production. We ironed out a few issues that cropped up within the first few days. The problems ranged between minor incompatibilities and simple human error. Unfortunately, hiding in the shadows was a long-forgotten application that was still in use by a handful of clients.

This piece of software is the kind of application legends are made of. Written many years ago, it has been performing well enough to be forgotten about and ignored. The original developer and documentation are long since gone, and it required a great deal of asking around just to find out who used this software and why. Unfortunately, the software, which had been running smoothly for years, did not transition well onto the new servers.

One of our developers (not me) was tasked with getting this service up and running on the new architecture. After spending a few hours digging through the old source code he was able to get it compiling and passing a few basic test cases. We figured that we could move ahead with a production deployment and shift our focus to more pressing matters. We figured wrong.

The next day, the support calls came in once more. Our software worked well enough for a few hours, but something had gone wrong overnight. The application was fully locked up and unresponsive to any requests. A quick restart did the trick, but this was only a temporary band-aid fix and was not acceptable as a long-term solution. The same developer began trying to reproduce the behavior and fix the problem for good. This became the routine over the next few days.

Now, the part I haven't yet told you is that this system has long since been replaced with a newer, shinier, and more fully-featured solution. This new solution is still active with many customers and regular maintenance and enhancements. Only a handful of customers remained on the old system. The reason for staying was simple: don't fix what ain't broke. This system continued to serve their needs for all these years with no updates or known issues.

The way we saw things, we were left with three choices:
  • Install the old server hardware back into our production systems to support these customers
  • Assign developers to debug the old code and get it working on the new hardware
  • Offer these customers the choice between upgrading to the new system or lose this service (with a reasonable amount of time to decide and transition their systems)
The idea of reverting back to the old hardware was dismissed pretty quickly. The reasons we decided to upgrade our hardware were related to security, consistency, and future maintenance. We did not want to have to support 31 flavors of hardware in our production environment.

The remaining choice seemed to come down to simple math. Weigh the revenues against the costs and you'll find the correct decision. Is the math really so simple? How do you come up with a formula for this. I don't know the answer myself, but I'll take a shot at offering a few observations.

Let's start with some obvious factors. Software developers don't come cheap, and having developers spend a few days fighting with a bug can quickly run into the hundreds and even thousands of dollars. Estimate the required effort in time and multiply that by the developer's pay rate and come up with the cost of fixing this bug. Losing a paying customer is contrary to the goals of running a successful business. Continuing to support this solution means that the company can continue to feed off of these revenue streams. Keep in mind that these customers have been using these services with minimal maintenance for many years. Multiply the regular fees by the expected service time to calculate revenue. As with all things, estimates may not accurately reflect reality.

Simple revenues and resource costs don't really cover some of the more complicated factors. If cancelling the service means losing a customer, it may also mean losing future streams of revenue from that customer. When the customer needs to expand their services in the future, who will they turn to? Will they look to maintain a relationship with a company that they already know and trust? If so, terminating that relationship today may mean waving goodbye to new opportunities in the future. What about word of mouth? Damaging your company's reputation may have wide ranging effects in one of the most powerful and underestimated forms of marketing. What about future bug fixes and maintenance? Who is to say that the next round of updates won't expose more bugs in what becomes a money pit of maintenance work for only a handful of customers? What about the confusion caused by the need to support two similar but disparate systems?

To be honest, I am no expert on this matter. The choice was not mine to make and we decided to pull the plug on this legacy application. From a development perspective, I was pleased with this choice. Out with the old and in with the new, I say! I'd much rather have everyone up to date and on the same platform instead of trying to support a creaky old piece of legacy software. From the perspective of a businessman, I'm still not sure.

Have you or your company pulled the plug on legacy software? What were the circumstances? What triggered your decision? What factors did you consider before making the decision? Let me know in the comments.

Cheers,

Joshua Ganes