Reviving Legacy Applications Without Starting from Scratch
Many legacy systems and software development projects often reach a point where there is so much technical debt and organisational knowledge loss that it becomes almost impossible to implement new features anymore. I like to dub these “Frankenstein’s Monster” systems because they have normally grown over the years, with scope creep and parts being sown on and adjusted far from what the system was normally originally designed for. The codebase has been hacked around repeatedly until it reaches the point of no return.
A solution to move forward!
Teams have often tried to rework parts of the application, but normally this only affects things you weren’t expecting, causing more headaches. Maybe wrapping the existing code with unit tests before making the change is the answer, but the code wasn’t designed to be testable in the first place! Maybe you should just lock the development branches and not touch it anymore (if it uses code repositories at all). How do you make code changes to a system that impossible to maintain, without making it worse?
Why “rewriting from scratch” doesn’t work
Code can be risky to change and expensive to refactor. In this situation, it might seem logical to rewrite it from scratch. This is normally how it goes:
After many long, intense meetings, you convince Leadership that stopping new features and rewriting the existing app is the right strategy.
You estimate that the rewrite will take 6 months to cover existing functionality of the app.
A few months in, a huge bug is discovered and needs an urgent, crucial fix in the old codebase. So, you patch the old and new systems.
A few weeks later, a new feature has been sold to the client. It CANNOT WAIT for the new system and must be implemented in the old code. You need to spend time adding it to the old system and add a TODO to implement the feature in the new system.
5 months in, you realise that the project will be late. The old system was doing way more things than you realised and affected things nobody knew about.
After 7 months of intense development, you start testing the new system. QA raises a lot of things that need fixing.
9 months later, the business can’t stand “not developing features” anymore. Leadership is not happy with the situation; you are tired and burned out. You are forced to start making changes to the old system whilst also trying to keep up with the rewrite.
Eventually, both systems end up in production. The long-term goal is still to retire the old system, but the new system is not ready yet. Every feature needs to be implemented twice.
For many, this may sound like an unrealistic, fictional scenario. For those of us who have lived through this or have inherited “2 systems in production”, this is very familiar. It’s a very common mistake that can only be identified as one in hindsight.
A real-life scenario
A client of ours had 2 systems working in parallel: cart and a booking. In fact, booking was supposed to replace cart.
The project started a few years ago but was never finished. booking is better than cart but is not as complete. Some flows use booking, while others still use cart. In this case, new features cost twice as much to implement.
The real fun part of this scenario is because cart is not designed to support the new features we want and booking is too out-of-date, it was suggested to “rewrite the cart system properly”. If we went down that road, we’d soon have 3 systems running parallel in production, all trying to do the same thing. We won’t go there. Here is where we can use an efficient technique to work around a legacy system.
Using “The ship of Theseus” to rewrite a legacy codebase.
We can apply this thought experiment to rewriting our legacy system. The strategy is simple, progressively delete the old codebase, in Favor of the new one.
If parts of a ship are replaced as they wear or rot, would the ship remain the same after all the original parts are eventually replaced? If you progressively replace your codebase, can the users tell?
The goal is to avoid the pitfall of a never-ending rewrite and instead take an incremental approach.
Here is the plan:
New code acts as a proxy for the old code. Users use the new system, but it redirects to the old system without the users knowing.
Re-implement each behaviour in the new codebase without a change from the end user’s perspective.
Progressively fade away the old codebase by making users consume the new behaviour. Delete the old, unused code.
What it looks like in practice
Consider our system mentioned previously. We had a cart module that used to handle payments.
A rewrite was attempted. The idea was to create a new and shiny booking that will handle payments way better than cart. This project wasn’t delivered 100%. It took too much time to do the rewrite and we had to develop new features on the old cart.
Eventually, both modules ended up in production.
Let’s try that again, progressively replacing the cart module instead.
We can introduce the new booking module as a proxy.
It would be relatively easy to set up and delivered in production, without duplicating the payment processing logic. Then, progressively, we could start migrating the payment logic to the new booking module.
As we migrate the logic, we get rid of the unused code on the cart module.
This can take time. But progressively, we move toward the goal of replacing the old, unmaintainable cart with the new, shiny booking.
Phase out > Rewrite
The benefit of this method is that it solves the problem of delivering new features whilst also rewriting the old codebase.
No duplication of features between the old system and the new
The new system in put into production as soon as possible
Feedback is received sooner rather than later which means less work and less chance of things breaking
Rewrite can be done gradually, no need to freeze new features for x months.
This is not new
This is referred to as the “Strangler Fig” pattern. Coined by Martin Fowler, it refers to “[the huge strangler figs that] grow into fantastic and beautiful shapes, meanwhile strangling and killing the tree that was their host.”
The idea is to slowly get rid of the old system, rather than a complete cut-over, which is far riskier.
This approach is also advocated by Michael Feathers in ”Working Effectively with Legacy Code“.
The Wrap Class technique is a way to add new behaviour to the system, without changing existing code. You wrap existing code into a new class, to add behaviour around.
It puts some distance between new responsibilities and old ones. It can be the first step towards a better design when the old code is particularly hard to work with.
It is not always easy working with legacy systems, they were created by developers with the best intentions who had to ‘make it work’ over long periods of time and under much pressure. Hopefully using the above method will make rewriting your legacy system easier until one day, you’ll be able to switch it off for good!
Hello, I'm Nathan Goosen Software Development Tech Lead at First Digital. With over a decade in the technology sector, I've had the opportunity to contribute to a wide array of projects. Often considered the 'Swiss Army knife' of the tech world, my role spans coding to team leadership to business development and client management. I enjoy fostering an environment that encourages continuous learning and mutual respect, as it lays the foundation for any successful team.
”Working Effectively with Legacy Code“ - Michael Feathers
“Strangler Fig” pattern - Martin Fowler
“Strangler Fig pattern” – MS Learn