In my current job, there is an error in the live code base which has been identified in one service.
We have identified a relatively small code change which would fix the issue and it has been confirmed to work in a test environment.
But, as this service is quiet old, with plans to phase it out over the next 12 months or so, and migrate everything towards the newer services, there has been an architectural decision made to make no more changes to the current service (with exceptions for extreme cases which are minor config changes, but our fix is being classed as a bigger change)
The alternative fix, is to migrate and redevelop the existing code to the new service, however this is a much larger chunk of work, which will need to be more extensively tested etc. And will also mean that the live production errors will remain until this work is done
I'm trying to understand, has anyone encountered something like this before, and what reasons would there be in architectural terms to not fix some code which is currently in your live system?
The consideration of time spent working on the fix may be negated against the time it takes to implement and solve the problem using the new service.
The architects may decide that the time would better spent developing a new service which is more robust, (as you say it will be migrated soon anyway) rather than working on the same thing twice only in two different ways.
Another factor to take into consideration is that if the current code base is old and difficult to work with, then the fix which you mention which works, is there anything to suggest that without a full suite of regression testing being completed ( also meaning more time and effort spent on something which will be phased out soon) that may actually end up breaking even more of your system?