Refactoring a legacy industrial application: when does it make sense, and how can you do it without putting production at risk?

Technical Summary

Key takeaways:

The article indicates that refactoring an industrial application makes sense when the cost and uncertainty of minor changes grow faster than their business value. The key is to distinguish structural cleanup from a technical change that affects the process or safety.

Refactoring a legacy application is about production continuity, costs, and accountability, not just code quality.
The risk increases when a change affects signals, states, the sequence of actions, or process transition conditions.
Seemingly technical changes can alter start, stop, fault reset, and the response to power loss and loss of communication.
If the sequences or responses of the protective circuits need to be revalidated, this is no longer routine code maintenance.
Safe refactoring requires defining the scope of the change, acceptance criteria, and a risk assessment for the process.

Why this matters today

Refactoring a legacy industrial application is no longer just a matter of code aesthetics or maintenance convenience. Today, it is a decision that affects production continuity, cost predictability, and the scope of responsibility on the system owner’s side. In many plants, control applications, operator tools, and communication layers have evolved over years without a single coherent architecture, often built around devices, libraries, and integration mechanisms with limited or discontinued support. That situation may be manageable for a time, but only until each further change starts costing more than the functionality it is meant to deliver. At that point, the question is no longer whether to touch the legacy application, but whether the organization still has control over its behavior under production conditions.

This matters because, in industrial systems, technical debt quickly turns into operational debt. If an application is hard to reproduce, depends on a few individuals, lacks reliable regression tests, or combines production functions with safety-related and diagnostic functions, then any incident will be more expensive than a similar issue in an office system. The impact is not limited to downtime. There is also the cost of maintenance work, the risk of incorrect workarounds applied under time pressure, difficulty demonstrating due diligence after a change, and the challenge of separating a pre-existing fault from the effects of the project team’s intervention. For a manager or product owner, the practical criterion is simple: if the time and uncertainty involved in deploying further minor changes are growing faster than their business value, the application has reached a point where refactoring must be a conscious decision, not something postponed until the first critical failure.

The most common mistakes arise when refactoring is treated as an upgrade “with no impact on the process,” even though it actually changes how the system makes decisions. In practice, even a seemingly minor intervention is enough: replacing a communication component, rebuilding the task schedule, changing the sensor data buffering logic, or tidying up the startup sequence after a restart. On paper, these look like technical housekeeping. On the shop floor, however, they can change signal timing, the order in which interlocks are released, the response to loss of communication, or the application’s behavior after a power loss. This is exactly where refactoring becomes a matter of practical change risk assessment: the point is not whether the code is “better,” but whether, after the change, the machine, line, or workstation still behaves predictably in normal operation, fault conditions, and after restart.

A good test of decision maturity is whether the team can define the boundary between a change to the application’s internal structure and a change to a function that is significant for the process or for safety. If that boundary cannot be described in terms of signals, states, and transition conditions, the project carries risk regardless of the contractor’s quality. In an industrial environment, situations are especially sensitive when the application is involved in startup, shutdown, fault reset, alarm acknowledgement, or interacts with energy isolation systems and interlocks. At that point, the issue is no longer just software architecture, but also protection against unexpected start-up and whether the analysis also covers the electrical installation, control logic, and dependencies between devices. This is exactly the point at which an apparently local refactoring effort stops being an IT task and becomes a technical change that requires a full decision-making regime.

Reference to normative requirements becomes relevant only at this stage, because standards do not replace a design decision, but they do help define its scope. If a change may affect startup conditions, stopping conditions, recovery after a disturbance, or protective measures, it must be assessed as a risk change, not as routine code maintenance. If the intervention affects logic that interacts with energy isolation, interlocks, or the safe access sequence, it also naturally opens the area of requirements related to protection against unexpected start-up. From a liability perspective, the key issue is therefore not simply “whether to refactor,” but whether the organization can demonstrate that it understands the boundaries of the change, has acceptance criteria based on process behavior, and can distinguish between system cleanup and a modification that requires a full risk assessment and coordination with installation design and on-site testing.

Where cost or risk most often increases

The biggest cost increase when refactoring a legacy industrial application rarely comes from the code itself. The problem usually starts with misclassifying the change: the team treats it as a cleanup of the program structure, while in reality it changes the system’s behavior over time, the order of operations, or the conditions for transitions between states. In a production environment, that mistake has direct project consequences. The schedule no longer matches the actual scope, tests are planned around software functionality rather than process behavior, and responsibility for the outcome becomes blurred between maintenance, automation, and the software supplier. The practical criterion here is simple: if the change requires the startup sequence, shutdown sequence, recovery after a disturbance, or the response to signals from protective circuits to be confirmed again, then it is no longer a “safe refactoring” in organizational terms, but a change that may create production risk and require a different approval path.

A second common source of rising cost is a design decision made without a complete picture of the dependencies. Legacy industrial applications are often tightly interwoven with controller configuration, actuators, visualization, data archiving, and operator procedures. In the documentation, this may look like a single system, but in practice these are layers developed over many years by different teams. A refactoring intended to improve code readability or make future maintenance easier can quietly change the meaning of delays, interlock conditions, default values after restart, or the way communication faults are handled. The result is not just a technical fix, but also the cost of downtime, additional on-site trials, and disputes over whether the defect already existed or was introduced by the change. That is why, before making a decision, it is worth assessing not the size of the modification itself, but the number and criticality of the interfaces: how many signals, recipes, operating modes, and operational workarounds depend on the section of code to be rebuilt. The more such interfaces there are, the less sense it makes to refactor “while you are at it” as part of another task.

In practice, the most expensive projects are often those where the team discovers the real requirements only during commissioning. A typical example is rebuilding a sequential module that, according to the description, “does the same thing, just more cleanly.” After deployment, however, it turns out that the previous version contained undocumented behaviors that compensated for imperfections in the installation: a brief signal hold, tolerance for a late sensor, a specific alarm reset order, or a condition that determines whether service access is possible. In the code, this looked like a bug or technical debt, but for the process it was part of the stabilization logic. If refactoring removes such mechanisms without understanding their function, the cost appears immediately: the number of post-startup interventions increases, acceptance takes longer, and the logic has to be recreated under the pressure of a running plant. That is why the value of refactoring should also be judged by whether the current system behavior can be reproduced. If the organization does not have an event log, reliable descriptions of operating modes, and test scenarios based on the real process, then the foundation for assessment must be built first, including hazard identification for the changed control logic, and only then should a rebuild decision be made.

This leads directly into a practical risk assessment of changes when the modification affects protective functions, safe access sequences, control of actuator movement, or the behavior of the installation after a power loss and restoration. In that scope, the cost of an error is not limited to programming corrections, because the question of responsibility for releasing the change for operation also arises. If the application interacts with hydraulic systems, pneumatic systems, or solutions such as two-hand control, then the line between refactoring and a technical change becomes even narrower and requires verification that the design assumptions of the protective measures have not been compromised. Only at this point is it justified to refer to structured risk assessment methods, including the approach used in practice under ISO/TR 14121-2, and, for hydraulic systems, also to the design requirements organized by EN ISO 4413. This is not about formalism for its own sake, but about a simple decision rule: if the change can affect process or operator safety, then its cost must be calculated together with validation, on-site testing, and the accountability regime, not solely with the programmer’s working time.

How to approach this in practice

In practice, the value of refactoring a legacy industrial application is judged not by the technological appeal of the change, but by whether it can reduce operational risk while keeping the implementation under control. For a manager or product owner, this means a simple shift in perspective: the question is not whether the code is “worth cleaning up,” but whether the application’s current state is genuinely making maintenance, testing, troubleshooting, or compliant implementation of further changes more difficult. If the answer is yes, refactoring makes sense, but only to the extent that it can be separated from production operations and assessed on the basis of measurable outcomes. A good decision criterion here is to compare two costs: the cost of leaving the application as it is, including downtime, diagnostic time, dependence on individual people, and the risk of an incorrect change, and the cost of a controlled rebuild together with testing, validation, and commissioning. Without that comparison, the project usually gets out of control, because the team ends up funding code cleanup from a budget intended for functionality, while responsibility for the consequences on the plant floor remains undefined.

For that reason, the first decision should not be “rewrite it” or “leave it alone,” but where to draw the boundary of the change. In a mature approach, the part that concerns only the software structure is separated from the part that affects process logic, start-up and shutdown sequences, operating modes, communication with drives, and behavior after a power disturbance. This distinction has a direct impact on both cost and organization. A change limited to the code-structuring layer can be carried out in a shorter cycle and with less involvement from maintenance. A change that affects the behavior of the machine or line already requires an on-site test plan, a service window, a rollback procedure, and a clear indication of who approves release for operation. It is also worth measuring not only the time needed for development work, but also the time required to restore the system after a failed attempt, the number of areas covered by regression testing, and the time needed to diagnose a deviation after start-up. These are the indicators that show whether refactoring is actually reducing project risk rather than simply improving the working comfort of the development team.

A practical example is typical of older control applications: the code contains multiple duplicated sections responsible for motion interlocks, alarm handling, and transitions between manual and automatic mode. The team wants to standardize them because the current layout makes further development difficult and causes inconsistencies between stations. That decision only makes sense after checking whether the standardization will change the conditions under which an actuator is permitted to move, and whether a different state restoration sequence will appear after a controller restart. If the application also controls valves, drives, or systems with stored energy, then even an apparently “internal” refactoring may move into the area of risk assessment under ISO 12100 and require analysis of protection against unexpected start-up in accordance with ISO 14118. In that case, a sensible approach is to carry out the refactoring in stages: first reproduce the behavior in a test environment, then separate the modules without changing the logic, and then verify the result on site with a prepared rollback scenario. This limits operational liability and makes it possible to stop the implementation before the problem affects production.

Only at this stage is a reference to standards needed, because standards do not replace an engineering decision, but they do help define the point at which a change stops being purely programming work. If the refactoring affects protective measures, conditions for safe access, energy isolation, or the behavior of systems after stopping and restarting, it naturally falls within the scope of a practical change risk assessment, carried out in a structured way, also using the approach known from ISO/TR 14121-2. Where there is a risk of unexpected start-up, it is necessary to check not only the code itself, but also the logic for isolating and restoring energy, which leads directly to issues associated with ISO 14118. If the application is also linked to hydraulics or pneumatics, the assessment cannot ignore the design assumptions of those systems, because an incorrect control sequence may compromise their safe operation regardless of whether the program itself is correct; in that case, it is also justified to refer to requirements that structure the design of hydraulic systems. In practice, this means one thing: the scope of refactoring is determined not by the elegance of the solution, but by the boundary of responsibility for the safe behavior of the installation after the change.

What to watch for during implementation

Implementing the refactoring of a legacy industrial application is the point at which even a sound architectural decision can turn into an operational problem. The whole effort stops making sense when the change improves the code but reduces the predictability of how the installation operates, or extends the team’s responsibility beyond what has been identified and approved. The most common mistake is to treat implementation as a routine release of a new version. In a production environment, what matters is not only whether the application works, but also whether all transient states behave identically after the change: start-up after power loss, communication restart, recipe restoration, handling of alarms, interlocks, and manual modes. The practical criterion is simple: if the team cannot clearly describe which behaviors must remain unchanged after implementation, then the conditions for safe commissioning are not yet in place.

At the implementation decision stage, you need to distinguish between a technically reversible change and one that, once commissioned, creates a new baseline and makes rollback difficult. This has direct implications for cost and schedule. A refactoring effort that requires simultaneous updates to controllers, visualisation, the data server, and interfaces to higher-level systems is no longer a single programming task; it becomes a coordinated production change with multiple points of failure. That is why, before deployment, it is worth adopting an acceptance criterion based not on the statement “the tests passed,” but on the ability to roll back the change in a controlled manner within a timeframe acceptable to the process. If there is no credible rollback procedure, there is no basis for claiming that the risk has been brought under control. In practice, it is better to measure not an abstract “deployment quality,” but indicators such as the time needed to restore the previous version, the number of interfaces dependent on the change, and the number of functions whose correctness can be confirmed on the installation without interfering with production.

A good example is a situation where refactoring tidies up exception handling and error messages, but at the same time changes the initialization order after a system restart. On a test bench, everything looks correct because the devices are available immediately and the process is not running under load. In the plant, however, the same code may trigger the sequence at a different moment than before, leading to loss of synchronisation with drives, incorrect interpretation of ready signals, or a batch of material being left in an intermediate state. Such an incident does not have to mean a failure in the technical sense, but it generates the cost of downtime, scrap, restart, and additional responsibility for the decision to resume operation. This is exactly where refactoring becomes a practical change risk assessment issue: not when the change is large, but when its effects can no longer be confined to the software layer.

The boundary of responsibility becomes even clearer where the application affects protective functions, permissive logic, unloading sequences, stopping, and restart after a disturbance. In such a case, comparing code versions or relying on a functional test performed by the integrator is no longer enough. What is needed is a structured assessment of whether the change modifies the level of risk and whether it undermines the assumptions for safe operation of the machine or installation. This is the natural point at which to move into risk assessment according to ISO 12100 and the practices used for assessing change-related risk, while in more complex cases the methodological approach known from ISO/TR 14121-2 can also be helpful. If the application controls hydraulic or pneumatic systems, you must also verify whether the new logic changes the conditions for safe energy control and the sequence of movements; in that case, the relevant design requirements for those systems matter as well, not just the correctness of the program itself. For the project team, this means one thing: implementation can be considered prepared only when the scope of technical, operational, and compliance responsibility has been defined before commissioning, not only after the first incident.

Refactoring a legacy industrial application: when does it make sense, and how can you do it without risking production?

It makes sense when the cost and uncertainty of implementing minor changes rise faster than their business value. This is a sign that technical debt is starting to affect production continuity and operating costs.

When a change affects signals, states, transition conditions, or startup, shutdown, and restart sequences. It is then no longer solely a matter of architecture, but a technical change requiring risk assessment.

Most often, in areas where the system’s behavior changes over time: task scheduling, sequence of operations, buffering logic, and the response after loss of communication or power failure. In such cases, even a minor change can alter the predictability of the machine’s or line’s operation.

A clear boundary must be defined between a change to the application structure and a change to a function that is critical to the process or safety. Acceptance criteria should be based on process behavior, and testing should also cover normal operation, fault conditions, and restart.

When the intervention affects the logic related to start-up, shutdown, fault reset, interlocks, energy isolation, or safe access. In such cases, the change should be treated as a risk-related change, not routine code maintenance.

Share: LinkedIn Facebook

Key takeaways:

Why this matters today

Where cost or risk most often increases

How to approach this in practice

What to watch for during implementation

Refactoring a legacy industrial application: when does it make sense, and how can you do it without risking production?

When does refactoring make sense?

When does refactoring stop being just code cleanup?

Where do production risks most commonly arise?

How can risk be reduced during refactoring?

When is a full risk assessment required?

Related articles (For Engineer)

Engineering Shield Technical Bulletin