Researchers Haoyu Wang, Andrea Alfonsi, Roberto Ponciroli, and Richard Vilim have introduced a novel data-driven approach to selecting state variables for dynamical systems, offering a significant advancement in the field of Digital Twins and control systems. Their work, titled “From Features to States: Data-Driven Selection of Measured State Variables via RFE-DMDc,” presents a method that leverages Recursive Feature Elimination (RFE) and Dynamic Mode Decomposition with Control (DMDc) to identify optimal state variables for monitoring and modeling complex systems.
In many engineering applications, identifying state variables through first-principles, model-based methods is often impractical. This is where data-driven approaches come into play, providing a more feasible alternative for developing Digital Twins used in control and diagnostics. The researchers’ proposed workflow, RFE-DMDc, addresses this need by using RFE to select a minimal yet meaningful set of variables to monitor. This selection process is crucial as it helps in capturing the behavior of a dynamical system under given inputs, which is essential for effective control and diagnostics.
The RFE-DMDc workflow includes a cross-subsystem selection step designed to mitigate feature overshadowing in multi-component systems. This step ensures that the selected variables are not only minimal but also physically interpretable and relevant across different subsystems. To validate their approach, the researchers implemented a GA-DMDc baseline, which jointly optimizes the state set and model fit under a common accuracy cost on states and outputs. This baseline serves as a benchmark to evaluate the performance of RFE-DMDc.
The researchers tested their method on two distinct systems: an RLC benchmark and a realistic Integrated Energy System (IES). The RLC benchmark, with known truths, allowed for a straightforward evaluation of the method’s accuracy. The IES, on the other hand, presented a more complex scenario with multiple thermally coupled components and thousands of candidate variables. Across both systems, RFE-DMDc consistently identified compact state sets comprising approximately ten variables. These sets achieved test errors comparable to those of GA-DMDc but required significantly less computational timeāan order of magnitude less, to be precise.
The selected variables retained clear physical interpretations across subsystems, demonstrating the method’s ability to identify meaningful and relevant state variables. The resulting models derived from these variables showed competitive predictive accuracy, computational efficiency, and robustness to overfitting. This robustness is particularly important in practical applications where models need to perform reliably under varying conditions.
The practical implications of this research are substantial. In the marine sector, for instance, the ability to accurately model and monitor the state of complex systems such as ship engines, propulsion systems, and environmental monitoring systems can lead to significant improvements in efficiency, maintenance, and overall performance. By identifying a minimal set of state variables that capture the essential dynamics of these systems, operators can focus their monitoring efforts more effectively, reducing the need for extensive and costly sensor arrays.
Moreover, the computational efficiency of RFE-DMDc makes it particularly suitable for real-time applications. In scenarios where rapid decision-making is crucial, such as in autonomous vessels or emergency response situations, the ability to quickly and accurately model system behavior can be a game-changer. This efficiency also extends to the development and deployment of Digital Twins, which can be used for simulation, training, and predictive maintenance, further enhancing the operational capabilities of marine systems.
In conclusion, the work of Haoyu Wang, Andrea Alfonsi, Roberto Ponciroli, and Richard Vilim represents a significant step forward in the field of data-driven modeling and control. Their RFE-DMDc method offers a practical and efficient solution for selecting state variables in complex dynamical systems, with wide-ranging applications in various industries, including the marine sector. By leveraging advanced data-driven techniques, they have demonstrated the potential to enhance system monitoring, control, and overall performance, paving the way for more intelligent and efficient maritime operations. Read the original research paper here.

