Engineering Safety and Disaster Prevention
LIU Cong-cong, ZHANG Feng, HU Chao, ZHANG Qi-ling, GUO Yong-cheng
[Objective] Dam deformation is comprehensively influenced by multiple components such as water level, temperature, and time-dependent effects, exhibiting characteristics of nonlinear time series. Currently, traditional and single models struggle to fully capture the complexity and diversity of dam deformation data, resulting in limited predictive performance and interpretation ability. To solve the above problems, this study aims to propose an efficient and interpretable dam deformation prediction method through the combination and optimization of multiple prediction models. [Methods] First, the least absolute shrinkage and selection operator (LASSO) was used to efficiently screen numerous environmental variables, both simplifying model input and explaining the reliability of factor selection.Then,the long short-term memory (LSTM) network was employed to predict dam deformation, and the attention mechanism was introduced to enhance the extraction of important information.Finally,the bagging algorithm was used to integrate the prediction results of multiple models, further improving the accuracy, stability, and generalization ability of the overall prediction. By combining the advantages of LASSO regression feature selection, LSTM model with attention mechanism, and bagging ensemble algorithm, a multi-model coupled method was proposed. [Results] To validate the effectiveness and applicability of the coupled model, this study took the deformation monitoring data of a roller-compacted concrete gravity dam as the research object for prediction analysis. When the number of features was relatively large, the LASSO variable selection method reduced model complexity by adding L1 regularization term and selected features with important influence on dam displacement, enhancing the interpretability of factor selection. Combined with this method, multiple LSTM models with attention mechanism were integrated for parallel training and prediction, reducing potential overfitting problems in single models and improving generalization ability and prediction efficiency of the coupled model. The trained model was used to predict and validate the test set data. The residual values of the coupled model were small, and residual distribution had strong randomness, indicating high prediction accuracy. The fitting results of each measurement point were smooth and agreed well with the measured data, and the prediction results were stable without showing any “distortion” phenomenon. Using the same dataset and identical proportion division, LSTM multi-factor model, stepwise regression prediction model, LASSO regression model, LASSO-LSTM model, and the coupled model were compared and analyzed. The results showed that the coupled model proposed in this study significantly outperformed other models in overall prediction trend and the prediction accuracy of partial fluctuations. The average MAE, MSE, and RMSE at each measurement point were 0.052, 0.005,0.067 mm, respectively. The coupled model could more accurately capture the dynamic changes of dam deformation, providing a simple and efficient method for prediction model research. [Conclusion] This study constructs a coupled prediction model with high accuracy, stability, and interpretability. The main innovation lies in the effective selection of key environmental variables through LASSO, simplifying model input and improving its interpretability; the use of LSTM to capture the time-series features of dam displacement data, while the incorporated attention mechanism helps the model focus on important features in time series; and the bagging algorithm that significantly improves the generalization ability of model by training multiple sub-models in parallel. Based on actual case analysis, the coupled model not only demonstrates higher accuracy in dam deformation prediction,but also outperforms commonly used models in interpretability and stability.The coupled model based on interpretable variable selection provides a reference for the optimization of subsequent combined models.Future research directions can shift from single measurement points in different dam sections to multiple measurement points in the same dam section.This will involve analyzing the location of measurement points and the relationships between different points,thereby enabling the construction of a comprehensive multi-point coupled model.