PDF(10114 KB)
PDF(10114 KB)
PDF(10114 KB)
基于CEEMDAN和IMSA的混合模型在水质预测中的应用
Application of a Hybrid Model Based on CEEMDAN and IMSA in Water Quality Prediction
水质预测是水污染防治的重要组成部分,但水质序列呈现出较强的随机性、不平稳性等特点,为进一步提高地表水质预测的精度,提出一种新型水质预测混合模型。首先采用自适应噪声完备集合经验模态分解(CEEMDAN)将原始水质序列分解,然后利用模糊散布熵(FuzzDE)将分量划分为高、中、低3种复杂度成分,其次分别利用改进螳螂算法(IMSA)优化后的双向长短时记忆网络(BiLSTM)、最小二乘支持向量机回归(LSSVR)、极限学习机(ELM)对高、中、低3种复杂度成分进行预测,并对预测结果进行组合重构,最后建立BiLSTM误差校正模型对误差进行修正,得到最终预测结果。利用沅江支流酉水两个断面的溶解氧浓度及湘江流域一个断面的pH值进行仿真验证,R2可达90%以上,结果表明混合模型预测的准确性优于其他对比预测模型。
[Objectives] To enhance water quality prediction accuracy, this study aims to address the following challenges: (1) traditional prediction methods often rely on simple, elementary decomposition techniques, limiting their ability to extract meaningful data features. (2) Single models and basic optimization algorithms result in low prediction accuracy. (3) Most approaches fail to leverage the advantages of different networks to analyze components of varying complexity, leading to inefficient model utilization. (4) Few studies incorporate error correction after prediction. This study proposes a novel hybrid model for water quality prediction. [Methods] First, the original water quality sequence was decomposed using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN). Next, Fuzzy Dispersion Entropy (FuzzDE) categorized the components into high-, medium-, and low-complexity subsequences. Then, an Improved Mantis Search Algorithm (IMSA) optimized three distinct models: Bidirectional Long Short-Term Memory (BiLSTM) for high-complexity components, Least Squares Support Vector Regression (LSSVR) for medium-complexity components, and Extreme Learning Machine (ELM) for low-complexity components. The predictions were combined and reconstructed, and a BiLSTM-based error correction model further corrected the errors, yielding the final prediction results. [Results] The study introduced four key innovations to the original Mantis Search Algorithm (MSA): (1) combining Logistic-Tent chaotic mapping for population initialization, ensuring uniform and random distribution of initial solutions to enhance global search capability and convergence speed; (2) nonlinear acceleration factor, refining MSA’s core update formula to transition from global exploration to local exploitation, mitigating local optima entrapment; (3) elite-guided adaptive update strategy, addressing the excessive randomness in the position update strategy when Mantis attacks fail, improving late-stage search efficiency while preserving some randomness; (4) opposition-based learning, generating individuals opposite to the current individual to enhance global optimization. IMSA’s performance was validated using benchmark functions (Rosenbrock for unimodal, Michalewicz for multimodal), confirming improved global search and convergence precision. After determining the network hyperparameters, ablation experiments were conducted to analyze the contribution of each strategy to the network model, providing a clear understanding of how each strategy impacts prediction performance. Finally, the sequence of model usage was validated by using FuzzDE to calculate the complexity of each component, creating high-, medium-, and low-complexity subsequences. The learning capabilities of different networks for these subsequences were verified, with BiLSTM used to predict high-complexity components, LSSVR for medium-complexity components, and ELM for low-complexity components. [Conclusions] This study performed a simulation verification using dissolved oxygen (DO) concentrations from two sections of Youshui River (a tributary of the Yuanjiang River) and pH values from one station in the Xiangjiang River Basin. Missing values were addressed via linear interpolation. For outlier treatment, the study considered that outliers in the data might be caused by sudden pollution events and discontinuous non-point source pollution. Directly removing them could lead to information loss, so outliers were retained. After integrating decomposition, use of entropy, algorithm optimization, and error correction models, eleven comparative experiments were established to evaluate the effectiveness of each optimization method. The hybrid model’s effectiveness was validated using RMSE, R2, and MAPE metrics. Ultimately, the R2 reached over 90%, demonstrating that the prediction accuracy of the hybrid model outperformed other comparative models.
水质预测 / CEEMDAN分解 / 模糊散布熵 / 螳螂算法 / 混合模型
water quality prediction / CEEMDAN decomposition / fuzzy dispersion entropy / Mantis Search Algorithm / hybrid model
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
孟朝霞, 蒋芃, 贾宏恩. 基于BP神经网络水库水质模拟预测[J]. 运城学院学报, 2022, 40(6): 1-5.
(
|
| [12] |
崔东文, 袁树堂. 基于WPD-AHA-ELM模型的水质时间序列多步预测[J]. 三峡大学学报(自然科学版), 2023, 45(1):6-13.
(
|
| [13] |
齐家蕙, 谢崇宝, 杨丽原. 沂河水质评价模型研究及其应用[J]. 中国农村水利水电, 2023(8):103-110.
为了研究影响流域水质的关键水质指标,以沂河为研究区域,基于2006-2019年水质数据,采用水质指数法对河流水质进行评价与建模。水质指数可以将大量复杂的水质数据转变为一个单独指标来反映水质整体状况,目前常被用于进行水质评价。共分析10个水质指标,包括TP、pH、WT、DO、NO<sub>3</sub>-N、BOD<sub>5</sub>、F<sup>-</sup>、COD、SO<sub>4</sub> <sup>2-</sup>和NH<sub>3</sub>-N。基于多元线性回归分析筛选流域关键水质指标,构建了沂河关键水质指标评价模型WQI<sub>min</sub>,简化了评价所需的水质指标。结果表明:无论是否加权,四指标模型和六指标模型的拟合程度和预测精度都未达到最高,不是本研究的最优模型;五指标模型 W Q I m i n + W T w具有良好的水质评价性能,R <sup>2</sup>=0.972,MSE=0.51,PE=2.07%,P<0.05,是本研究最优关键水质指标模型。该模型包括5个水质指标:NH<sub>3</sub>-N、BOD<sub>5</sub>、DO、SO<sub>4</sub> <sup>2-</sup>和WT,与WQI模型呈极显著正相关关系(P<0.001),对WQI的解释程度最大,不仅保持了评价精度,而且有效降低了检测成本,提高了水资源评价效率,能有效替代WQI模型进行流域水质评价。此外,基于同样的样本数据开发了人工神经网络模型,可有效应用于沂河水环境状态评价与预测,为沂河水质未来变化趋势提供参考,为水环境智能化模拟提供新的技术途径。
(
To study the key water quality indexes affecting the water quality of the basin, this paper selects Yihe River as the research area. The annual water quality monitoring data and laboratory sampling data of the Yihe River from 2006 to 2019 are used to evaluate and model the river water quality by using the water quality index method. Water quality index (WQI) can transform a large number of complex water quality data into a single index. This single index can reflect the overall state of water quality, so the water quality index is often used to evaluate water quality at present. A total of 10 water quality indexes including total phosphorus (TP), pH, water temperature (WT), dissolved oxygen (DO), nitrate nitrogen (NO3-N), 5-day biochemical oxygen demand (BOD5), fluoride (F-), chemical oxygen demand (COD), sulfate (SO4 2-), and ammonia nitrogen (NH3-N) are analyzed. Based on the multiple linear regression analysis, the key water quality index evaluation model WQImin of the Yihe River is established. The indexes involved in the evaluation of the Yihe River water quality are reduced. The results of this paper are as follows. When the water quality index is not weighted, the fitting degree and prediction accuracy of the four-index water quality assessment model and the six-index water quality assessment model do not reach the highest; when the water quality index is weighted, the fitting degree and prediction accuracy of the four-index water quality assessment model and the six-index water quality assessment model do not reach the highest, too. Neither of these two simplified index models is the optimal critical water quality evaluation model in this study. Through model training and testing, the weighted five-index model has good water quality evaluation performance, R 2=0.972, MSE=0.51, PE=2.07%, P<0.05, and is the optimal key water quality index model in this study. The model is a weighted five-index water quality evaluation model, including five water quality indexes: NH3-N, BOD5, DO, SO4 2-, and WT, which shows a significant positive correlation with the WQI model ( P<0.001). The weighted five-index model not only maintains the accuracy of water quality evaluation, but also effectively reduces the cost of water quality index detection, improves the efficiency of water resources evaluation, and can effectively replace the WQI model for water quality evaluation in the basin. In addition, the artificial neural network model is developed based on the same sample data, which can be effectively applied to the evaluation and prediction of water quality in the Yihe River. On the one hand, the artificial neural network model can provide a reference for the future change trend of water quality in the Yihe River. On the other hand, the artificial neural network model can provide a new technical way for the intelligent simulation of water environment. |
| [14] |
谢雨茜, 李路, 朱明, 等. 基于EMD与K-means的ILSTM模型在池塘溶解氧预测中的应用[J]. 华中农业大学学报(自然科学版), 2022, 41(3):200-210.
(
|
| [15] |
|
| [16] |
王渤权, 金传鑫, 周论, 等. 基于长短期记忆网络的西丽水库水质预测[J]. 长江科学院院报, 2023, 40(6):64-70.
西丽水库是深圳重要饮水源之一,水库的水质影响着全市人民的供水安全。为及时准确预测西丽水库水质结果,以指导水库水厂供水计划的制定,在利用自适应噪声的完备集合经验模态分解方法进行数据分解的基础上,利用长短期记忆网络(LSTM)模型,建立了基于LSTM的西丽水库水质预测模型。通过模拟计算发现,模型模拟效果较好,其中水质预测模型中总氮、氨氮及总磷的预测结果与实测结果吻合度均较高,能够很好地模拟水库水质浓度变化过程;且对于总氮和氨氮,模型的相对预报误差能控制在10%以下,说明了所建模型的合理性。研究成果可为西丽水库的水质预测及供水计划的制定提供重要模型与技术支撑。
(
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
兰小机, 贺永兰, 武帅文. 基于RF-BiLSTM模型的河流水质预测研究[J]. 长江科学院院报, 2024, 41(7):57-63,71.
水环境中过量的氮、磷和高锰酸盐会对流域造成严重污染,准确预测这三类指标的含量对流域污染治理具有重要意义。然而,现有的模型预测精度低,输入因子的选择缺乏数理依据。基于此,以邕江为研究区域,提出一种RF-BiLSTM的混合网络模型。该模型具有利用RF算法提取水质指标最优特征和利用BiLSTM模型提取输入数据的时间特征的优势,采用先降维后预测的方式对TN、TP和 COD<sub>Mn</sub>进行预测,并将深度学习中的CNN、LSTM、BiLSTM和RF-LSTM作为基准模型与本研究所提模型作对比研究。研究结果表明,本研究模型预测TN、TP和COD<sub>Mn</sub>的平均绝对百分比误差(MAPE)分别达到了4.330%、6.781%和7.384%,均低于其他基准模型,预测结果具有较高的准确性和实用性,可为水环境的污染治理提供有效的技术支持。
(
|
| [21] |
|
| [22] |
郭利进, 许瑞伟. 基于改进果蝇算法的LSTM在水质预测中的应用[J]. 长江科学院院报, 2023, 40(8):57-63.
水质环境的实时变化和内部耦合导致难以实现水质高效准确的预测。为挖掘水质时间序列中的更多信息,同时提高预测模型的精度,提出一种溶解氧组合预测模型。首先将水质数据去耦合,进行时间序列分解,然后将分解后趋势分量、周期分量和余项分量输入到长短时神经网络模型(LSTM)中进行预测,再针对LSTM网络初始化参数对预测性能的影响提出基于高斯函数的果蝇算法进行优化,最后将各分量的预测值重构为溶解氧浓度的预测值。以海河某3个河流断面的水质数据进行仿真检验,结果表明混合模型对3个站点溶解氧浓度预测效果好,误差小,泛化性强。
(
|
| [23] |
|
/
| 〈 |
|
〉 |