Predicting COD in Effluent from Wastewater Treatment Plants Using BiLSTM with Attention Mechanism

LIU Yu, WANG Ze-xin, LÜ Chen-kai, MEI Chang-song, ZHAN Hao-dong, JIANG Yun-peng, LU Xi

Journal of Changjiang River Scientific Research Institute ›› 2025, Vol. 42 ›› Issue (12) : 198-206.

PDF(3497 KB)
PDF(3497 KB)
Journal of Changjiang River Scientific Research Institute ›› 2025, Vol. 42 ›› Issue (12) : 198-206. DOI: 10.11988/ckyyb.20250531
Urban Water Environmental Treatment Technologies for Middle-Lower Yangtze River

Predicting COD in Effluent from Wastewater Treatment Plants Using BiLSTM with Attention Mechanism

Author information +
History +

Abstract

[Objective] The wastewater treatment process exhibits highly non-linear, time-varying, and multivariable coupling characteristics, making it difficult for traditional prediction methods to effectively capture complex spatiotemporal dependencies. Unidirectional LSTM utilizes only historical information, struggling to fully exploit bidirectional temporal features. This study aims to construct a deep learning model combining a bidirectional long short-term memory network with an attention mechanism to achieve high-precision prediction of effluent COD in wastewater treatment plants. [Methods] This study proposed a deep learning architecture integrating BiLSTM and a multi-layer attention mechanism. The model adopted a hierarchical design. First, sine-cosine positional encoding was used to embed time step position information. A feature attention mechanism was designed to achieve adaptive weight learning for different water quality parameters using a fully connected network and the softmax function. Then, a single-layer bidirectional LSTM structure was employed to simultaneously capture forward and backward temporal dependencies. A multi-head attention mechanism was introduced to capture complex interaction patterns between time steps. Subsequently, a time-step importance weighting mechanism was designed, using a quadratic growth curve to assign higher weights to recent time steps. An attention-gated fusion strategy was used to dynamically combine the LSTM output and the attention output. Finally, the final prediction was achieved through global average pooling and a fully connected network. The model training employed the Adam optimizer, Dropout regularization, L2 regularization, and an early stopping strategy. The prediction performance was compared with baseline models such as unidirectional LSTM, BiLSTM, and 1D-CNN. [Results] Experimental verification showed that the BiLSTM-attention mechanism model significantly outperformed other models in effluent COD prediction. Compared to the BiLSTM model, the root mean square error decreased from 1.17 mg/L to 1.01 mg/L, a reduction of 13.5%. The mean absolute error decreased from 0.92 mg/L to 0.80 mg/L, a reduction of 13.8%. The mean absolute percentage error decreased from 9.79% to 8.29%, a reduction of 15.3%. The validation set loss converged well during the training process. The visualization analysis of attention weights revealed the model’s decision-making mechanism as follows. Feature attention identified dissolved oxygen in the process section and sludge concentration as key influencing parameters. Temporal attention showed that the model assigned higher weights to recent time steps, conforming to the physical laws of time-series prediction, and the different heads of the multi-head attention captured different temporal dependency patterns, achieving complementary feature extraction. [Conclusion] This study successfully constructs an effluent COD prediction model for wastewater treatment plants based on BiLSTM and a multi-layer attention mechanism. The innovations are reflected in proposing a hierarchical deep learning architecture that integrates positional encoding, feature attention, multi-head attention, and gated fusion; utilizing a bidirectional LSTM structure to simultaneously leverage forward and backward temporal information, which reduces the error by over 10% compared to unidirectional models; and designing time-step importance weighting and gated fusion mechanisms to achieve refined modeling of temporal information.

Key words

bidirectional long short-term memory network / multi-head attention mechanism / gated fusion / wastewater treatment plant / chemical oxygen demand prediction / deep learning / BiLSTM-Attention Mechanism

Cite this article

Download Citations
LIU Yu , WANG Ze-xin , LÜ Chen-kai , et al . Predicting COD in Effluent from Wastewater Treatment Plants Using BiLSTM with Attention Mechanism[J]. Journal of Changjiang River Scientific Research Institute. 2025, 42(12): 198-206 https://doi.org/10.11988/ckyyb.20250531

References

[1]
张羽就, 席佳锐, 陈玲, 等. 中国城镇污水处理厂能耗统计与基准分析[J]. 中国给水排水, 2021, 37(8):8-17.
(ZHANG Yu-jiu, XI Jia-rui, CHEN Ling, et al. Energy Consumption Statistics and Benchmarking Analysis of Urban Wastewater Treatment Plants(WWTPS) in China[J]. China Water & Wastewater, 2021, 37(8): 8-17. (in Chinese))
[2]
曹业始, 郑兴灿, 刘智晓, 等. 中国城市污水处理的瓶颈、缘由及可能的解决方案[J]. 北京工业大学学报, 2021, 47(11): 1292-1302.
(CAO Ye-shi, ZHENG Xing-can, LIU Zhi-xiao, et al. Bottlenecks and Causes, and Potential Solutions for Municipal Sewage Treatment in China[J]. Journal of Beijing University of Technology, 2021, 47(11): 1292-1302. (in Chinese))
[3]
郭亚萍, 顾国维. ASM2d在污水处理中的研究与应用[J]. 中国给水排水, 2006, 22(6): 8-10.
(GUO Ya-ping, GU Guo-wei. Study and Application of ASM2d in Wastewater Treatment[J]. China Water & Wastewater, 2006, 22(6): 8-10. (in Chinese))
[4]
宓云軿, 王晓萍, 金鑫. 基于机器学习的水质COD预测方法[J]. 浙江大学学报(工学版), 2008, 42(5): 790-793.
(MI Yun-ping, WANG Xiao-ping, JIN Xin. Water COD Prediction Based on Machine Learning[J]. Journal of Zhejiang University (Engineering Science), 2008, 42(5): 790-793. (in Chinese))
[5]
IBRAHIM M, HAIDER A, LIM J W, et al. Artificial Neural Network Modeling for the Prediction, Estimation, and Treatment of Diverse Wastewaters: A Comprehensive Review and Future Perspective[J]. Chemosphere, 2024, 362: 142860.
[6]
朱琳, 李明河, 陈园. 基于EHO优化的BP神经网络污水处理出水COD预测模型[J]. 重庆工商大学学报(自然科学版), 2022, 39(3): 26-32.
(ZHU Lin, LI Ming-he, CHEN Yuan. Prediction Model for Effluent COD in Sewage Treatment Based on BP Neural Network Optimized by EHO[J]. Journal of Chongqing Technology and Business University (Natural Science Edition), 2022, 39(3): 26-32. (in Chinese))
[7]
张玉泽, 姚立忠, 罗海军. 基于量子加权最小门限单元网络的出水COD预测[J]. 环境工程技术学报, 2023, 13(5): 1857-1864.
(ZHANG Yu-ze, YAO Li-zhong, LUO Hai-jun. Prediction of Effluent COD Based on Quantum Weighted Minimal Gated Unit Network[J]. Journal of Environmental Engineering Technology, 2023, 13(5): 1857-1864. (in Chinese))
[8]
尚旭东, 段中兴, 陈炳生, 等. 基于双向长短期记忆网络组合模型的水质预测[J]. 环境科学学报, 2024, 44(7): 261-270.
(SHANG Xu-dong, DUAN Zhong-xing, CHEN Bing-sheng, et al. Water Quality Prediction Based on a Composite Model of Bidirectional Long Short-term Memory Networks[J]. Acta Scientiae Circumstantiae, 2024, 44(7): 261-270. (in Chinese))
[9]
朱凌建, 荀子涵, 王裕鑫, 等. 基于CNN-Bi LSTM的短期电力负荷预测[J]. 电网技术, 2021, 45(11): 4532-4539.
(ZHU Ling-jian, XUN Zi-han, WANG Yu-xin, et al. Short-term Power Load Forecasting Based on CNN-BiLSTM[J]. Power System Technology, 2021, 45(11): 4532-4539. (in Chinese))
[10]
杜秀丽, 范志宇, 吕亚娜, 等. 基于双向长短期记忆循环神经网络的网络流量预测[J]. 计算机应用与软件, 2022, 39(2): 144-149, 156.
(DU Xiu-li, FAN Zhi-yu, Ya-na, et al. Network Traffic Prediction Based on Bilstm Recurrent Neural Network[J]. Computer Applications and Software, 2022, 39(2): 144-149, 156. (in Chinese))
[11]
SHEIK A G, AHMAD MALLA M, SRUNGAVARAPU C S, et al. Prediction of Wastewater Quality Parameters Using Adaptive and Machine Learning Models: A South African Case Study[J]. Journal of Water Process Engineering, 2024, 67: 106185.
[12]
CHENG T, HARROU F, KADRI F, et al. Forecasting of Wastewater Treatment Plant Key Features Using Deep Learning-based Models: A Case Study[J]. IEEE Access, 2020, 8: 184475-184485.
[13]
BI J, CHEN Z, YUAN H, et al. Accurate Water Quality Prediction with Attention-based Bidirectional LSTM and Encoder-Decoder[J]. Expert Systems with Applications, 2024, 238: 121807.
[14]
ZHANG Q, WANG R, QI Y, et al. A Watershed Water Quality Prediction Model Based on Attention Mechanism and Bi-LSTM[J]. Environmental Science and Pollution Research, 2022, 29(50): 75664-75680.
[15]
王雷, 张煜, 赵艺琨, 等. 基于多空间维度联合方法改进的BiLSTM出水氨氮预测方法[J]. 中国农村水利水电, 2025(2): 17-24.
(WANG Lei, ZHANG Yu, ZHAO Yi-kun, et al. Improved BiLSTM Effluent Ammonia Nitrogen Prediction Method Based on Multi-dimensional Joint Method[J]. China Rural Water and Hydropower, 2025(2): 17-24. (in Chinese))
[16]
邹吕熙, 李怀波, 郑凯凯, 等. 太湖流域城镇污水处理厂进水水质特征分析[J]. 给水排水, 2019, 55(7): 39-45.
(ZOU Lü-xi, LI Huai-bo, ZHENG Kai-kai, et al. Analysis on the Characteristics of Influent Water Quality from Wastewater Treatment Plants in Taihu Basin[J]. Water & Wastewater Engineering, 2019, 55(7): 39-45. (in Chinese))
[17]
管业鹏, 苏光耀, 盛怡. 双向长短期记忆网络的时间序列预测方法[J]. 西安电子科技大学学报, 2024, 51(3): 103-112.
(GUAN Ye-peng, SU Guang-yao, SHENG Yi. Time Series Prediction Method Based on the Bidirectional Long Short-term Memory Network[J]. Journal of Xidian University, 2024, 51(3): 103-112. (in Chinese))
[18]
郑志超, 陈进东, 张健. 融合非负正弦位置编码和混合注意力机制的情感分析模型[J]. 计算机工程与应用, 2024, 60(15): 101-110.
Abstract
针对情感分析任务中,序列模型存在难以获取文本的相对位置信息,且处理较长序列时容易丢失关键信息等问题,提出了一种融合非负正弦位置编码(non-negative sinusoidal position encoding,NSPE)和混合注意力机制(hybrid attention mechanism,HAM)的双向长短期记忆网络(bi-directional long short-term memory,Bi-LSTM)情感分析模型(NSPEHA-BiLSTM)。提出NSPE方法,建立词语的NSPE,为词向量融入相对位置信息;通过Bi-LSTM提取文本特征,并基于HAM分别对特征的全局和局部特征进行赋权,确保关键信息的准确传递;通过全连接层实现文本情感分析。在IMDB数据集中,NSPEA-BiLSTM相较于Bi-LSTM和Text-CNN准确率分别提升了4.67和2.02个百分点,且输入的文本长度越长,模型效果越好,同时验证了NSPE优于其他位置编码。
(ZHENG Zhi-chao, CHEN Jin-dong, ZHANG Jian. Sentiment Classification Model Based on Non-negative Sinusoidal Positional Encoding and Hybrid Attention Mechanism[J]. Computer Engineering and Applications, 2024, 60(15): 101-110. (in Chinese))
NSPEHA-BiLSTM is proposed to address the issues of sequence models in sentiment analysis tasks, such as difficulty in obtaining the relative positional information of text and the loss of critical information when processing long sequences. The model integrates non-negative sinusoidal position encoding (NSPE) and hybrid attention mechanism (HAM) to incorporate relative positional information into word embeddings and weight the global and local information features of text using HAM, respectively, ensuring the accurate transmission of critical information. The text features are extracted by Bi-LSTM, and sentiment analysis is performed using a fully connected layer. NSPEHA-BiLSTM achieves higher accuracy than Bi-LSTM and Text-CNN by 4.67 and 2.02 percentage points, respectively, on the IMDB dataset, and the model performance improves with longer input text. The results also verify that NSPE is superior to other position encodings.
[19]
任欢, 王旭光. 注意力机制综述[J]. 计算机应用, 2021, 41(增刊1): 1-6.
(REN Huan, WANG Xu-guang. Review of Attention Mechanism[J]. Journal of Computer Applications, 2021, 41(Supp. 1): 1-6. (in Chinese))
[20]
仵晓聪, 冯鑫, 蒋豪. 基于多头注意力CNN-LSTM碳排放量预测研究[J/OL]. 重庆工商大学学报(自然科学版).(2024-06-06)[2025-04-25].
WU Xiao-cong, FENG Xin, JIANG Hao. Carbon Emission Prediction Based on Multi-head Attention CNN-LSTM[J/OL]. Journal of Chongqing Technology and Business University (Natural Science Edition).(2024-06-06)[2025-04-25]. (in Chinese))
[21]
赖雪梅, 唐宏, 陈虹羽, 等. 基于注意力机制的特征融合-双向门控循环单元多模态情感分析[J]. 计算机应用, 2021, 41(5): 1268-1274.
Abstract
针对视频多模态情感分析中,未考虑跨模态的交互作用以及各模态贡献程度对最后情感分类结果的影响的问题,提出一种基于注意力机制的特征融合-双向门控循环单元多模态情感分析模型(AMF-BiGRU)。首先,利用双向门控循环单元(BiGRU)来考虑各模态中话语间的相互依赖关系,并得到各模态的内部信息;其次,通过跨模态注意力交互网络层将模态内部信息与模态之间的交互作用相结合;然后,引入注意力机制来确定各模态的注意力权重,并将各模态特征进行有效融合;最后,通过全连接层和softmax层获取情感分类结果。在公开的CMU-MOSI和CMU-MOSEI数据集上进行实验。实验结果表明,与传统的多模态情感分析方法(如多注意力循环网络(MARN))相比,AMF-BiGRU模型在CMU-MOSI数据集上的准确率和F1值分别提升了6.01%和6.52%,在CMU-MOSEI数据集上的准确率和F1值分别提升了2.72%和2.30%。可见,AMF-BiGRU模型能够有效提高多模态的情感分类性能。
(LAI Xue-mei, TANG Hong, CHEN Hong-yu, et al. Multimodal Sentiment Analysis Based on Feature Fusion of Attention Mechanism-bidirectional Gated Recurrent Unit[J]. Journal of Computer Applications, 2021, 41(5): 1268-1274. (in Chinese))
Aiming at the problem that the cross-modality interaction and the impact of the contribution of each modality on the final sentiment classification results are not considered in multimodal sentiment analysis of video, a multimodal sentiment analysis model of Attention Mechanism based feature Fusion-Bidirectional Gated Recurrent Unit (AMF-BiGRU) was proposed. Firstly, Bidirectional Gated Recurrent Unit (BiGRU) was used to consider the interdependence between utterances in each modality and obtain the internal information of each modality. Secondly, through the cross-modality attention interaction network layer, the internal information of the modalities were combined with the interaction between modalities. Thirdly, an attention mechanism was introduced to determine the attention weight of each modality, and the features of the modalities were effectively fused together. Finally, the sentiment classification results were obtained through the fully connected layer and softmax layer. Experiments were conducted on open CMU-MOSI (CMU Multimodal Opinion-level Sentiment Intensity) and CMU-MOSEI (CMU Multimodal Opinion Sentiment and Emotion Intensity) datasets. The experimental results show that compared with traditional multimodal sentiment analysis methods (such as Multi-Attention Recurrent Network (MARN)), the AMF-BiGRU model has the accuracy and F1-Score on CMU-MOSI dataset improved by 6.01% and 6.52% respectively, and the accuracy and F1-Score on CMU-MOSEI dataset improved by 2.72% and 2.30% respectively. AMF-BiGRU model can effectively improve the performance of multimodal sentiment classification.
[22]
王太勇, 王廷虎, 王鹏, 等. 基于注意力机制BiLSTM的设备智能故障诊断方法[J]. 天津大学学报(自然科学与工程技术版), 2020, 53(6): 601-608.
(WANG Tai-yong, WANG Ting-hu, WANG Peng, et al. An Intelligent Fault Diagnosis Method Based on Attention-based Bidirectional LSTM Network[J]. Journal of Tianjin University (Science and Technology), 2020, 53(6): 601-608. (in Chinese))
PDF(3497 KB)

Accesses

Citation

Detail

Sections
Recommended

/