PDF(1654 KB)
Prediction of Influent Ammonia Nitrogen in Wastewater Treatment Plants Based on Transfer Learning under Few-Shot Scenarios
LÜ Chen-kai, JIANG Yun-peng, LIU Yu, WANG Hao-bo, LU Xi, WANG Ze-xin
Journal of Changjiang River Scientific Research Institute ›› 2025, Vol. 42 ›› Issue (12) : 180-187.
PDF(1654 KB)
PDF(1654 KB)
Prediction of Influent Ammonia Nitrogen in Wastewater Treatment Plants Based on Transfer Learning under Few-Shot Scenarios
[Objective] Accurate prediction of influent ammonia nitrogen concentration is a key support for ensuring the stability of the biological treatment process in wastewater treatment plants and achieving low-carbon and efficient nitrogen removal. However, target wastewater treatment plants often face a shortage of effective samples due to insufficient monitoring equipment and limited data collection, which leads to poor prediction performance as traditional models are prone to underfitting or overfitting. This study aims to construct a 1DCNN-LSTM deep hybrid model based on transfer learning to overcome the bottleneck of influent ammonia nitrogen prediction under few-shot scenarios and achieve accurate prediction. [Methods] Two wastewater treatment plants with the same process in southeastern China were selected as the source domain and target domain. The source domain comprised hourly monitoring data from February to November 2024, including influent flow rate (Q), pH value, chemical oxygen demand (COD), etc., while the target domain consisted of scarce data from November 10 to 30, 2024. A 1DCNN-LSTM model was constructed, using historical data combined with multi-scale autocorrelation features and first-order difference features of ammonia nitrogen as the model input. Bayesian optimization was used to determine the model hyperparameters. Additionally, the model was first trained on the source domain. For the target domain, transfer learning was applied using a two-stage transfer strategy. First, the convolutional layers and LSTM layers of the source domain pre-trained model were frozen, and only the fully connected layers of the target domain were trained. Then, the entire model was fine-tuned with a small learning rate. Finally, performance was evaluated using indicators such as RMSE, MAPE, and R2. [Results] First, the data distributions of the source and target domains exhibited certain similarities while also showing certain differences, which conformed to the application scenario of transfer learning. The source domain model showed excellent performance, with RMSE=1.65, MAPE=4.60%, and R2=0.91 on the test set, and it could accurately capture the short-term fluctuations and long-term trends of ammonia nitrogen concentration. In the target domain, the performance of the transfer learning model was significantly better than the directly trained model. RMSE decreased from 1.650 to 1.515, a reduction of 8.18%. MAPE decreased from 5.62% to 5.21%, a reduction of 7.23%. R2 increased from 0.635 to 0.692, an increase of 9.02%. The prediction curve of the transfer model was smoother and aligned more closely with the measured values, demonstrating stronger adaptability and stability, particularly during sudden changes in ammonia nitrogen concentration. [Conclusion] The core innovations of this study are reflected in two aspects. First, this study proposes a 1DCNN-LSTM hybrid architecture that integrates the advantages of local feature extraction and long-term dependency modeling, overcoming the limitations of single models in capturing the complex dynamic changes in ammonia nitrogen. Second, it designs a two-stage transfer strategy that not only preserves the general knowledge learned from the source domain but also adapts to the differences of the target domain through fine-tuning, effectively addressing the issues of small samples and domain shift and avoiding the accuracy decline caused by directly applying the source domain model. The results confirm that the 1DCNN-LSTM model can reliably capture the variation patterns of ammonia nitrogen, and transfer learning can significantly enhance the prediction accuracy and generalization ability under few-shot scenarios. This provides a reliable technical pathway for wastewater treatment plants to precisely regulate process parameters and optimize chemical dosing and offers a new perspective for addressing the issue of scarce water quality monitoring data, which is of great significance for promoting the intelligent and precise wastewater treatment.
influent ammonia nitrogen prediction / transfer learning / 1DCNN-LSTM / few-shot learning
| [1] |
宦娟, 张浩, 徐宪根, 等. 基于图卷积STG-LSTM的京杭运河水质时空预测研究[J]. 中国农村水利水电, 2022(8): 14-22.
快速精准预测河流水质是城市水管理战略的重要任务,而河流水质因子具有时序性、不稳定性和非线性等特点且受多种因素影响,会造成时空维度上分布差异。针对现有水质因子预测方法大多是单监测站点的时间序列预测,无法描述河流水质因子的空间分布,提出一种基于时空图卷积融合长短记忆神经网络的河流水质时空预测模型(STG-LSTM)。以各监测站点地理位置和水质因子历史观测值为依据,构建时空图来表征各监测站点间的时空相关性。将时空图输入到STG-LSTM模型中,采用图卷积(GCN)提取河流水质数据空间依赖关系,并融合长短时记忆神经网络(LSTM)来获取水质因子数据的时空关联性,实现对未来一段时间运河河段不同位置水质状态的时空预测。用京杭运河常州段上8个监测站点4种不同水质因子数据集进行验证,从预测精度和训练时间两方面,将模型和其他6种预测模型进行比较,并对模型进行可靠性测试。实验结果表明,STG-LSTM模型能以较短的训练时间得到较高的预测精度,实现了对河流不同位置水质的快速精准预测,为城市水管理提供技术支撑。
(
Rapid and accurate prediction of river water quality is an important task of urban water management strategy. However, River water quality factors have the characteristics of time series, instability, and nonlinearity, and are affected by many factors, which will cause differences in spatial and temporal dimensions. Most of the existing water quality factor forecasting methods are time series forecasting at a single monitoring station, which cannot describe the spatial distribution of river water quality factors. In this paper, we proposes a spatiotemporal prediction model of river water quality (STG-LSTM) on account of spatio-temporal graph convolution and long-short-term memory neural network. Based on the historical observation values of the geographic location and water quality factors of each monitoring station, construct a time-space map to characterize the time-space correlation between each monitoring station. Input the space-time diagram into the STG-LSTM model, using graph convolution (GCN) to obtain the spatial dependence of river water quality factor data and fusing the long and short-term memory neural network (LSTM) to obtain the spatio-temporal correlation of the water quality factor data, we realized the future period temporal and spaltial prediction of water quality at different locations in the canal section. The data sets of four different water quality factors at eight monitoring stations on the Changzhou section of the Beijing-Hangzhou Canal were used for verification. The model was compared with six other prediction models in terms of prediction accuracy and training time, and the reliability of the model was tested. The experimental results show that STG-LSTM can obtain high prediction accuracy with a short training time and realize rapid and accurate prediction of water quality at different locations of the river. Most but not the least, they provide technical support for urban water management. |
| [2] |
龚晓露. 机器学习用于城镇污水处理厂进水预测的实践研究[J]. 给水排水, 2024, 60(3): 142-147.
(
|
| [3] |
米莎, 韦安磊, 王小文, 等. 基于BP神经网络的污水处理厂进水水质预测模型[J]. 给水排水, 2012, 48(增刊1): 488-491.
(
|
| [4] |
余伟, 罗飞, 杨红, 等. 基于多神经网络的污水氨氮预测模型[J]. 华南理工大学学报(自然科学版), 2010, 38(12): 79-83.
针对污水生化处理过程的非线性、大滞后等特点,建立了一种基于多神经网络的出水水质预测模型.通过减聚类方法将输入空间划分为一些小的局部空间,在每个局部空间中用神经网络建立子模型;各个子模型的预测输出通过主元递归(PCR)方法连接以解决子模型相互之间的严重相关问题,从而提高了模型的精度和鲁棒性;同时,应用改进目标函数以提高对偏高值的建模精度,采用加权反馈校正以提高模型的泛化能力.将该方法应用于某污水处理厂出水氨氮指标的预测,结果验证了模型的有效性.
(
|
| [5] |
王娜, 韩帅, 吴玉龙. 基于机器学习的西藏某污水处理厂进水水质预测研究[J]. 信息与电脑, 2025, 37(1):10-13.
(
|
| [6] |
闵振辉, 张志强. 基于灰色神经网络的污水处理厂水质预测研究[J]. 自动化与仪器仪表, 2013(3):10-11, 16.
(
|
| [7] |
陆超, 张峻, 赵俊. 基于神经网络的污水处理厂水质预测模型[J]. 净水技术, 2013, 32(4): 100-105.
(
|
| [8] |
The concentration of ammonia nitrogen is significant for intensive aquaculture, and if the concentration of ammonia nitrogen is too high, it will seriously affect the survival state of aquaculture. Therefore, prediction and control of the ammonia nitrogen concentration in advance is essential. This paper proposed a combined model based on X Adaptive Boosting (XAdaBoost) and the Long Short-Term Memory neural network (LSTM) to predict ammonia nitrogen concentration in mariculture. Firstly, the weight assignment strategy was improved, and the number of correction iterations was introduced to retard the shortcomings of data error accumulation caused by the AdaBoost basic algorithm. Then, the XAdaBoost algorithm generated and combined several LSTM su-models to predict the ammonia nitrogen concentration. Finally, there were two experiments conducted to verify the effectiveness of the proposed prediction model. In the ammonia nitrogen concentration prediction experiment, compared with the LSTM and other comparison models, the RMSE of the XAdaBoost–LSTM model was reduced by about 0.89–2.82%, the MAE was reduced by about 0.72–2.47%, and the MAPE was reduced by about 8.69–18.39%. In the model stability experiment, the RMSE, MAE, and MAPE of the XAdaBoost–LSTM model decreased by about 1–1.5%, 0.7–1.7%, and 7–14%. From these two experiments, the evaluation indexes of the XAdaBoost–LSTM model were superior to the comparison models, which proves that the model has good prediction accuracy and stability and lays a foundation for monitoring and regulating the change of ammonia nitrogen concentration in the future.
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
Accurate Earth orientation parameter (EOP) predictions are needed for many applications, e.g., for the tracking and navigation of interplanetary spacecraft missions. One of the most difficult parameters to forecast is the length of day (LOD), which represents the variation in the Earth’s rotation rate since it is primarily affected by the torques associated with changes in atmospheric circulation. In this study, a new-generation time-series prediction algorithm is developed. The one-dimensional convolutional neural network (1D CNN), which is one of the deep learning methods, is introduced to model and predict the LOD using the IERS EOP 14 C04 and axial Z component of the atmospheric angular momentum (AAM), which was taken from the German Research Centre for Geosciences (GFZ) since it is strongly correlated with the LOD changes. The prediction procedure operates as follows: first, we detrend the LOD and Z-component series using the LS method, then, we obtain the residual series of each one to be used in the 1D CNN prediction algorithm. Finally, we analyze the results before and after introducing the AAM function. The results prove the potential of the proposed method as an optimal algorithm to successfully reconstruct and predict the LOD for up to 7 days.
|
| [18] |
|
| [19] |
|
| [20] |
In the past decade, the number of cars in China has significantly raised, but the traffic jam spree problem has brought great inconvenience to people’s travel. Accurate and efficient traffic flow prediction, as the core of Intelligent Traffic System (ITS), can effectively solve the problems of traffic travel and management. The existing short-term traffic flow prediction researches mainly use the shallow model method, so they cannot fully reflect the traffic flow characteristics. Therefore, this paper proposed a short-term traffic flow prediction method based on one-dimensional convolution neural network and long short-term memory (1DCNN-LSTM). The spatial information in traffic data is obtained by 1DCNN, and then the time information in traffic data is obtained by LSTM. After that, the space-time features of the traffic flow are used as regression predictions, which are input into the Fully-Connected Layer. In the end, the corresponding prediction results of the current input are calculated. In the past, most of the researches are based on survey data or virtual data, lacking authenticity. In this paper, real data will be used for research. The data are provided by OpenITS open data platform. Finally, the proposed method is compared with other road forecasting models. The results show that the structure of 1DCNN-LSTM can further improve the prediction accuracy.
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
|
| [25] |
|
| [26] |
宋波涛, 许广亮. 基于LSTM与1DCNN的导弹轨迹预测方法[J]. 系统工程与电子技术, 2023, 45(2):504-512.
针对弹道导弹等超远程攻击目标的轨迹难以预测的问题, 提出一种基于长短期记忆(long short-term memory, LSTM)网络与一维卷积神经网络(1-dimensional convolutional neural network, 1DCNN)的目标轨迹预测方法。首先, 建立三自由度导弹运动模型, 依据再入类型设计3种目标轨迹数据, 构建机动数据库, 解决轨迹数据的来源问题。其次, 采用重复分割与滑动窗口的方法对轨迹数据进行预处理。然后, 基于LSTM与1DCNN设计了一种目标类型分类网络, 对目标进行初步分类。最后, 基于1DCNN设计轨迹预测网络, 对目标轨迹进行预测。仿真结果表明, 提出的轨迹预测网络能够完成轨迹预测任务, 预测误差在合理范围内。
(
Aiming at the problem that it is difficult to predict the trajectory of ultra-long-range attack targets such as ballistic missiles, a target trajectory prediction method based on long short-term memory (LSTM)network and 1-dimensional convolutional neural network (1DCNN) is proposed. Firstly, a three-degree-of-freedom missile movement model is established, and three target trajectory data are designed according to the type of reentry, and a maneuvering database is constructed to solve the problem of the source of trajectory data. Secondly, the method of repeated segmentation and sliding window is used to preprocess the trajectory data. Then, a target type classification network is designed based on LSTM and 1DCNN to perform preliminary classification of targets. Finally, a trajectory prediction network is designed based on 1DCNN to predict the target trajectory. The simulation results show that the proposed trajectory prediction network can complete the trajectory prediction task, and the prediction error is within a reasonable range. |
| [27] |
|
| [28] |
|
/
| 〈 |
|
〉 |