Deep Learning for Downstream Water Level Prediction in Complex Hydrology Systems: An LSTM Approach

Article Content

1. Introduction

Accurate forecasting of flow in downstream reservoirs and irrigation districts during the flood season is essential for effective flood management and water resource planning [1]-[3]. Over the years, various approaches have been developed to improve the precision of flow predictions, ranging from traditional statistical models to advanced data-driven techniques [4].

Early efforts in flow forecasting primarily relied on statistical models such as Autoregressive Integrated Moving Average (ARIMA) and Multiple Linear Regression (MLR), which provided reasonable accuracy for linear and stationary datasets like monthly flow [5]-[7]. However, these models often struggle to capture the complex, nonlinear relationships inherent in hydrological systems, particularly during flood events where rapid changes in flow patterns occur.

Physical-based hydrological models, such as the Hydrologic Engineering Center’s Hydrologic Modeling System (HEC-HMS) and Soil and Water Assessment Tool (SWAT), have been widely applied for flood forecasting [8]-[11]. While these models can simulate hydrological processes with physical accuracy, their effectiveness is highly dependent on the availability of high-quality input data and extensive calibration. Moreover, their computational complexity can be a limitation when dealing with real-time forecasting.

In recent years, data-driven techniques, particularly machine learning and deep learning models have gained significant attention for their ability to handle nonlinear and dynamic patterns in hydrological systems. Machine learning techniques, such as decision trees, support vector machines, and random forests, have been applied to predict flood events by analyzing historical data, meteorological variables, and river flow patterns [12]-[15]. These methods excel at detecting patterns in data, but they often require extensive feature engineering and may not capture temporal dependencies as effectively. Deep learning, particularly through Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks, has further advanced forecasting by learning from sequential data, making it well-suited for time-series prediction tasks. LSTM networks are capable of modeling long-term dependencies, enabling the prediction of flood events with greater accuracy and longer lead times. These techniques allow for more precise forecasting by leveraging vast amounts of real-time data and accounting for complex interactions in hydrological and climatic factors, thus improving the accuracy and reliability of flood prediction systems [11] [16]-[18].

Despite these advancements, challenges remain in developing robust and reliable forecasting models specifically tailored for the flood season. Issues such as data scarcity, model overfitting, and the need for real-time predictions require further investigation. This study proposes a comprehensive forecasting framework designed to enhance water level prediction accuracy at critical control points during the flood season. The approach integrates optimized search techniques with a carefully selected set of relevant and often sparse input features, aiming to improve the reliability and performance of flood forecasting under data-limited conditions.

2. Case Study and Methodology

2.1. Case Study

The Tich River basin is a critical hydrological region within the Red River system, located in northern Vietnam. As a first tributary of the Bui River, the Tich River originates from the Ba Vi Mountain range and flows predominantly in a northwest-to-southeast direction, running parallel to the right bank of the Day River. This basin covers a total area of approximately 1330 km² and stretches over a length of 110 km, with the main river channel extending 91 km. The Tich River basin encompasses several important administrative areas, including Ba Vi, Son Tay, Phuc Tho, Thach That, Quoc Oai, and Chuong My districts of Hanoi, as well as parts of Luong Son district in Hoa Binh province. The region is bordered by the Da River and Red River to the north, the Day River to the east and south, and the Hoang Long River basin to the west (Figure 1).

Figure 1. Tich Bui River basin. (Source: the authors)

The basin area is characterized by diverse topographical features, ranging from mountainous and semi-mountainous regions to low-lying deltas, which significantly influence hydrological patterns and flood risks.

This area plays a vital role in local agricultural production, water resource management, and flood control. It includes several large reservoirs such as Dong Mo-Ngai Son (with the storage of 61.3 million cubic meter), Suoi Hai (46.8 million cubic meter), and Xuan Khanh (104 hectares), along with smaller reservoirs like Tan Xa, Co Dung, Dong So, Lua, and Linh Khieu. These reservoirs, together with the intricate network of rivers and streams, form a complex hydraulic system essential for irrigation, water supply, and flood mitigation.

The Tich-Bui River basin is particularly vulnerable to flooding, especially in areas situated along the right side of the Bui River in Chuong My District and Huu Tich of Quoc Oai District. During heavy rainfall events, these regions are prone to severe flooding and waterlogging, which pose significant challenges to agricultural activities, infrastructure, and local communities (see Figure 2).

Figure 2. The easier inundation area in downstream of Tich Bui River. (Source: the authors)

Given the increasing frequency of extreme weather events attributed to climate change, developing accurate and reliable flood forecasting models for the Tich River basin has become a critical priority. This study aims to apply the Long Short-Term Memory (LSTM) deep learning technique to enhance flood forecasting accuracy, providing valuable insights for disaster management and sustainable water resource planning in the region.

2.2. Methodology

2.2.1. Data Collection and Preprocessing

The data set used in this study comprises historical rainfall data and water level measurements collected from four monitoring stations distributed across the complex system of reservoirs and irrigation districts. In addition, the total rainfall from all gauged stations at the same time step is calculated to determine whether rainfall is occurring over the entire basin or only in specific areas. This additional input feature helps the model capture spatial rainfall patterns that may influence water levels. Besides them, input variables, including the “encoding-date” have been generated to capture seasonal and cyclical patterns of rainfall and water levels. For these types of inputs, the natural date (day of the year ranging from 1 to 365) is converted into sine and cosine values. This transformation allows the model to recognize yearly cycles and improve prediction accuracy.

The data spans the most recent five years to ensure adequate representation of different hydrological conditions during flood seasons. Notably, extreme events occurred in 2024, where the water level at control stations exceeded the top of its dike, providing valuable data for model training and evaluation.

The dataset undergoes several preprocessing steps to enhance the performance of the LSTM model. First, normalization is applied, where all input features are scaled between 0 and 1 using min-max normalization to ensure compatibility with the model and improve training efficiency. Next, data segmentation is performed, dividing the dataset into training (60%), validation (20%), and testing (20%) subsets. The training set is used for model learning. The validation set helps in monitoring and tuning, while the testing set is reserved for final evaluation. Lastly, a sequence preparation step is implemented using a sliding window approach to segment the time series data into smaller sequences, enabling the model to effectively capture temporal dependencies. Each sequence consists of input features along with their corresponding target values, representing future water levels.

2.2.2. Model Design

The proposed model is built using a Long Short-Term Memory (LSTM) neural network, which is highly suitable for sequential data processing due to its ability to capture both short-term and long-term dependencies. The architecture consists of three key components.

The input layer processes multidimensional data, including:

Rainfall measurements from individual gauged stations (Ha Dong, Ba Tha, Son Tay, and Ba Vi), total rainfall over the basin (aggregating data from these four stations to account for extreme events), and
Historical water level data from Tri Thuy stations on the Bui River.

The hidden layers comprise multiple LSTM layers, each with a specified number of neurons, optimized to balance model complexity and performance, while dropout layers are incorporated to prevent overfitting. The details are as follows:

Finally, the output layer consists of a dense layer with a linear activation function, which is responsible for predicting future water levels to support flood forecasting.

2.2.3. Training Algorithms

Two optimization algorithms were employed and compared for training the LSTM model: the Adam optimizer, known for its ability to accelerate convergence and improve generalization by adaptively adjusting the learning rate, and the Stochastic Gradient Descent with Momentum (SGDM), which helps the model converge more efficiently and avoid local minima.

For the SGDM-based model, training was performed for up to 20,000 epochs. Validation data were used to monitor generalization, and the model achieving the lowest validation loss was retained. The initial learning rate was set to 0.001, with a mini-batch size of 64 to ensure stable gradient updates. Data were shuffled at the start of each epoch to reduce overfitting and bias. Input sequences were automatically trimmed to the shortest length in each batch (SequenceLength = “shortest”), and root mean square error (RMSE) was used as the evaluation metric.

For the model trained with the Adam optimizer, the initial learning rate was set to 0.005. The number of training epochs was also set to 20,000, using the same mini-batch size of 64. To prevent gradient explosion, a gradient threshold of 1 was applied. As with SGDM, input sequences were trimmed to the shortest length in each mini-batch, and validation was conducted using a held-out dataset.

2.2.4. Evaluation Metrics

The model’s performance is evaluated using several key indicators. The Coefficient of Determination (R²) assesses the goodness of fit, representing the proportion of variance in observed data that can be predicted from the input features. The Root Mean Square Error (RMSE) measures the average magnitude of prediction errors, offering an indication of how accurately the model forecasts future water levels. The Mean Square Error (MSE) also evaluates the goodness of fit but places greater emphasis on larger errors by squaring them, making it particularly useful when penalizing significant deviations from actual values. Lastly, the Accuracy Coefficient (P) quantifies the linear correlation between predicted and observed values, providing insights into the strength and direction of their relationship.

2.3. Data Set

There are four water stations along the Tich-Bui River; however, their irregular records make them challenging to use as direct inputs for the model. Among these stations, only Tri Thuy has a long-term, consistent time series, making it the primary focus for water level forecasting. Consequently, an LSTM model has been developed to predict the water level at Tri Thuy. Subsequently, a stage-relationship between Tri Thuy and nearby stations will be established, enabling the extension of forecasts to other critical downstream stations, such as Yen Duyet.

Therefore, in this study, water levels at Tri Thuy stations were examined and forecasted to support flood risk management in the downstream Tich-Bui area. The alarm levels at Tri Thuy station are set at 6.3 m (first alarm), 6.8 m (second alarm), and 7.3 m (third alarm).

Since 2020, the dike system along the Tich-Bui River has been upgraded to enhance flood protection, with the left and right levees reaching elevations of 7.72 m and 8.25 m at Tri Thuy. To accommodate these improvements, data was collected from 2020 to 2024 with a 12-hour time step for model tuning.

Although the historical extreme water level at Tri Thuy was recorded at 7.80 m in 2018, corresponding to a flood frequency of 5% and being the largest event since the station’s activation in 1972, more recent data shows increased flood risks. In 2024, the maximum water level at Tri Thuy reached 8.21m, exceeding the crest of its right dike (Figure 3).

Figure 3. The Tri Thuy water level from 2020-2024.

These observations indicate that the period from 2020 to 2024 effectively captures significant flood events at both stations, providing valuable data for flood risk management and model calibration. The input data collected for this purpose includes 12-hour rainfall measurements from four stations: Son Tay, Ba Vi, Ba Tha, and Ha Dong, along with water level recordings at Tri Thuy and Yen Duyet (Figure 1). The data was used to train the 24-hour leading time and 48-hour leading time forecast for the LSTM model.

3. Result and Discussion

3.1. LSTM with Adam Optimizer

The model is configured with a learning rate of 0.01, 200 neurons, and a maximum of 20,000 epochs. The results indicate that while the Adam optimizer effectively optimizes the training process, it struggles with the validation process, suggesting potential overfitting or insufficient generalization to unseen data. (Figure 4).

Figure 4. Training and validation performance of ADAM-LSTM model.

From Figure 4, it is evident that the training RMSE (blue line in the top figure) fluctuates significantly but remains generally lower than the validation RMSE (orange line in the top figure). This indicates that the model captures consistent patterns with occasional spikes. Additionally, the gap between the training and validation RMSE suggests overfitting, as the validation error does not decrease proportionally to the training error. Figure 5 further confirms this overfitting issue, as the model achieves an almost “perfect” fit with the training dataset, where the recorded time series and model outputs align exceptionally well, leading to a forecast accuracy of 99%. However, while the model performs well on the training dataset, its performance deteriorates in validation and testing. The P value for validation is 43%.

Figure 5. Comparison of predicted and recorded water levels at Tri Thuy during training period with 24-hour forecast model.

To address this problem, an L2 regularization value of 0.01 was added to the model. While this approach improved the results, issues with overfitting still exist. The performance indicators are shown in Table 1.

Table 1. The performance of LSTM with Adam optimizer.

Forecast	Dataset	RMSE	Error Allowance	MSE	R²	P
24 h	Training	0.23	0.28	0.05	0.98	86.94
	Validation	0.39	0.28	0.15	0.94	71.81
	Testing	0.31	0.28	0.10	0.85	79.06
48 h	Training	0.36	0.46	0.13	0.94	88.68
	Validation	0.62	0.46	0.38	0.86	77.09
	Testing	0.42	0.46	0.17	0.73	81.54

The model’s performance was evaluated across three datasets, training, validation, and testing, for both 24-hour and 48-hour forecasts. The 24-hour forecast consistently demonstrated better accuracy and stability, as indicated by higher R2 values and lower RMSE/MSE across all phases. Specifically, the training R2 for 24- hour forecast was 0.98, while for 48-hour forecast, it was 0.94, showing that the model fits shorter-term predictions better. Moving to validation and testing, both forecasts show a decline in R2, with the 48-hour forecast dropping significantly to 0.73, indicating weaker generalization for longer predictions but still in the range of acceptable.

Additionally, the error allowance for the 24-hour forecast model is smaller than that of the 48-hour forecast model, contributing to a higher P value in the testing dataset. This suggests that while the 48-hour forecast experiences greater uncertainty and accumulated errors over time, it still provides useful predictions. However, further improvements, such as enhanced data preprocessing or temporal smoothing techniques, could help refine the 48-hour model’s accuracy and stability (Figure 6).

3.2. LSTM with SDPM Optimizer

The model is configured with a learning rate of 0.001, utilizing 200 neurons and a maximum of 20,000 epochs to ensure thorough training. To mitigate overfitting, L2 Regularization is applied with a regularization coefficient of 0.1, which helps prevent excessive complexity in the model while maintaining generalization. This setup balances learning efficiency with model stability, ensuring the network can capture meaningful patterns without overfitting the training data. (Figure 7)

The results from SGDM-LSTM model show good convergence, with both RMSE and MSE plots rapidly decreasing at the start and stabilizing at low values around 20,000 iterations. The training and validation curves are closely aligned throughout the training process, suggesting that the model is generalizing well and not overfitting.

Figure 6. The comparision of performances indicators between 2 leading time forecast models.

Figure 7. Training and validation performance of SGDM LSTM model.

The results of the LSTM model trained using the SGDM (Stochastic Gradient Descent with Momentum) optimizer demonstrate a reasonable ability to forecast water levels across the training, validation, and testing datasets. The model’s predictions (depicted by the blue line) closely follow the recorded values (depicted by the orange line), particularly during periods of gradual fluctuation. However, during peak water levels, slight deviations are observed, suggesting that the model may have limitations in capturing extreme events or rapid changes in water levels (Figure 8).

Figure 8. Comparison of predicted and recorded water levels during training (top figure), validation (middle figure) and testing (bottom figure) with 24-hour forecast SGDM-LSTM model.

The similarity in the patterns across the training, validation, and testing graphs indicates that the model generalizes well and does not overfit the training dataset. Additionally, the model appears to be effective at capturing the general trend of the water levels, although further improvements are needed to enhance its accuracy during peak conditions. This performance will be further quantified through the evaluation metrics of RMSE, R², and P, which will provide a more objective comparison of the model’s predictive capability across the datasets.

The performance of the LSTM model trained using the SGDM algorithm is evaluated through training, validation, and testing phases. The convergence graphs of RMSE and MSE indicate successful training with both training and validation loss curves rapidly decreasing and stabilizing at low values, demonstrating effective learning and minimizing overfitting. The comparison between the predicted and recorded water levels for the training and validation periods reveals a strong agreement between the model and the observed data, as shown by the close alignment of the curves.

Table 2. The performance of LSTM with SGDM optimizer.

Forecast	Dataset	RMSE	Error Allowance	MSE	R²	P
24 h	Training	0.36	0.28	0.13	0.95	77.29
	Validation	0.38	0.28	0.14	0.95	73.12
	Testing	0.23	0.28	0.05	0.92	84.85
48 h	Training	0.59	0.46	0.35	0.87	77.29
	Validation	0.50	0.46	0.25	0.91	80.63
	Testing	0.35	0.46	0.12	0.81	87.60

During the testing phase, the model’s performance remains robust, accurately capturing the general patterns and peaks of the water level changes, although some discrepancies occur during sharp increases or decreases. This slight deviation can be attributed to the complexity of rapidly changing hydrological processes and possible limitations of the model’s ability to generalize well to unseen data.

Overall, the model demonstrates strong predictive ability, with satisfactory evaluation metrics for all phases (Table 2). The performance comparison, particularly the testing phase, confirms the model’s capability to generalize the learned relationships from the training data to new, unseen conditions. However, some further improvements may be required to enhance accuracy during rapid water level changes.

3.3. Comparison and Discussion

The comparison between the Adam and SGDM algorithms for flood forecasting reveals some important differences in performance. Three graphs were provided for each model: training, validation, and testing. The purpose of comparing these graphs is to evaluate which optimization algorithm provides better predictive performance over all phases of the modeling process.

3.3.1. Adam Algorithm

The training, validation, and testing graphs of the Adam algorithm show relatively good alignment between the model predictions and the recorded water levels. However, it struggles with overfitting even through an L2-Regularizarion technique. The testing fore-cast accuracy coefficient P only reach 43%.

3.3.2. SGDM Algorithm

The graphs obtained from the SGDM algorithm display a closer fit to the recorded water levels than the Adam algorithm. During both the training and validation phases, the predicted curves follow the recorded data more accurately. The SGDM model demonstrates higher precision in capturing peaks and troughs, which is particularly evident during the testing phase. The overall performance of the SGDM algorithm is superi- or in terms of tracking the pattern of actual water levels, including extreme values.

When comparing the two algorithms, SGDM performs better than Adam in terms of predictive accuracy. The SGDM model provides smoother and more accurate predictions across all phases of the modelling process. Its ability to effectively capture peak values during testing indicates a more robust generalization capability.

On the other hand, the Adam algorithm shows some weaknesses in accurately predicting extreme values, likely due to its tendency to make larger updates in parameter space, which may result in overshooting or underfitting the target data.

The comparison suggests that for this specific flood forecasting problem, the SGDM algorithm provides better performance than the Adam algorithm. This is likely due to its more stable gradient descent process, which contributes to improved accuracy in capturing both general trends and extreme values. Future work could focus on further enhancing the SGDM model or combining it with other optimization techniques to achieve even better forecasting performance.

4. Conclusions

This study successfully developed and applied hydrological forecasting and flood warning technologies for small river basins in the northwest region of Vietnam. The LSTM model was trained and tested using two optimization algorithms: Adam and SGDM. The results demonstrated that the SGDM algorithm provided better accuracy and stability than the Adam algorithm, as indicated by lower RMSE values and higher R² and P values across training, validation, and testing phases. The performance comparison also highlighted the model’s ability to capture peak flow events effectively, a critical aspect of flood forecasting systems.

Additionally, the incorporation of the total rainfall of all gauged stations improved the model’s accuracy by considering basin-wide rainfall events. However, some discrepancies between modelled and recorded data indicate potential areas for further improvement, such as enhancing the model architecture, increasing training data, or applying hybrid models.

The findings contribute to improving flood forecasting accuracy and reliability, offering valuable insights for early warning systems in small river basins, particularly in regions with limited data availability. Future work will focus on refining the model’s performance and applying the approach to a broader range of hydrological systems to enhance flood preparedness and risk management.

Acknowledgements

This research received funding from the Vietnam Ministry of Natural Resources and Environment entitled “To Study and Develop Hydrological Forecasting and Flood Warning Technologies for Small River Basins, Applying the Tests to Some Small River Basins in the North-West Region”, Code number: TNMT.2023.06.14.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1]	Montanari, A. (2012) Hydrology of the Po River: Looking for Changing Patterns in River Discharge. Hydrology and Earth System Sciences, 16, 3739-3747. https://doi.org/10.5194/hess-16-3739-2012
[2]	Papacharalampous, G. and Tyralis, H. (2020) Hydrological Time Series Forecasting Using Simple Combinations: Big Data Testing and Investigations on One-Year Ahead River Flow Predictability. Journal of Hydrology, 590, Article ID: 125205. https://doi.org/10.1016/j.jhydrol.2020.125205
[3]	Piadeh, F., Behzadian, K. and Alani, A.M. (2022) A Critical Review of Real-Time Modelling of Flood Forecasting in Urban Drainage Systems. Journal of Hydrology, 607, Article ID: 127476. https://doi.org/10.1016/j.jhydrol.2022.127476
[4]	Jain, S.K., et al. (2017) A Brief Review of Flood Forecasting Techniques and Their Applications. International Journal of River Basin Management, 16, 329-344.
[5]	Agaj, T., Budka, A., Janicka, E. and Bytyqi, V. (2024) Using ARIMA and ETS Models for Forecasting Water Level Changes for Sustainable Environmental Management. Scientific Reports, 14, Article No. 22444. https://doi.org/10.1038/s41598-024-73405-9
[6]	Wang, W., Chau, K., Xu, D. and Chen, X. (2015) Improving Forecasting Accuracy of Annual Runoff Time Series Using ARIMA Based on EEMD Decomposition. Water Resources Management, 29, 2655-2675. https://doi.org/10.1007/s11269-015-0962-6
[7]	Katimon, A., Shahid, S. and Mohsenipour, M. (2017) Modeling Water Quality and Hydrological Variables Using ARIMA: A Case Study of Johor River, Malaysia. Sustainable Water Resources Management, 4, 991-998. https://doi.org/10.1007/s40899-017-0202-8
[8]	James Oloche Oleyiblo, Z.J.L. (2010) Application of HEC-HMS for Flood Forecasting in Misai and Wan’an Catchments in China. Water Science and Engineering, 3, 14-22.
[9]	Manoj, N. (2016) Development of a Flood Forecasting Model Using HEC-HMS. National Conference on Water Resources & Flood Management with Special Reference to Flood Modelling, SVNIT Surat, 14-15 October 2016, 10-19.
[10]	Tan, M.L., Gassman, P.W., Yang, X. and Haywood, J. (2020) A Review of SWAT Applications, Performance and Future Needs for Simulation of Hydro-Climatic Extremes. Advances in Water Resources, 143, Article ID: 103662. https://doi.org/10.1016/j.advwatres.2020.103662
[11]	Yu, D., Xie, P., Dong, X., Hu, X., Liu, J., Li, Y., et al. (2018) Improvement of the SWAT Model for Event-Based Flood Simulation on a Sub-Daily Timescale. Hydrology and Earth System Sciences, 22, 5001-5019. https://doi.org/10.5194/hess-22-5001-2018
[12]	Rostami, A. and Gholizadeh, N. (2023) Machine Learning and Deep Learning Approaches for River Flow Forecasting. 22nd Iranian Conference on Hydraulics, Maragheh, 8-9 November 2023, 1-10.
[13]	Adnan, R.M., Kisi, O., Mostafa, R.R., Ahmed, A.N. and El-Shafie, A. (2022) The Potential of a Novel Support Vector Machine Trained with Modified Mayfly Optimization Algorithm for Streamflow Prediction. Hydrological Sciences Journal, 67, 161-174. https://doi.org/10.1080/02626667.2021.2012182
[14]	Luppichini, M., Vailati, G., Fontana, L. and Bini, M. (2024) Machine Learning Models for River Flow Forecasting in Small Catchments. Scientific Reports, 14, Article No. 26740. https://doi.org/10.1038/s41598-024-78012-2
[15]	Duy Nguyen, H. (2022) Daily Streamflow Forecasting by Machine Learning in Tra Khuc River in Vietnam. Vietnam Journal of Earth Sciences, 45, 82-97. https://doi.org/10.15625/2615-9783/17914
[16]	Irving, K., Kuemmerlen, M., Kiesel, J., Kakouei, K., Domisch, S. and Jähnig, S.C. (2018) A High-Resolution Streamflow and Hydrological Metrics Dataset for Ecological Modeling Using a Regression Model. Scientific Data, 5, Article No. 180224. https://doi.org/10.1038/sdata.2018.224
[17]	Wee, W.J., Zaini, N.B., Ahmed, A.N. and El-Shafie, A. (2021) A Review of Models for Water Level Forecasting Based on Machine Learning. Earth Science Informatics, 14, 1707-1728. https://doi.org/10.1007/s12145-021-00664-9
[18]	Anh, T.V. (2023) Artificial Intelligence Technique in Hydrological Forecasts Supporting for Water Resources Management of a Large River Basin in Vietnam. Open Journal of Modern Hydrology, 13, 246-258. https://doi.org/10.4236/ojmh.2023.134014

Related Articles

Contact us

Article Content