Bedload transport rate prediction: Application of novel hybrid data mining techniques

Khosravi, Khabat, Cooper, James R ORCID: 0000-0003-4957-2774, Daggupati, Prasad, Thai Pham, Binh and Tien Bui, Dieu
(2020) Bedload transport rate prediction: Application of novel hybrid data mining techniques. Journal of Hydrology, 585. p. 124774.

[img] Text
Bedload.pdf - Author Accepted Manuscript

Download (3MB) | Preview


The accurate prediction of bedload transport in gravel-bed rivers remains a significant challenge in river science. However the potential for data mining algorithms to provide models of bedload transport have yet to be explored. This study provides the first quantification of the predictive power of a range of standalone and hybrid data mining models. Using bedload transport data collected in laboratory flume experiments, the performance of four types of recently developed standalone data mining techniques - the M5P, random tree (RT), random forest (RF) and the reduced error pruning tree (REPT) - are assessed, along with four types of hybrid algorithms trained with a Bagging (BA) data mining algorithm (BA-M5P, BA-RF, BA-RT and BA-REPT). The main findings are four-fold. First, the BA-M5P model had the highest prediction power (R2 = 0.943; RMSE = 0.061 kg m−1 s−1; MAE = 0.040 kg m−1 s−1; NSE = 0.945; PBIAS = −1.60) followed by M5P, BA-RT, RT, BA-RF, RF, BA-REPT, and REPT. All models displayed ‘very good’ performance except the BA-REPT and REPT model, which were ‘satisfactory’. Second, the M5P, BA-RT, and RT models underestimated, and the BA-M5P, BA-RF, RF, BA-REPT and REPT models overestimated, bedload transport rates. Third, flow velocity had the most significant impact on bedload transport rate (PCC = 0.760) followed by shear stress (PCC = 0.709), discharge (PCC = 0.668), bed shear velocity (PCC = 0.663), bed slope (PCC = 0.490), flow depth (PCC = 0.303), median sediment diameter (PCC = 0.247), and relative roughness (PCC = 0.003). Fourth, the maximum depth of tree was the most sensitive operator in decision tree-based algorithms, and batch size, number of execution slots and number of decimal places did not have any impact on model’ prediction power. Overall the results revealed that hybrid data mining techniques provide more accurate predictions of bedload transport rate than standalone data mining models. In particular, M5P models, trained with a Bagging data mining algorithm, have great potential to produce robust predictions of bedload transport in gravel-bed rivers.

Item Type: Article
Uncontrolled Keywords: Bedload, Flume experiment, Data mining, River, Artificial intelligence
Depositing User: Symplectic Admin
Date Deposited: 03 Mar 2020 08:53
Last Modified: 18 Jan 2023 23:59
DOI: 10.1016/j.jhydrol.2020.124774
Related URLs: