Improving maritime accident severity prediction accuracy: A holistic machine learning framework with data balancing and explainability techniques



Cao, Wenjie, Wang, Xinjian, Feng, Yuanjun ORCID: 0000-0002-3918-2564, Zhou, Jingen ORCID: 0000-0001-9091-5535 and Yang, Zaili ORCID: 0000-0003-1385-493X
(2026) Improving maritime accident severity prediction accuracy: A holistic machine learning framework with data balancing and explainability techniques Reliability Engineering & System Safety, 266. p. 111648. ISSN 0951-8320

Access the full-text of this item by clicking on the Open Access link.

Abstract

Accurately predicting the severity of maritime accidents is crucial for enhancing safety management and minimizing operational risks. Traditional prediction models, however, often suffer from the challenges resulted from unbalanced datasets and the complexity of multidimensional factors. This study aims to develop an integrated prediction framework incorporating six data balancing techniques to effectively address category imbalance and enhance model predictive robustness. Additionally, eight well-established machine learning models are utilized, with their performance optimized through hyperparameter tuning and cross-validation. To interpret the model results, SHapley Additive exPlanations (SHAP) are applied for global feature contribution analysis, while Local Interpretable Model-agnostic Explanations (LIME) provide local interpretations, enabling an in-depth understanding of feature-specific impacts on predictions. The results indicate that the combination of RandomOverSampler and CatBoost achieves optimal performance across all metrics, with an accuracy of 86.45%, precision of 84.38%, recall of 89.70%, F1-score of 86.81%, and ROC AUC of 93.69%. The analysis identifies accident type, ship type, engine power and gross tonnage as the key features influencing accident severity prediction. Furthermore, the integrated explanatory framework combining SHAP and LIME elucidates both the individual contributions and the collective impact of these features, along with the direction and magnitude of their influence on individual predictions, ensuring model transparency and interpretability. This study advances the prediction of maritime accident severity and provides a robust scientific basis for decision-making in maritime safety, enabling policymakers and industrial stakeholders to make accurate and reliable risk-informed decisions. The source code is publicly available at: https://github.com/AdvMarTech/BalancedMaritimeAccidentXAI.

Item Type: Article
Uncontrolled Keywords: 3505 Human Resources and Industrial Relations, 35 Commerce, Management, Tourism and Services, Networking and Information Technology R&D (NITRD), Machine Learning and Artificial Intelligence, Bioengineering
Divisions: Faculty of Humanities & Social Sciences
Faculty of Humanities & Social Sciences > School of Management
Depositing User: Symplectic Admin
Date Deposited: 15 Sep 2025 14:49
Last Modified: 22 Dec 2025 14:46
DOI: 10.1016/j.ress.2025.111648
Open Access URL: https://doi.org/10.1016/j.ress.2025.111648
Related Websites:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3194412
Disclaimer: The University of Liverpool is not responsible for content contained on other websites from links within repository metadata. Please contact us if you notice anything that appears incorrect or inappropriate.