Journal of Systems Engineering and Electronics ›› 2024, Vol. 35 ›› Issue (2): 396-405.doi: 10.23919/JSEE.2024.000035

• SYSTEMS ENGINEERING • Previous Articles    

Classification of aviation incident causes using LGBM with improved cross-validation

Xiaomei NI1,2(), Huawei WANG1,*(), Lingzi CHEN1(), Ruiguan LIN1()   

  1. 1 School of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
    2 School of Aeronautical Engineering, Nanjing Vocational University of Industry Technology, Nanjing 210023, China
  • Received:2021-04-28 Online:2024-04-18 Published:2024-04-18
  • Contact: Huawei WANG E-mail:905271004@qq.com;wang_hw66@163.com;13258035636@163.com;478636604@qq.com
  • About author:
    NI Xiaomei was born in 1992. She received her Ph.D. degree in aviation engineering from Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2022. She is a lecturer in School of Aeronautical Engineering, Nanjing Vocational University of Industry Technology, Nanjing, China. Her research interests are civil aviation safety engineering and aircraft reliability. E-mail: 905271004@qq.com

    WANG Huawei was born in 1974. She received her Ph.D. degree from National University of Defense Technology, Changsha, China in 2003. She is a professor of the School of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing, China. Her research interests include prognostic and health management, civil aviation safety engineering, and aircraft reliability engineering. E-mail: wang_hw66@163.com

    CHEN Lingzi was born in 1997. She received her M.S. degree from Nanjing University of Aeronautics and Astronautics in 2022. She is an engineer of the 41st Research Institute of China Electronics Technology Group Corporation. Her research interest is civil aviation safety engineering. E-mail: 13258035636@163.com

    LIN Ruiguan was born in 1993. He received his M.S. degree in aviation engineering from Shenyang Aerospace University, Shenyang, China, in 2019. He is currently pursuing his Ph.D. degree at the School of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing, China. His research interests are civil aviation safety engineering and aircraft reliability. E-mail: 478636604@qq.com
  • Supported by:
    This work was supported by the National Natural Science Foundation of China Civil Aviation Joint Fund (U1833110), and Research on the Dual Prevention Mechanism and Intelligent Management Technology for Civil Aviation Safety Risks (YK23-03-05).

Abstract:

Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced machine learning algorithm. To assess aviation safety and identify the causes of incidents, a classification model with light gradient boosting machine (LGBM) based on the aviation safety reporting system (ASRS) has been developed. It is improved by k-fold cross-validation with hybrid sampling model (HSCV), which may boost classification performance and maintain data balance. The results show that employing the LGBM-HSCV model can significantly improve accuracy while alleviating data imbalance. Vertical comparison with other cross-validation (CV) methods and lateral comparison with different fold times comprise the comparative approach. Aside from the comparison, two further CV approaches based on the improved method in this study are discussed: one with a different sampling and folding order, and the other with more CV. According to the assessment indices with different methods, the LGBM-HSCV model proposed here is effective at detecting incident causes. The improved model for imbalanced data categorization proposed may serve as a point of reference for similar data processing, and the model’s accurate identification of civil aviation incident causes can assist to improve civil aviation safety.

Key words: aviation safety, imbalance data, light gradient boosting machine (LGBM), cross-validation (CV)