›› 2017, Vol. 29 ›› Issue (9): 59-71.

Previous Articles     Next Articles

Research on Auto Loan Default Prediction Based on Large Sample Data Model

Shu Yang, Yang Qiuyi   

  1. School of Economics, Huazhong University of Science and Technology, Wuhan 430074
  • Received:2015-09-09 Online:2017-09-28 Published:2017-10-09

Abstract:

Using the data containing 47,138 customers in December 2014 from a well-known auto finance company in China, this paper first uses ROC curves to test the efficiency of Stepwise Regression, then respectively applies Binary Choice Model and Count Model to predict the default status of loan customers. Afterwards, we apply Genetic Algorithm to do one-to-one matching on unbalanced sample and finally obtain the predicted results. Based on the above analysis, we argue that the current default evaluation system is ineffective, and variables including customers' basic information, geographical zone, loan messages, car type, credit status, estate, impact events during loan period all have corresponding impacts to customers' default status. Furthermore, the paper finally concludes that balanced sample after matching still possesses superior prediction accuracy rate, that Logistic Model is the most suitable when companies intend to predict whether a customer will default, and that Negative Binomial Model has better efficiency if companies need to know the time length of a customer not paying back.

Key words: auto loan, default prediction, Stepwise Regression, ROC curves, Binary Choice Model, Count Model, Genetic Algorithm Matching