›› 2020, Vol. 32 ›› Issue (3): 75-84.

• Economic and Financial Management • Previous Articles     Next Articles

Data Imbalance in Credit Score Model Based on Resampling Method

Xia Liyu1,2, He Xiaoqun2   

  1. 1. State Grid Energy Research Institute, Beijing 102209;
    2. School of Statistics, Renmin University of China, Beijing 100872
  • Received:2017-06-14 Online:2020-03-28 Published:2020-04-08

Abstract:

The number of "good credit" customer is far greater than that of "bad credit" customer, thus credit data presents a serious imbalance structure. However, common methods rarely focus on both default losses and market share, on which financial institutions put a high value. For the sake of default loss, we propose an Iterative Resampling Integration Model (IRIM) to improve model's concern on "bad credit" customer by resampling method and transform the weak classifier to a strong one by model integration. Based on F-value index, we propose a RS index for the sake of market share to evaluate classification effect. Simulation studies in 6 data imbalance cases are implemented, empirical studies with SSBF dataset and bank of C dataset are conducted. The results demonstrate that our method can reduce financial institutions' risk of default without excessively losing market share, and RS index can appropriately coordinate the relationship between market share and default risk.

Key words: credit score model, data imbalance, iterative resampling, evaluation index