›› 2020, Vol. 32 ›› Issue (3): 75-84.

• 经济与金融管理 • 上一篇    下一篇

基于重抽样法处理不平衡问题的信用评级模型

夏利宇1,2, 何晓群2   

  1. 1. 国网能源研究院有限公司, 北京 102209;
    2. 中国人民大学统计学院, 北京 100872
  • 收稿日期:2017-06-14 出版日期:2020-03-28 发布日期:2020-04-08
  • 通讯作者: 何晓群(通讯作者),中国人民大学统计学院教授,博士生导师
  • 作者简介:夏利宇,国网能源研究院有限公司中级研究员,中国人民大学统计学院博士研究生
  • 基金资助:

    教育部人文社会科学重点研究基地重大项目(15JJD910002)。

Data Imbalance in Credit Score Model Based on Resampling Method

Xia Liyu1,2, He Xiaoqun2   

  1. 1. State Grid Energy Research Institute, Beijing 102209;
    2. School of Statistics, Renmin University of China, Beijing 100872
  • Received:2017-06-14 Online:2020-03-28 Published:2020-04-08

摘要:

由于履约客户的数量远远大于违约客户,征信数据具备严重的不平衡特征,常用的处理方法较少同时考虑金融机构所关注的违约损失和市场份额因素。本文基于违约损失因素提出迭代重抽样集成模型(IRIM),利用迭代欠抽样方法提升模型对"坏"客户的关注,采用集成方法将弱分类模型转变为强分类模型;基于市场份额因素改进常用的F-value指标,引入评价分类效果的RS指标。在6类不平衡关系下进行模拟研究,并对SSBF数据和中国某银行征信数据进行实证研究。结果表明,与常用的方法和指标相比,迭代重抽样集成模型能够在确保市场份额不过度减少的情况下降低金融机构的违约风险,RS指标能够恰当地权衡市场份额和违约风险的关系。

关键词: 信用评级模型, 不平衡, 迭代重抽样, 评价指标

Abstract:

The number of "good credit" customer is far greater than that of "bad credit" customer, thus credit data presents a serious imbalance structure. However, common methods rarely focus on both default losses and market share, on which financial institutions put a high value. For the sake of default loss, we propose an Iterative Resampling Integration Model (IRIM) to improve model's concern on "bad credit" customer by resampling method and transform the weak classifier to a strong one by model integration. Based on F-value index, we propose a RS index for the sake of market share to evaluate classification effect. Simulation studies in 6 data imbalance cases are implemented, empirical studies with SSBF dataset and bank of C dataset are conducted. The results demonstrate that our method can reduce financial institutions' risk of default without excessively losing market share, and RS index can appropriately coordinate the relationship between market share and default risk.

Key words: credit score model, data imbalance, iterative resampling, evaluation index