›› 2020, Vol. 32 ›› Issue (3): 3-20.

• 经济与金融管理 •    下一篇

基于非平衡数据处理的上市公司ST预警混合模型

迟国泰, 章彤, 张志鹏   

  1. 大连理工大学经济管理学院, 大连 116042
  • 收稿日期:2019-10-14 出版日期:2020-03-28 发布日期:2020-04-08
  • 通讯作者: 章彤(通讯作者),大连理工大学经济管理学院博士研究生
  • 作者简介:迟国泰,大连理工大学经济管理学院教授,博士生导师,博士;张志鹏,大连理工大学经济管理学院博士研究生
  • 基金资助:

    国家自然科学基金重点项目(71731003;71431002);国家自然科学基金面上项目(71873103;71971051;71971034);国家自然科学基金青年项目(71901055;71903019);爱德力智能科技(厦门)有限公司智能风险管控模型与算法项目(2019-01)。

Special Treatment Warning Hybrid Model Dealing with Imbalanced Data of Chinese Listed Companies

Chi Guotai, Zhang Tong, Zhang Zhipeng   

  1. School of Economics and Management, Dalian University of Technology, Dalian 116024
  • Received:2019-10-14 Online:2020-03-28 Published:2020-04-08

摘要:

准确预测上市公司ST状态,对上市公司自身的管理以及投资者的投资决策极为重要。本文通过Lasso最小二乘回归筛选ST判别能力最强的指标组合,并用SMOTE过采样技术对上市公司数据进行平衡化处理,再通过逻辑回归与BP神经网络的混合模型,基于不同时间窗口的数据对中国上市公司ST状态进行预测。本文创新与特色:一是将BP神经网络和逻辑回归分别得到的公司ST概率与指标数据一同代入BP神经网络模型中预测ST状态,提高了仅用单一判别模型的预测准确率;二是以Lasso最小二乘回归方程的误差最小为目标,寻找对ST状态判别能力最大的一组指标;三是采用SMOTE对上市公司样本进行平衡化处理,解决了非平衡数据下模型判别不准确的问题;四是分别采用了提前2年、3年、4年和5年的数据对公司未来ST状态进行预测,找到了ST预警的最优时间窗口。

关键词: 非平衡样本, 最优指标组合, ST预警, 中国上市公司, 混合模型

Abstract:

It is of great importance for the managers and investors to predict the ST (special treatment) status of Chinese listed companies. The purpose of this study is to deal with three essential issues involving ST warning. The first issue is how to select an optimal indicator set which can precisely predict the ST status of Chinese listed companies. The second issue is how to improve the predictive accuracy when the data is imbalanced. The third issue is how to determine the optimal predictive period of ST warning.
In this study, we select the optimal indicator set by using Lasso Regression, then generate the minority sample to make the data balanced by SMOTE, finally construct a hybrid model combining the Logistic Regression (LR) with Back Progpagation Neural Network (BPNN) to predict the ST status of Chinese listed companies. The innovations and characteristics of this study are summarized as follows. Firstly, we combine the Chinese listed company dataset with the ST probabilities respectively calculated by BPNN and LR to form the input of another BPNN, then use this hybrid model to improve the predictive accuracy. Secondly, we get the optimal indicator set of ST warning by Lasso Regression which minimizes the prediction error. Thirdly, we do the over-sampling of the minority data by SMOTE and deal with the issue of data imbalance. Fourthly, we train the ST warning model by respectively using the data 2, 3, 4 and 5 years before the ST status, and find the optimal lag period to do the ST warning.
The empirical study shows that the proposed hybrid model with SMOTE and optimal indicator set outperforms other benchmark models. Furthermore, we find that the optimal predictive period of ST warning is 3 years before ST status, the proposed model can achieve 96% TPR and 99.5% TNR.

Key words: imbalanced data, optimal indicator set, ST warning, Chinese listed company, hybrid model