›› 2020, Vol. 32 ›› Issue (3): 3-20.

• Economic and Financial Management •     Next Articles

Special Treatment Warning Hybrid Model Dealing with Imbalanced Data of Chinese Listed Companies

Chi Guotai, Zhang Tong, Zhang Zhipeng   

  1. School of Economics and Management, Dalian University of Technology, Dalian 116024
  • Received:2019-10-14 Online:2020-03-28 Published:2020-04-08

Abstract:

It is of great importance for the managers and investors to predict the ST (special treatment) status of Chinese listed companies. The purpose of this study is to deal with three essential issues involving ST warning. The first issue is how to select an optimal indicator set which can precisely predict the ST status of Chinese listed companies. The second issue is how to improve the predictive accuracy when the data is imbalanced. The third issue is how to determine the optimal predictive period of ST warning.
In this study, we select the optimal indicator set by using Lasso Regression, then generate the minority sample to make the data balanced by SMOTE, finally construct a hybrid model combining the Logistic Regression (LR) with Back Progpagation Neural Network (BPNN) to predict the ST status of Chinese listed companies. The innovations and characteristics of this study are summarized as follows. Firstly, we combine the Chinese listed company dataset with the ST probabilities respectively calculated by BPNN and LR to form the input of another BPNN, then use this hybrid model to improve the predictive accuracy. Secondly, we get the optimal indicator set of ST warning by Lasso Regression which minimizes the prediction error. Thirdly, we do the over-sampling of the minority data by SMOTE and deal with the issue of data imbalance. Fourthly, we train the ST warning model by respectively using the data 2, 3, 4 and 5 years before the ST status, and find the optimal lag period to do the ST warning.
The empirical study shows that the proposed hybrid model with SMOTE and optimal indicator set outperforms other benchmark models. Furthermore, we find that the optimal predictive period of ST warning is 3 years before ST status, the proposed model can achieve 96% TPR and 99.5% TNR.

Key words: imbalanced data, optimal indicator set, ST warning, Chinese listed company, hybrid model