管理评论 ›› 2024, Vol. 36 ›› Issue (7): 113-127.

• 经济与金融管理 • 上一篇    

基于互联网新闻和时间卷积长短时记忆神经网络的股票指数预测研究

崔笑宁1,2, 苏丹华3, 尚维1,4,5   

  1. 1. 中国科学院数学与系统科学研究院, 北京 100190;
    2. 中国科学院大学经济与管理学院, 北京 100190;
    3. 北京物资学院经济学院, 北京 101149;
    4. 中国科学院预测科学研究中心, 北京 100190;
    5. 中国科学院大学数字经济监测预测预警与政策仿真教育部哲学社会科学实验室(培育), 北京 100190
  • 收稿日期:2020-10-19 发布日期:2024-08-03
  • 作者简介:崔笑宁,中国科学院数学与系统科学研究院博士研究生;苏丹华,北京物资学院经济学院讲师,博士;尚维(通讯作者),中国科学院数学与系统科学研究院副研究员,博士生导师,博士。
  • 基金资助:
    国家自然科学基金面上项目(71571180;72073008)。

Research on Stock Index Prediction Based on Online News and Temporal Convolutional Long-short Term Memory Neural Network

Cui Xiaoning1,2, Su Danhua3, Shang Wei1,4,5   

  1. 1. Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190;
    2. School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190;
    3. School of Economics, Beijing Wuzi University, Beijing 101149;
    4. Center for Forecasting Science, Chinese Academy of Sciences, Beijing 100190;
    5. MOE Social Science Laboratory of Digital Economic Forecasts and Policy Simulation at UCAS, Beijing 100190
  • Received:2020-10-19 Published:2024-08-03

摘要: 本文针对互联网财经新闻中对于股票市场涨跌的舆论观点,进行文本分析和建模,建立了股票市场领域情感词典,用于对互联网财经新闻文本数据进行积极、消极与中立情绪的情感分析。在情感分析过程中考虑否定副词和转义词,随后建立情感特征并采用时间卷积长短时记忆神经网络对沪深300股票指数进行预测。本文利用Word2Vec方法对大量互联网财经新闻进行训练,以半监督的方式构建互联网新闻语境股票市场领域中文情感词典,该词典能够有效地对股票市场相关新闻中所蕴含的股市涨跌观点和情绪进行识别。为充分利用文本和时序特征,本研究提出了时间卷积与长短时记忆网络相结合的模型TCN-LSTM。经过实证分析对比发现,TCN-LSTM模型的方向预测和短期数值预测效果优于其他深度学习模型。本研究提出了面向特定舆情主题的情感词典构建方法,建立了用于股市预测的互联网新闻情感词典。同时,也发展了利用深度学习方法进行金融时间序列预测的新方法。时间卷积和长短时记忆机制的集成解决了特征提取时局部和长期的权衡问题,对深度学习在金融预测领域应用效果的提高有较为重要的意义。

关键词: 情感分析, 情感词典, 股票指数预测, 深度学习

Abstract: This paper, based on the opinions of Internet financial news on the rise and fall of the stock market, conducts text analysis and modeling and establishes a sentiment lexicon for stock market, which is used to analyze the positive, negative and neutral sentiment of online financial news. During the process of sentiment analysis, the negative adverbs and the words which have inverse effect are considered. Then the sentiment feature is established for predicting the CSI 300 by temporal convolutional long-short term memory neural network. In terms of the sentiment lexicon, this paper applies Word2Vec to train large amount of Internet financial news, and builds Chinese sentiment lexicon of stock market domain under online news context in a semi-supervised way. This lexicon can effectively identify the opinions and the sentiment of stock market fluctuations in related news. In order to make full use of time series data features with text, a model, TCN-LSTM, combining temporal convolution with long-short term memory network is proposed in this study. Through the empirical analysis and comparisons, it can be found that the TCN-LSTM model is superior to other deep learning models on direction prediction and short-term numerical prediction. This study proposes a sentiment lexicon construction method for specific public opinion topics and establishes a sentiment lexicon based on online news for stock market prediction. Meanwhile, a financial time series prediction method based on deep learning is developed. The integration of temporal convolution and long-short term memory cell solves the tradeoff between local and long-term feature extraction, which is of great significance to improvimg the application effect of deep learning in the field of financial prediction.

Key words: sentiment analysis, sentiment lexicon, stock index prediction, deep learning