管理评论 ›› 2021, Vol. 33 ›› Issue (5): 257-269.

• 物理-事理-人理系统方法论(WSR) • 上一篇    下一篇

基于Seq2Seq文本摘要和情感挖掘的股票波动趋势预测

齐甜方, 蒋洪迅   

  1. 中国人民大学信息学院, 北京 100872
  • 收稿日期:2018-10-22 出版日期:2021-05-28 发布日期:2021-06-03
  • 通讯作者: 蒋洪迅(通讯作者),中国人民大学信息学院副教授,博士
  • 作者简介:齐甜方,中国人民大学信息学院硕士研究生。
  • 基金资助:
    国家自然科学基金项目(71571183)。

Exploring Stock Price Trend Using Seq2Seq Based Automatic Text Summarization and Sentiment Mining

Qi Tianfang, Jiang Hongxun   

  1. School of Information, Renmin University of China, Beijing 100872
  • Received:2018-10-22 Online:2021-05-28 Published:2021-06-03

摘要: 文本情感挖掘技术被广泛应用于金融市场中股票价格预测和波动趋势分析。以往的研究多局限于全文情感挖掘,即把全文各部分视作同等重要,然而对于互联网极具自由精神的新闻文本来说这是一个不切实际的强假设,有些新闻局部内容甚至与主题无关。本文提出一种基于自动文本摘要和情感挖掘技术的股票波动趋势预测研究框架,将Seq2Seq文本摘要技术应用于情感挖掘中,来提升文本情感表达的准确性,之后将新闻摘要情感值作为股票预测模型的输入特征。为了验证模型有效性,本文进行了新闻摘要情感值和全文情感值预测的交叉对比实验,结果表明二者情感波动趋势大多数情况下趋于一致,但是文本摘要的波动区间更小而且分布也更为稳定;在情感值分布上,摘要挖掘对负向情感更加敏感;对24只沪市A股历时两年半的涨跌趋势预测方面,摘要情感挖掘模型的预测效果要明显优于全文情感挖掘方法。

关键词: 股票涨跌幅, 自动文本摘要, Seq2Seq, 情感分析

Abstract: Text sentiment mining is widely used in empirical studies for stock prices forecasting or trend analysis. Existing researches tend to focus merely on mining the full texts of financial news with an underlying assumption that all contents are related to its topic and all sections count equally. Clearly this is unduly restrictive due to the presence of irregular text format and structural confusion of internet news. This paper proposes and studies stock trend prediction models with text summarization and sentiment mining. We use Seq2Seq method to summarize news texts automatically, then mine their sentiment to improve the performance of models, and finally incorporate these sentiment values as additional features into the machine learning of stock prediction. We conduct experiments to compare two strategies, one with sentiment values of full texts and the other with sentiment value of text summary. The results show that a) their curves of emotional fluctuation are the same in most cases, but the one with text summary has smaller range and is more stable; b) the one with text summary is more sensitive to negative emotional words; c) the one with text summary, in the stock market trend prediction, obviously dominates the other.

Key words: stock price trend, automatic text summarization, Seq2Seq, sentiment analysis