The boom of social media in the past decade has transformed the way investors forecast the stock market. As social media provides large volumes of real-time user opinion, investors can potentially exploit that information by leveraging machine learning and natural language processing techniques. In recent years, popular news channels have repeatedly reported that predicting the market using “alternative data”, such as microblog messages, is becoming mainstream among institutional investors. How effective is this approach? Does an upbeat sentiment on microblogs signal a bullish market trend?

In a recent article published in the MIS Quarterly, we tried to advance this stream of research by conducting a comprehensive analysis of 18 million StockTwits messages. Specifically, we investigated whether user sentiment reflected in microblog messages influences stock returns. If so, is positive or negative sentiment more influential? Is this effect long-lasting or short-lived? Does this effect apply to individual stocks or to the entire market? After a decade-long quest to disentangle the relationship between microblog user sentiment and stock market performance, there are still many unanswered questions. Using sentiment analysis and econometric modelling techniques, we attempted to find answers from the massive volumes of messages posted on StockTwits.  

StockTwits is the largest microblog platform for investors and investment professionals. It allows users to track specific financial assets using Cashtags (e.g., $AAPL, $BABA) in their messages. Based on these Cashtags, we identified the 44 most popular stocks in terms of message volume and market capital. For each of these stocks, we extracted user sentiment from related messages. Specifically, we used a program called SentiStrength to classify each message as positive, negative, or neutral. These three sentiment categories can effectively summarize microblog users’ opinions toward the stocks. We then aggregated the number of positive/negative messages by hour and by day. This sentiment analysis procedure generated time series of positive and negative sentiment scores for each stock.

Drawing on theories and prior findings in behavioural finance, we believe that while investor sentiment may predict future stock returns, it is also likely to be influenced by past stock returns. That is, a potential causality loop between microblog sentiment and stock returns may exist. Thus, to reveal the true effect of microblog sentiment, we must account for such a causality loop in our analysis. We chose vector autoregression for the data analysis. Such a multiple time series model considers effects from both sides simultaneously and optimizes for the best estimation. Moreover, because stock prices are driven by market events and exhibit seasonality patterns, we also considered news sentiment, news volume, earning announcements, stock trading volume, volatility, date, and hour of the day in our analysis.

The results show that user sentiment on StockTwits largely fluctuates with stock returns. The effect is stronger on negative sentiment than on positive sentiment. That is, negative sentiment becomes stronger after negative returns than positive sentiment does after positive returns. This effect can last from a few hours to several days. Meanwhile, negative sentiment, but not positive sentiment, predicts stock returns within an hour. A 1% increase in negative sentiment leads to three basis points (0.03%) drop in stock returns. However, this effect is not observed at the day level. We also performed a similar analysis at the market level. Specifically, we analyzed the relationship between sentiment from all StockTwits messages and returns of the Dow Jones Industrial Average. We found similar and stronger patterns.

Consistent with most prior academic studies, we did not find that microblog sentiment predicts stock returns at the day level. But our results do show that negative sentiment predicts returns within a day. We believe that this is mainly because microblogs provide a high-velocity information channel. It suits the information needs of intraday traders, and effectively reflects their opinions toward the market. By conducting data analysis at the hourly frequency, we were able to reveal how such opinions intertwine with stock returns in very short cycles. Furthermore, the stronger effect of negative sentiment that we discovered reflects our understanding of the behavioural bias of noise traders, who are more sensitive to losses than to gains.

Our findings provide an important implication for intraday trading strategies using microblog data, namely, positive and negative sentiments have different effects. While it has been common to construct a single sentiment index by weighting positive and negative sentiments equally, it is a negative sentiment that drives the market in short term. An effective intraday trading strategy should be sensitive to changes in negative sentiment, but not too responsive to fluctuations in positive sentiment.



Shuyuan (Lance) Deng is an assistant professor of management information systems at the Seidman College of Business, Grand Valley State University. He received his PhD in Information Systems from University of Wisconsin-Milwaukee and his M.B.A from University of Illinois at Chicago. His research interests include artificial intelligence, machine learning, natural language processing, the social and economic impact of information systems, and enterprise systems.

Zhijian (James) Huang is an assistant professor of finance at the Saunders College of Business, Rochester Institute of Technology. He received his PhD in Finance from Pennsylvania State University and his Master of Financial Engineering from Cornell University. His areas of expertise include real-time simulation of trading strategies, discovery and assessment of financial market anomalies, stock market predictivity, and behavioural finance. He has published in the Journal of Financial Economics, Quarterly Journal of Finance, and Review of Quantitative Finance and Accounting.

Atish Sinha is a professor of information technology management and the Director of the Center for Technology Innovation at the Lubar School of Business, University of Wisconsin-Milwaukee. He earned his PhD in business, with a concentration in Artificial Intelligence, from the University of Pittsburgh. His current research interests are in the areas of business intelligence, big data analytics, machine learning, connected systems, and healthcare informatics.

Huimin Zhao is a professor of information technology management at the Lubar School of Business, University of Wisconsin-Milwaukee. He received the B.E. and M.E. degrees in Automation from Tsinghua University, China and the PhD degree in Management Information Systems from the University of Arizona, USA. His current research interests include data mining and healthcare informatics.