Difference between revisions of "Stock Prediction using Social Media Analysis"

From REU@MU
Jump to: navigation, search
(Progress)
 
Line 18: Line 18:
  
  
==Progress==
+
==Abstract==
'''Week 1'''
+
Stock market prices are becoming more and more volatile, largely due to improvements in technology and increased trading volume. Speculation affects business owners, investors, and policymakers alike. While these seemingly unpredictable trends continue, investors and consumers take to social media to share thoughts and opinions. We use information shared over StockTwits, a social media platform for investors, to better understand and predict individual stock prices. We designed and implemented three machine learning models to forecast stock prices using the dataset collected from StockTwits. We also evaluated our models with conclusions drawn from previous researchers in this field. Our first model found no correlation between general StockTwits postings and stock price. However, our second and third models considered a novel approach and successfully filtered through the twits to find important posts. These important twits could predict stock price movements with greater accuracy (average around 65%) based on sentiment analysis and smart user identification. We consider a user “smart” based on number of likes, follower count and more importantly how often the user is right about a stock.
* Installed and used python libraries for data manipulation
+
* Found API's for mining social media and stock data
+
* Compiled social media and stock data into single database
+
* Analyzed sentiment of every post to find mean sentiment of stock per day
+
 
+
'''Week 2'''
+
* Found complete lack of correlation between average sentiment and stock price
+
* Ran into issues with limited data
+
* Created regression model that analyzes word-counts and tf-idf of daily posts to predict change
+
* Reached out to StockTwits and will be given partner-level access to their API
+
 
+
'''Week 3'''
+
* Created Library for regression model so code can be reused in analyzing different stock
+
* Updated regression model to use past days and ranges of days to predict prices
+
* Created classification and cluster models using tf-idf
+
* Still waiting on complete access to StockTwits API
+

Latest revision as of 18:13, 3 August 2017

By Scott Coyne Mentor Dr. Praveen Madiraju

Goals and Milestones

1) Complete literature survey of similar projects

2) Compile all social media and stock price info into single data-frame

3) determine sentiment of posts and classify them by value

4) create multiple machine learning models to predict stock prices and evaluate each of them

5) calculate weighted scores for users based on their influence and apply that to the model

6) create a high level architecture diagram of the system

7) produce a final project report


Abstract

Stock market prices are becoming more and more volatile, largely due to improvements in technology and increased trading volume. Speculation affects business owners, investors, and policymakers alike. While these seemingly unpredictable trends continue, investors and consumers take to social media to share thoughts and opinions. We use information shared over StockTwits, a social media platform for investors, to better understand and predict individual stock prices. We designed and implemented three machine learning models to forecast stock prices using the dataset collected from StockTwits. We also evaluated our models with conclusions drawn from previous researchers in this field. Our first model found no correlation between general StockTwits postings and stock price. However, our second and third models considered a novel approach and successfully filtered through the twits to find important posts. These important twits could predict stock price movements with greater accuracy (average around 65%) based on sentiment analysis and smart user identification. We consider a user “smart” based on number of likes, follower count and more importantly how often the user is right about a stock.