To The Moon: Performing Stock Sentiment Analysis with Dropbase

Learn how a team of four Waterloo computer science students managed to predict the massive rallies of stocks like Blackberry, GameStop and Bed Bath and Beyond, using sentiment analysis and Dropbase

This article is an ongoing series showcasing some of the best hacks we encountered through the Hack the North competition. To view more of these projects, please checkout the full list of hacks using the Dropbase API

What do GameStop, Blackberry, and Bed Bath and Beyond have in common?

For University of Waterloo/Wilfrid Laurier University Business and Computer Science double degree students Hong Yi Chen, Brayden Royston and Tailai Wang, alongside their UW Computer Science classmate Bill Cui, this question formed the inspiration for their Hack the North project, To The Moon. Having been friends and classmates before the Hackathon, they came into the weekend with a team already set to go.

How can we measure sentiment surrounding a stock?

To answer the question posed at the beginning of the article, there are two commonalities between these stocks: first, each of these stocks have more than doubled their stock price in the past month. Second, they were all top picks of the Reddit community WallStreetBets. For those  unfamiliar with the forum, it's a group of traders who share their risky investing plays primarily focused on the purchasing of options. In recent months, the community has grown to over 2 million users, and their recent short squeeze of GameStop costing short sellers billions of dollars in a single day has brought them into mainstream financial news.

With such massive gains recently with the subreddit's hottest stock positions, the team set out to find a way to quantify this sentiment in a way that would allow a user to quickly understand if WallStreetBets members as a whole were speaking in a highly positive manner (bullish), or in a negative manner (bearish).

Combining webscraping and sentiment analysis with WallStreetBets

To achieve this feat, the To The Moon team decided to create a way to scrape the forum, using the Reddit API to gather the most recent posts on the forum, storing the results as a JSON file. The JSON file was then fed to Dropbase to clean unnecessary columns and data not need for sentiment analysis, and then stored the cleaned data in a Postgres database. With this data now cleaned, they sent the data through Google Cloud's Natural Language Processing to parse the data for key words and symbols (🚀) associated with positive and negative sentiment.

This NLP data was then aggregated and put through a weighted algorithm to give the stock an overall sentiment rating. The team created a React frontend to display the top 10 most positive stocks, in order to give users a quick overview of what some hot stocks are that they might want to invest in.

Cleaning and storing data for sentiment analysis using Dropbase

For the To The Moon team, the data they got from Reddit was in a JSON format that included a number of elements that they did not want to pass through to the NLP sentiment analysis stage. So the team used Dropbase to import their JSON data, perform a number of custom functions to transform the data into a clean format that could then be passed along to the Google Cloud NLP. In particular, the team was impressed by how easy it was to process and clean the data using Dropbase, and the ability to add custom functions to better fit their data cleaning needs.

Next Steps for To The Moon

There's a couple added functionalities that the team still wants to add, mainly to improve the amount of information that a user can access about the stocks in the frontend of the website. A second feature they'd like to add is implementing a weighted portfolio based on the overall positivity of sentiment from each of these companies, to track how the sentiment matches up with real stock performance.

In the week between talking with the team and writing this article, Bed Bath and Beyond increased its share price by 44%, Blackberry is up 50% and GameStop is up more than 300% (Although the volatility with all these stocks is high). With results like these, To The Moon might just have the formula to get your finances to reach the moon 🚀

