Building A Bitcoin Sentiment Analyzer Based On Twitter In Java
Analyzing recent tweets can give a valuable indication of the general feeling around Bitcoin.
Created in 2009, Bitcoin is still a fairly young project. Its market price is therefore naturally very volatile. In addition, Bitcoin is much more sensitive to mass effects such as the FOMO (Fear Of Missing Out) feeling that can capture the market as was the case at the end of 2017. Driven by a general euphoria, the Bitcoin price had reached its all-time high of $20,000.
Being able to measure this type of sentiment around Bitcoin can give an excellent indication of what awaits the price of Bitcoin in the coming hours. A good solution is to analyze activity around Bitcoin on a social network such as Twitter. In this article, I will teach you how to create a Java program to analyze the general sentiment around Bitcoin from tweets retrieved on Twitter.
Bitcoin Sentiment Analyzer Specifications
The Bitcoin Sentiment Analyzer that you will learn to develop will perform the following actions during its execution:
- Retrieving a sample of tweets on Twitter containing the keyword bitcoin
- Analysis of tweets retrieved one by one to detect the general sentiment associated with them
- Displaying the percentage of each of the following 5 sentiments for Bitcoin on Twitter: very negative, negative, neutral, positive, very positive
The program will end after each execution and will not perform this analysis continuously due to the quotas imposed by Twitter with its free developer API.
Creating The Java Project
The first step will be to create a Java project. We will use Maven as a dependency manager. In terms of dependencies, we will have the following code libraries:
- Twitter4J which is an unofficial Java client for the Twitter API
- Stanford CoreNLP which is an open source library for natural language processing
Twitter4J will allow us to retrieve a sample of tweets from Twitter in an easy way. The Stanford CoreNLP code library will allow us to detect the sentiment associated with each of the associated tweets.
This gives us the following POM for our project:
Creating A Twitter Application
Using the Twitter API requires the creation of a developer account. It is free of charge but remains limited in use to a certain number of calls and some quotas are imposed. As part of our project, which is for educational purposes, this will be perfectly appropriate. The creation of the developer account necessary to use the Twitter API is done here: https://developer.twitter.com
Once you have created this account, you will be taken to the next page:
You will need to create a new application. I have chosen to call my application “Bitcoin_Sentiment_Analyzer”. During the creation of this application, you will have to fill in a certain amount of information about it. Finally, you will arrive on the “Keys and tokens” screen where you will find the information that will allow you to authenticate yourself correctly when calling the Twitter API to retrieve tweets related to Bitcoin:
Retrieving The Tweets
Now that the Twitter application associated with the Bitcoin Sentiment Analyzer is created, we will be able to move on to retrieving tweets within our program. To do this, we will rely on the Twitter4J code library.
Twitter4J has as its entry point the TwitterFactory and Twitter classes.
The TwitterFactory class takes as input a Configuration object instance that contains the connection information to the Twitter API:
- Consumer API Key
- Consumer API Secret Key
- Access Token
- Access Token Secret
I will then retrieve an instance of a Twitter object instance from the TwitterFactory by calling its getInstance method. With this object, we will be able to launch queries on the Twitter API. We will use its search method to retrieve tweets matching certain criteria.
A query is modeled within the Query object which takes as input a string corresponding to the query that you want to execute via the Twitter API. In the case of the Bitcoin Sentiment Analyzer, I want to retrieve tweets containing the keyword bitcoin while not being retweets, links, answers or images.
This query is represented with the following string:
bitcoin -filter:retweets -filter:links -filter:replies -filter:images
The setCount method of the Query class allows you to define the number of results you want to retrieve. In the case of the free developer API, this number is limited to 100 results.
Finally, it remains to execute this query by passing it from the search method of the Twitter object instance. A QueryResult object is returned on which it remains to call the getTweets method to retrieve a list of Status objects. Each Status object represents a tweet. It is finally possible to access its textual content via the getText method of the latter object.
All this gives the following searchTweets method:
Analyzing The General Sentiment Of A Tweet
The next step is to analyze the general sentiment of a tweet. Many solutions in the cloud exist to analyze the general sentiment of a text. Google, Amazon or Microsoft offer solutions for example. However, there are also very good free and open-source solutions such as the Stanford CoreNLP code library.
The Stanford CoreNLP code library perfectly meets our needs as part of our Bitcoin Sentiment Analyzer program.
The StanfordCoreNLP class is the entry point to the API. We instantiated this object by passing as an input an instance of Properties in which we define the different annotators that will be used during the text analysis.
I then call the process method of the StanfordCoreNLP object to start the text analysis. In return, I get an Annotation object on which I will iterate to get the associated CoreMap objects. For each of these objects, I retrieve a Tree object obtained by calling the get method with the SentimentAnnotatedTree class as input.
Finally, it remains to call the static method getPredictedClass of the RNNCoreAnnotations class by passing this instance of Tree as an input. The return corresponds to the general sentiment of this piece of the analyzed text. The general sentiment of the text is calculated by keeping the sentiment of the longest part of the text.
The general sentiment calculated for the text passed as input is expressed as an integer whose value can range from 0 to 4 inclusive.
In order to facilitate the manipulation of the general sentiment of a text later on, I define a TypeSentiment enum associating to each of these values the associated feeling defined in the form of this enum.
All this gives the following code:
Assembling The Different Parts Of The Program
We are now able to retrieve tweets corresponding to a given keyword. Then, we are able to analyze each of its tweets to get the general sentiment associated with it. All that remains is to assemble all this into the main method of the BitcoinSentimentAnalyzer class.
First of all, I define a HashMap that will store the number of times each sentiment has been found within the analyzed tweets. Then I call the searchTweets method with the keyword “bitcoin” as input.
The next step is to iterate on the Status objects contained within the list returned by the searchTweets method. For each tweet, I retrieve the associated text and call the analysisSentiment method to calculate the associated general sentiment in the form of a TypeSentiment instance.
Each time a sentiment is returned, we increment our counter within the HashMap. After analyzing all the retrieved tweets, we can display the percentage of each sentiment about Bitcoin to give the distribution of current sentiments on Twitter.
This gives the following complete code:
Bitcoin Sentiment Analyzer In Action
The best part of this article comes as we will put into action the Bitcoin Sentiment Analyzer program that we have just built. After executing the program and after a few seconds of analysis, I obtain the following result:
On the sample of tweets returned by the Twitter API, the general sentiment around Bitcoin is as follows:
- 2% of very negative tweets
- 72% of negative tweets
- 12% of neutral tweets
- 14% of positive tweets
Our Bitcoin Sentiment Analyzer clearly shows that the general sentiment is rather negative about Bitcoin on Twitter right now.
To Go Further
Our Bitcoin Sentiment Analyzer is perfectly functional and provides an excellent basis for further analysis of sentiments around Bitcoin. Thus, you could perform this analysis continuously for correlating it with the price of Bitcoin that you can retrieve through the Bitcoin Price Index API from Coindesk.
You could thus deduce from this if the general sentiment around Bitcoin on Twitter is directly related to the evolution of its price. This program can help you improve the prediction of the future price of Bitcoin. For this type of program, you will need to switch to the paid developer API of Twitter for getting the tweets on Bitcoin in real-time.