Sentiment Analysis

sentiment analyst banner

Introduction

What is Sentiment Analysis?

Sentiment Analysis is the subject area that is involved in dealing with human feelings, responses, judgments which are derived from a variety of textual sources such as blogs, business websites, social media platforms, etc

In simple terms, with the help of sentiment analysis, we can recognize an authors intent or attitude towards a certain topic 

Why Sentiment Analysis is important?

Well, who wants to spend every hour in a day in front of a computer analyzing customer sentiments on social media platforms or a business website in real-time. Is it even possible?

That’s where these powerful algorithms will take over these dauntingly difficult tasks from humans. The information that these algorithms process can be transformed into organizational insights to improve the business model, product research, customer management, etc     

With the increasing popularity of AI technologies and the rapid evolution of hardware resources, Even the smaller companies can adopt these techniques to improve their businesses   

What is Sentiment Analysis used for?

Sentiment Analysis is being widely used in areas like data mining, chatbots in business websites, social media analytics, etc. Sentiments are the most essential characteristics to determine human behavior. Sentiments, views, opinions, feedbacks can be positive, negative, or neutral or they can be segregated into any category of interest in the context of your business model.

Nowadays, Social media is probably the biggest platform out there that facilitates millions of customer engagements with products and services provided by countless service providers. In that context, social media platforms are enabling the customers to connect with their service providers and vice versa. For any business, customer feedbacks are essential to constantly improve their products and services. 

Not directly in a business perspective, rather we can talk about sentiment analysis being used for reading the mentality of people reacting to certain news topics, or their opinions regarding a political debate or their reactions to a certain sporting event, Though the data generated from these events will then be utilized somewhere down the road to fit into a certain business strategy.

Crisis prevention can be managed with the power of sentiment analysis by implementing real-time customer sentiment collections and analysis. if negative customer sentiments are detected by algorithms, organizations can interact with the unhappy customers to prevent further escalations of those underlying issues    

How Sentiment Analysis is implemented?

Sentiment analysis is not performed only because we can get an overview of how positive or negative a certain sentiment is. The enhancement of the analysis is based on the quality and the quantity of the data that we have access to. Sentiments need to be classified into specified categories. And how does this classification takes place?

Generally, the most popular approaches among the community are the machine learning approach and the lexicon-based approach. The machine learning approach can then be classified into supervised and unsupervised learning, supervised learning training approach will utilize the data that has been already labeled with the desired classification category that the sentiment belongs to, in the case of unsupervised learning, unlabeled data will be used with machine learning techniques like clustering algorithms usually the k-means clustering to classify the sentiments.

In any Data Science/Machine Learning project, the first and the most important phase is data preparation. Both quality and the quantity of data need to be fine-tuned and well processed, most importantly it’s always a better approach to take enough time for data gathering and preprocessing rather than rushing your way towards the subsequent stages in the process. This practice can address some of the possible pitfalls that you may have to encounter in the successive stages of model tuning. Further, without quality data, it is possible to end up with a biased model that is either not performing as expected according to some metric we choose (e.g:- F-score)

Supervised Sentiment Analysis

In the data preprocessing stage, once the data cleanups have been performed (removing punctuations, converting to lowercase, addressing missing data, removal of stopwords), Lemmatization or Stemming can be used to convert each word from the corpus of the cleaned dataset to its base form which is called a lemma or a stemmer depending on either of the methods you have chosen for the task. When it comes to classification, there are multiple algorithms that can be used to perform the tasks other than going for Deep learning techniques. This is purely based on what you are trying to achieve with data availability, requirements, and other organizational needs.

Naive-Bayes is one of the popular techniques for text classification. It’s an application of base theorem assuming that each word in our corpus (features) are independent of each other. As an example let’s pick this sentence ‘I am in love with my new laptop. It’s really fast, reliable, and well-designed!”. The outlook of this sentence is obviously positive. 

The  Naive-Bayes model assumes that the words like  ‘new’, ‘really’, ‘bad’, ‘love’, ‘reliable’, individually and contribute independently to its positive class. In other words, the probability of the sentence being positive when it uses the word ‘reliable’ does not change by any other words. However, this does not represent words having independent appearances as some words appear together more often. Still, the number of each word does contribute to its class independently.

Support Vector Machine (SVM) is another popular choice among the practitioners of sentiment analysis. What SVMs do, is that it simply figures out the hyperplane that divides the classes with the largest margin between them. But when there are lots of features, SVMs will take a considerable amount of time to do the job but thanks to its kernel tricks, SVMs are actually a pretty robust solution

Deep Learning with LSTMs in RNNs

If you are going for improved accuracy, this is the kind of technique you should rely upon. In the case of understanding a language and its meaning requires an understanding of syntax. Literally, that’s the sequence of the words. This Deep learning architecture that cares about the sequence of the vocabularies is called LSTM – Long Short-Term Memory architecture. for LSTMs we need to feed text as a sequence 

Sentiment Analysis
LSTM Architecture for Sentiment Analysis

Unsupervised Sentiment Analysis

Just like in the case of supervised learning, data preprocessing is unavoidable. Once we have cleaned up the dataset which contains our corpus of words, it’s time to introduce the Word2Vec model.

Word2Vec Model

What are word embeddings? Not exactly but they are just vector representations of a particular word. Word2Vec is one of the most popular techniques to learn word embeddings and you can implement Word2Vec using not a very deep neural network, rather a shallow network.

Using one-hot encoding, a word in a vocabulary can be represented as follows,

In a 5 word vocabulary,

Awesome=[0,0,0,1,0] ,

Good = [0,0,1,0,0]

Poor = [0,1,0,0,0]

Day = [1,0,0,0,0]

Really = [0,0,0,0,1]

If we try to picture these encodings, we can think of a 5-dimensional space, where each of these words occupies one of the dimensions in that multidimensional space and is independent of other dimensions which means no projections along the other dimensions. according to the above example, ‘good’ and ‘awesome’ are as different as ‘really’ and ‘poor’ which is hardly the truth is.

So the objective is to have words with similar meaning/context occupy close spatial positions.

Word embedding can be learned using two methods,

  • Skip Gram
  • Common Bag Of Words (CBOW) 
Sentiment Analysis
CBOW Vs Skip Gram

Once training the corpus of data is completed, we’ll use the trained weights in the neural network to derive the word embeddings, now that our word embeddings that have a similar context will occupy close spatial positions in the muti dimensional vector space, simply the word ‘Awesome’ and ‘good’ will stay close to each other than ‘really’ to ‘poor’ in the vector space    

Once the vector representations are learned through the above methods it’s time for us to use the K-means algorithm to check on how many cluster centroids we are going to get and what does each cluster represent (maybe Positive and Negative), which takes some additional steps to analyze the resulting clusters. By computing what word vectors are most similar in terms of cosine similarity to coordinates (cluster centers/centroids) of each cluster, we can get an overview of the nature of the clusters. (like positive or negative)

Lexicon-Based Sentiment Analysis

This approach is not a machine learning approach for sentiment analysis. We are trying to calculate a sentiment score (a sum or an average score) mapping with a pre-defined dictionary of positive or negative words with a positive or negative sentiment value is assigned for each word in the dictionary. Different approaches are available for creating these dictionaries. The calculated score will reflect the overall sentiment of the message.