Sentiment Analysis with VADER: Exploring Textual Emotion Analysis
In this blog post, we embark on an exploration of sentiment analysis using VADER (Valence Aware Dictionary and sEntiment Reasoner), a powerful tool for analyzing sentiment in text.
Introduction to VADER VADER is a lexicon and rule-based sentiment analysis tool specifically designed for analyzing sentiments expressed in social media texts. It's renowned for its ability to handle nuanced expressions, slang, and emoticons commonly found in online conversations. With its pre-trained lexicon and sophisticated algorithm, VADER offers a robust solution for sentiment analysis tasks.
!pip install vaderSentiment
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import pandas as pd
Getting Started: Installation and Initialization Our journey begins with the installation of the VADER library using pip. Once installed, we import the SentimentIntensityAnalyzer class from the vaderSentiment module. We initialize the analyzer and explore the built-in lexicon, which contains a plethora of words and their associated sentiment scores.
sa = SentimentIntensityAnalyzer()
len(sa.lexicon)
7506
corpus = ["Absolutely perfect! Love it! :-) :-) :-)","Horrible! Completely useless. :(","It was OK. Some good and some bad things.","good one but i do not like it"]
for doc in corpus:
scores = sa.polarity_scores(doc)
print(scores)
print('{:+}: {}'.format(scores['compound'], doc))
{'neg': 0.0, 'neu': 0.111, 'pos': 0.889, 'compound': 0.9428} +0.9428: Absolutely perfect! Love it! :-) :-) :-) {'neg': 0.91, 'neu': 0.09, 'pos': 0.0, 'compound': -0.8768} -0.8768: Horrible! Completely useless. :( {'neg': 0.261, 'neu': 0.522, 'pos': 0.216, 'compound': -0.1531} -0.1531: It was OK. Some good and some bad things. {'neg': 0.251, 'neu': 0.565, 'pos': 0.184, 'compound': -0.1815} -0.1815: good one but i do not like it
We dive into sentiment analysis using VADER by analyzing a set of sample texts. Each text is assigned sentiment scores for positive, negative, neutral, and compound sentiments. The compound score represents an aggregated sentiment score ranging from -1 (extremely negative) to 1 (extremely positive).
data_frame=pd.read_table('/content/train.tsv')
phrase=data_frame['Phrase']
review_list=[]
for doc in data_frame['Phrase']:
scores = sa.polarity_scores(doc)
if scores['compound']>0.50:
review_list.append("pos")
else:
review_list.append('neg')
print('{:+}: {}'.format(scores['compound'], doc))
final_data=pd.DataFrame(review_list,phrase)
final_data
0 | |
---|---|
Phrase | |
A series of escapades demonstrating the adage that what is good for the goose is also good for the gander , some of which occasionally amuses but none of which amounts to much of a story . | pos |
A series of escapades demonstrating the adage that what is good for the goose | neg |
A series | neg |
A | neg |
series | neg |
... | ... |
Hearst 's | neg |
forced avuncular chortles | neg |
avuncular chortles | neg |
avuncular | neg |
chortles | neg |
156060 rows × 1 columns