Sentiment analysis is used to determine whether a given text contains negative, positive, or neutral emotions. It’s a form of text analytics that uses natural language processing (NLP) and machine learning. Sentiment analysis is also known as “opinion mining” or “emotion artificial intelligence”. Tweets are often useful in generating a vast amount of sentiment data upon analysis. These data are useful in understanding the opinion of the people about a variety of topics.
While building our dataset we had to open a Twitter Developer account then extract raw tweets based on our requirement. After that we compiled the collected data into a csv file and made it ready to create a dataframe.
In the phase of data preprocessing we tried to emit the stop-words, comments, url and other unnecessary items from our dataset. We have used some popular built-in methods to complete this step. Then we calculated polarity scores to signify that our dataset is balanced and not biased towards any polarity.
In the case of Model building, first we had to split our dataset into the training and testing part. Then we fit our model with MultinomialNB and got some results and as we know MultinomialNB works in that case of classification where we have sufficient differences between the classes. So we also used SGDClassifier to get a better result.
To create results, we have used popular techniques and got our results in metrics.
Finally, we can see that we can improve the overall accuracy of our model or improve the accuracy for each class by refining our dataset.
Step 1: Get access to the Twitter API and create a developers account
Step 2: Apply for a developer account with Twitter and get your Twitter API keys and Tokens
Step 3: Fetch data from Twitter API in Python
Step 4: Install tweepy, which provides a way to invoke certain HTTP endpoints without dealing with low-level details.

Step 5: Authenticate with your credentials, which we can get once we have registered with a developers account. This step is essential for getting our data.

Step 6: Set up the search query containing the content related to which we want to collect the data.

Step 7: Collect the Tweets and append to a list

Step 8: Create a dataset using pandas dataframe

Step 9: Convert dataset to csv file
Step 1: START
Step 2: Import necessary libraries and packages
Step 3: Read the dataset and convert it into pandas data frame
Step 4: Convert the contents of the column named “Tweets” into lower case
Step 5: Define a list of stop words
Step 6: Remove the stop words using the above-mentioned list
Step 7: Remove the punctuations and special symbols
Step 8: Remove repeating characters
Step 9: Remove URLs/ Hyperlinks
Step 10: Remove numerical values
Step 11: Import nltk and download 'vader_lexicon'
Step 12: Using nltk.sentiment.vader , import SentimentIntensityAnalyzer
Step 13: Create a new column named “polarity scores” containing the polarities of individual tweets from our dataset using SentimentIntensityAnalyzer
Step 14: Create a new column named “polarity” containing the overall compound polarities of the tweets
Step 15: Print the results containing the number of tweets in favor of Russia/Ukraine and favor of War/No War
Step 16: Divide the dataset into training (80%) and testing (20%) dataset
Step 17: Import CountVectorizer,TfidfTransformer and MultinomialNB from sklearn.feature_extraction.text and sklearn.naive_bayes respectively
Step 18: Import Pipeline from sklearn.pipeline
Step 19: Train the columns of “Tweets” and “Support” using MultinomialNB model and the pipeline
Step 20: Predict the result and compare it with “Support” and find the accuracy
Step 21: Import SGDClassifier from sklearn.linear_model
Step 22: Train the columns of “Tweets” and “Support” using SGDClassifier model and the pipeline
Step 23: Predict the result and compare it with “Support” and find the accuracy
Step 24: Repeat Step 19 to Step 23 for “Tweets” and “War”
Step 25: Import metrics from sklearn
Step 26: Print the classification report
● For Support for Russia/Ukraine, SGDClassifier gives better accuracy
● For Support for War (Yes/No), MultinomialNB gives better accuracy
● No. of tweets which want WAR: 499
● No. of tweets which do not want WAR: 506
● No. of tweets which support Russia: 490
● No. of tweets which support Ukraine: 468











