Skip to content

Results of running SentimentAnalyzer() on Amazon Reviews Dataset. #187

@mdvsh

Description

@mdvsh

Background

Under Google Code-In, I used the sentiment analysis model in TextAnalysis.jl to analyse the amazon reviews dataset.. I performed basic text pre-processing to increase the metrics of the model. Some tasks undertaken were:

  • stemming words in each review
  • removing corrupted characters
  • removing definite and indefinite articles

I also found that remove_numbers!() (another pre-processing function mentioned in the Docs) gave an error on running. On further inspection, I found that it isn't still implemented in the src/preprocessing.jl folder. It is an issue worth looking into.

Also, a BoundsError occurred in the midst of the run.

BoundsError: attempt to access 32×5000 Array{Float32,2} at index [Base.Slice(Base.OneTo(32)), 5001]

This didn't effect the running and the results that I get are presented below.

Result

I learnt how precision, recall and f1score are different metrics for measuring how well the model performs and was a wonderful learning experience.

Precision : 0.583117838593833
Recall : 0.5144996465068449
F1Score : 0.5466638895622987

Related To:

  1. BoundsError in sentiment analysis #160
  2. Testing the efficiency of Sentiment Analysis models #185
  3. Better sentiment analysis model #84

cc @Ayushk4 @aviks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions