Online ReviewsSentiment AnalysisPolarity EstimationNgramsMulti-Class Classifiers
Trinity College Dublin
In the present era, most of the data on the internet is in the form of raw text. These gold mines of data are invaluable since it contains lots of underlying information which can be extracted using natural language processing or text analytics techniques. The data from these text-based documents disclose users’ sentiments and opinions about a particular subject. In this paper, customer reviews from Amazon.com are pre-processed, analyzed using our proposed framework, and how these textual reviews justify the star ratings is studied. Features derived from textual reviews are used to predict its corresponding star ratings. To accomplish it, the prediction problem is transformed into a multi-class classification task to classify reviews to one of the five classes corresponding to its star rating. The performances of various classifiers used are evaluated and compared. Evaluation results on ground-truth data set show that Logistic Regression Classifier outperformed other models. Our study also reveals that among the various factors, polarity of the review and length of the review showed a significant effect on its rating.
Ankit TapariaTanmay Bagla