TU Berlin

Service-centric NetworkingShekhawat, D. (2019). Sentiment Analysis for Product Reviews Using Machine Learning. Master Thesis, Technische Universität Berlin

Page Content

to Navigation

Master Thesis: Sentiment Analysis for Product Reviews Using Machine Learning

Title:

Sentiment Analysis for Product Reviews Using Machine Learning

Description: As the internet is expanding very fast and usage of internet is increasing rapidly, a lot of qualitative and quantitative text data have become easily available in the form of product reviews, blogs, articles e.t.c. The text data possess the semantics, emotions, sentiments or opinions of the authors. For a machine to understand the sentiments from the text in an automated way, sentiment analysis is performed. Sentiment analysis is contextual mining of text which extracts opinions, sentiments, emotions. It is also an analysis to determine the writer's sentiments by reading the text. Sentiment analysis is a helpful automatic text mining concept to develop a marketing strategy, improve customer service, understand customer demand, customer behavior and reduce manual efforts. There have been several methods proposed for sentiment classification but still, lack successful results for cross domain sentiment analysis.

The goal of the thesis is to explore product reviews text from different domains and analyze the sentiments of the text by applying supervised machine learning classifiers and predict the polarity of the reviews either as positive or negative. The scope of the thesis is to gather the text data from different domains and perform both in-domain and cross domain sentiment analysis. The thesis comprises of background study, related work study, the development of a concept and design, implementation of the concept, and evaluation of the developed framework over the datasets containing product reviews text.

This thesis visualized frameworks that process the texts using natural language processing techniques. Multiple combinations of natural language processing techniques generated to study the impact of negation, emoticons while processing the text. After text preprocessing, text vectorization is performed to generate a machine-readable matrix from the text. Supervised machine learning classifiers namely, Naive Bayes, Logistic Regression, Linear SVM are applied on the matrix to perform both in-domain and cross domain sentiment analysis. The results of the envisioned frameworks for in-domain sentiment analysis achieve a prediction accuracy of 90.04\% whereas, for cross domain sentiment analysis, 81.85\% of prediction accuracy achieved.

Supervisor: Katerina Katsarou, Tanja Deutsch

Type:  Master Thesis

Duration: 6 months

Navigation

Quick Access

Schnellnavigation zur Seite über Nummerneingabe