direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Master Thesis: Sentiment Analysis for Product Reviews Using Machine Learning


Sentiment Analysis for Product Reviews Using Machine Learning

Description: As the internet is expanding very fast and usage of internet is increasing rapidly, a lot of qualitative and quantitative text data have become easily available in the form of product reviews, blogs, articles e.t.c. The text data possess the semantics, emotions, sentiments or opinions of the authors. For a machine to understand the sentiments from the text in an automated way, sentiment analysis is performed. Sentiment analysis is contextual mining of text which extracts opinions, sentiments, emotions. It is also an analysis to determine the writer's sentiments by reading the text. Sentiment analysis is a helpful automatic text mining concept to develop a marketing strategy, improve customer service, understand customer demand, customer behavior and reduce manual efforts. There have been several methods proposed for sentiment classification but still, lack successful results for cross domain sentiment analysis.

The goal of the thesis is to explore product reviews text from different domains and analyze the sentiments of the text by applying supervised machine learning classifiers and predict the polarity of the reviews either as positive or negative. The scope of the thesis is to gather the text data from different domains and perform both in-domain and cross domain sentiment analysis. The thesis comprises of background study, related work study, the development of a concept and design, implementation of the concept, and evaluation of the developed framework over the datasets containing product reviews text.

This thesis visualized frameworks that process the texts using natural language processing techniques. Multiple combinations of natural language processing techniques generated to study the impact of negation, emoticons while processing the text. After text preprocessing, text vectorization is performed to generate a machine-readable matrix from the text. Supervised machine learning classifiers namely, Naive Bayes, Logistic Regression, Linear SVM are applied on the matrix to perform both in-domain and cross domain sentiment analysis. The results of the envisioned frameworks for in-domain sentiment analysis achieve a prediction accuracy of 90.04\% whereas, for cross domain sentiment analysis, 81.85\% of prediction accuracy achieved.

Supervisor: Katerina Katsarou, Tanja Deutsch

Type:  Master Thesis

Duration: 6 months

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

TU Berlin - Service-centric Networking - TEL 19
Ernst-Reuter-Platz 7
10587 Berlin, Germany
Phone: +49 30 8353 58811
Fax: +49 30 8353 58409