direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Master Thesis: Multi-Domain Sentiment Classification using an LSTM-based Framework with Attention Mechanism

Title:

Due to the widespread use of the internet, a large amount of content is created and shared on-line in form of reviews, social media posts or blog entries. Access to the internet enables peopleto express their experiences and attitudes towards purchased products and services as well ascurrent events and topics.  The general polarity towards products and services is particularlymeaningful  for  companies  which  can  use  the  customer  responses  to  adjust  their  marketingstrategy.  Besides, market leaders as well as influencers of various areas can be identified.  Forthis purpose, sentiment classification is extremely useful as an area which classifies text intopositive or negative in an automated way.

This thesis focuses on multi-domain sentiment classification which trains a classifier using mul-tiple domains and then tests the classifier on one of the domains. Multi-domain sentiment clas-sification can be considered very realistic because no domain is assumed to have sufficient la-belled data. In multi-domain sentiment classification multi-task learning is often applied whichaims to learn multiple tasks in parallel. The different tasks represent the different domains forthis.  However, publications in multi-task learning generally learn an entire set of parametersper domain which means that entire models are trained in parallel. Prevalent multi-task mod-els are therefore extremely time and space inefficient.

This research proposes an alternative solution to multi-task models. The framework of this the-sis learns representations for the inputs texts using an autoencoder which are classified by anclassification algorithm.  Only one task is learned during this process in contrast to multi-taskmodels.  Furthermore, this work investigates the effect of active learning on the framework.Active learning as a technique which strategically queries data points, is capable of improv-ing performance while reducing the number of labelled instances which makes it especiallyuseful for multi-domain sentiment classification. The framework in absence of active learningachieves an average accuracy score of 86.74 % on the Amazon data set  which is comparable to recent publications in the field of multi-task learning.  This accuracy score couldn’t be in-creased by the active learning applied in this work. Active learning in multi-domain sentimentclassification is particularly challenging and should be further investigated.

Description:

Supervisor: Katerina Katsarou

Type:  Master Thesis

Duration: 6 months

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

TU Berlin - Service-centric Networking - TEL 19
Ernst-Reuter-Platz 7
10587 Berlin, Germany
Phone: +49 30 8353 58811
Fax: +49 30 8353 58409