Inhalt des Dokuments
Es gibt keine deutsche Übersetzung dieser Webseite.
Master Thesis: Natural Language Processing in Digital Citizen Participation
Natural Language Processing in Digital Citizen
Urban planning measures often have a direct impact on the living environments of concerned citizens. Citizen participation has started to play a role in planning processes over recent years, including citizen requirements. Citizen participation can be any process that considers the inputs of citizens in decision-making, and consists mainly of formal and informal participation. The former is legally required (mandatory) and the latter conducted on a voluntary basis. Both participation types are often combined and closely linked, complementing one another. Citizen participation in urban planning aims to improve planning outcomes and strengthen the planning profession. Digitalization plays a very important role in citizen participation, since it enables people to express their opinion anytime and anywhere, enhancing transparency and increasing participation. The increasing importance of digital citizen participation has raised the necessity to employ advanced data processing techniques, such as Natural Language Processing (NLP), which help to understand and analyze large numbers of textual citizen contributions. In this thesis, the contributions data set provided by DIPAS; the online participation tool in Hamburg, will be described. The DIPAS platform enables people in different urban planning processes to contribute to decision making and express their opinions about the current situation in their part of a city. Each contribution has a title, topic and an exact geographic location. Different means have been evaluated to facilitate the process of filling out the online participation form. To unburden users from having to evaluate which part of the content to put in the contribution’s title, it has been decided to develop two different algorithms to automatically generate a title (headline) for an arbitrary contribution. The titles generated have to provide meaningful information rather than catch the eye. In general, headline generation is a task of text summarization, which aims to describe an article or paragraph using a short, single sentence. There have been many extractive and abstractive approaches to generating useful headlines. Some of these are based on frequency-driven approaches, while others are based on the encoder-decoder recurrent neural networks, or long short-term memory unit (LSTM). To evaluate the performance of these algorithms, Recall-Oriented Understudy for Gisting Evaluation (ROUGE) has to been used. ROUGE determines the quality of the summarization by counting the numbers of overlapping words between automatically-generated and golden-standard (human written) summaries. After developing two algorithms to generate a headline for each contribution text, a REST API for each is created. This should be directly integrated in the existing DIPAS platform. The development of the headline generation algorithms and their evaluation take place after cleaning and pre-processing the data, followed by the development of the REST API. As a result, the algorithm is based on extractive techniques performed better by the used contributions data set, since abstractive approaches require larger data sets in order to perform better.
Supervisor: Bianca Lüders , Boris Lorbeer 
Type: Master Thesis
Duration: 6 months
10587 Berlin, Germany
Phone: +49 30 8353 58811
Fax: +49 30 8353 58409