Towards better evaluation methods for Controllable Affective Natural Language Generation
Publication date
Authors
DOI
Document Type
Master Thesis
Metadata
Show full item recordCollections
License
CC-BY-NC-ND
Abstract
In this thesis we discuss Natural Language Generation (NLG), and the more specific domain of Affective Natural Language Generation. We discuss the concepts of having Controllable Affective Natural Language Generation (CA-NLG), a specialized area focusing on controlling the emotions in generated language output. We also discuss various means of evaluating NLG systems and argue current research overfocuses on evaluating NLG systems intrinsically, and more extrinsic evaluations should be performed. We elaborate on specific means for measuring emotions in readers of NLG output texts, and related issues. We discuss various measuring systems and design an experiment to compare a recent proposal named the Geneva Emotion Wheel (GEW) against a well-known, more established method called the Positive Affect Negative Affect Scale (PANAS). We compare the two methods’ test-retest reliability and find similar results for the former and a slight bias in favor of PANAS for the later. However due to sample size issues we cannot conclusively say which method performs better, objectively.
Keywords
Natural Language Generation, NLG, Affective, Affective Computing, Computational Linguistics, Evaluation, Intrinsic Evaluation, Extrinsic Evaluation, Geneva Emotion Wheel, PANAS