Towards better evaluation methods for Controllable Affective Natural Language Generation

In this thesis we discuss Natural Language Generation (NLG), and the more specific domain of Affective Natural Language Generation. We discuss the concepts of having Controllable Affective Natural Language Generation (CA-NLG), a specialized area focusing on controlling the emotions in generated language output. We also discuss various means of evaluating NLG systems and argue current research overfocuses on evaluating NLG systems intrinsically, and more extrinsic evaluations should be performed. We elaborate on specific means for measuring emotions in readers of NLG output texts, and related issues. We discuss various measuring systems and design an experiment to compare a recent proposal named the Geneva Emotion Wheel (GEW) against a well-known, more established method called the Positive Affect Negative Affect Scale (PANAS). We compare the two methods’ test-retest reliability and find similar results for the former and a slight bias in favor of PANAS for the later. However due to sample size issues we cannot conclusively say which method performs better, objectively.

Keywords

Natural Language Generation, NLG, Affective, Affective Computing, Computational Linguistics, Evaluation, Intrinsic Evaluation, Extrinsic Evaluation, Geneva Emotion Wheel, PANAS

URI

https://studenttheses.uu.nl/handle/20.500.12932/41522

Towards better evaluation methods for Controllable Affective Natural Language Generation

Files

Publication date

Authors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI