Experiments with GloVe embeddings and Domain Adversarial Neural Networks on the Dutch medical domain

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

The focus in this thesis is on developing models and resources that will be useful for the Dutch medical domain. This domain lacks annotated data and domain-specific models. In the fist part of the thesis, GloVe embeddings (Pennington et al., 2014) are developed. However, evaluating the quality of these embeddings is a challenge, given the lack of annotated resources for medical Dutch. The second part of the thesis presents experiments using a novel domain adaptation method, Domain Adversarial Neural Networks, which is getting attention for domain-adaptation problems in NLP. The network is trained on a Named Entity Recognition task and a Part-of-Speech tagging task, with and without (English) medical embeddings. Its performance and suitability for various domain-adaptation scenarios is evaluated.

Keywords

DANN; adversarial training; neural nets; neural network; GloVe; word embeddings; domain adaptation; medical NLP; low resource domain; dutch language; dutch medical domain; part-of-speech tagging; pos tagging; named entity recognition; NER;

Citation