Experiments with GloVe embeddings and Domain Adversarial Neural Networks on the Dutch medical domain
Publication date
Authors
DOI
Document Type
Master Thesis
Metadata
Show full item recordCollections
License
CC-BY-NC-ND
Abstract
The focus in this thesis is on developing models and resources that will be useful for the Dutch medical domain. This domain lacks annotated data and domain-specific models. In the fist part of the thesis, GloVe embeddings (Pennington et al., 2014) are developed. However, evaluating the quality of these embeddings is a challenge, given the lack of annotated resources for medical Dutch. The second part of the thesis presents experiments using a novel domain adaptation method, Domain Adversarial Neural Networks, which is getting attention for domain-adaptation problems in NLP. The network is trained on a Named Entity Recognition task and a Part-of-Speech tagging task, with and without (English) medical embeddings. Its performance and suitability for various domain-adaptation scenarios is evaluated.
Keywords
DANN; adversarial training; neural nets; neural network; GloVe; word embeddings; domain adaptation; medical NLP; low resource domain; dutch language; dutch medical domain; part-of-speech tagging; pos tagging; named entity recognition; NER;