The dual-use of BERT for regulatory compliance

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Dual-use goods are items that have both commercial and military or proliferation applications. This thesis investigates whether BERT, a contextual language model, can be used to identify dual-use goods based on a short description of the good and whether it can be improved by augmenting it with relational dual-use knowledge. Two methods from augmenting BERT are explored: by further pre-training BERT on relevant synthetic sentences from the KELM corpus and by augmenting it with knowledge graph embeddings (KGEs) created from Wikidata. The use of KGEs can improve the performance of a logistic regression model on the dual-use identification task. All implementations of BERT perform well on the dual-use identification task and have a better performance compared to the logistic regression models. None of the BERT implementations augmented with relational dual-use knowledge outperformed the plain implementation of BERT.

Keywords

dual-use goods; BERT; knowledge graph embeddings;

Citation