IDENTIFYING STRUCTURAL VARIATIONS USING OPTIMIZED SV PIPELINE

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Structural variants (SVs) are genomic alterations of at least 50 base pairs in the DNA. The PrediCT study utilizes two gene panels to investigate tumour development linked to germline mutations in cancer predisposition genes. The aim of this project is to optimize an SV pipeline and identify if there are clinically significant SVs, focussing on deletions, in genes from the gene panels in PrediCT patients. SV callers identify genomic alterations in the DNA and produce a VCF file. However, currently there is not a single caller good enough for accurate and comprehensive detection of SVs (Koboldt, 2020) (Kuzniar et al, 2020) (Kosugi et al, 2019). Hence, SV callers Manta and Dysgu are combined for a more accurate and comprehensive detection of SVs. Both callers contain a property in their VCF file useful for validating the accuracy of the event and can process CRAM files, therefore significantly reducing the pipeline’s runtime. The pipeline is optimized by setting thresholds for Dysgu’s Probability Score- and Manta’s Quality Score property, based on event verification in IGV (Robinson et al, 2011), which serve as filter. SV length and high-confidence-calls from a single caller are also filtered in. Events of interest are annotated using AnnotSV, annotation software specialized for SVs, and exon-region filtering is performed. Many identified SVs lack clinical significance due to being population-common. Combining Dysgu and Manta showed improved results over the use of them as a single caller. With an equivalent number of correctly identified events, fewer false positives are called when combing the callers. In the analysis, the pipeline was unable to detect any novel clinically significant SVs beyond those that were already established.

Keywords

SV pipeline; structural variants; pipeline

Citation