Názov:Identification of barcodes in nanopore sequencing data
Vedúci:doc. Mgr. Tomáą Vinař, PhD
Kµúčové slová:DNA sequencing, DNA barcoding, unsupervised learning, sequence alignment, classification, clustering
Abstrakt:DNA sequencing multiplexing is a technique that allows simultaneous sequencing of multiple genomes. This method is economically advantageous, since it uses a singleuse flow cell to run several sequencing sessions. The challenge this method poses is demultiplexing, i.e. classifying the reads based on their barcode sequences. The original approaches based on aligning already basecalled sequences typically render up to 15% of the reads unusable due to the errors in the basecalling process. The recent approach of Wick et al. [53] in a software called Deepbinner, which is based on a Convolutional Neural Network that operates with raw sequencing signal established a new state of the art with only ≈ 5.2% of reads being unclassified and a ≈ 1.6% error rate. We present a novel approach that also operates with raw sequencing signals. As opposed to the previous attempts at this problem, however, our method works in an unsupervised manner, i. e. without the preceding knowledge of the barcode sequences. Experimental results show that its performance is comparable to that of Deepbinner

Súbory diplomovej práce:
Autor nedal súhlas so zverejnením svojej diplomovej práce.

Súbory prezentácie na obhajobe:

