Detecting operons from RNA-seq data using a convolutional and recurrent neural network architecture

Loading...
Thumbnail Image

Keywords

operon, convolutional neural network, recurrent neural network, RNA-seq

Degree Level

masters

Advisor

Degree Name

M. Sc.

Volume

Issue

Publisher

Memorial University of Newfoundland

Abstract

Operon is a characteristic of prokaryotic genomes that enables the co-regulation of adjacent genes. Identifying which genes belong to the same operon can help in understanding bacterial gene function and regulation, which can enhance, for instance, drug development and antibiotic resistance inhibition. There are numerous experimental and computational approaches for operon detection; however, many of the computational approaches have been developed for a specific target genome or require specific information only available for a restricted number of bacterial genomes. Here, we develop a novel general method that directly utilizes RNA-seq reads as a signal over nucleotide bases in the genome, extracting all the information from the RNA-seq data. This representation enabled us to employ deep learning techniques without limitations on species. The final model (OpDetect) demonstrates superior performance in terms of recall, f1-score and Area Under Receiver Operating Characteristic curve (AUROC) compared to previous approaches. Additionally, it showcases species-agnostic capabilities, successfully detecting operons even in Caenorhabditis elegans (C. elegans), the only eukaryotic organism known to have operons.

Collections