BacTermFinder: bacteria-agnostic comprehensive terminator finder using a CNN ensemble
Files
Date
Keywords
Degree Level
Advisor
Degree Name
Volume
Issue
Publisher
Abstract
Terminator is a region in the DNA that ends the transcription process. Knowing the location of bacterial terminators will lead to a better understanding of how bacteria’s transcription works. This might facilitate bio-engineering and support bacterial genomic studies. Currently, multiple tools are available for predicting bacterial terminators. However, most methods are specialized for certain bacteria or terminator types. In this work, we developed BacTermFinder, a tool that utilized Deep Learning models, specifically an ensemble of Convolutional Neural Networks (CNNs), with four different genomic representations trained on 46,386 bacterial terminators identified using RNA-seq technologies. Based on our results, BacTermFinder’s average recall score is significantly higher than the next best approach (0.56 ± 0.19 vs 0.45 ± 0.20) in our diverse test set of five different bacteria while reducing the number of false positives. Moreover, BacTermFinder’s model identifies both types of terminators (intrinsic and factor-dependent) and even generalizes to Archea. BacTermFinder is publicly available at https://github.com/BioinformaticsLabAtMUN/BacTermFinder.
