Analysis of n-Gram based promoter recognition methods and application to whole genome promoter prediction

dc.contributor.author Rani, T. Sobha
dc.contributor.author Bapi, Raju S.
dc.date.accessioned 2022-03-27T05:50:51Z
dc.date.available 2022-03-27T05:50:51Z
dc.date.issued 2009-07-08
dc.description.abstract Promoter prediction is an important and complex problem. Pattern recognition algorithms typically require features that could capture this complexity. A special bias towards certain combinations of base pairs in the promoter sequences may be possible. In order to determine these biases n-grams are usually extracted and analyzed. An n-gram is a selection of n contiguous characters from a given character stream, DNA sequence segments in this case. Here a systematic study is made to discover the efficacy of n-grams for n = 2, 3, 4, 5 in promoter prediction. A study of n-grams as features for a neural network classifier for E. coli and Drosophila promoters is made. In case of E. coli n = 3 and in case of Drosophila n = 4 seem to give optimal prediction values. Using the 3-gram features, promoter prediction in the genome sequence of E. coli is done. The results are encouraging in positive identification of promoters in the genome compared to software packages such as BPROM, NNPP, and SAK. Whole genome promoter prediction in Drosophila genome was also performed but with 4-gram features. © 2009 IOS Press. All rights reserved.
dc.identifier.citation In Silico Biology. v.9(1-2)
dc.identifier.issn 13866338
dc.identifier.uri 10.3233/ISB-2009-0388
dc.identifier.uri https://www.medra.org/servlet/aliasResolver?alias=iospress & doi=10.3233/ISB-2009-0388
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8266
dc.subject Binary classification
dc.subject Biological data sets
dc.subject Cascaded classifiers
dc.subject In silico method for promoter prediction
dc.subject Machine learning method
dc.subject Neural networks
dc.title Analysis of n-Gram based promoter recognition methods and application to whole genome promoter prediction
dc.type Journal. Article
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: