Analysis of E.coli promoter recognition problem in dinucleotide feature space

dc.contributor.author Rani, T. Sobha
dc.contributor.author Bhavani, S. Durga
dc.contributor.author Bapi, Raju S.
dc.date.accessioned 2022-03-27T05:50:52Z
dc.date.available 2022-03-27T05:50:52Z
dc.date.issued 2007-03-01
dc.description.abstract Motivation: Patterns in the promoter sequences within a species are known to be conserved but there exist many exceptions to this rule which makes the promoter recognition a complex problem. Although many complex feature extraction schemes coupled with several classifiers have been proposed for promoter recognition in the current literature, the problem is still open. Results: A dinucleotide global feature extraction methodis proposed for the recognition of sigma-70 promoters in Escherichia coli in this article. The positive data set consists of sigma-70 promoters with known transcription starting points which are part of regulonDB and promec databases. Four different kinds of negative data sets are considered, two of them biological sets (Gordon et al., 2003) and the other two synthetic data sets. Our results reveal that a single-layer perceptron using dinucleotide features is able to achieve an accuracy of 80% against a background of biological non-promoters and 96% for random data sets. A scheme for locating the promoter regions in a given genome sequence is proposed. A deeper analysis of the data set shows that there is a bifurcation of the data set into two distinct classes, a majority class and a minority class. Our results point out that majority class constituting the majority promoter and the majority non-promoter signal is linearly separable. Also the minority class is linearly separable. We further show that the feature extraction and classification methods proposed in the paper are generic enough to be applied to the more complex problem of eucaryotic promoter recognition. We present Drosophila promoter recognition as a case study. © 2007 Oxford University Press.
dc.identifier.citation Bioinformatics. v.23(5)
dc.identifier.issn 13674803
dc.identifier.uri 10.1093/bioinformatics/btl670
dc.identifier.uri https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btl670
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8270
dc.title Analysis of E.coli promoter recognition problem in dinucleotide feature space
dc.type Journal. Article
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: