1. Genomic Identification of Transmembrane β-Barrels (TMBBs)
- Author
-
Thomas C. Freeman and William C. Wimley
- Subjects
Signal peptide ,chemistry.chemical_classification ,Biophysics ,Bacterial genome size ,computer.file_format ,Biology ,Protein Data Bank ,biology.organism_classification ,Genome ,Transmembrane protein ,Amino acid ,Biochemistry ,chemistry ,False positive paradox ,computer ,Bacteria - Abstract
Transmembrane beta-barrels (TMBBs) are a special structural class of proteins predominately found in the outer membranes of Gram-negative bacteria, mitochondria, and chloroplasts. It is estimated that 2-3% of a bacterial genome encodes TMBBs, yet less than 40 non-redundant structures have been solved. It would be highly advantageous to have methods to rapidly identify TMBBs from increasingly available genomic databases. A prediction algorithm proposed by Wimley in 2002 was based on the physicochemical properties of TMBBs of known structure. This method used relative amino acid abundances to predict the position of beta-strands and beta-hairpins, which are the major structural subunits of TMBBs, and a mathematical simplification of the topology prediction data called a beta-barrel score. To test the accuracy of this algorithm we scored proteins from a non-redundant database of protein sequences from the Protein Data Bank (NRPDB). The results revealed that the algorithm's ability to discriminate true TMBBs from other proteins, while strong, could be significantly improved. First, we updated the relative amino acid abundances to include the latest structural information. Second, we altered the beta-strand prediction method to account for the fact that certain amino acids have a higher propensity to situate near the lipid/water interface than in the hydrophobic core of the bilayer. Third, we adjusted the calculation of the beta-barrel score to address the lowered beta-hairpin density of larger TMBBs such as BtuB. We reanalyzed the NRPDB and the modifications resulted in a 5-fold decrease in the number of false positives, many of which are either non-bacterial proteins or from Gram-positive organisms. We will use this method to analyze the available genomes of Gram-negative bacteria and the results, along with the signal peptide predictions of SignalP (Bendtsen, et al. 2004) will be deposited into a publicly available database.
- Published
- 2009
- Full Text
- View/download PDF