Back to Search Start Over

A method for calculating codon frequencies in DNA

Authors :
Narendra S. Goel
Lucy Lam King
Hans J. Bremermann
Martynas Yčas
Gita Subba Rao
Source :
Journal of Theoretical Biology. 35:399-457
Publication Year :
1972
Publisher :
Elsevier BV, 1972.

Abstract

Since the biological code is degenerate, the frequencies of occurrence of codons are not uniquely determined by the amino acid frequencies in proteins. A method of calculating these codon frequencies is presented here. It is assumed that the anti-sense strand of DNA consists of a sequence of codons randomly juxtapositioned, as is indicated by the statistically random sequence of amino acids in proteins. Using experimental data on average frequencies of amino acids in proteins, and on nearestneighbor frequencies and frequencies of pyrimidine runs flanked by purines in DNA, it is possible to write equations expressing constraints on the codon frequencies. Using the existing data, the frequencies of 20 codons and 24 linear combinations of codons can be calculated. This is done using a newly developed optimization procedure. The type of data that would be needed to calculate frequencies of all 64 codons is described. The method has been tested on hypothetical DNA molecules. The solutions are rather insensitive to errors in the estimates of amino acid frequencies but are very sensitive to errors in the estimate of doublet frequencies and pyrimidine runs. Since the equations are of degree up to five, the solutions are not unique. It is found, however, that “acceptable” solutions (those not containing negative values for the codon frequencies) do not seem to be numerous. Since most of the degeneracy of the genetic code is at the third place, for a given set of amino acid frequencies certain percentages of A, C, G and T are required for the first two places. The initial guess used for the iterative procedure is obtained by assuming that the percentage of nucleotides left for the third place is distributed randomly. A random perturbation of the initial guess of up to 10% yields the same solution. The method has been used to calculate codon frequencies in the DNA of three vertebrates, one plant (yeast) and seven bacteria. It is found that, in general, all the codons for a given amino acid are not used equally, and the relative frequencies vary with the species.

Details

ISSN :
00225193
Volume :
35
Database :
OpenAIRE
Journal :
Journal of Theoretical Biology
Accession number :
edsair.doi.dedup.....f05fcca9f85386fce889f3f784c15cea