1. Complex high-resolution linkage disequilibrium and haplotype patterns of single-nucleotide polymorphisms in 2.5 Mb of sequence on human chromosome 21.
- Author
-
Olivier M, Bustos VI, Levy MR, Smick GA, Moreno I, Bushard JM, Almendras AA, Sheppard K, Zierten DL, Aggarwal A, Carlson CS, Foster BD, Vo N, Kelly L, Liu X, and Cox DR
- Subjects
- Animals, Cricetinae, DNA chemistry, Genetic Variation, Genome, Human, Humans, Hybrid Cells, Microsatellite Repeats, Sequence Analysis, DNA, Sequence Tagged Sites, Chromosomes, Human, Pair 21 genetics, DNA genetics, Haplotypes genetics, Linkage Disequilibrium, Polymorphism, Single Nucleotide genetics
- Abstract
One approach to identify potentially important segments of the human genome is to search for DNA regions with nonrandom patterns of human sequence variation. Previous studies have investigated these patterns primarily in and around candidate gene regions. Here, we determined patterns of DNA sequence variation in 2.5 Mb of finished sequence from five regions on human chromosome 21. By sequencing 13 individual chromosomes, we identified 1460 single-nucleotide polymorphisms (SNPs) and obtained unambiguous haplotypes for all chromosomes. For all five chromosomal regions, we observed segments with high linkage disequilibrium (LD), extending from 1.7 to>81 kb (average 21.7 kb), disrupted by segments of similar or larger size with no significant LD between SNPs. At least 25% of the contig sequences consisted of segments with high LD between SNPs. Each of these segments was characterized by a restricted number of observed haplotypes,with the major haplotype found in over 60% of all chromosomes. In contrast, the interspersed segments with low LD showed significantly more haplotype patterns. The position and extent of the segments of high LD with restricted haplotype variability did not coincide with the location of coding sequences. Our results indicate that LD and haplotype patterns need to be investigated with closely spaced SNPs throughout the human genome, independent of the location of coding sequences, to reliably identify regions with significant LD useful for disease association studies.
- Published
- 2001
- Full Text
- View/download PDF