A measure of ambiguity in SNPs in presence of linkage disequilibrium. S.E. Hodge, J. Hoh. Columbia Univ, New York, NY.
The extent of haplotype ambiguity in a string of single nucleotide polymorphisms (SNPs) was quantified by Hodge et al. (Nat Genet 21: 360, 1999). In their measure, the level of ambiguity increases with increasing numbers of loci and as loci become more polymorphic. That work assumed linkage equilibrium (LE). However, linkage disequilibrium (LD) provides additional information about the haplotypes at a site, thereby diluting the level of ambiguity. The ambiguity vanishes altogether when LD reaches its maximum value. Here, we extend the ambiguity measure (f) to allow for LD between each successive pair of SNPs. We derive the formula f= 4yz, where x, y, z, and w are the frequencies of the ++, +-, -+, and -- haplotypes, respectively, and w.l.o.g. xw > yz. Alternatively, f can be expressed in terms of the allele frequencies and the LD parameter d. We also extend the formula to triads of two parents plus one child. In genome-wide LD studies to map common disease genes, a dense map of SNPs has been utilized to detect association between a marker and disease. Therefore, the measurement of ambiguity can help investigators to determine a more efficient map, designed to minimize ambiguity and subsequent information loss. We calculate our measure for relevant SNPs in the published LPL dataset (Clark et al., AJHG 63: 595, 1998; Nickerson et al., Nat Genet 19: 233, 1998), obtaining values ranging from a low of 0 to a high of 11%, in that particular dataset.