Analytic Solution of Detection Rates when Genotype Errors Introduced into Family at Single Nucleotide Polymorphism Locus. D. Gordon, J. Ott. Lab Statistical Genetics, Rockefeller Univ, New York, NY.
Errors in human linkage data such as misreads by automated machines (allelic errors) or sample swaps (genotype errors) can increase type I error rates and reduce power in TDT tests. The goal of our analysis is to analytically determine the rate at which genotype errors are detected through Mendelian inconsistency in trios (parent 1, parent 2, child) that are genotyped at a single nucleotide polymorphism (SNP) locus that is in Hardy-Weinberg equilibrium (HWE).
In our error model, it is assumed that any of the three correct genotypes in a trio is replaced by a different genotype with constant probability, e. Further, it is assumed that errors occur randomly and independently. Hence, the probability of 1 to 3 errors occurring in a trio, conditional on at least one error occurring in the trio, is computed using the binomial distribution. It is also assumed that the trios arise from a population in HWE at the SNP locus. Using standard probability theory, the probability that errors are detected, or detection rate, for a randomly selected trio from the population may be calculated using the error rate e of genotype swaps and the allele frequency p of one of the two alleles at the SNP locus.
We compute this detection rate for various values of e and p . It can be shown that the theoretical maximum detection rate is 75%, but this detection rate is never achieved in practice. The values of e considered are 0.001, 0.005, 0.01, 0.02, 0.05, 0.1, and 0.2. The values of p considered are 0.01, 0.05, 0.1, 0.25, and 0.5. The maximum detection rate for these values is 66%, when e is 0.001 and p is 0.01. The minimum detection rate is 42%, when e is 0.001 and p is 0.5. For any fixed setting of e, the minimum detection rate occurs when allele frequencies are equal, i.e., p = 0.5. Also the detection rate appears to be much more sensitive to changes in the allele frequency p than to changes in the error rate e.