Trouble with Tags

To reduce the number of markers to be genotyped in an association study, much effort is spent to identify and discard markers that are in strong linkage disequilibrium (LD) with other markers. The idea is that if two markers are in strong LD, the association results they will produce will be highly positively correlated. However, it is not clear that the amount of LD between markers necessarily linearly tracks with those two markers' trait-association test results. We examined this, and found that when LD between two markers and an unobserved causal site exists, the trait-association test results of these markers are often negatively correlated. In these cases, one marker will indicate that there is an association with the trait, but the other will not. Which marker will give which result is random. Below is a graph showing this for a large number of pairs of markers. LD between markers is represented on the horizontal axis and correlation between trait-association results is on the vertical axis.

Each point represents simulation results for two markers, in LD with each other and a causal site. LD patterns are taken from real data. Markers can show LD (r2) as great as 0.8, and yet regularly disagree about being associated with the trait (correlation < -0.7). Which marker gives the positive result and which does not is random.


Reference:

Nielsen DM, Suchindran S, Smith CP. (2008) Does strong linkage disequilibrium guarantee redundant association results? Genet Epidemiol. 32(6):546-52.