Outliers Analysis in Human Case-Control Association Mapping
Yuanyuan Shen, Zhe Liu, Jurg Ott -- March 2011
Abstract
Background/Aims:
In human case-control association studies, population heterogeneity is
often present and can lead to increased false-positive results. Various
methods have been proposed and are in current use to remedy this
situation.
Methods: We assume that heterogeneity is due to a
relatively small number of individuals whose allele frequencies differ
from those of the remainder of the sample. For this situation, we
propose a new method of handling heterogeneity by removing outliers in
a controlled manner. In a coordinate system of the c largest principal
components in multidimensional scaling (MDS), we systematically remove
one after another of the most extreme outlying individuals and each
time recompute the largest association test statistic. The smallest p
value obtained within M removals serves as our test statistic whose
significance level is assessed in randomization
samples.
Results:
In power simulations of our method and three methods in current use,
averaged over several different scenarios, the best method turned out
to be logistic regression analysis (based on all individuals) with MDS
components as covariates.
Conclusion: Our proposed method
ranked closely behind logistic regression analysis with MDS components
but ahead of other commonly used approaches. In analyses of real
datasets our method performed best. Copyright © 2010 S. Karger AG, Basel
Program
The MDSOutlier computer program is currently available only for Unix/Linux and Mac OS X systems and may be downloaded by clicking here. Only the latest version is required.
References
Shen Y, Liu Z, Ott J: Systematic removal of outliers to reduce
heterogeneity in case-control association studies. Hum Hered 2010;70:227-231 (for free download click here)