These R programs are to accompany Shibata K, Diatchenko L, Zaykin DV. Haplotype associations with quantitative traits in the presence of complex multilocus and heterogeneous effects. Genet Epidemiol. 2009; 33:63-78. Written by Dmitri Zaykin Sample data file: dat.csv. The scripts require a comma-separated format. SNPs need to be coded as 0,1,2. There are separate scripts for additive (i.e. allelic), dominant, recessive, and genotypic models. Sample usage ------------ Source scripts first: source("Add2.r") source("Genot.r") source("Dom.r") source("Rec.r") Do four different models of analysis: VarAdd("dat.csv", "/dev/stdout", "Response", "M2", 10000) VarGenot("dat.csv", "/dev/stdout", "Response", "M2", 10000) VarDom("dat.csv", "/dev/stdout", "Response", "M2", 10000) VarRec("dat.csv", "/dev/stdout", "Response", "M2", 10000) Here, "/dev/stdout" means to write to the screen (standard output). Alternatively, a file name can be given (in double quotes, as in "MyOut.txt"). "Response" is the name of the column in dat.csv that contains the response variable. "M2" is the name of one of SNP columns in dat.csv. 10000 is the number of permutations. Asymptotic p-values are also computed. Missing data ------------ If there are missing data, they need to be imputed first. Package "mice" needs to be installed: install.packages("mice") Imputed values are averages over multiple imputations. This method takes into account LD between SNPs. ImpGeno.r is a sample script that converts a missing data file ("dat-missing.csv") to a complete data file ("dat-imputed.csv"). To use this script with your files, edit these two file names. The script assumes that there is a single response column, and that it is the first one. Imputation ignores that column: md <- mice(d[,-1], m=mi) then puts it back before writing to the file imputed.data <- cbind(d[,1], gni)