Simulating Missing Data

Next: Analysis Up: Recreating Datasets Previous: Permutation Tests Contents Index

Simulating Missing Data

You can also use Prune to simulate missing data. You set the amount of missing marker data you would like to simulate with the -M option. This will be a percent, and should be specified before you invoke the bootstrap option, which actually does the simulation. Use a value of 3 to tell Prune to randomly set some of the markers to missing. Over the entire data set, approximately the percentage of markers that had been set with the -M option will be set to -10. The results will be in a file with the filename extension ``.crb''. Similar to simulating missing data, some of the markers can be made dominant by using a value of 4 with the bootstrap option. The percentage of markers transformed is set with the -M. The direction of dominance is random: Half of those changed will convert the allele to dominant, while the other half will convert the allele.

If you use a value of 5 with -b and specify a percentage for -M, then you can investigate how selective genotyping compares to having typed all the individuals. As an example, suppose you have a data set typed for 500 individuals and use -M 20 and -b 5. The individuals are ordered with respect to the trait of interest and those whose trait values are in the lowest 10% are retained along with those in the highest 10%. Those in the 10 to 90 percent range are deleted. A new data set with the ``.crb'' filename extension will contain the results.

Next: Analysis Up: Recreating Datasets Previous: Permutation Tests Contents Index

Christopher Basten 2002-03-27