This program is to accompany

Zaykin DV, Zhivotovsky LA. (2005) Ranks of genuine associations in
whole-genome scans. Genetics 171: 813--823.
(http://www.genetics.org/cgi/content/abstract/171/2/813)

The main C++ code file is zzranks.cpp. A compiled version for MS Windows
is zzranks.exe. For the correlated p-values case, the correlation
parameters are hard-coded. The resulting correlation decay is as given
in CorrPlot.jpg

To compile the program using the GNU C++ compiler:

    g++ zzranks.cpp -o zzranks.exe chahyp.cpp dcdflib.cpp -O3 -s

To run under the bash shell (e.g. under Linux)

    ./zzranks.exe 50000 10.5 1 10000 5e-06 1 1 $RANDOM > ranks1.txt

$RANDOM here and below supposed to convert to an integer: replace
$RANDOM with an integer if the shell doesn't recognize it. E.g. under
the MS Windows "Command Prompt" shell the command may look like:

    zzranks.exe 50000 10.5 1 10000 5e-06 1 1 12345 > ranks1.txt

The order and the meaning of the parameters in the command line above is
given at the top of main() in zzranks.cpp.

To examine the ranks there needs to be R installed on your system. To
get the number of most significant markers needed to find at least one
of the m specified TAs with 95% probability, enter the R commands as

    >  r <- array(scan("ranks1.txt"))
    >  quantile(r, 0.95)


If you'd like to change correlation parameters, zzranks.cpp, you'd need
to tweak d,m,s after the "NOTE" comment in that file. To help choosing
these, I provide "corr.cpp" which is just  a stripped-down version of
the program: it outputs p-values instead of the ranks. Then R can be
used to look at the correlation decay as described below.

See "NOTE" in corr.cpp on how to control the correlation decay change
parameters to your liking, compile to corr.exe, run corr.exe, then
examine the correlation in R:

    g++ corr.cpp -o corr.exe chahyp.cpp dcdflib.cpp -O3 -s
    ./corr.exe 200000 $RANDOM > pv.txt
    R

R commands:

    >  p <- array(scan("pv.txt"))
    >  a <- acf(p, lag.max=50, plot=F); plot(a[1:length(a$lag)])


If you changed d,m,s values in corr.cpp above, then put the same values
into zzranks.cpp, then recompile:

    g++ zzranks.cpp -o zzranks.exe chahyp.cpp dcdflib.cpp -O3 -s

Now run the program again:

    ./zzranks.exe 50000 10.5 1 10000 5e-06 1 1 $RANDOM > ranks1.txt

Use R to get the number of most significant markers needed to find  at
least one of the m specified TAs with 95% probability:

    R
    >  r <- array(scan("ranks1.txt"))
    >  quantile(r, 0.95)