(c)1995
Jeff O'Connell
Table of Contents
VITESSE is a software package that computes likelihoods with the functionality of the LINKMAP and MLINK programs from LINKAGE.
VITESSE uses the novel algorithms of set-recoding and fuzzy inheritance to reduce the number of genotypes needed for exact computation of the likelihood, which accelerates the calculation. It also represents multilocus genotypes locus-by-locus to reduce the memory requirements.
The algorithms in VITESSE were developed and coded by Jeff O'Connell at the University of Pittsburgh. Dan Weeks at the University of Pittsburgh and the Wellcome Trust Centre for Human Genetics at Oxford University collaborated.
Running VITESSE requires that you be familiar with running the LINKAGE/FASTLINK package. If you have not used LINKAGE/FASTLINK you can find user manuals and other very helpful information on Jurg Ott's Linkage Analysis Web Site at Columbia University.
% cnvrt_sh Input file :pedin Output file: vpedin -> 79 lines processed %
The reason the output is not the same is so that the user can compare answers with LINKAGE/FASTLINK by running both shell files.
Version 1.0 will only handle simple pedigrees. Simple means there are no loops and there is only one set of parents who are founders. I'm actively working on general pedigrees without loops. Yes, I know this limitation is annoying!!
%pedshell Input file :pedin Output file: ppedin -> 79 lines processed %When you run 'ppedin', a message will be displayed after each pedigree that is not simple. Delete those pedigrees from the pedigree file (make a backup first) and then run VITESSE.
For example, if you generated 'pedin' using lcp, then convert 'pedin' to say 'ppedin' and run it. Delete pedigrees, if necessary. Then convert 'pedin' to 'vpedin' and run it.
NEVER RECODE ALLELES. VITESSE does its own allele recoding. Hand recoding may lead to errors and any 'allele lumping' will not affect VITESSE's running time. My experience is that LINKAGE/FASTLINK does not always handle more than 31 alleles at a locus correctly. VITESSE has no restrictions on the number of alleles at a locus.
The final output from VITESSE appears at the end of the run because the program is optimized for MLINK and LINKMAP runs. This means that when a trait locus moves between two markers, all thetas are done for that pedigree, instead of doing all pedigrees for one theta. VITESSE will print which pedigree is is processing.
The output during the run is much different than LINKAGE. VITESSE prints the state of the calculation to give the user an idea of the of how complex the problem is. As each nuclear family is peeled from the pedigree, VITESSE prints the id's of the parents and children, and the number of Parental Pairs. This is the number of valid multilocus genotype pairings for the parents. For each pair, a calculation involving the compatible child genotypes is done, so this number relates to the complexity of that nuclear family, and thus the pedigree - the more pairs, the longer the calculation. As you run different pedigrees or add markers, you should get a feel for the time complexity of the problems.
When you reach the last nuclear family, the output looks slightly different because VITESSE uses a novel algorithm to exploit a special symmetry in this family which can lead up to 2-fold speed up.
The time and space complexity of LINKAGE/FASTLINK is associated to the product of the number of alleles of the markers, called Maxhap. This constant is actually a false indicator of the complexity of the problem. The space and time complexity of VITESSE is a function of the number of markers and the number of parental genotypes in the pedigree. Note that Maxhap is irrelevant in VITESSE.
I'm developing a preprocessor program that will quickly give this complexity information without doing the entire likelihood calculation.
PEDIGREE/MENDELIAN INCONSISTENCIES
VITESSE assumes to some extent that the pedigree file is correct and does not have extensive diagnostic checking. Assuming the pedigree file is correct, VITESSE is guaranteed to find any Mendelian inheritance inconsistencies in your pedigree. If VITESSE finds an inconsistency it exits and gives information on the screen and in the file 'vitesse.dbg' to assist you in locating the problem.
The memory requirements are a function of the number of loci and number of parental pairs. On most problems I've reached the time complexity before the memory limit. I am testing other implementations which use less memory and have not decided yet how to or whether to separate versions.
For example, on my Sun Workstation I get inaccurate log10 answers from FASTLINK. To get better accuracy in FASTLINK change the log10_ value in commondefs.h to 2.302585093 or replace it with '(log(10.0))'.
*THUS, TO COMPARE:
VITESSE is also available with an alternative interface as part of the GAS system -- Genetic Analysis System by Alan Young, Oxford University.
The GAS/VITESSE implementation offers some different functionality than LINKMAP and MLINK, allowing 2-locus optimization of theta, exclusion mapping automated multi-point (up to 8 loci simultaneously) mapping across any number of adjacent marker loci, and produces a postscript file for direct graphical output.
For more details use either:
VITESSE is also available at the above sites.
I'm currently working on various algorithmic additions to VITESSE, so that it will cover the full range of analyses available in LINKAGE/FASTLINK:
In addition,
I would like to thank the following Beta testers for excellent information and suggestions on improving VITESSE and platform compatibility.
"The VITESSE algorithm for rapid exact multilocus
linkage analysis via genotype set-recoding and fuzzy inheritance",
O'Connell JR, Weeks DE, Nature Genetics 11:402-408, December 1995
Please reference this if you use VITESSE in any published work.
I'm very interested in receiving feedback on your experiences. If you have any questions, problems, suggestions, or comments, please get in touch.
Thank you for your assistance,
Jeff O'Connell