Xiaoli Xie and Jurg Ott, Columbia University New York
15 December 1993
The VaryPhen program is a utility program, which may be used as a tool in linkage analysis. This brief user's guide consists of five sections: I. Introduction, II. Installation, III. Overview of Program Usage, IV. Example, and V. References. The reference for the use of this program is Xie and Ott (1990).
The VaryPhen program analyses the effect of a change in an individual's affection status ("VARY PHENotype") on the lod score. In linkage analyses involving a disease locus, the affection status is usually considered known with certainty. Uncertainties about affection status may be taken into account by analysis with different penetrance models (Ott 1990), but a change from unaffected to affected or vice versa will generally have some effect on the lod score. This program determines the change in (multipoint) lod score due to a change in affection status for each individual in a family pedigree. It identifies those individuals whose affection statuses are the most critical for the analysis and reveals those who are somewhat uninformative for linkage. It is thus intended as an aid to corroborate the results of a linkage analysis and to alert investigators to the effect of possible errors in classifying the disease phenotype of an individual on the lod score.
This program runs the LINKMAP program (Lathrop et al. 1984) to analyze the given pedigree and, for each individual in turn, temporarily changes his or her affection status and runs LINKMAP again. Output consists of a list of who's phenotype was changed, how it was changed, and the resulting change in lod score, both at the user chosen disease map position (POS) [or estimated map position when given phenotypes are used, if this option is chosen] and at the map position at which the lod score is maximized when the i-th individual's phenotype is changed.
The VaryPhen program is currently available for DOS, OS/2 and DEC/VMS machines. A Sun Sparcstations version will soon be available. The VaryPhen program need not be in any particular directory, but we suggest to put it in a directory that can be accessed directly by your operating system. For example, when you use the VaryPhen program under DOS, copy the VaryPhen program to the C:\BIN directory and put the C:\BIN directory in the path. To make sure C:\BIN is in the path, you may type PATH at the DOS prompt. You will then see something like this:
PATH=C:\DOS;C:\BIN; . . .
If C:\BIN is not in the path, you may modify the AUTOEXEC.BAT file to put it in the path.
The VaryPhen program uses standard LINKAGE data and pedigree files for the LINKMAP program, and requires a batch file called PEDIN.BAT, which defines the desired analysis and can be made with the LCP program. The linkage programs, LINKMAP, UNKNOWN, etc. should also be accessible (i.e., in the path). It is a good idea to put all the data files in a separate directory.
The VaryPhen program requires three input files: a datafile, DATAIN.DAT; a pedigree file, PEDIN.DAT; and a batch file, PEDIN.BAT; it produces an output file, VPOUT.DAT. In the following list, names in brackets are default names. You may assign different names if you wish:
Once you invoke the VaryPhen program, you will see the
following on the screen:
Pedigree File [PEDIN.DAT]:
Batch File [PEDIN.BAT]:
The Output File from LINKMAP [FINAL.OUT]:
The Output File from VP [VPOUT.DAT]:
In brackets are the default names for each file. You can
change them by typing a new name. If you accept the default name, just press the <ENTER> key. After you give all the file names, the program checks whether the files exist. If either of the first three files does not exist, the program will stop running and give a brief message. If VPOUT.DAT already exists, the program will ask you if you wish to overwrite it. If you do not wish to overwrite it the program will stop running.
For the operation of the program you can select one of the following possibilities by simply typing the corresponding number when prompted:
Meaning of these possibilities:
After giving your choice, the program displays the following two options for the disease map position at which lod scores should be calculated:
Under option 1, lod scores will always be calculated for the fixed disease map position chosen by the user. Under option 2, lods will be calculated for that disease position with the maximum lod score in the original (unmodified) data. In addition, the program will calculate the maximum lod score wherever it occurs for each change in affection status.
You can choose the default option simply by pressing the [ENTER] key. The program then displays map positions on the screen, as follows (remember that locus no. 1 is the disease locus):
POS ORDER THETAS 0 1 2 3 0.500 0.100 1 1 2 3 0.400 0.100 2 1 2 3 0.300 0.100 3 1 2 3 0.200 0.100 4 1 2 3 0.100 0.100 5 1 2 3 0.000 0.100
.....
The program then prompts you as follows:
The map position (POS) you like to simulate==> [More]...
You can choose the locus order and the corresponding interlocus recombination fractions by typing the appropriate disease position number. If [More] ... appears on the screen, that means there are additional map positions on the next screen. If you choose one on the first screen, the remaining ones will not be displayed. If you want to choose a position other than the ones on this screen, simply hit the <ENTER> key and the program will display more positions from which you may choose.
Once these parameters are entered, the program starts to check the pedigree file. If it is incorrect, the program may stop running. If the pedigree file is correct the program repeats the following for each pedigree member:
until it reaches the last individual.
Consider the following example pedigree:
[1]---.---(2)
|
.-------------------.
| |
[3]-.-(4) [8]-.-(9)
| |
.---------. .-----.
| | | | |
(5) [6] (7) [10] [11]
The corresponding pedigree file PEDIN.DAT is as follows (it is in standard LINKAGE format, as produced by MAKEPED):
1 1 0 0 3 0 0 1 1 2 0 0 0 0 0 0 0 0 0 0 Ped: 1 Per: 1 1 2 0 0 3 0 0 2 0 1 0 0 0 0 0 0 0 0 0 0 Ped: 1 Per: 2 1 3 1 2 5 8 8 1 0 2 1 2 1 2 2 2 2 2 1 2 Ped: 1 Per: 3 1 4 0 0 5 0 0 2 0 1 1 2 3 3 1 1 1 2 1 1 Ped: 1 Per: 4 1 5 3 4 0 6 6 2 0 2 1 2 2 3 1 2 1 2 1 1 Ped: 1 Per: 5 1 6 3 4 0 7 7 1 0 1 1 1 1 3 1 2 1 2 1 1 Ped: 1 Per: 6 1 7 3 4 0 0 0 2 0 1 0 0 1 3 1 2 0 0 1 2 Ped: 1 Per: 7 1 8 1 2 10 15 15 1 0 2 1 2 0 0 2 2 2 2 1 1 Ped: 1 Per: 8 1 9 0 0 10 0 0 2 0 1 1 1 1 3 1 1 1 2 2 2 Ped: 1 Per: 9 1 10 8 9 0 11 11 1 0 1 1 1 1 2 1 2 1 2 1 2 Ped: 1 Per: 10 1 11 8 9 13 0 0 1 0 1 0 0 0 0 1 2 1 2 1 2 Ped: 1 Per: 11
The corresponding datafile DATAIN.DAT, in standard LINKAGE format for LINKMAP, is as follows:
6 0 0 5 << NO. OF LOCI, RISK LOCUS, SEXLINKED (IF 1) PROGRAM 0 0.0 0.0 0 << MUT LOCUS, MUT RATE, HAPLOTYPE FREQUENCIES (IF 1) 1 2 3 4 5 6 1 2 << AFFECTION, NO. OF ALLELES 0.990000 0.010000 << GENE FREQUENCIES 1 << NO. OF LIABILITY CLASSES 0.0000 0.5000 0.5000 << PENETRANCES 3 2 << ALLELE NUMBERS, NO. OF ALLELES 0.500000 0.500000 << GENE FREQUENCIES 3 4 << ALLELE NUMBERS, NO. OF ALLELES 0.250000 0.250000 0.250000 0.250000 << GENE FREQUENCIES 3 2 << ALLELE NUMBERS, NO. OF ALLELES 0.500000 0.500000 << GENE FREQUENCIES 3 2 << ALLELE NUMBERS, NO. OF ALLELES 0.500000 0.500000 << GENE FREQUENCIES 3 2 << ALLELE NUMBERS, NO. OF ALLELES 0.500000 0.500000 << GENE FREQUENCIES 0 0 << SEX DIFFERENCE, INTERFERENCE (IF 1 OR 2) 0.05000 0.10000 0.10000 0.10000 0.10000 << RECOMBINATION VALUES 1 0.05000 0.45000 << REC VARIED, INCREMENT, FINISHING VALUE
The FINAL.OUT file, which is the output file from the LINKMAP program used by VaryPhen, looks as follows:
********************************************************************
LINKMAP
Pedigree File : lath.TPD
Parameter File : lathrop.TDT
Output Pedigree File : PEDFILE.DAT
Output Parameter File : DATAFILE.DAT
Log File : LSP.LOG
Stream File : STREAM.DAT
Date Run : 1-Oct-90 21:45:54
Sex Difference : 0
Test Locus : 1
Stop Value : 0.00000000
Number of Evaluations : 5
Locus Order : 1 2 3 4
Male Recomb. Fractions : 0.50000000 0.10000000 0.10000000
********************************************************************
Length of real variables = 8 bytes
LINKAGE (V5.03) WITH 4-POINT AUTOSOMAL DATA
LOCUS ORDER: 1 2 3 4
-----------------------------------
-----------------------------------
THETAS 0.500 0.100 0.100
-----------------------------------
PEDIGREE | LN LIKE | LOG 10 LIKE
-----------------------------------
1 -32.953612 -14.311541
-----------------------------------
TOTALS -32.953612 -14.311541
-2 LN(LIKE) = 65.907224
-----------------------------------
-----------------------------------
THETAS 0.400 0.100 0.100
-----------------------------------
PEDIGREE | LN LIKE | LOG 10 LIKE
-----------------------------------
1 -32.924425 -14.298866
-----------------------------------
TOTALS -32.924425 -14.298866
-2 LN(LIKE) = 65.848850^M
-----------------------------------
. . . . . .
The output file, VPOUT.DAT, produced by the VaryPhen program has two parts. The first part displays all possible disease map positions with the locus order and corresponding interlocus thetas. The second part is a table, which consists of a list of individuals and the difference in lod score when their affection status is changed. The output file for the given example is as follows (detailed explanations are given below):
POS ORDER THETAS
0 1 2 3 4 0.500 0.100 0.100
1 1 2 3 4 0.400 0.100 0.100
2 1 2 3 4 0.300 0.100 0.100
3 1 2 3 4 0.200 0.100 0.100
4 1 2 3 4 0.100 0.100 0.100
5 1 2 3 4 0.000 0.100 0.100
6 2 1 3 4 0.000 0.100 0.100
7 2 1 3 4 0.020 0.083 0.100
8 2 1 3 4 0.040 0.065 0.100
9 2 1 3 4 0.060 0.045 0.100
. . . . .
USER CHOSEN POS IS 11
PHENOTYPE
CHANGED
INDIVIDUAL FROM TO | Zuser | pos | Zmax | pos |
=============================================================
PHENOTYPE AS GIVEN | 0.36493| 11 | 0.36493| 11 |
-------------------------------------------------------------
1 2 1 | 0.36630| 11 | 0.36630| 11 |
-------------------------------------------------------------
2 1 2 | 0.17058| 11 | 0.19328| 5 |
-------------------------------------------------------------
3 2 1 | 0.35894| 11 | 0.35894| 11 |
-------------------------------------------------------------
4 1 2 | 0.18061| 11 | 0.21104| 5 |
-------------------------------------------------------------
5 2 1 | 0.05634| 11 | 0.12071| 5 |
-------------------------------------------------------------
6 1 2 | -1.42465| 11 | 0.00000| 0 |
-------------------------------------------------------------
7 1 2 | -1.42465| 11 | 0.00000| 0 |
-------------------------------------------------------------
8 2 1 | 0.02299| 11 | 0.06077| 5 |
-------------------------------------------------------------
9 1 2 | 0.40062| 11 | 0.40062| 11 |
-------------------------------------------------------------
10 1 2 | 0.63163| 11 | 0.63163| 11 |
-------------------------------------------------------------
11 1 2 | 0.36528| 11 | 0.36528| 11 |
-------------------------------------------------------------
The meaning of each column is as follows:
The first line in the output table above presents results based on the given (unmodified) pedigree file, while the remaining lines show results when the i-th individual's affection status is changed.
Summary of results for example data: The overall multipoint lod score for disease (locus 1) versus map of loci (loci 2, 3, 4) is 0.36, obtained at map position 11.
If the phenotype of individual 1 is switched from 2 (affected) to 1 (unaffected), the max. lod score is 0.37, obtained at the same map position as before. Hence, individual 1 is not very informative for linkage in this family.
If individual 2 switches affection status from 1 to 2, this results in a drop of the max. lod score from 0.36 to 0.19, obtained at map position 5.
Comparing these changes in maximum lod score for all individuals will show who in the family is most crucial for the analysis.
Lathrop GM, Lalouel JM, Julier C, Ott J (1984) Strategies for multilocus analysis in humans. Proc Natl Acad Sci USA 81, 3443-3446
Ott J (1990) Genetic linkage analysis under uncertain disease definition. In Banbury Report 33: Genetics and Biology and Alcoholism, edited by C.R. Cloninger and H. Begleiter. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press, pp. 327-331
Xie X, Ott J (1990) Determining the effect of a change in affection status on the lod score. Am J Hum Genet 47, A205