VaryPhen Program


Xiaoli Xie and Jurg Ott, Columbia University New York

15 December 1993

The VaryPhen program is a utility program, which may be used as a tool in linkage analysis. This brief user's guide consists of five sections: I. Introduction, II. Installation, III. Overview of Program Usage, IV. Example, and V. References. The reference for the use of this program is Xie and Ott (1990).

I. Introduction

The VaryPhen program analyses the effect of a change in an individual's affection status ("VARY PHENotype") on the lod score. In linkage analyses involving a disease locus, the affection status is usually considered known with certainty. Uncertainties about affection status may be taken into account by analysis with different penetrance models (Ott 1990), but a change from unaffected to affected or vice versa will generally have some effect on the lod score. This program determines the change in (multipoint) lod score due to a change in affection status for each individual in a family pedigree. It identifies those individuals whose affection statuses are the most critical for the analysis and reveals those who are somewhat uninformative for linkage. It is thus intended as an aid to corroborate the results of a linkage analysis and to alert investigators to the effect of possible errors in classifying the disease phenotype of an individual on the lod score.

This program runs the LINKMAP program (Lathrop et al. 1984) to analyze the given pedigree and, for each individual in turn, temporarily changes his or her affection status and runs LINKMAP again. Output consists of a list of who's phenotype was changed, how it was changed, and the resulting change in lod score, both at the user chosen disease map position (POS) [or estimated map position when given phenotypes are used, if this option is chosen] and at the map position at which the lod score is maximized when the i-th individual's phenotype is changed.

II. Installation

The VaryPhen program is currently available for DOS, OS/2 and DEC/VMS machines. A Sun Sparcstations version will soon be available. The VaryPhen program need not be in any particular directory, but we suggest to put it in a directory that can be accessed directly by your operating system. For example, when you use the VaryPhen program under DOS, copy the VaryPhen program to the C:\BIN directory and put the C:\BIN directory in the path. To make sure C:\BIN is in the path, you may type PATH at the DOS prompt. You will then see something like this:

PATH=C:\DOS;C:\BIN; . . .

If C:\BIN is not in the path, you may modify the AUTOEXEC.BAT file to put it in the path.

The VaryPhen program uses standard LINKAGE data and pedigree files for the LINKMAP program, and requires a batch file called PEDIN.BAT, which defines the desired analysis and can be made with the LCP program. The linkage programs, LINKMAP, UNKNOWN, etc. should also be accessible (i.e., in the path). It is a good idea to put all the data files in a separate directory.

III. Overview of Program Usage

The VaryPhen program requires three input files: a datafile, DATAIN.DAT; a pedigree file, PEDIN.DAT; and a batch file, PEDIN.BAT; it produces an output file, VPOUT.DAT. In the following list, names in brackets are default names. You may assign different names if you wish:

[PEDIN.DAT]
A file holding the pedigree data, in LINKAGE format (LINKAGE pedigree file, after processing by MAKEPED)
[DATAIN.DAT]
A file holding the description of the loci, locus order, etc., in LINKAGE format (LINKAGE datafile, preferably made using PREPLINK). The disease locus must be the first locus in the datafile and pedigree file. Presently only one liability class is allowed at the disease locus (an extension to any number of liability classes is planned).
[PEDIN.BAT]
A batch file defining the desired analysis, which may be created with the LCP program
[FINAL.OUT]
An output file produced by PEDIN.BAT, which contains results from the LINKMAP analysis
[VPOUT.DAT]
An output file containing a list of individuals and the differences of lod score at certain map positions (POS) of the disease locus.

Once you invoke the VaryPhen program, you will see the

following on the screen:

     Pedigree File [PEDIN.DAT]:
     Batch File [PEDIN.BAT]:
     The Output File from LINKMAP [FINAL.OUT]:
     The Output File from VP [VPOUT.DAT]:
In brackets are the default names for each file. You can

change them by typing a new name. If you accept the default name, just press the <ENTER> key. After you give all the file names, the program checks whether the files exist. If either of the first three files does not exist, the program will stop running and give a brief message. If VPOUT.DAT already exists, the program will ask you if you wish to overwrite it. If you do not wish to overwrite it the program will stop running.

For the operation of the program you can select one of the following possibilities by simply typing the corresponding number when prompted:

  1. Switch Affection Status
  2. Change Affection Status to Unknown
  3. Do Both in Same Run

Meaning of these possibilities:

1. Switch Affection Status:
For each of the N individuals temporarily change the affection status (affected to unaffected or vice versa) one at a time and keep the remaining N-1 individuals unchanged. The program works on the affection status code and temporarily changes 2 to 1 or vice versa.
2. Change Affection Status to Unknown:
For each of the N individuals temporarily change the affection status to unknown (change 1 or 2 to 0) and keep the remaining N-1 individuals unchanged.
3. Do Both at the Same Run:
For each of the N individuals switch the affection status first and then change the affection status to unknown.

After giving your choice, the program displays the following two options for the disease map position at which lod scores should be calculated:

  1. The user chosen map positions [DEFAULT]
  2. The map position at the maximum lod score for the given phenotype.

Under option 1, lod scores will always be calculated for the fixed disease map position chosen by the user. Under option 2, lods will be calculated for that disease position with the maximum lod score in the original (unmodified) data. In addition, the program will calculate the maximum lod score wherever it occurs for each change in affection status.

You can choose the default option simply by pressing the [ENTER] key. The program then displays map positions on the screen, as follows (remember that locus no. 1 is the disease locus):

  POS       ORDER       THETAS
   0        1 2 3     0.500 0.100
   1        1 2 3     0.400 0.100
   2        1 2 3     0.300 0.100
   3        1 2 3     0.200 0.100
   4        1 2 3     0.100 0.100
   5        1 2 3     0.000 0.100

.....

The program then prompts you as follows:

The map position (POS) you like to simulate==>  [More]...

You can choose the locus order and the corresponding interlocus recombination fractions by typing the appropriate disease position number. If [More] ... appears on the screen, that means there are additional map positions on the next screen. If you choose one on the first screen, the remaining ones will not be displayed. If you want to choose a position other than the ones on this screen, simply hit the <ENTER> key and the program will display more positions from which you may choose.

Once these parameters are entered, the program starts to check the pedigree file. If it is incorrect, the program may stop running. If the pedigree file is correct the program repeats the following for each pedigree member:

until it reaches the last individual.

IV. Example

Consider the following example pedigree:

                 [1]---.---(2)
                       |
            .-------------------.
            |                   |
           [3]-.-(4)           [8]-.-(9)
               |                   |
          .---------.           .-----.
          |    |    |           |     |
         (5)  [6]  (7)         [10]  [11]

The corresponding pedigree file PEDIN.DAT is as follows (it is in standard LINKAGE format, as produced by MAKEPED):

1  1  0  0  3  0  0 1 1  2  0 0  0 0  0 0  0 0  0 0  Ped: 1  Per: 1
1  2  0  0  3  0  0 2 0  1  0 0  0 0  0 0  0 0  0 0  Ped: 1  Per: 2
1  3  1  2  5  8  8 1 0  2  1 2  1 2  2 2  2 2  1 2  Ped: 1  Per: 3
1  4  0  0  5  0  0 2 0  1  1 2  3 3  1 1  1 2  1 1  Ped: 1  Per: 4
1  5  3  4  0  6  6 2 0  2  1 2  2 3  1 2  1 2  1 1  Ped: 1  Per: 5
1  6  3  4  0  7  7 1 0  1  1 1  1 3  1 2  1 2  1 1  Ped: 1  Per: 6
1  7  3  4  0  0  0 2 0  1  0 0  1 3  1 2  0 0  1 2  Ped: 1  Per: 7
1  8  1  2 10 15 15 1 0  2  1 2  0 0  2 2  2 2  1 1  Ped: 1  Per: 8
1  9  0  0 10  0  0 2 0  1  1 1  1 3  1 1  1 2  2 2  Ped: 1  Per: 9
1 10  8  9  0 11 11 1 0  1  1 1  1 2  1 2  1 2  1 2  Ped: 1  Per: 10
1 11  8  9 13  0  0 1 0  1  0 0  0 0  1 2  1 2  1 2  Ped: 1  Per: 11

The corresponding datafile DATAIN.DAT, in standard LINKAGE format for LINKMAP, is as follows:

 6 0 0 5  << NO. OF LOCI, RISK LOCUS, SEXLINKED (IF 1) PROGRAM
 0 0.0 0.0 0  << MUT LOCUS, MUT RATE, HAPLOTYPE FREQUENCIES (IF 1)
  1  2  3  4  5  6
1   2  << AFFECTION, NO. OF ALLELES
 0.990000 0.010000   << GENE FREQUENCIES
 1 << NO. OF LIABILITY CLASSES
 0.0000 0.5000 0.5000 << PENETRANCES
3   2  << ALLELE NUMBERS, NO. OF ALLELES
 0.500000 0.500000   << GENE FREQUENCIES
3   4  << ALLELE NUMBERS, NO. OF ALLELES
 0.250000 0.250000 0.250000 0.250000   << GENE FREQUENCIES
3   2  << ALLELE NUMBERS, NO. OF ALLELES
 0.500000 0.500000   << GENE FREQUENCIES
3   2  << ALLELE NUMBERS, NO. OF ALLELES
 0.500000 0.500000   << GENE FREQUENCIES
3   2  << ALLELE NUMBERS, NO. OF ALLELES
 0.500000 0.500000   << GENE FREQUENCIES
 0 0  << SEX DIFFERENCE, INTERFERENCE (IF 1 OR 2)
 0.05000 0.10000 0.10000 0.10000 0.10000 << RECOMBINATION VALUES
 1 0.05000 0.45000 << REC VARIED, INCREMENT, FINISHING VALUE

The FINAL.OUT file, which is the output file from the LINKMAP program used by VaryPhen, looks as follows:


********************************************************************
		LINKMAP


     Pedigree File              : lath.TPD
     Parameter File             : lathrop.TDT
     Output Pedigree File       : PEDFILE.DAT
     Output Parameter File      : DATAFILE.DAT
     Log File                   : LSP.LOG
     Stream File                : STREAM.DAT

     Date Run                   :  1-Oct-90  21:45:54

     Sex Difference             : 0
     Test Locus                 : 1
     Stop Value                 : 0.00000000
     Number of Evaluations      : 5

     Locus Order                : 1 2 3 4
     Male Recomb. Fractions     : 0.50000000 0.10000000 0.10000000
********************************************************************

Length of real variables = 8 bytes
LINKAGE (V5.03) WITH  4-POINT AUTOSOMAL DATA
LOCUS ORDER:  1 2 3 4
-----------------------------------
-----------------------------------
THETAS  0.500 0.100 0.100
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        1   -32.953612   -14.311541
-----------------------------------
TOTALS      -32.953612   -14.311541
-2 LN(LIKE) =   65.907224
-----------------------------------
-----------------------------------
THETAS  0.400 0.100 0.100
-----------------------------------
PEDIGREE |  LN LIKE  | LOG 10 LIKE
-----------------------------------
        1   -32.924425   -14.298866
-----------------------------------
TOTALS      -32.924425   -14.298866
-2 LN(LIKE) =   65.848850^M
-----------------------------------
. . . . . .

The output file, VPOUT.DAT, produced by the VaryPhen program has two parts. The first part displays all possible disease map positions with the locus order and corresponding interlocus thetas. The second part is a table, which consists of a list of individuals and the difference in lod score when their affection status is changed. The output file for the given example is as follows (detailed explanations are given below):

      POS        ORDER        THETAS
        0      1 2 3 4   0.500 0.100 0.100
        1      1 2 3 4   0.400 0.100 0.100
        2      1 2 3 4   0.300 0.100 0.100
        3      1 2 3 4   0.200 0.100 0.100
        4      1 2 3 4   0.100 0.100 0.100
        5      1 2 3 4   0.000 0.100 0.100
        6      2 1 3 4   0.000 0.100 0.100
        7      2 1 3 4   0.020 0.083 0.100
        8      2 1 3 4   0.040 0.065 0.100
        9      2 1 3 4   0.060 0.045 0.100
     . . . . .

USER CHOSEN POS IS 11

              PHENOTYPE
               CHANGED
  INDIVIDUAL  FROM  TO  |  Zuser   | pos  |   Zmax   | pos  |
=============================================================
 PHENOTYPE AS GIVEN     |   0.36493|  11  |   0.36493|  11  |
-------------------------------------------------------------
      1        2     1  |   0.36630|  11  |   0.36630|  11  |
-------------------------------------------------------------
      2        1     2  |   0.17058|  11  |   0.19328|   5  |
-------------------------------------------------------------
      3        2     1  |   0.35894|  11  |   0.35894|  11  |
-------------------------------------------------------------
      4        1     2  |   0.18061|  11  |   0.21104|   5  |
-------------------------------------------------------------
      5        2     1  |   0.05634|  11  |   0.12071|   5  |
-------------------------------------------------------------
      6        1     2  |  -1.42465|  11  |   0.00000|   0  |
-------------------------------------------------------------
      7        1     2  |  -1.42465|  11  |   0.00000|   0  |
-------------------------------------------------------------
      8        2     1  |   0.02299|  11  |   0.06077|   5  |
-------------------------------------------------------------
      9        1     2  |   0.40062|  11  |   0.40062|  11  |
-------------------------------------------------------------
      10       1     2  |   0.63163|  11  |   0.63163|  11  |
-------------------------------------------------------------
      11       1     2  |   0.36528|  11  |   0.36528|  11  |
-------------------------------------------------------------

The meaning of each column is as follows:

INDIVIDUAL
individual number in the pedigree file [PEDIN.DAT]
PHENOTYPE CHANGED FROM TO
- gives change in affection status
Zuser
- indicates the lod score at the user chosen map position
Zmax
- is the maximum lod score attained
pos
- indicates the map position. There are two such columns. The first indicates the user chosen map position, and the second indicates the map position, at which the lod score is maximized.

The first line in the output table above presents results based on the given (unmodified) pedigree file, while the remaining lines show results when the i-th individual's affection status is changed.

Summary of results for example data: The overall multipoint lod score for disease (locus 1) versus map of loci (loci 2, 3, 4) is 0.36, obtained at map position 11.

If the phenotype of individual 1 is switched from 2 (affected) to 1 (unaffected), the max. lod score is 0.37, obtained at the same map position as before. Hence, individual 1 is not very informative for linkage in this family.

If individual 2 switches affection status from 1 to 2, this results in a drop of the max. lod score from 0.36 to 0.19, obtained at map position 5.

Comparing these changes in maximum lod score for all individuals will show who in the family is most crucial for the analysis.

V. References

Lathrop GM, Lalouel JM, Julier C, Ott J (1984) Strategies for multilocus analysis in humans. Proc Natl Acad Sci USA 81, 3443-3446

Ott J (1990) Genetic linkage analysis under uncertain disease definition. In Banbury Report 33: Genetics and Biology and Alcoholism, edited by C.R. Cloninger and H. Begleiter. Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press, pp. 327-331

Xie X, Ott J (1990) Determining the effect of a change in affection status on the lod score. Am J Hum Genet 47, A205


converted to html by wentian li, august 8, 1996