This file supplements the information provided in the program
documentation
(USER file) and the documentation for the Linkage Support Programs.
A. INSTALLATION, 32 bit and 64 bit Windows
B. BRIEF OVERVIEW OF PROGRAM USAGE
E. The LINKLODS program
F. ANALYSIS HINTS
To work with the LINKAGE programs, it is best to reserve a specific directory on your hard disk, for example, C:\LINKAGE. You may want to put all program files into this directory and any data to be analyzed in another directory, for example, C:\LINKAGE\DATA. Transfer all files received to the LINKAGE directory. Windows users can download the program package here, which contains sample data from our Handbook (Terwilliger and Ott 1994). However, it is recommended that you use the FASTLINK version of the LINKAGE programs (under Unix or Linux) because they are more error-free than the LINKAGE programs, and they execute faster (see the readme file). You may want to run the programs the way they come. To use them with different program constants (eg. different maximum number of pedigrees), make the necessary changes (see section C.2). For details on compiling, see section C.3.
IMPORTANT: You cannot simply click on program names. The programs must be run in a command ("DOS") window -- click on Start, then Run, and type cmd. This opens a command box. You may then modify its proprties (font size, etc.). You need to know basic Windows commands to work with a command box. These commands are very similar to the ones in Unix. It is convenient to create a shortcut of the file cmd.exe on your desktop; the cmd.exe file resides in the c:\windows\system32 folder.
Most of the Linkage Support programs (LCP, LSP, makeped1) are 16 bit programs that do not run on 64 bit Windows PCs. To my knowledge, no 32 bit or 64 bit versions of these programs are available (although Dr. David Curtis has a 64 bit version of makeped1), with the exception of LCPwin (this program requires presence of the cygwin1.dll file and follows slightly different commands than the standard LCP; watch the control commands indicated at the bottom of the screen) Here are some ways around this problem.
a) In 64 bit Windows 7 Professional, a Windows XP emulation program is freely available from Microsoft. Installing it is a little involved but once installed it will run 16 bit programs.
b) A program called dosbox is freely available. It is a 32 bit program (thus, runs on 64 bit machines) and emulates a rudimentary DOS box. The 16 bit Linkage Support programs run fine in dosbox under 64 bit Windows. However, dosbox does not seem to support all commands generated by LCP and implemented in the resulting pedin.bat file (it freezes when pedin.bat is invoked). So, the recommendation is to use LCP but, rather than running the resulting pedin.bat file as is, delete from this file all lines except the ones required for LSP, then run this rudimentary pedin.bat file. After that, run the Linkage programs. As an example, I had a small pedigree file called cousin.pre with an accompanying datafile, cousin.dat. The makeped program generated cousin.ped and LCP generated a pedin.bat file. This file was then trimmed to contain only the following lines.
COPY cousin.ped cousin.TPD
COPY cousin.DAT cousin.TDT
LSP mlink cousin.TPD cousin.TDT 2 1 2 0 0.001 1 0.1 0.45 0
After executing this file, the unknown and mlink programs were run in a regular Windows command window.
For LCP and other programs to display correctly, the command
window must recognize the ansi.sys
driver. Depending on your Windows
version, this may be accomplished in different ways as outlined below.
1) Install the Real DOS-Mode Patch for Windows ME from http://www.geocities.com/mfd4life_2000/
2) Use PC Magazine's "ANSI.COM" by Michael
Mefford
instead of
ANSI.SYS, and simply load it from the command-line before running your
old program.
ftp://garbo.uwasa.fi/pc/pcmagutl/ansi132.zip
or
http://www.simtel.net/pub/simtelnet/msdos/pcmag/v8n02.zip
3) Right click on your DOS program's shortcut. Select properties, program, advanced. From here you can create a custom config.sys and autoexec.bat file for your DOS program to use when run. Be sure you direct the config.sys file to the correct path for ansi.sys. Unless you specify a separate config.sys or autoexec.bat file for your DOS program to use it will use the default config.nt and autoexec.nt.
Version 2 appears the simplest and has been reported to work fine.
The procedure is the same as described above for Windows XP. However, you need Administrator privileges to change the config.nt file. The simplest approach is as follows. If you have a shortcut to cmd.exe on your desktop, right-click on that shortcut and select Run as administrator. Otherwise click on Start, start typing comm in the Search box at the bottom left of the screen, and right-click on Command Prompt. Then select Run as administrator. The Administrator:cmd box will take you directly to the c:\windows\system32 folder (if not, you need to cd to that folder). Type notepad config.nt, scroll towards the bottom of the file, copy the line device=%SystemRoot%\system32\himem.sys, and in the copied line replace himem by ansi. The modified config.nt file should then contain the following two lines:
device=%SystemRoot%\system32\himem.sysSave the config.nt file.
Make sure that the LINKAGE directory is accessed by the system by setting the appropriate path (not needed if you are working in the LINKAGE directory). The easiest approach is as follows and works with all Windows versions. In the drive where you generally work, for example, D:, create a directory (folder), for example, D:\bin, and put all programs and batch files into this folder. Using Notepad, create a file containing the following lines:
set dircmd=/p/o
set path=D:\bin;%PATH%
Save it in the bin folder under the name setbin.bat
but as "All Files", not as a "Text File". Assume that you want to put
the LINKAGE programs into a folder C:\LI. Create that folder and also
prepare a file in Notepad containing the following lines:
@echo off
echo *** Setting path to include LI directory
set PATH=c:\LI;%PATH%
Save this file in the bin folder under
the name setli.bat
("All Files"). Save all your Linkage executables in the LI folder.
Whenever you want to work with the Linkage programs, open a command box
(CMD) and type setbin followed on a new line by setli.
The bin and LI folders are
now accessed by the
system as long as you keep the CMD window open. To make permanent
changes to the path proceed as outlined below.
In the Control Panel, click or double click on System / Advanced / Environment Variables. Find the Path variable and edit it so it contains c:\Linkage.
Currently the LINKAGE programs for PCs are furnished for general and 3-generation (CEPH) pedigrees. A third category, programs for experimental crosses in the mouse, is not currently supported by me. This documentation is generally oriented towards the general pedigree version; differences between the two versions are pointed out where necessary. A detailed user manual is available as a pdf file.

The LINKAGE programs require two input files, a "pedfile" holding the pedigree data, and a "datafile" holding the descriptions of the loci, locus order, etc. (pedfile and datafile are the names of the corresponding files in the program code). Preferably, the first step in the linkage analysis is to create the pedigree file. This must be done using a text editor (word processor) capable of producing ASCII files (see section F.7).
Write one line of input for each individual, where the following items must be given for each individual (more detailed information is found in the program manual):
Phenotype symbols depend on the locus type used. Each locus must be coded in one of four possible locus type formats (only Allele Numbers and Binary Factors locus types may be used in the programs for 3-generation pedigrees, and they must specify codominant inheritance). The locus types and corresponding phenotypes are as follows:
b) Allele numbers: two numbers, corresponding to the two alleles present, eg, 2 5 (alleles 2 and 5 present), or 1 2. Also, 0 0 denotes unknown. Homozygotes and hemizygotes (males in X-linked case) must be given two identical numbers. Usually used for RFLPs (co-dominant).
c) Binary factors: a sequence of 0’s and 1’s indicating absence or presence of the i-th factor. Used for dominant marker loci, eg, ABO locus.
d) Quantitative traits: quantitative measurement, eg, CPK level.
One pedigree may be entered after another, each pedigree with its own pedigree id. After the last line is entered, make sure that there are no trailing blank (empty) lines after you exit from the editor. The DOS and other editors append an empty line when you press the <Enter> key at the end of the last input line. So, either you do not press <Enter> at the end of the last line (the cursor then stays at the far right ON the last line), or you insert an end-of-file [EOF] character in column 1 after the last input line. To enter [EOF], press Alt-2-6 (press 2 and then 6 on the NUMERIC KEYPAD while holding down the Alt key); you should then see a small right arrow on the screen.
Save the file under a name with the extension PRE, eg, as SAMPLE.PRE. It is convenient to use the same file name for the input files of a given problem but distinguish datafile and pedfile by using different extensions.
The sample pedigree file so created, SAMPLE.PRE, must now be processed by the MAKEPED program to make it suitable for input to the analysis programs. Invoke the MAKEPED program (actually, the MAKEPED.BAT file) with the input and output file names on the command line, for example, enter
MAKEPED SAMPLE.PRE SAMPLE.PED N
(upper or lower case) where the last N tells the program that no loops are present and that probands should be selected automatically. If N is omitted, follow directions issued by program. Recommended further responses:
Loops present? → n (unless your pedigree contains loops)
Should probands be selected automatically? → y.
If a pedigree contains a marriage or consanguinity loop, answer Y to the corresponding question from the MAKEPED program and indicate one individual per pedigree at which the loop should be broken. If more than one loop is present in any one pedigree (the maximum number of loops is specified by the constant MAXLOOP), proceed as above and identify as many individuals in each pedigree as necessary at which loops should be broken. For example, if in pedigree 1, loops should be broken at individuals 5 and 9, your interaction with the MAKEPED program would look as follows:
Pedigree →
1
Person → 5
Pedigree →
1
Person → 9
Pedigree → 0
MAKEPED will then duplicate each of these individuals and will assign the same positive number (different for each pair) in the proband field (column) to the resulting two duplicated individuals. After exiting from MAKEPED, read the pedigree file into your text editor and verify that MAKEPED has made the appropriate duplications and entries in the proband field. If a duplicate individual is to be the proband, this individual must correspond to the first loop to be broken, and the proband field for the two duplicates has to contain a 1 and a 2 (this rule also applies to a single loop only).
Note that for a pedigree file to be suitable for use by the analysis programs, each individual within a pedigree must be numbered sequentially from 1 through n, except for duplicate individuals (loops broken) who can be out of order, where n is the total number of individuals (including duplicated individuals) in that pedigree. Pedigree id’s, too, must be numbers, but they need not be sequential and can be in any order. It is the MAKEPED program’s job to bring pedigrees into this form required by the LINKAGE programs.
Two example input files (already processed by the MAKEPED program) are provided. PEDIN.DAT contains three-generation pedigrees and one non-CEPH pedigree; PEDIN3.DAT contains only two-generation and three-generation pedigrees and is suitable for testing out the 3-generation programs.
As pointed out above, it is recommended to use the same file names for the same problem but distinguish the associated datafile and pedigree files with the extensions DAT, PRE, and PED, respectively, where PRE refers to the preliminary pedigree file and PED to the one processed by the MAKEPED program. For example, in a study of CF families, the three files would be named CF.DAT, CF.PRE, and CF.PED. For families without loops and automatic proband designation, a third parameter, n, may be given on the command line which tells MAKEPED that no loops are present and that all probands should be chosen automatically. Thus, you might enter: MAKEPED SAMPLE.PRE SAMPLE.PED N.
When loops exist in a pedigree and are not declared in MAKEPED, this error may or may not provoke error messages by the analysis programs. Thus, an undetected loop may lead to an apparently normal termination of the programs yet the resulting likelihoods can be completely wrong. To avoid such problems, a program called LOOP was developed by Xiaoli Xie. It detects marriage and consanguinity loops and is automatically invoked after each run of the MAKEPED program.
The datafile should reflect the loci given for each individual, where the loci are ordered corresponding to the order of the phenotypes in the pedigree file. The datafile is best created using the PREPLINK program. After PREPLINK is invoked, it will present various menus with default assumptions on number of loci, locus types, etc. Proceed in the following manner:
(2) Select locus types. It is important to do this first, before any more specific locus descriptions are given. Changing a locus type will set most other locus parameters back to their default value.
(3) For each locus, look at its parameters ("see or modify a locus") and adjust where needed. For example, for a disease locus, you may want to adjust gene frequencies to 0.99 and 0.01 so that the disease allele is allele number 2. Generally, choose allele 2 as the disease allele.
(4) If everything is correct, go to the main menu and save the file ("write datafile"), preferably with the extension DAT, for example, under the name of SAMPLE.DAT, corresponding to SAMPLE.PED. Exit from PREPLINK. Should you need to modify a previously created datafile, simply invoke PREPLINK and read in that datafile.
To modify an existing datafile, invoke PREPLINK and read in that file. If parameters other than recombination fractions are to be estimated in the ILINK program, you will need to modify the datafile in your text editor after leaving the PREPLINK program. The last line of the datafile (for an ILINK run) contains a series of 1’s and 0’s indicating whether or not a particular parameter should be estimated, that parameter being defined by the order of appearance of the number 1 or 0 (see manual for full details). For example, with 2 loci, if male recombination and female-to-male map distance are to be estimated, there should be two 1’s on the last line of the datafile.
On the second but last line, the number given identifies the locus which may have iterated parameters such as gene frequencies. In this case (only recombination fractions estimated), the value of that number is irrelevant as no locus-specific parameters are estimated. Hint: If no locus-specific parameters are to be estimated, choose a "locus with iterated parameters" with only a small number or no penetrance classes since these may then potentially be estimated, which calls for a large value of the constant MAXN.
Two sample datafiles are provided: DATAIN.DAT may be used in connection with PEDIN.DAT, and DATAIN3.DAT corresponds to PEDIN3.DAT (3-generation families).
The LCP program prepares the data for a series of
production
runs.
You
will be able to make various choices, eg, loci to be used, and to set
parameter
values such as recombination fractions. All these choices will be saved
in a batch file (command file) that you can run by typing its name
after
exiting from the LCP program. The default name of that command file is
PEDIN.BAT, so will have to type PEDIN in your command box to run this
file. To start up LCP, simply type LCP in your command box.
alternatively, you may type LCPWIN, which invokes a differently
compiled version of this program. However, LCPWIN must use the
program-specific keys for text manipulation (see explanations on the
screen).
After you invoked LCP, change the file names presented on the first screen as needed. Usually, you will only have to adjust the names of your pedigree file and datafile (parameter file). When you have chosen these file names, move back and forth among the screens with the PgDn and PgUp keys. However, watch for the screen identified by the title, COMMAND SCREEN, shown in reverse video. Pressing PgDn on such a command screen will save in the batch file the choices you just made, and failure to press PgDn on a command screen will not save these choices. Leave the LCP program by pressing Ctrl-Z.
To execute the runs you selected in LCP, enter the name of the batch file (PEDIN by default). If nothing happens, you failed to press the PgDn key on the Command screen in which case you have to invoke LCP again and repeat the selections desired.
Note the following feature of LCP: When choosing ILINK as the analysis program, generally all recombination fractions between loci will be estimated. If you want to keep some of them fixed at their initial value, enter the recombination fraction with an equal sign in front of it.
You may inspect the PEDIN.BAT file with your text editor.
It
consists
of a sequence of commands (DOS commands and calls to programs).
Essentially,
it extracts loci information from your input files and prepares new
input
files (called datafile.dat and pedfile.dat) for the Unknown program and
then invokes the analysis program. After the runs are completed, all
intermediate
files are deleted. If you do not want intermediate files deleted, you
have
to invoke the command file with the command line parameter NODELETE,
eg,
by entering PEDIN NODELETE. One reason for doing that would be, for
example,
to retain the files (DATAFILE.DAT, PEDFILE.DAT, IPEDFILE.DAT,
SPEEDFIL.DAT)
containing the loci extracted from the original files and to modify
DATAFILE.DAT
so that parameters other than recombination fractions can be estimated
by ILINK; currently, this cannot be done through LCP. After the runs
invoked by PEDIN have completed you may see a message at the end saying
the "speedfile.dat" file was not found. Just disregard this message.
LCP cannot exploit all the features of the analysis
programs
(MLINK,
LINKMAP, ILINK). For example, a female/male distance ratio different
from
1 is not allowed for MLINK although the MLINK program when used
directly
will accept any such ratio, and haplotype frequencies cannot presently
be specified through LCP.
The programs may not carry out calculations when only a
single
locus
is used. For such cases, expand the data by adding a dummy marker locus
at which everybody is homozygous. Also, if a single individual should
be
part of your pedigree data, add two parents with unknown phenotypes and
have these three individuals form one pedigree.
Most of the discussion below refers to Free Pascal. Differences to other Pascal versions are noted where necessary.
A number of constants may be set by the user prior to
recompiling
the
programs. These constants define upper limits for number of loci,
number
of alleles per locus, etc. They reside in the CONST section of the main
programs, for example, MLINK.PAS. Change the appropriate
number;
for example, change MAXLIAB = 20 to MAXLIAB = 30 if this is what you
need.
Then recompile the programs (see below).
The programs discussed here have been compiled with Free Pascal,
which is compatible
with Borland/Turbo Pascal 7.0 but is much less
restricted
than Borland Pascal. Large programs may be compiled with FreePascal.
In Free Pascal, the ERRTRAP procedure reports errors in plain English rather than only providing error numbers (exception: stack overflow; see below). Some of the less than obvious error messages are explained below.
Range check error. One of the constants is too small for the problem to be analyzed. Check each of these constants. For example, the number of haplotypes, h, may have to be as large as the product of the number of alleles for all loci. This error message may occasionally be quite cryptic and it may be difficult to determine which of the constants must be increased. For example, in ILINK, having a large number of penetrance classes requires a high value of MAXN, the max. number of parameters that can be estimated in ILINK, since penetrances may potentially be estimated in ILINK (if at the end of the datafile, the locus with iterated parameters is the one for which penetrance classes are defined).
Stack overflow (error number 202). The program ran out of stack space. This may occur when the stack segment is too small to hold all local variables in which case one must increase the stack size in the M compiler switch (the first of the three numbers in curly brackets) at the beginning of the main program. However, the stack segment is usually large enough and the most common reason for the occurrence of this error is the presence of an undeclared loop in a pedigree.
Heap overflow. There is not enough free (dynamically allocated) memory to hold all the data. This error should only occur when you compile for DOS real mode. A program running in DOS protected mode or under Windows can address up to 16MB of memory. To reduce memory requirements the following actions may be taken:
1) Reduce program constants to
their
smallest
possible
values.
2) Analyze only one pedigree at a time
and set the
max. number of pedigrees to 1.
3) Set the compiler switch to R–. Note
that this may
freeze the computer when an array bound is exceeded.
Data segment too large. The variables and arrays
occupy
too
much
memory. Reduce some of the program constants to make array sizes
smaller,
or go from double to single precision. It may happen that for the same
programs this error occurs when compiling for Windows but not for DOS.
This batch file allows running any one of the Linkage programs without going through the LCP shell provided that all the loci in the data file are to be analyzed (no possibility of extracting loci). To initiate this batch file, execute the command
RUN DATNAME PEDNAME PROGNAME
where DATNAME is the name of the file holding the locus descriptions (the datafile, as processed by the PREPLINK program), PEDNAME refers to the file holding the pedigree data (as processed by the MAKEPED program), and PROGNAME is the name of the program to be used.
The major reason for using the RUN batch file is to be
able to
make
use of some features not implemented in LCP (see end of section on
LCP, above), in particular, haplotype frequencies which may be
important
in risk calculation.
Lathrop GM, Lalouel JM, Julier C, Ott J: Strategies for multilocus linkage analysis in humans. Proc Natl Acad Sci USA 81:3443- 3446, 1984
Lathrop GM, Lalouel JM, Julier C, Ott J: Multilocus linkage analysis in humans: detection of linkage and estimation of recombination. Am J Hum Genet 37:482-498, 1985
Ott J: Analysis of Human Genetic Linkage, 3rd
edition. Johns Hopkins
University
Press, Baltimore, 1999
Terwilliger JD, Ott J: Handbook of Human Genetic
Linkage.
Johns
Hopkins University Press, Baltimore, 1994
The LINKLODS program reads output (FINAL.OUT file) from the LINKMAP or MLINK (LINKAGE) program and, for each family, converts log likelihoods to lod scores. Lod scores may also be obtained as an option from the LRP program but using the LINKLODS program may be more straightforward.
In the input file (FINAL.OUT) to LINKLODS, for a collection of families, an initial set of likelihoods with one of the theta values being equal to 0.5 must precede those sets of likelihoods for which lod scores should be calculated. Several such initial ‘baseline’ sets of likelihoods may occur throughout the input file.
Resulting lod scores will be written to the file FINAL.LOD, and an existing file by that name will be overwritten.
Notice that the LINKLODS program makes certain rigid
assumptions on
the structure of the input file as produced by the LINKAGE programs.
For
example, the first likelihood must be on the fourth line after the
line,
which lists the theta values. Therefore, if the input file has been
manipulated,
the LINKLODS program may no longer be able to process it properly and
will
issue error messages.
In the LIPED program, each phenotype is associated with
an array of
penetrances, that is, the conditional probabilities that the phenotype
is observed given a genotype. In the Linkage programs, one may code
phenotypes
in several ways, depending on the type of locus considered (binary
factors,
affection status or quantitative phenotypes locus). With a binary
factors
locus, one may code for codominant or dominant phenotypes but not both
types mixed. This sometimes poses a problem, for example, in the
following
situation. Assume a locus with two alleles, A and B, whose individual
presence
in a person can usually be detected (codominant situation). Sometimes,
however, a test is used that detects A only (dominant situation). Using
conditional probabilities (penetrances), this situation is represented
in LIPED as follows:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In the Linkage programs, it is not possible to allow for all these phenotypes at a binary factor locus. A simple general solution for using tables such as the one above in the Linkage programs is as follows. Define the locus in question as an affection status locus with as many liability classes as there are columns in the table above. In the pedigree file of the Linkage programs, each phenotype is then represented by two numbers, 2 i, where i is the column number in the table above, that is, each individual is defined to be affected, except that the unknown phenotype is coded as 0 1. Each column in that table represents a liability class whose penetrances (the entries in the column) must be furnished in the datafile.
This coding scheme may be wasteful in the number of
liability
classes
needed. Depending on the particular situation, one may be able to apply
a similar coding scheme requiring a smaller number of penetrance
classes.
In the given example, above, a possible solution is the following.
Define
an individual as affected when the A allele is
detected, and
distinguish
3 liability classes, depending on whether the A
allele is seen
in
the dominant or codominant situation. The correspondence between
phenotypes
in Liped and in Linkage is then as follows:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In the datafile, the following penetrances must be given
for
each
genotype
and each liability class, 1-3:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Generally, in the LINKAGE programs, one would code phenotypes (CK levels for females, aff. or unaff. for males) as a quantitative trait locus. A special case in which simple coding as an affection status locus is possible is the following.
Males:
affected or not affected
Females (not affected): CK+ or CK- (CK = creatine kinase
level,
high or low)
Alleles:
D = disease allele, d = normal allele
Possible coding scheme for LIPED:
-------------------------------
Phenotypes
------------------
females males
---------- ---------- ------
Genotype AF CK+
CK- AFF NA <-
phenotype codes
-------------------------------
D/D or D/y
1 0
0
1 0
D/d
0 .66 .34 * *
d/d or d/y 0 .05
.95 0
1
-------------------------------
Unknown: special phenotype, eg, blank.
AF = affected female
* = value irrelevant (X-linked case)
In LINKAGE, this case may be treated by the general method outlined in section 1, above, leading to 3 penetrance classes. To code for such a situation with a single liability class, one may adopt the following coding scheme in the LINKAGE programs:
Define disease status as having an elevated CK value. This works fine when only unaffected females are observed (usual situation).
--------------------------
Phenotypes in
LIPED MLINK (in pedfile)
--------------------------
CK+
2
\ unaffected
CK-
1
/ females
AFF
2
affected male
NA
1
unaffected male
--------------------------
Unknown phenotype: 0
In the datafile, the penetrances (= probabilities of being affected) are given as follows:
Females
Males
Genotype Penetrance Allele Penetrance
------------------- -----------------
1 /
1
1
1
1
1 / 2
.66
2
0
2 / 2
.05
-----------------
-------------------
Note that "affected" (CK+) females potentially are homozygous for the disease allele (CK– females still cannot be homozygous for the disease allele). If this is undesired, or if truly affected females are present, one better uses the scheme with 3 penetrance classes corresponding to the LIPED notation.
To identify an obligate heterozygote in LIPED, one might label such an individual with the phenotype NA2 and define the following penetrances:
Phenotypes
Genotype AFF NA1 NA2 <- phenotype
codes
---------------------
D / D
1 0
0
D / d
0 1
1
d / d
0 1
0
---------------------
Again, this case may be treated as outlined in section 1, above. Using only 2 rather than 3 liability classes, one may define these in the datafile as follows:
Penetrance
class
Genotype
1 2
-------------------------
D /
D
1
1
D /
d
0
0
d /
d
0
1
-------------------------
In the pedfile, the following phenotype codes are used:
Phenotypes in
LIPED MLINK
-------------
AFF 2 1
NA1 1 1
NA2 1 2
-------------
In multipoint linkage analysis, for a given family pedigree, it sometimes happens that all individuals have not been tested at one of the loci and thus have phenotype ‘unknown’ at that locus. In the present implementation of the Linkage programs, the presence of many unknowns slows down execution speed. There is, however, a simple remedy. If everybody in that pedigree is given the same homozygous phenotype (uniquely identifying the homozygous genotype), this will not change the lod score but will considerably increase computing speed. This feature has now been implemented in the UNKNOWN program except when allele frequencies are to be estimated.
With new data and several marker loci, it is often useful to first find or confirm estimates of interlocus distances, that is, to run the ILINK program for the marker loci only. However, before doing that, it is a good idea to do one run with the MLINK program to verify that the likelihood is nonzero in all pedigrees. If the likelihood is zero in one or more pedigrees, for example, due to genotype inconsistencies, then the ILINK program will still try to maximize the likelihood and will, of course, fail but only after running for a possibly very long time.
With X-linked recessive deleterious traits, for a female founder individual (no parents in pedigree), the prior probability, q, of being a carrier of the disease gene is a multiple of the mutation rate, μ. For example, in Duchenne muscular dystrophy (DMD), q = 4μ (Murphy and Chase, "Principles of Genetic Counseling"). In the likelihood calculation of pedigree data, on the other hand, the prior probability of a founder’s genotype is always determined by the gene frequency, p. The prior probability that a founder woman is heterozygous is given by 2p(1 – p). To implement the prior probability, q, that she is heterozygous for an X-linked recessive deleterious gene, in the likelihood calculation, one must choose the gene frequency of the deleterious gene, p, such that q = 2p(1 – p) or, approximately, p = q/2. For example, in DMD, when the mutation rate is assumed to be equal to μ, the gene frequency of the disease allele must be taken to be equal to p = 2μ.
Input files to the LINKAGE programs must be created in
ASCII
format
(text files). Word
processors such as WordPerfect or Word write files in their own format
but are
capable for producing text files when specifically instructed to do so.
In Windows, a convenient text editor is NOTEPAD. Also, the freely
available
Crimson Editor
is highly
recommended, particularly because it displays line numbers. In Unix,
the joe editor or pico are recommended unless you
are familiar with one of the standard Unix editors.
In some of the input files to the LINKAGE programs, it is
important
that no empty (blank) lines follow the last input line. To avoid such
trailing
blank lines, press Ctrl-End, which will position the cursor at the end
of the file. If this position is not in column 1 of the line
immediately
following the last input line, press the Backspace key repeatedly until
the cursor is all the way to the left on the last input line.