crimap documentation (version 2.4)
4.3 .dat file
This file, which is created by the program (using the prepare option), not
the user, has two data structures, each having the following data items:
- {# of loci} {# of families} {family id1}{family id2} . . . (may be any
character strings not containing blanks) {locus name1} {locus name2}
. . . (each locus name consists of at most 15 characters, with no
embedded blanks)
- For each family:
- {# of chromosomes}
- {phase chromosomes: each is a character string of length numloci
with values 0, 1, or X, denoting the phase}
- {# of switches}
- For each switch:
- {index of locus affected by the switch (in this file,
locus indices start at 1, not 0)} {string of
length numchroms, with entry 1 if the corresponding
chromosome is affected by the switch, 0 otherwise}
The first such
data structure has the "phase known" data: there is 1 "family"
with all the chromosomes in the data set, and 0 switches. All
loci of unknown phase are given phase X, as are loci on the chromosomes
of children of identical heterozygotes (the latter restriction
is necessary to avoid bias in the estimation of recombination
fractions, cf. [ Ott (1985)]. The second data structure
contains the full phase information for the data set, arranged by
families. (For definitions of switches and phase chromosomes,
see [ Green (1988) ]. The "phase known" data structure is used
only by the option build.
up: 4. file structure
previous section: 4.2 .par file
next section: 4.4 .ord file