crimap documentation (version 2.4)


4.3 .dat file

This file, which is created by the program (using the prepare option), not the user, has two data structures, each having the following data items:

{# of loci} {# of families} {family id1}{family id2} . . . (may be any character strings not containing blanks) {locus name1} {locus name2} . . . (each locus name consists of at most 15 characters, with no embedded blanks)
For each family:
{# of chromosomes}
{phase chromosomes: each is a character string of length numloci with values 0, 1, or X, denoting the phase}
{# of switches}
For each switch:
{index of locus affected by the switch (in this file, locus indices start at 1, not 0)} {string of length numchroms, with entry 1 if the corresponding chromosome is affected by the switch, 0 otherwise}
The first such data structure has the "phase known" data: there is 1 "family" with all the chromosomes in the data set, and 0 switches. All loci of unknown phase are given phase X, as are loci on the chromosomes of children of identical heterozygotes (the latter restriction is necessary to avoid bias in the estimation of recombination fractions, cf. [ Ott (1985)]. The second data structure contains the full phase information for the data set, arranged by families. (For definitions of switches and phase chromosomes, see [ Green (1988) ]. The "phase known" data structure is used only by the option build.
up: 4. file structure

previous section: 4.2 .par file

next section: 4.4 .ord file