crimap documentation (version 2.4)


4.2 .par file

This file may either be created using the prepare option, or directly using a text editor. The format has been changed effective with version 2.4, and differs somewhat from that of the other files. Each line contains a parameter or variable name and its value(s) (separated by spaces), and concludes with an asterisk *. The parameters may appear in any order within the file. If a parameter is omitted (either by omitting the corresponding line altogether, or by omitting the value between the parameter name and the asterisk), then CRI-MAP will automatically assign a default value to the parameter.

In the following descriptions, each line ending in an asterisk is displayed as it would appear in the .par file, and gives a parameter name together with its default value(s) (if any). Any number of hap_sys, hap_sys0, and fixed_dist lines may appear in the file. The other parameters should appear only once. (If they appear more than once, only the last specified value will be used).

dat_file chrx.dat *
gen_file chrx.gen *
ord_file chrx.ord *
These give the names of the .dat, .gen, and .ord files to be used in the analysis. When the name of the .par file is not of the form chrx.par, the default names for the above files are obtained by replacing the ".par" (which must always appear) in the name of the .par file with ".dat", ".gen", or ".ord", respectively. (This is sometimes useful if one wants to keep all files in a separate directory from the program itself, in which case the name of the .par file passed to crimap may include the full path name; the default names for the other files will then include the same path).

nb_our_alloc 3000000 *
The initial memory allocation (in bytes). The default value will suffice for most runs, except with a very large number of loci, and in fact is much more than is usually needed for analyzing a small number of loci. If additional memory is needed during the run, it will be allocated automatically (up to the limits of your system); however it is advantageous to choose values for the initial allocation which will suffice for the entire run. See "Memory management" under TECHNICAL NOTES.

SEX_EQ 1 *
SEX_EQ is 1 if recombination rates are assumed equal in the two sexes, 0 if sex-specific rates are to be allowed in each interval.

TOL .01 *
The tolerance for determining convergence of the layered EM algorithm; when log10 likelihoods from successive "phase unknown" iterations increase by less than this amount, iteration terminates. The tolerance used to detect convergence in the "noninformative locus" part of layered EM, and in the option twopoint is TOL/10. In extensive tests using our RFLP CEPH family data sets (several hundred maximum likelihood estimations) the default value .01 was always found to be adequate (it was occasionally not adequate when ordinary EM was used as the search method, instead of layered EM). If you are concerned about the possibility that this may not be stringent enough for for your data set (for example, if the likelihood surface is relatively flat), try using .001 instead; the linear nature of EM convergence guarantees that, if the estimates had not converged with TOL = .01, then a substantial improvement in likelihood should be apparent with the more stringent tolerance. If such an improvement is seen, use successively smaller values for TOL until no further improvement in the likelihood results.

PUK_NUM_ORDS_TOL 6 *
Applies only to the option build; gives the maximum number of orders allowed in the current map, in the phase unknown part of the analysis.

PK_NUM_ORDS_TOL 8 *
Similar to PUK_NUM_ORDS_TOL, but applies instead to the phase known analysis in build. If 0, the phase known analysis will be skipped entirely during mapbuilding.

PUK_LIKE_TOL 3.0 *
The tolerance for discarding locus orders. If the log10 likelihood of an order is less than the log10 likelihood of some other order for the same loci by an amount exceeding PUK_LIKE_TOL, that order is discarded (or not printed). Used by mapbuilding options, all, and flipsn. With twopoint, LOD tables are displayed only for locus pairs whose LOD exceeds PUK_LIKE_TOL.

PK_LIKE_TOL 3.0 *
As above, but applies only to analysis of the phase known data in the option build (and is not used by the other options).

use_ord_file 0 *
This parameter applies only to the options all and flipsn. When it is 1, the orders generated by those options are prescreened against the .ord file to eliminate orders incompatible with the orders database, prior to computing likelihoods. When use_ord_file is 0, the information in the orders database is not used.

write_ord_file 1 *
Applies only to the option build. When it is 1, the results of the current build run are used to update the orders database. When it is 0, the orders database will not be updated (but will still be used to prescreen orders during the course of the run).

ordered_loci {index # of 1st ordered locus} {index of 2d ordered locus} . . . *
inserted_loci {index # of 1st inserted locus} {index of 2d inserted locus} . . . *
(Note: locus indices start at 0).
For all options except twopoint, the "ordered_loci" are assumed to be in their known, unique order; the remaining loci, called the "inserted_loci", are to be placed in the framework defined by the ordered loci. For the options chrompic, fixed, and flipsn, there are no inserted loci. For twopoint, if both ordered_loci and inserted_loci are specified, then LOD tables are only computed for pairs of loci for which one is in the ordered list and the other is in the inserted list. Otherwise the analysis uses all pairs of loci in the specified list (ordered or inserted).

hap_sys {index # of 1st locus in system} {index # of 2d locus} . . . *
hap_sys0 {index # of 1st locus in system} {index # of 2d locus} . . . *
Each haplotyped system is a list of loci (for example, different RFLPs detected by the same probe) which are to be grouped together in an analysis. The first locus in any system is called "primary", and the remaining loci are called "secondary". When the parameter use_haps (see below) is 1, the secondary loci in a system "tag along" with the primary locus whenever ordered sets of loci are constructed. The operations which construct new orders (for example, by inserting a new locus into the map, or by permuting a collection of loci) utilize only the primary loci; once the order is constructed, however, it is "filled out" by inserting secondary loci immediately following the corresponding primary locus, prior to calculating likelihoods and map distances. Secondary loci are automatically deleted from the input lists of ordered loci and inserted loci. Thus any system which is to be included in the analysis must be represented by (at least) its primary locus.

For systems specified using hap_sys, the loci within a system are treated as independent in all calculations; i.e. they are not forced to have 0 recombination fraction. In particular, intralocus recombinants between loci in the same system are permitted. For systems specified using hap_sys0, distances within the system are forced to 0 (the program will stop, displaying the message ERROR: 0 likelihood, if there are in fact intralocus recombinants).

Example: two haplotyped systems, the first having loci 3, 4, and 5 with distances not forced to 0.0, and the second having loci 9 and 11 with distances forced to 0.0, would be entered in the .par file as

        hap_sys   3   4   5   *
	hap_sys0   9   11   *

	use_haps   1   *
      

When use_haps is set to 1, haplotyping is performed; when it is 0, any haplotyped systems specified in the .par file are ignored (i.e. the input lists of loci are taken as is, and no secondary loci are deleted or inserted).

fixed_dist {rec. frac.} {index # of 1st locus} {index # of 2d locus} {sex (optional)} * 

For the options fixed and chrompic (only), the recombination fraction between a pair of adjacent loci may be held fixed using fixed_dist. If the recombination fraction to be fixed is sexspecific, specify the sex as 0 for female, or 1 for male. If either locus is part of a haplotyped system, it must be the primary locus from its system (it is not possible to force a distance within a haplotyped system, except by using hap_sys0 as described above). Note: any recombination fractions held fixed, either using fixed_dist or hap_sys0, are flagged by an asterisk in the map displayed following the analysis.

The last line of the .par file must contain the single word END.

Example: if you wished to construct a .par file chr7a.par to perform the all analysis described in GETTING STARTED, above, the following three lines would suffice (since default values are to be used for all other parameters):

     ordered_loci   2   8   * inserted_loci   9   10   * END

up: 4. file structures

previous section: 4.1 .gen file

next section: 4.3 .dat file