linkage user's guide (version 5.2)


2.3 Binary Factors

Binary factors (sometimes called "factor-union" notation) can also represent phenotypes for codominant marker data, but this coding is most useful with recessive alleles or with complex systems such as Rh, ABO, and Gm. Each allele is assigned a set of properties, called factors, in such a way that all phenotypes can be specified as the union of two allele sets.

For codominant loci, each allele can be associated with one factor. If n alleles are present, the ith allele is represented by a series of n binary codes with a 0 in all locations, except in the ith position, which contains a 1. For example, in a two allele system the allelic codes are:

     1 0                 (allele 1)
     0 1                 (allele 2)
The three possible phenotypes are:
     1 0                 (union of alleles 1 and 1)
     1 1                 (union of alleles 1 and 2)
     0 1                 (union of alleles 2 and 2)
An unknown phenotype is coded as 0 0. Spaces between the codes are very important; they must be included when entering the phenotypes into the pedigree file as described below.

A locus with three codominant alleles is coded as:

     1 0 0               (allele 1)
     0 1 0               (allele 2)
     0 0 1               (allele 3)
The six possible phenotypes are:
     1 0 0               (union of alleles 1 and 1)
     1 1 0               (union of alleles 1 and 2)
     1 0 1               (union of alleles 1 and 3)
     0 1 0               (union of alleles 2 and 2)
     0 1 1               (union of alleles 2 and 3)
     0 0 1               (union of alleles 3 and 3)
and an unknown phenotype is 0 0 0.

The advantage of the binary factor coding scheme is evident when a recessive disease gene is under study. To code such a system, we could indicate the normal gene by the presence of a single factor (1) and the disease gene by the absence of this factor (0). The phenotype 1 (unaffected) now corresponds to two possible genotypes, either the union of allele 1 and allele 1 (noncarrier) or the union of allele 1 and allele 0 (carrier).

This simple coding is usually not sufficient because both homozygote recessive and unknown phenotypes are coded as 0. To account for this, we introduce a second factor for which a 1 indicates that the phenotype is known, and a 0 that the phenotype is unknown. The allelic codes are:

     1 1                 (allele 1)
     0 1                 (allele 2)
and the possible phenotypes are:
     1 1                 (union of alleles 1 and 1, or alleles 1 and 2)
     0 1                 (union of alleles 2 and 2)
     0 0                 (unknown)                                             
 

previous: 2.3 numbered alleles
next: 2.4 affection status
up: 2. structure of input data