A Glossary of Statistics
(in particular, re-randomisation statistics)
by Norman Marsh ( jw34@liverpool.ac.uk ), Feb 28, 96
VERSION 1.0
The original source of this file is at
ftp://www.mailbase.ac.uk/pub/lists-a-e/exact-stats/files/glossary.txt
Note: "[ ]" are used at the beginning of some entried to enclose
etymological or biographic information.
See also, statistics glossary by Velerie J. Easton and John H. McColl.
See also, statistics glossary from Electronic Textbook STATSOFT.
See also, statistics glossary from STATLETS
See also, HyperStat (online introductory textbook) from Rice Virtual Lab in Statistics
A
ALGORITHM(1)
A formal statement, clear complete and unambiguous, of how a certain
process needs to be undertaken. Also see :
ALGORITHM(2).
ALGORITHM(2)
An
ALGORITHM(1) expressed in a
PROGRAMMING LANGUAGE for a COMPUTER .
ALPHA
ANSI
BERNOUILLI PROCESS
BETA
BINOMIAL DISTRIBUTION
BINOMIAL TEST
BOOTSTRAP
BRANCH-AND-BOUND
'C'
CHI-SQUARED DISTRIBUTION
CHI-SQUARED STATISTIC
COMPILER
COMPUTER
COMPUTER PROGRAM
CONFIDENCE INTERVAL
CONTINUOUS DISTRIBUTION
DECISION RULE
DEGREES OF FREEDOM
DIFFERENCE OF MEANS
DISCRETE DISTRIBUTION
ERROR TYPES
EQUIVALENT TEST STATISTIC
EXACT BINOMIAL TEST
EXACT-STATS
EXACT TEST(1)
EXACT TEST(2)
EXHAUSTIVE RE-RANDOMISATION
EXPERIMENTAL DESIGN
EXTENDED PASCAL
FACTORIAL
FISHER TEST(1)
FISHER TEST(2)
FORTRAN
FREEMAN-HALTON TEST
GOLD STANDARD(1)
GOLD STANDARD(2)
INTERPRETER
INTERVAL SCALE
ISO
LOGISTIC REGRESSION
MANN-WHITNEY TEST
MEASUREMENT TYPE
MID-P
MINIMAL-CHANGE SEQUENCE
MONTE-CARLO TEST
MULTINOMIAL DISTRIBUTION
NOMINAL ALPHA CRITERION LEVEL
NOMINAL SCALE
NON-PARAMETRIC TEST
NORMAL DISTRIBUTION
NULL HYPOTHESIS
OBJECT CODE
ODDS RATIO
ORDINAL SCALE
OUTCOME VALUE
P-VALUE
PAS2C
PASCAL
PERMUTATION
PERMUTATION TEST
PITMAN PERMUTATION TEST(1)
PITMAN PERMUTATION TEST(2)
POISSON DISTRIBUTION
POISSON PROCESS
POPULATION
POWER
PROGRAM
PROGRAMMABLE
PROGRAMMING LANGUAGE
PSEUDO-RANDOM
RANDOM SAMPLE
RANDOMISATION(1)
RANDOMISATION(2)
RANDOMISATION(3)
RANDOMISATION DISTRIBUTION
RANDOMISATION SET
RANDOMISATION TEST
RANKED DATA
RATIO SCALE
RE-RANDOMISATION
RE-RANDOMISATION STATISTICS
RELATIVE POWER
REPEATED-MEASURES
REPLICATIONS
REPRESENTATIVE
RESAMPLING STATS
RNG
SACROWICZ & COHEN CRITERION
SAMPLE
SAMPLE SIZE
SCALE TYPE
SEED
SHIFT ALGORITHM
SIGNIFICANCE
SIZE
STANDARD PROGRAMMING LANGUAGE
STATISTIC
STATISTICAL SIGNIFICANCE
STEVENS' TYPOLOGY
STRATIFIED
TAIL
TAIL DEFINITION POLICY
TEST STATISTIC
TIED RANKS
TIED VALUES
TWO-WAY TABLE
TYPE-1 ERROR
WILCOXON RANK-SUM TEST
WILCOXON TEST(1)
WILCOXON TEST(2)
2-WAY TABLE
2-BY-2 TABLE
Also known as SIZE
or TYPE-1 ERROR. This is the probability that,
according to some null hypothesis, a statistical test will generate
a false-positive error : affirming a non-null pattern by chance.
Conventional methodology for statistical testing is, in advance of
undertaking the test, to set a
NOMINAL ALPHA CRITERION LEVEL (often
0.05). The outcome is classified as showing
STATISTICAL SIGNIFICANCE
if the actual ALPHA (probability of the outcome under the null
hypothesis) is no greater than this
NOMINAL ALPHA CRITERION LEVEL
(but see : TAIL DEFINITION POLICIES). This reasoning is applicable
for all types of statistical testing, including
RE-RANDOMISATION STATISTICS
which are the concern of this present glossary. Also see
: BETA,
ERROR TYPES, P-VALUE.
[Initials/acronym for the American National Standards Institute]
This body publishes specifications for a number of
STANDARD PROGRAMMING LANGUAGES.
The specifications are generally arranged to
concur with those of ISO.
B
[()] This is the simplest probability model - a single trial
between two possible outcomes such as a coin toss. The distribution
depends upon a single parameter,'p', representing the probability
attributed to one defined outcome out of the two possible outcomes.
Also see : BINOMIAL DISTRIBUTION, POISSON PROCESS.
Also known as
TYPE-2 ERROR, BETA is the complement to
POWER : BETA = (1-POWER).
This is the probability that a statistical test will
generate a false-negative error : failing to assert a defined
pattern of deviation from a null pattern in circumstances where the
defined pattern exists. Conventional methodology for statistical
testing is to set in advance a
NOMINAL ALPHA CRITERION LEVEL - the
corresponding level for BETA will depend upon the
NOMINAL ALPHA CRITERION LEVEL
and upon further considerations including the
strength of the pattern in the data and the sample size. Interest
is generally in the
RELATIVE POWER of different tests rather than in
an absolute value. It is questionable whether the concept of BETA
error is properly applicable without considering the concept of
sampling from a population, which is separate from the concerns of
this Glossary. Applicability of this reasoning is also closely
bound up with the choice of
TEST STATISTIC. Also see :
ERROR TYPES.
This is a special case of the MULTINOMIAL DISTRIBUTION where the
number of possible outcomes is 2. It is the distribution of
outcomes expected if a certain number of independent trials are
undertaken of a single BERNOUILLI PROCESS (e.g. multiple tosses of a
coin, or tosses of several coins with identical characteristics).
The distribution depends upon the single parameter,'p', of the
corresponding BERNOULLI PROCESS and upon the number of trials, 'n'.
An alternative characterisation is as the outcome of two separate
POISSON PROCESSEs with separate rate parameters.
This is a statistical test referring to a repeated binary process
such as would be expected to generate outcomes with a BINOMIAL
DISTRIBUTION. A value for the parameter 'p' is hypothesised (null
hypothesis) and the difference of the actual value from this is
assessed as a value of
ALPHA. Also see : EXACT BINOMIAL TEST.
[()] This is a form of RANDOMISATION TEST which is one of the
alternatives to EXHAUSTIVE RE-RANDOMISATION. The BOOTSTRAP scheme
involves generating subsets of the data on the basis of random
sampling with replacements as the data are sampled. Such resampling
provides that each datum is equally represented in the randomisation
scheme; however, the BOOTSTRAP procedure has features which
distinguish it from the procedure of a MONTE-CARLO TEST. The
distinguishing features of the BOOTSTRAP procedure are concerned
with over-sampling - there is no constraint upon the number of times
that a datum may be represented in generating a single resampling
subset; the size of the resampling subsets may be fixed arbitrarily
independently of the parameter values of the EXPERIMENTAL DESIGN and
may even exceed the total number of data. The positive motive for
BOOTSTRAP resampling is the general relative ease of devising an
appropriate resampling
ALGORITHM(1) when the EXPERIMENTAL DESIGN is
novel or complex. A negative aspect of the BOOTSTRAP is that the
form of the resampling distribution with prolonged resampling
converges to a form which depends not only upon the data and the
TEST STATISTIC, but also upon the BOOTSTRAP resampling subset size -
thus the resampling distribution should not be expected to converge
to the GOLD STANDARD(1) form of the EXACT TEST as is the case for
MONTE-CARLO resampling. An effective necessity for the BOOTSTRAP
procedure is a source of random codes or an effective PSEUDO-RANDOM
generator.
Exploration of a RANDOMISATION DISTRIBUTION in such a way as to
anticipate the effect of the next RANDOMISATION(3) relative to the
present RANDOMISATION(3). This allows selective search of
particular zones of a RANDOMISATION DISTRIBUTION; in the context of
a RANDOMISATION TEST such selective search may be concerned with the
TAIL of the RANDOMISATION DISTRIBUTION. Also see : RANOMISATION
TEST(1).
C
[Named as one of a developmental sequence of theoretical programming
languages : 'A', 'B' (also the useful language BCPL)]. A
PROGRAMMING LANGUAGE of broad expressive power; thus suitable for
both numerical and general programming. 'C' is closely
associated with the construction of the ubiquitous computer
operating system 'unix'. COMPILERS for 'C' are supplied for
virtually all modern computers. 'C' is available as a STANDARD
PROGRAMMING LANGUAGE approved by ANSI and ISO.
Where expected frequencies are sufficiently high, hypothesised
distributions of counts may be approximated by a NORMAL DISTRIBUTION
rather than an exact BINOMIAL DISTRIBUTION. The corresponding
distribution of the CHI-SQUARED STATISTIC can be derived
algebraically - this is the CHI-SQUARED DISTRIBUTION which has been
computed and published historically as extensive printed tables.
Use of the tables is notably simple, as the CHI-SQUARED DISTRIBUTION
depends upon only one parameter, the DEGREES OF FREEDOM, defined as
one less than the number of categories.
[Named by E.S. Pearson ()?]. This is a long-established TEST
STATISTIC for measuring the extent to which a set of categorical
outcomes depart from a hypothesised set of probabilities. It is
calculated as a sum of terms over the available categories, where
each term is of the form : ((O-E)^2)/E ; 'O' represents the observed
frequency for the category and 'E' represents the corresponding
expected frequency based upon multiplying the sample size by the
hypothesised probability for the category being considered
(therefore 'E' will generally not be an integer value). In
situations where the number of categories is 2 an alternative
procedure is to use an EXACT BINIOMIAL TEST. Also see : CHI-SQUARED
DISTRIBUTION, MULTINOMIAL DISTRIBUTION, POISSON PROCESS.
A PROGRAM supplied especially for a particular type of COMPUTER, to
enable the translation of code expressed in some PROGRAMMING
LANGUAGE into OBJECT CODE for that COMPUTER. A
COMPILER undertakes
translation of the whole of the user's PROGRAM to produce an OBJECT
CODE version which is complete, undivided and potentially permanent;
this is in contrast to the action of an INTERPRETER.
An automatic data-processing device which is PROGRAMMABLE. Also see
: COMPUTER PROGRAM, OBJECT CODE, PROGRAM.
A specification of how to undertake a certain process, usually
expressed via a PROGRAMMING LANGUAGE, for some chosen COMPUTER.
Also see : PROGRAM.
For a given RE-RANDOMISATION distribution, a family of related
distributions may be defined according to a range of hypothetical
values of the pattern which the TEST STATISTIC measures. For
instance, for the PITMAN PERMUTATION TEST(2) to test for a scale
shift between two groups, a related distribution may be formed by
shifting all the observations in one group by a common amount, where
this common shift is regarded as a continuous variable. With finite
numbers of data the number of related distributions will be finite,
and typically considerably smaller than the number of points of the
RANDOMISATION DISTRIBUTION. The likelihood of the OUTCOME VALUE may
be calculated for each distribution in the family, and these
likelihoods may be then used to define a contiguous set of values
which occupy a certain proportion of the total unit weight of the
likelihoods integrated over all values of the TEST STATISTIC. The
CONFIDENCE INTERVAL is defined by the minimum and maximum values of
the range of values so defined. The proportion of the total weight
within the range of values is regarded as an
ALPHA probability that
the value of the TEST STATISTIC lies within this range. Generally
the definition of a CONFIDENCE INTERVAL cannot be unique without
imposing further constraints. Approaches to providing suitable
constraints, such that a CONFIDENCE INTERVAL will be unique,
include defining the CONFIDENCE INTERVAL : to include the whole of
one TAIL of the distribution; or to be centred in some sense upon
the OUTCOME VALUE; or to be centred between TAILS of equal weight.
In the case of RE-RANDOMISATION DISTRIBUTIONs, these are DISCRETE
DISTRIBUTIONS so there will generally be no range of values with
weight corresponding exactly to an arbitrary
NOMINAL ALPHA CRITERION LEVEL,
and the problem of non-uniqueness is therefore not generally
solvable.
A probability distribution of a continuous STATISTIC, based upon an
algebraic formula, such that for any possible value of the
cumulative probability there is an exact corresponding value of the
STATISTIC in question. Also see : DISCRETE DISTRIBUTION.
D
A rule for comparing the OUTCOME VALUE of
ALPHA with a NOMINAL ALPHA CRITERION LEVEL
(such as 0.05). An OUTCOME VALUE smaller (more
extreme) than the
NOMINAL ALPHA CRITERION LEVEL leads to a decision
of STATISTICAL SIGNIFICANCE of the finding that the TEST STATISTIC
has a value other than its (null-) hypothesised value. Also see :
STATISTICAL SIGNIFICANCE, TAIL-DEFINITION POLICY.
An integer value measuring the extent to which an EXPERIMENTAL
DESIGN imposes constraints upon the pattern of the mean values of
data from various meaningful subsets of data. This value is
frequently referred to in the organisation of tables of statistical
distributions used in undertaking SIGNIFICANCE TESTS. For simple
one-way classifications the value of DEGREES OF FREEDOM is defined
as one less than the number of subsets.
A TEST STATISTIC of intuitive appeal for measuring difference in
location between two samples with INTERVAL-SCALE data. Employing
this TEST STATISTIC in an EXACT TEST defines the PITMAN PERMUTATION
TESTs(1 or 2).
A probability distribution of some STATISTIC, based upon an
algebraic formula or upon re-randomisation or upon actual data, in
which the cumulative probability increases in non-infinitesmal steps
corresponding to non-infinitesmal weight associated with possible
values of the STATISTIC in question. This situation is
characteristic of RANDOMISATION DISTRIBUTIONs, and also of TEST
STATISTICs which are essentially discrete. Also see : CONTINUOUS
DISTRIBUTION.
E
See :
ALPHA,
BETA,
TYPE-1 ERROR, TYPE-2 ERROR.
Within a RANDOMISATION SET, it is possible that two different
STATISTICs may be inter-related in a manner which is provably
monotonic irrespective of the data. In such a situation a
RANDOMISATION TEST performed on either of these TEST STATISTICs
will necessarily have the same outcome in terms of
ALPHA. If one of
the STATISTICs is of good descriptive validity whereas the other is
simpler to compute, then a RANDOMISATION TEST upon the simpler
STATISTIC may be used in place of a test upon the descriptively more
valid one, with corresponding savings in amount of computation
required. An example of such EQUIVALENT TEST STATISTICs occurs for
the situation of comparison of levels of a single INTERVAL-SCALE
variable between two groups. In this situation, the descriptively
valid statistic, as defined for the PITMAN PERMUTATION TEST(1), is
the difference of means, but simpler EQUIVALENT TEST STATISTICS
include the mean for one designated group, or (most simply) the
total of scores in one designated group.
A STATISTICAL TEST referring to the BINOMIAL DISTRIBUTION in its
exact algebraic form, rather than through continuous approximations
which are used especially where sample sizes are substantial. Also
see EXACT TEST(1).
This is the name of the academic initiative which produced this
present glossary. EXACT-STATS is a closed e-mail based discussion
group for the development and promulgation of the ideas of
re-randomisation statistics. The contact address is :
exact-stats@mailbase.ac.uk .
The characteristic of a RE-RANDOMISATION TEST based upon EXHAUSTIVE
RE-RANDOMISATION, that the value of
ALPHA will be fixed irrespective
of any random sampling of RANDOMISATIONS or upon any distributional
assumptions. Notable examples are the EXACT BINOMIAL TEST, FISHER
TEST(1), the PITMAN PERMUTATION TESTs(1 and 2), and various
NON-PARAMETRIC TESTs based upon RANKED DATA.
A test which yields an
ALPHA value which does not depend upon the
NOMINAL ALPHA CRITERION VALUE
which may have been set for
ALPHA.
This is in contrast to the possible practice of producing only a
yes/no decision with regard to a
NOMINAL ALPHA CRITERION VALUE.
Note that this reference to exactness is not (sic) the concern of
the EXACT-STATS initiative.
A series of samples from a RANDOMISATION SET which is known to
generate every RANDOMISATION. In particular, sampling which
generates every RANDOMISATION exactly once.
This term overtly refers to the planning of a process of data
collection. The term is also used to refer to the information
necessary to describe the interrelationships within a set of data.
Such a description involves considerations such as number of cases,
sampling methods, identification of variables and their scale-types,
identification of repeated measures and replications. These
considerations are essential to guide the choice of TEST STATISTIC
and the process of RE-RANDOMISATION. Also see : DEGREES OF FREEDOM,
REPEATED MEASURES, REPLICATIONS, STRATIFIED, TWO-WAY TABLE.
See : PASCAL.
F
The FACTORIAL operator is applicable to a non-negative integer
quantity. It is notated as the postfixed symbol '!'. The resulting
value is the product of the increasing integer values from 1 up to
the value of the argument quantity. For instance : 3! is 1x2x3 = 6.
By convention 0! is taken as producing the value 1. FACTORIAL
values increase very rapidly wityh increase in the argument value;
this rapid growth is represented in the similarly rapid growth in
numbers of COMBINATIONS.
[Named after the statistician RA Fisher()]. This is an EXACT
TEST(1) to examine whether the pattern of counts in a 2x2 cross
classification departs from expectations based upon the marginal
totals for the rows and columns. Such a test is useful to examine
difference in rate between two binomial outcomes. The RANDOMISATION
SET consists of those reassignments of the units which produce
tables with the same row- and column- totals as the OUTCOME. The
RANDOMISATION SET will thus consist of a number of tables with
different respective patterns of counts; each such table will have a
number of possible RANDOMISATIONS which may be a very large number.
For this test there are several reasonable TEST STATISTICs,
including : the count in any one of the 4 cells, CHI-SQUARED(1), or
the number of RANDOMISATIONS for each 2x2 table with the given row-
and column- totals; these are EQUIVALENT TEST STATISTICS. The
calculation for the FISHER TEST(1) is relatively undemanding
computationally, making reference to the algebra of the
hypergeometric distribution, and the test was widely used before the
appearance of COMPUTERs. This test has historically been regarded
as superior to the use of CHI-SQUARED(2) where sample sizes are
small. Statistical tables have been published for the FISHER
TEST(1) for a number of small 2x2 tables defined in terms of row-
and column- totals. Also see FISHER TEST(2), TWO-WAY TABLE.
[()] This is also known as the FREEMAN-HALTON TEST. It is an
extension of the logic of the FISHER TEST(1), for a 2-way
classification of counts where the extent of the
cross-classification may be greater than 2x2. The RANDOMISATION SET
for an EXHAUSTIVE RANDOMISATION TEST (EXACT TEST(1)) can be defined
in the same way as for the FISHER TEST(1). However, the various
TEST STATISTICs applicable when considering the FISHER TEST(1) will
not all be definable and will not clearly be EQUIVALENT TEST
STATISTICs. The TEST STATISTIC which is used is the number of
RE-RANDOMISATIONS for each table with the given row- and column-
totals; this TEST STATISTIC has the drawback of lacking any
descriptive significance in terms of the EXPERIMENTAL DESIGN.
[Name is an acronym : FORmula TRANslator]. A very long established
and widely implemented PROGRAMMING LANGUAGE, specialised
substantially for numerical applications. A number of STANDARD
PROGRAMMING LANGUAGE versions of FORTRAN have established at
various dates (e.g. FORTRAN IV, FORTRAN 90), approved as standard by
ANSI and ISO.
See FISHER TEST(2).
G
The GOLD STANDARD is the form of test which is most faithful to the
RANDOMISATION DISTRIBUTION, for a given TEST STATISTIC and
EXPERIMENTAL DESIGN. This involves EXHAUSTIVE RANDOMISATION. Other
RANDOMISATION TESTs may reasonably be judged by comparison with this
form. Also see : BOOTSTRAP, GOLD STANDARD(2), MONTE-CARLO.
The idea of a re-randomisation test as a standard of correctness by
which to judge other tests which are not based upon principles of
RE-RANDOMISATION.
I
A PROGRAM supplied especially for a particular type of COMPUTER, to
enable the translation of code expressed in some PROGRAMMING
LANGUAGE into OBJECT CODE for that type of COMPUTER. An INTERPRETER
undertakes translation of the user's PROGRAM in small functional
units (statements) to OBJECT CODE as the PROGRAM is used and allows
modification of the sequence of statements without need to generate
a full explicit OBJECT CODE version of the PROGRAM; this is in
contrast to the action of a COMPILER. Use of an INTERPRETER is
convenient and flexible for program development; however, running a
program produced in this way generally requires more computational
resource (particuarly in terms of run time) than for the OBJECT CODE
produced using a COMPILER.
A characteristic of data such that the difference between two values
measured on the scale has the same substantive meaning/significance
irrespective of the common level of the two values being compared.
This implies that scores may meaningfully be added or subtracted and
that the mean is a representative measure of central tendency. Such
data are common in the domain of physical sciences or engineering -
e.g. lengths or weights. Also see : MEASUREMENT TYPE, SCALE TYPES,
STEVENS' TYPOLOGY.
[Initials/acronym for the International Standards Organisation,
based in Geneva, Switzerland] This body publishes specifications
for a number of STANDARD PROGRAMMING LANGUAGES. The specifications
are arranged generally to concur with those of
ANSI.
L
This relates to an EXPERIMENTAL DESIGN for predicting a binary
categorical (yes/no) outome on the basis of predictor variables
measured on INTERVAL SCALEs. For each of a set of values of the
predictor variables, the outcomes are regarded as representing a
BINOMIAL process, with the binomial parameter 'p' depending upon the
value of the predictor variable. The modelling accounts for the
logarithm of the ODDS RATIO as a linear function of the predictor
variable. Fitting is via a weighted least-squares regression
method. RANDOMISATION TESTS for this purpose have been developed by
Mehta & Patel.
M
[Devised by ()] This is a test of difference in location for an
EXPERIMENTAL DESIGN involving two samples with data measured on an
ORDINAL SCALE or better. The TEST STATISTIC is a measure of ordinal
precedence. For each possible pairing of an observation in one
group with an observation in the alternate group, the pair is
classified in one of three ways - according to whether the
difference is positive, zero or negative; the numbers in these three
categories are tallied over the RANDOMISATION SET. The
RANDOMISATION SET is the same as that for the PITMAN PERMUTATION
TEST(1). This test is generally recommended for comparisons
involving ORDINAL-SCALE data but is not confined to this SCALE-TYPE.
An equivalent formulation of the test, based upon ranking the data
and summing ranks within groups, is the WILCOXON TEST(2). Also see
: COMBINATIONS.
This is a distinction regarding the relationship between a
phenomenon being measured and the data as recorded. The main
distinctions are concerned with the meaningfulness of numerical
comparisons of data (NOMINAL SCALE versus ORDINAL SCALE versus
INTERVAL SCALE versus RATIO SCALE : this is known as STEVENS'
TYPOLOGY), whether the scale of the measurements (other than NOMIMAL
SCALE measurements) should be regarded as essentially conituous or
discrete, and whether the scale is bounded or unbounded.
[Proposed by H.O Lancaster(), and further promoted by G.A. Barnard]
This is a TAIL DEFINITION POLICY that the
ALPHA value should be
calculated as the sum of the proportion of the TAIL for data
strictly more extreme than the OUTCOME, plus one half of the
proportion of the DISTRIBUTION corresponding to the exact OUTCOME
value. This gives an unbiased estimate of
ALPHA.
Exploration of a RANDOMISATION DISTRIBUTION is such a sequence that
the successive RANDOMISATION(3)s differ is a simple way. In the
context of a RANODMISATION TEST this can mean that the value of the
TEST STATISTIC for a particular RANDOMISATION(3) may be calculated
by a simple adjustment to the value for the preceding
RANDOMISATION(3). Also see : RANDOMISATION(1).
[Named after the famous site of gambling casinos] A MONTE-CARLO
TEST involves generating a random subset of the RANDOMISATION SET,
sampled without replacement, and using the values of the TEST
STATISTIC to generate an estimate of the form of the full
RANDOMISATION DISTRIBUTION. This procedure is in contrast to the
BOOTSTRAP procedure in that the sampling of the RANDOMISATION SET is
without replacement. An advantage of the MONTE-CARLO TEST over the
BOOTSTRAP is that with successive resamplings it converges to the
GOLD STANDARD(1) form of the EXACT TEST(1). An effective necessity
for the MONTE-CARLO procedure is a source of random codes or an
effective PSEUDO-RANDOM generator.
This is the distribution of outcomes expected if a certain number of
independent trials are undertaken of a several separate BERNOUILLI
PROCESSes, to determine a number of alternative outcomes. A special
case, where the number of outcomes is 2, is the BINOMIAL
DISTRIBUTION. The distribution depends upon the collection of
parameter values of the corresponding BERNOULLI PROCESSes and upon
the number of trials, 'n'. An alternative characterisation is as
the outcome of a number of separate POISSON PROCESSes with separate
rate parameters. Also see : TWO-WAY TABLEs.
N
A publicly agreed value for TYPE-1 ERROR, such that the outcome of a
statistical test is classified in terms of whether the obtained
value of ALPHA is extreme as compared with this criterion level.
The fine detail of the comparison involves the TAIL DEFINITION
POLICY. The outcome is classified as showing STATISTICAL
SIGNIFICANCE ('significant') if the outcome has low ALPHA as
compared with the NOMINAL ALPHA CRITERION LEVEL, otherwise not
('non-significant'). The commonest conventional values for the
NOMINAL ALPHA CRITERION LEVEL are 0.05 and 0.01 .
This is a type of MEASUREMENT SCALE with a limited number of
possible outcomes which cannot be placed in any order representing
the intrinsic properties of the measurements. Examples : Female
versus Male; the collection of languages in which an international
treaty is published.
A number of statistical tests were devised, mostly over the period
1930-1960, with the specific objective of by-passing assumptions
about sampling from populations with data supposedly conforming to
theoretically modelled statistical distributions wuch as the NORMAL
DISTRIBUTION. Several of these tests were explictly concerned with
ORDINAL-SCALE data for which modelling based upon continuous
functions is clearly inappropriate. These tests are implicitly
RE-RANDOMISATION TESTS. Also see : BINOMIAL TEST, MANN-WHITNEY
TEST, WILCOXON TEST(1 and 2).
[] The NORMAL DISTRIBUTION is a theoretical distribution applicable
for continuous INTERVAL-SCALE data. It is related mathematically to
the BINOMIAL and CHI-SQUARE(2) distributions and to several named
sampling distributions (including Student's t, Fisher's F, Pearson's
r); these sampling distributions are the characteristic tools of
parametric statisical infernece to which RE-RANDOMISATION STATISTICS
are an alternative.
In order to test whether a supposed interesting pattern exists in a
set of data, it is usual to propose a NULL HYPOTHESIS that the
pattern does not exist. It is the unexpectedness of the degree of
departure of the observed data, relative to the pattern expected
under the NULL HYPOTHESIS, which is examined by the measure ALPHA.
Reference to a NULL HYPOTHESIS is common between RE-RANDOMISATION
STATISTICS and parametric statistics. Also see :
BETA.
O
This is the code which a COMPUTER recognises and acts upon as a
direct consequence of its electromechanical construction. Typically
such code is highly abstract and unsuitable for use in general use
by human programmers. The OBJECT CODE to specify a certain process
is usually generated through use of a COMPILER. Also see :
PROGRAMMING LANGUAGE.
An alternative characterisation of the parameter 'p' for a BINOMIAL
PROCESS is the ratio of the incidences of the two alternatives :
p/(1-p) ; this quantity is termed the ODDS RATIO; the value may
range from zero to infinity. This relates to a possible view of a
BINOMIAL PROCESS as the combined activity of two POISSON PROCESSes
with a limit upon total count for the two processes combined. Also
see : LOGISITIC REGRESSION.
A MEASUREMENT TYPE for which the relative values of data are defined
solely in terms of being lesser, equa-to or greater as compared with
other data on the ORDINAL SCALE. These characteristics may arise
from categorical rating scales, or from converting INTERVAL SCALE
data to become RANKED DATA.
The value of the TEST STATISTIC for the data as initially observed,
before any RE-RANDOMISATION..
P
The ALPHA value arising from a statistical test. Also see :
EXACT TEST(2)
One of a number of PROGRAMs for undertaking translations between
STANDARD PROGRAMMING LANGUAGES.
[Named after the mathematician Blaise Pascal ( - )]. A PROGRAMMING
LANGUAGE designed for clarity of expression when published in
human-legible form, and for the teaching of programming. PASCAL is
to some extent specialised for numerical work. A development is
EXTENDED PASCAL. COMPILERS for PASCAL are widespread. PASCAL and
EXTENDED PASCAL are each represented as STANDARD PROGRAMMING
LANGUAGEs approved by ANSI and ISO.
This term has a distinct mathematical definition, but is also
commonly used as a synonym for RE-RANDOMISATION.
See : PERMUTATION, PITMAN PERMUTATION TEST(1), PITMAN PERMUTATION
TEST(2).
[Named after the statistician E.J. Pitman who described this test,
and the PITMAN PERMUTATION TEST(2), in 1937; this is one of the
earliest instances of an EXACT TEST(1)] An EXACT RE-RANDOMISATION
TEST in which the TEST STATISTIC is the DIFFERENCE OF MEANS of two
samples of univariate INTERVAL-SCALE data. . Also see : EQUIVALENT
TEST STATISTIC, PITMAN PERMUTATION TEST(2).
An EXACT RE-RANDOMISATION TEST in which the TEST STATISTIC is the
MEAN DIFFERENCE of a single sample of univariate data measured under
two circumstances as REPEATED MEASURES. Also see : PITMAN
PERMUTATION TEST(1)
The distribution of number of events in a given time, arising from a
POISSON PROCESS. This differs from the BINOMIAL DISTRIBUTION in
that there is no upper limit, corresponding to the parameter 'n' of
a BINOMIAL PROCESS, to the number of events which may occur. Also
see : ODDS RATIO.
A process whereby events occur independently in some continuum (in
many applications, time), such that the overall density (rate) is
statistically constant but that it is impossible to improve any
prediction of the position (time) of the next event by reference to
the detail of any number of preceding observations. The
corresponding distribution of intervals between events is an
exponential distribution. The conventional example of a POISSON
PROCESSES is concerned with occurence of radioactive emissions in a
substantial sample of radioactive with a half-life very much longer
than the total observation period. Also see : POISSON DISTRIBUTION.
A definable set of individual units to which the findings from
statistical examination of a SAMPLE subset are intended to be
applied. The POPULATION will generally much outnumber the SAMPLE.
In RE-RANDOMISATION STATISTICs the process of applying inferences
based upon the SAMPLE to the POPULATION is essentially informal.
Also see : REPRESENTATIVE.
This is the probability that a statistical test will detect a
defined pattern in data and declare the extent of the pattern as
showing
STATISTICAL SIGNIFICANCE. POWER is related to TYPE-2 ERROR
by the simple formula : POWER = (1-BETA) ; the motive for this
re-definition is so that an increase in value for POWER shall
represent improvement of performance of a STATISTICAL TEST. For more
detail, see :
BETA.
A sequence of instructions expressed in some PROGRAMMING LANGUAGE.
Also see ALGORITHM(2).
The characteristic of a COMPUTER which enables it to be used to
undertake a variety of different processes on different occasions.
Also see : ALGORITHM(2), PROGRAM, PROGRAMMING LANGUAGE, STANDARD
PROGRAMMING LANGUAGE.
A formal code for expressing to a COMPUTER how a certain process
should be undertaken. The translation from the code of the
PROGRAMMING LANGUAGE to the OBJECT CODE of the appropriate COMPUTER
is itself undertaken by a PROGRAM for that COMPUTER; the translation
program may take the form of either a COMPILER of an INTERPRETER.
Also see : ALGORITHM(1), ALGORITHM(2), PROGRAM. STANDARD PROGRAMMING
LANGUAGES.
A source of data which is effectively unpredictable although
generated by a determinate process. Successive PSEUDO-RANDOM data
are produced by a fixed calculation process acting upon preceding
data from the PSEUDO-RANDOM sequence. To start the sequence it is
necessary to decide arbitrarily upon a first datum, which is termed
the SEED value. Also see : BOOTSTRAP, MONTE-CARLO TEST.
R
A SAMPLE drawn from a POPULATION in such a way that every individual
of the POPULATION has an equal chance of appearing in the SAMPLE.
This ensures that the SAMPLE is REPRESENTATIVE, and provides the
necessary basis for virtually all forms of inference from SAMPLE to
POPULATION, including the informal inference which is characteristic
of RE-RANDOMISATION statistics. PSEUDO-RANDOM procedures can be
useful in defining a RANDOM SAMPLE.
Generation of whole or part of the RANDOMISATION SET. Also see :
RANDOMISATION(3), RE-RANDOMISATION.
The process of arranging for data-collection, in accordance with the
EXPERIMENTAL DESIGN, such that there should be no foreseeable
possibilty of any systematic relationship between the data and any
measureable characteristic of the procedure by which the data was
sampled. This is usually arranged by assigning experimental units
to groups, and REPEATED MEASURES to experimental units, on a
strictly random basis.
One of the arrangements making up the RANDOMISATION SET. These
arranegments will be encountered in the act of RANDOMISATION(1).
Also see : BRANCH AND BOUND, MINIMAL-CHANGE SEQUENCE.
A collection of values of the TEST STATISTIC obtained by undertaking
a number of RE-RANDOMISATIONS of the actual data within the
RANDOMISATION SET. ALso see : CONFIDENCE INTERVAL, RANDOMISATION
TEST.
The collection of possible RE-RANDOMISATIONs of data within the
constraints of the EXPERIMENTAL DESIGN. Also see : RANDOMISATION
DISTRIBUTION.
The rationale of a RANDOMISATION TEST involves exploring
RE-RANDOMISATIONs of the actual data to form the RANDOMISATION
DISTRIBUTION of values of the TEST STATISTIC. The OUTCOME VALUE
value of the TEST STATISTIC is judged in terms of its relative
position within the RE-RANDOMISATION DISTRIBUTION. If the OUTCOME
VALUE is near to one extreme of the RE-RANDOMISATION DISTRIBUTION
then it may be judged that it is in the extreme TAIL of the
distribution, with reference to a NOMINAL ALPHA CRITERION VALUE, and
thus judged to show STATISTICAL SIGNIFICANCE. Also see : EXACT
TEST(1).
This refers to the practice of taking a set of N data, to be
regarded as ORDINAL-SCALE, amd replacing each datum by its rank (1
.. N) within the set. Also see : WILCOXON RANK-SUM TEST.
This is a type of MEASUREMENT SCALE for which it is meaningful to
reason in terms of differences in scores (see INTERVAL SCALE) and
also in terms of ratios of scores. Such a scale will have a zero
point which is meaningful in the sense that it indicates complete
absence of the property which the scale measures. The RATIO SCALE
may be either unipolar (negative values not meaningful) or bipolar
(both positive and negative values meaningful), and either
continuous or discrete.
The process of generating alternative arrangements of given data
which would be consistent with the EXPERIMENTAL DESIGN. Also see :
BOOTSTRAP, EXACT TEST(2), EXHAUSTIVE RE-RANDOMISATION, MONTE-CARLO,
RE-RANDOMISATION STATISTICS.
Also known as PERMUTATION or RANDOMISATION(1) statistics. These are
the specific area of concern of this present glossary.
A comparison of two or more statistical tests, for the same
EXPERIMENTAL DESIGN, SAMPLE SIZE, and NOMINAL ALPHA CRITERION VALUE,
in terms of the respective values of POWER. Also see :
BETA.
This is a feature of an EXPERIMENTAL DESIGN whereby several
observations measured on a common scale refer to the same sampling
unit. Identification of the relation of the individual observations
to the EXPERIMENTAL DESIGN is crucial to this definition. Examples
: the measurement of water level at a particular site on several
systematically-defined occasions; measurement of reaction-time of an
individual using right hand and left hand separately. Also see :
INDEPENDENT GROUPS, REPLICATIONS, STRATIFIED.
This is a feature of an EXPERIMENTAL DESIGN whereby observations on
an experimental unit are repeated under the same conditions.
Identification of the position of a particular observation within
the sequence of replications is irrelevant. Also see : REPEATED
MEASURES, STRATIFIED.
Patterns in a SAMPLE of units may reasonably be attributed to the
POPULATION from which the SAMPLE is drawn, only if the SAMPLE is
REPRESENTATIVE. In practical terms, to ensure that a SAMPLE is
REPRESENTATIVE almost always means ensuring that it is a RANDOM
SAMPLE.
This is the name of an educational initiative involving the use of a
PROGRAMMING LANGUAGE, in the form of an INTERPRETER, allowing the
user to specify MONTE-CARLO RESAMPLING of a set of data and
accumulation of the RANDOMISATION DISTRIBUTION of a defined TEST
STATISTIC.
Acronym for Random Number Generator. This is a process which uses a
arithmetic algorithm to generate sequences of PSEUDO-RANDOM numbers.
Also see : SEED.
S
[Sacrowicz & Cohen()] This is a
TAIL DEFINITION POLICY which
asserts that the ALPHA value should be
A set of individual units, drawn from some definable POPULATION of
units, and generally a small proportion of the POPULATION, to be
used for a statistical examination of which the findings are
intended to be applied to the POPULATION. It is essential for such
inference that the SAMPLE should be REPRESENTATIVE. In
RE-RANDOMISATION STATISTICS the process of applying inferences based
upon the SAMPLE to the POPULATION is essentially informal.
The number of experimental units on which observations are
considered. This may be less than the number of observations in a
data-set, due to the possible multipying effects of multiple
variables and/or REPEATED MEASURES within the EXPERIMENTAL DESIGN.
See MEASUREMENT TYPE.
See PSEUDO-RANDOM.
[()]. ALGORITHMs employing BRANCH-AND-BOUND methods for the PTIMAN
PERMUTAION TEST(1) and the PITMAN PERMUTATION TEST(2).
See : STATISTICAL SIGNIFICANCE.
See ALPHA.
A PROGRAMMING LANGUAGE which has a publicly agreed common form
across several different types of COMPUTER. Such standardisation
allows a PROGRAM to be transported conveniently between the
different types of COMPUTER and is thus suitable for communicating
general ideas about programming. Some STANDARD PROGRAMMING
LANGUAGES relevant to the present context are : FORTRAN, PASCAL,
'C'. There are a number of widely available programs for
translating SOURCE PROGRAMS from one STANDARD PROGRAMMING LANGUAGE
to another - e.g. the program PAS2C which translates source code
from PASCAL to 'C'. Also see : ALGORITHM(2), ANSI, ISO.
A number or code derived by a prior-defined consistent process of
calculation, from a set of data. Also see : ALGORITHM(1), TEST
STATISTIC.
See : ALPHA, NOMINAL ALPHA CRITERION LEVEL.
[()] This is widely-observed scheme of distinctions between types of
MEASUREMENT SCALEs according to the meaningfulness of arithmetic
which may be performed upon data values. The types are : NOMINAL
SCALE versus ORDINAL SCALE versus INTERVAL SCALE versus RATIO SCALE.
This is a feature of an EXPERIMENTAL DESIGN whereby a scheme of
observations is repeated entirely using further sets (strata) of
experimental units, with each such further set distinguished by a
level of a categorical variable which is distinct from any
categorical variables used to define the EXPERIMNATL DESIGN within a
single set (stratum). The data from the various strata are regarded
as distinct. This situation occurs when attempting to make
inferences based upon the results of several similar independent
experiments. Also see : REPEATED MEASURES, REPLICATIONS.
T
An area at the extreme of a RANDOMISATION DISTRIBUTION, where the
degree of extremity is sufficient to be notable judged against some
NOMINAL ALPHA CRITERION VALUE. Also see : BRANCH-AND BOUND,
RE-RANDOMISATION TEST, TAIL DEFINITION POLICY.
This is a defined method for dividing a DISCRETE DISTRIBUTION into a
TAIL area and a body area. The scope for differing policies arises
due to the non-infinitesmal amount of probability measure which may
be associated with the ACTUAL OUTOME value. The conventional
policy, based upon considerations of simplicity and of conservatism
in terms of ALPHA, is to include the whole of the weight of outcomes
equal to the ACTUAL OUTCOME as part of the TAIL. Also see MID-P,
SACROWICZ & COHEN.
A STATISTIC measuring the strength of the pattern which a
statistical test undertakes to detect. In the context of
RE-RANDOMISATION TESTS one is concerned with the distribution of the
values of the TEST STATISTIC over the RANDOMISATION SET. An example
of a TEST STATISTIC is the DIFFERENCE OF MEANS as employed in the
PITMAN PERMUTATION TEST. Also see : EXACT TEST(1), OUTCOME VALUE.
In a NONPARAMETRIC TEST involving RANKED DATA, if two data have TIED
VALUES then they will deserve to receive the same rank value. It is
generally agreed that this should be the average of the ranks which
would have been assigned if the values had been discernably unequal.
Thus, the ranks assigned to a set of 6 data, with ties present
might emerge as sets such as : 1,3,3,3,5,6 or 1,2,3.5,3.5,5,6. The
possibility of TIED RANKS leads to elaborations in the
otherwise-standard tasks of computing or tabulating RANDOMISATION
DISTRIBUTIONS where data are replaced by ranks.
Where data are represented by ranks, TIED VALUES lead to TIED RANKS.
Whether or not data are rep[resnted by ranks, for any TEST STATISTIC
the occurrence of TIED VALUES will increase the extent to which a
RANDOMISATION DISTRIBUTION will be a DISCRETE DISTRIBUTION rather
than a CONTINUOUS DISTRIBUTION.
A representation of suitable data in a table organised as rows and
columns, such that the rows represent one scheme of alternatives
covering the whole of the the data represented, the columns
represent a further scheme of alternatives covering the whole of the
data represented, and the entries in the TWO-WAY TABLE are the
counts of numbers of observations conforming to the respective cells
of the two-way classification.
See :
ALPHA.
See : WILCOXON TEST(1), WILCOXON TEST(2).
[Named after the statistician F, Wilcoxon ()] This test applies to
an EXPERIMENTAL DESIGN involving two REPEATED MEASURE observations
on a common set of experimental units, which need be only
ORDINAL-SCALE. The purpose is to measure shift in scale location
between the two levels of the REPEATED MEASURE distinction. The
TEST STATISTIC is derived from the set of differences between the
two levels of the REPEATED MEASURE distinction - one difference
score for each observational unit. The procedure is somewhat
variable between authors, although the variants each correspond to
valid well-sized EXACT TEST(1)s. Wilcoxon's original procedure
commences by discarding entirely the observations from any
experimental units for which the data values are equal at each level
of the REPEATED MEASURE comparison. Thus or otherwise, the next
step is RANKING the differences, providing a rank for each retained
experimental unit; the ranks are according to the absolute values of
the differences. The ranks are summed separately into two or three
categories : negative differences; zero differences (if any);
positive differences. The TEST STATISTIC is the smaller of the
outer categories, plus an adjustment for the middle
(zero-difference) category. Also see : PITMAN PERMUTATION TEST(2).
[Named after the statistician F, Wilcoxon ()] This is a test for an
EXPERIMENTAL DESIGN involving two INDEPENDENT GROUPS of experimental
units, where data need be only ORDINAL-SCALE. The purpose is to
measure shift in scale location between the two groups. The TEST
STATISTIC is the sum, for a nominated group, of the ranks of the
data for the groups combined. This test has an EQUIVALENT TEST
STATISTIC to that for the MANN-WHITNEY TEST, so the two tests must
always agree. Also see : PITMAN PERMUTATION TEST(1).
See : TWO-WAY TABLE.
This is a TWO-WAY TABLE where the numbers of levels of the row- and
column-classifications are each 2. If the row- and column-
classifications each divide the observational units into subsets,
then it is likely that it will be useful to analyse the data using
the FISHER TEST(1).
see also,
Internet Glossary of Statistical Terms