David A. Kenny
January 8, 1999

Glossary of Experimental and Quasi-Experimental Terms

alpha:  the probability of making a Type I error (rejecting the

          null hypothesis when it is true)

anti-compensatory program:  experimental participants outscore

          the control participants on pretreatment measures for

          which higher scores mean "better"

ARIMA model:  autoregressive, integrated, moving average model

          for the analysis of time-series data

assignment variable:  variable correlated with the outcome and

          used to assign persons to treatment groups; the source

          of non-randomized selection effects

autocorrelation:  in a time series, the degree of correlation

          between two time points separated by a fixed length of

          time or lag

autocorrelogram:  a graph of the strength of autocorrelation as a

          function of lag length

autoregressive model:  for time-series and longitudinal data, the

          current value is assumed to be a function of the

          previous values plus a random component

autoregressive coefficient:  the degree to which the current

          value is influenced by the previous value

blocking:  matching using a categorical variable; term often used

          in randomized experiments

change score analysis:  the use of change as the outcome in

          over-time analyses

Cohen's d:  mean difference between two treatment groups divided

          by the pooled within groups standard deviation; a

          measure of effect size

compensatory program:  control participants outscore the

          experimental participants on pretreatment measures for

          which higher scores mean "better"

construct validity:  the measure actually taps the intended

          theoretical construct

control group:  persons who have not received the treatment to

          whom the treated group is compared

correlation:  the degree of linear association between two

          standardized variables; the slope divided by the

          perfect slope; ranges from minus one to plus one;

          measure of effect size

covariate:  a measure that is correlated with the outcome but not

          affected by the treatment or the outcome

cross-lagged panel correlation (CLPC):  a method for ruling out

          the plausible rival hypothesis of spuriousness using

          longitudinal data

cycle:  in a time series, observations separated by a constant

          interval tend to be similar to one another

effect size:  the magnitude of the standardized effect of a

          treatment variable on an outcome

error variance:  in psychometrics, the variance in a measure not

          due to true variance which is estimated by the

          measure's variance times one minus the measure's

          reliability; alternatively in modeling, the unexplained

          variance in a variable

external validity:  the generalizability of the results from a

          study; a threat is the interaction of treatment with

          another variable

fishing:  multiple significance tests of essentially the same

          hypothesis with no adjustment in alpha

Galton squeeze diagram:  a pair-link diagram in which the levels

          of one variable are connected to the means of the other


guesstimate:  through a visual examination, not mathematical

          computation, a statistic (e.g., a mean or slope) is


history:  the plausible rival hypothesis that change is due to

          some intervening event and not the treatment

horizontal squeeze plot:  a scatter plot in which means or

          guesstimates are computed for each value on the

          variable on the vertical axis; the scatter plot is

          squeezed horizontally

instrumentation:  the plausible rival hypothesis that a change in

          an outcome is due to a change in the calibration of the

          measuring device

internal validity:  valid causal inference; estimating the effect

          due to a treatment; threatened by plausible rival

          hypotheses such as regression toward the mean

interrupted time-series design:  a time series in which the

          initial observations serve as control and after an

          intervention is introduced the remaining observations

          are experimental

lag:  the time interval between measurement

latent variable:  a theoretical construct that is imperfectly

          measured by one or more indicator

linearity:  the assumption that the relationship between two

          variables can be best fitted by a straight line   

Lord's paradox:  when treatment groups differ on a pretreatment

          measure, covarying out that measure and change score

          analysis yield different conclusions

matching:  measuring the treatment effect across equivalent

          scores on a third variable to reduce, but not likely

          eliminate, bias due to selection

maturation:  the plausible rival hypothesis that change is due to

          natural growth and not the intervention

measurement error:  the random, unsystematic component in a


mega-covariate:  a covariate that is formed by combining the

          values of two or more covariates

mortality:  the plausible rival hypothesis that persons who leave

          the experimental and control groups do so for different

          reasons; a type of selection effect

multilevel modeling:  a statistical method for the analysis of

          data at two or more levels, e.g., children and

          classrooms or persons and times

multiple regression:  a statistical technique for the

          simultaneous estimation of the effects of several

          predictors that add together

multitrait-multimethod matrix (MTMM):  a correlation matrix

          between a set of variables (i.e., traits) all measured

          by the same set of methods

nonequivalent control group design:  treatment and control groups

          are non-randomly formed and persons are pre- and


null hypothesis:  the hypothesis that some population value

          (e.g., a mean difference, a correlation, a regression

          coefficient) equals some particular value (usually


omitted variable:  variable not controlled in the statistical

          that causes the outcome and the assignment variable;

          the assignment variable; the source of selection


overadjustment:  the estimated treatment is biased in the

          direction of the difference on the covariate (the

          covariate being scaled to correlate positively with the


over-fitted regression line:  if X is used to predict Y, the

          predicted values of Y for each value of X, connected by


pair-link diagram:  graph of two-variable association; two

          vertical lines, one for each variable, and the scores

          represented by a line connecting these two vertical


parallel test:  a second measure of the same construct that has

          the same amount of true and error variance and

          sometimes is assumed to have the same mean

perfect-correlation:  the slope if there were a perfect

          correlation; a slope of the standard deviation of the

          criterion divided by the standard deviation of the


plausible rival hypothesis:  a threat to internal validity; an

          alternative explanation of the treatment effect

power:  the probability of rejecting the null hypothesis; one

          minus the probability of making a Type II error

pre-post design:  a group of persons is measured before and after

          receiving a treatment

pretest:  a prior measure of the outcome

proximal autocorrelation:  the correlation between shorter time

          lags is larger than the correlation between longer lags

quasi-simplex:  a simplex correlational structure that is

          attenuated by measurement error

random assignment:  the assignment of persons into treatment

          groups by a random rule; persons have a fixed

          probability of being assigned to a treatment group

random selection:  the selection of persons into the study

          randomly from some specified population

randomized experiments:  studies in which units are randomly

          assigned to treatments

regression discontinuity design:  persons are assigned to

          treatment groups on the basis of a measured variable

regression line:  if X is used to predict Y, the line that

          minimizes the sum of squared errors of prediction

regression toward the mean:  because of a less than perfect

          correlation, the predicted score of a variable is not

          as extreme in terms of standard score units than the

          predictor variable in standard score units

reliability:  the proportion of variance in a measure that is

          true, commonly estimated by an internal consistency


scatter plot:  a graph in which the axes are two variables and

          the points represent the scores of individuals on the


selection:  the plausible rival hypothesis that the treatment

          difference is due to a pre-existing difference on some

          unknown variable; that unknown variable is called the

          assignment variable

selection by maturation:  persons at the different levels of the

          assignment variable are changing at different rates

selection by regression:  persons at the different levels of the

          assignment variable are regressing to different means

shrinkage:  the variance of predicted scores using the standard

          regression prediction formula must be less than or

          equal to the variance of the observed scores; how much

          less depends on the correlation between the prediction

          and the score being predicted

simplex:  the correlational structure that results from a

          first-order autoregressive model; the resulting

          structure is proximally autocorrelated

spuriousness:  the covariation between two variables is not due

          to one causing the other, but rather due to the

          variables both being caused by a third variable

standardization:  the transformation of a variable so that its

          mean is zero and its variance is one; Z scoring

standardized change score analysis:  the variance of the

          components of a change score have equal variance

          through standardization; the formula for this analysis

          is Y - (sY/sX)X

stationarity:  parameters do not change over time; e.g., the mean

          and the standard deviations of the pretest and the

          posttest are the same

statistical equating:  using multiple regression in an attempt to

          control presumed selection variables

structural equation modeling:  models with a causal structure

          between latent variables

synchronous correlation:  the correlation between two variables

          measured at the same time

testing:  the plausible rival hypothesis that the process of

          being measured affects subsequent measurements

time-reversed analysis:  the analysis of data switching the flow

          of time and determining if the results change

time series:  data from a single unit that is temporally ordered

trait-state-error model:  a model of change with three

          components: a trait or unchanging variable, a state or

          autoregressive component, and an error or random


treatment:  an experimental intervention or program; the variable

          that contrasts the two groups in an evaluation

trend:  a constant change in a variable over time

true-score estimate:  given an observed score, the predicted true

          score is regressed or shrunk toward the mean using the

          formula:  MX + rX(X - MX) where rX is the measure's


true variance:  the portion of variance in a measure that is not

          error; estimated by the measure's variance times its


Type I error:  rejecting the null hypothesis when it is true;

          probability denoted as alpha

Type II error:  not rejecting the null hypothesis when it is

          false; its probability denoted as beta and power equals

          one minus beta

underadjustment:  the estimated treatment is biased in the

          direction opposite from the difference on the covariate

          (the covariate scaled to correlate positively with the


vertical squeeze plot:  a scatter plot in which means or

          guesstimates are computed for each value of the

          variable on the horizontal axis; the scatter plot is

          squeezed vertically

zero-correlation line:  a flat line which intersects the mean of

          the variable being predicted

Go back to the previous page.