Unit of Analysis
If you are specifically interested in dyadic analysis, click here to go to a book that discusses that topic.
This page has been translated into Polish and Czech.
This page provides the practicing researcher with guidance
concerning the unit in the statistical analysis. I thank Charles
Judd for helping me with many of the ideas on this page. Any feedback,
either technical or pedagogical, would be most appreciated.
Outline
Statement of
the Problem
Independence
of Units
The Measurement
of Nonindependence
Unit of Generalization
The researcher should be aware of the ecological fallacy
(Robinson, 1950). The conclusions drawn from an analysis conducted at a
group level may not apply at the individual level. Conversely, analyses
at the individual level may not apply to the group level. In principal,
the analysis should be conducted at the level at which generalizations
should be made. However, there are exceptions to this rule.
Unit of Measurement
A related issue is that sometimes a researcher aggregates
across units (i.e., averages) and so changes the unit of measurement.
For example, to measure organizational climate, the mean of individual
measures might be used. Just because the mean is at the level of
the organization, does not mean that it, in fact, operates at that level.
Unit of Assignment
or Sampling
How Do I Conduct
the Analysis?
Sometimes, rules about the unit of assignment and the
unit of generalization will be violated. For instance, classrooms
may be the unit of assignment, but if there is no evidence of nonindependence
due to classroom, person can be the unit of analysis. Alternatively,
if there is evidence that classrooms are nonindependent, then person should
not be the unit of analysis, even if person is the unit of generalization.
Because all of the variation of treatment is between classrooms (recall
that classroom is the unit of assignment), then the treatment's effect
will be seen in between classroom variation, not within classroom.
Statement
of the Problem
Independence
of Units
The
Measurement of Nonindependence
Unit
of Generalization
Unit
of Measurement
Unit
of Assignment or Sampling
How
Do I Conduct the Analysis?
References
In statistical analysis, it is sometimes not clear what
is the appropriate level of analysis. For instance, persons are in groups
(e.g., children in classrooms), and either person or group could be the
unit of analysis. (The group would be the unit of analysis by computing
a mean of those persons who are members of the group.) Sometimes
the two units are crossed instead of nested; for example, 30 judges rate
20 targets. Either target, rater, or even observation could be the
unit of analysis. Because nesting (e.g., children in classrooms)
is much more common than crossing, that case is generally assumed in the
following discussion.
At the heart of statistical analysis is replication or the repeated observation of a phenomenon.
For a replication to be a true replication, there must be independence
of observations. (For example, duplicating your data is not replication!)
Independence of observations is presumed in standard measures of variability.
For there to be independence, two observations are no more likely to be
similar (or different) than any other two observations. There are
several factors that make units nonindependent (Kenny & Judd, 1986).
Observations can be nonindependent because of compositional effects, common
fate, and social interaction:
Using path analysis notation, a compositional effect is a
curved line between a pair of observations, common fate is spuriousness (the observation caused by common variable), and social
interaction is a direct effect. The nonindependence would be positive
if the nonindependent observations were more similar than independent observations;
the nonindependence would be negative if the nonindependent observations
were more different than independent observations. The degree of
nonindependence can be viewed as a correlation coefficient, though it is
not usually measured by an ordinary Pearson product-moment correlation.
To determine the unit of analysis, an assessment of whether
observations are independent is often helpful. That is, the observations that
are thought to be nonindependent, may in fact be independent. The measurement
of nonindependence can be complicated, but in many cases an intraclass
correlation can be used to measure the degree of nonindependence. (Read
about this measure for dyads.) This measure is appropriate when
groups of observations are all linked to one another. Kenny and Judd
(1996) discuss a wide variety of measures of nonindependence.
Another factor in deciding the unit of analysis is the
level of generalization that the researcher seeks to make. Consider a researcher who
measures 10 children in 10 classrooms from 10 different schools, or 1000 children in all.
There are three possible levels of generalizations: the student, the classroom,
and the school. One simple rule is to conduct the analysis at the level
at which one wants to make generalizations. So if one wants to draw
conclusions about persons, person should be the unit of analysis.
However, as will be seen, this simple rule cannot always be followed.
Another consideration is the unit of measurement. Again
returning to the example of children, classroom, and school, some variables
may be measured on children (e.g., achievement), some on the classroom
(e.g., teacher's gender), and some on the school (e.g., school size).
Just because one measures a variable at a certain level does not imply
that the variable operates at that level. Consider the variable group
size. Presumably this variable operates at the group level.
However, if a researcher changed the unit of measurement of the variable
and asked persons how big the group was, the variable will still likely
operates at the group level, not at the individual level.
A final consideration in the decision about the unit
of analysis is design factors. It is necessary to consider the unit
by which observations are selected to enter the study or are assigned to
levels of the independent variable. A good idea is to perform the
statistical analysis at the level of the selection or assignment. So, for
instance, if floors in a dormitory are assigned to experimental conditions,
dormitory floor, not person, should be the unit of analysis. This is not
a "hard-and-fast rule," just a helpful guideline. For instance, individuals may be the unit of assignment, but if individuals interact with one another, then it may not be possible to use individual as the unit of analysis.
There are three major approaches to the unit of analysis
question when persons are nested within groups (or observations are nested
within persons) and is based on discussion in Kenny (1996):
There are then two key questions in determining the unit
of analysis. First, a determination must be made about the lowest
level of units that are independent. Often statistical analysis is
necessary to determine the extent to which units are independent (though
this can be tricky: see Kenny, Kashy, & Bolger's (1998) concept of
"consequential nonindependence"). Second, a determination must be
made about the degree of variation in the causal variable. If most
of its variation is between the nonindependent units, then aggregation
or averaging should be used. If not, then the within analysis should
be used.
Kenny, D. A. (1996). The design and analysis of social-interaction research. Annual Review of Psychology, 47, 59-86.
Kenny, D. A., & Judd, C. M. (1986). Consequences of violating the independence assumption in analysis of variance. Psychological Bulletin, 99, 422-431.
Kenny, D. A., & Judd, C. M. (1996). A general procedure for the estimation of interdependence. Psychological Bulletin, 119, 138-148.
Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis in social psychology. In D. Gilbert, S. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., Vol. 1, pp. 233-265). Boston, MA: McGraw-Hill.
Robinson, W. S. (1950). Ecological correlations and the behavior of individuals. American Sociological Review, 15, 351-357.