Dyadic Analysis
The topics on this page are much more extensively covered in book Dyadic Data Analysis written by David A. Kenny, Deborah A. Kashy, and William Cook. To find out more about this book click here.
What this tutorial does not cover:
Related topics are covered in the Unit of Analysis page. As is always the case, any suggestions for changes would be appreciated.
The work done on this page has been done in collaboration with many people. Particularly important are Deborah Kashy and William Cook (with whom I wrote a book called Dyadic Data published by Guilford Press), and many others have been helpful in developing my thinking. I especially want to acknowledge the late Larry Kurdek (who has provided specific feedback about the page), Larry La Voie, Eliot Smith, Mike Berbaum, Harry Reis, Dale Griffin, Rich Gonzalez, and Charles Judd. There are almost certainly others whom I forgot.
List of Topics
Topic 1. What is a standard dyadic design?
Topic 2. What is the level of measurement of the outcome measure?
Topic 3. Are the dyad members distinguishable or not?
Topic 4. Determine the types of variables in the analysis.
Topic 5. The assessment of nonindependence
Topic 6. Consequences of nonindependence on significance testing
Topic 7. How are the effects of a betweendyads predictor variable estimated?
Topic 8. How are the effects measured when the predictor variable is within dyads?
Topic 9. How is nonindependence controlled if there are several predictor variables, some of which are between and others of which are within dyads?
Topic 10. How can effects be estimated if the predictor variable is a mixed variable?
Topic 1. What is a standard dyadic design?
Are the data as below, where "<~>" means a tie or link between two persons (first subscript is person and the second dyad)?
X_{11 }<~> X_{21}
X_{12 }<~> X_{22}
X_{13 }<~> X_{23}
X_{14 }<~> X_{24}
X_{15 }<~> X_{25}
That is, each person is linked to one and only one other person in the sample and both persons are measured on the same variables. I will also sometimes denote the two persons' scores as X and X' or Y and Y' and not use subscripts.
Examples of Standard Dyad Designs:
25 pairs of roommates
44 lesbian couples
38 supervisorsupervisee pairs
116 fatherdaughter pairs
54 pairs of twins
An Example of Data Set that Could Become Standard
44 dating couples and 22 persons whose partner is not measured (the data from the 22 "singles" would have to be set aside and the remaining 44 would form a standard design)
Examples of Designs that Are Not Standard
a) therapy groups in which persons rate each other (persons are linked to every one in the group; see the Social Relations Model)
b) classroom survey of dating habits (people are linked to people who are likely not in the survey)
c) cancer patients rate how much their spouse helps them (the spouse would need to rate the help of the patient)
d) people who have more than one partner (the onewithmany design): egocentric networks, persons rated by multiple informants, people who see different doctors in a clinic, people asked to recall how jealous they were in their last three relationships
Topic 2. What is the level of measurement of the outcome measure?
The methods discussed on this page presume that the outcomes are measured at the interval level of measurement. That is, there must be a numeric score for each person. However, the predictor variables need not be at the interval level of measurement. For instance, gender may be a predictor variable. If the outcome is categorical (e.g., together versus separated), methods derived for sociometric analysis (e.g., data in which persons state whether they like members in a group) are likely more appropriate. Consult the book by Wasserman and Faust (Social Network Analysis: Methods and Applications) on this topic. Also the Loeys, T., Cook, W., De Smet, O., Wietzker, A., & Buysse, A. (2014). The ActorPartnerInterdependenceModel for Categorical Dyadic Data: a userfriendly guide to GEE. Personal Relationships, 21: 225241 is relevant.
Topic 3. Are the dyad members distinguishable or not?
Analysis procedures depend on whether the members of the dyad are indistinguishable (sometimes called exchangeable) or not. Dyad members can be distinguished if there is a variable that allows the researcher to differentiate members. So, for instance, members of heterosexual couples can be distinguished by their gender, whereas members of gay and lesbian cannot. As another example, close friends usually cannot be distinguished, whereas boss and employee can be. Very often in dyadic analysis, researchers distinguish dyad members in an arbitrary fashion. For instance, they call the first person whose data are entered "person one" and the second person as "person two." Because such a designation is arbitrary, the results obtained from this analysis would vary if the data were ordered differently. It is inadvisable to pretend that the members of the dyad are distinguishable when, in fact, they are not. It is possible to test empirically if dyad members are distinguishable. See the paper by Gonzalez and Griffin (1999).
Just because members can be distinguished does not mean that such a distinction necessarily should be made. So, if members can be distinguished by their gender, but gender does not affect the responses, it would be better not to make such a distinction and treat dyad members as if they were indistinguishable. Sometimes dyad members can be distinguished in more than one way. So, for instance, heterosexual couples can also be distinguished by who is older. Generally, it is advisable to choose the distinction that is more meaningful for the current research and variables under study. The second distinguishing variable can be handled in the analysis, as will be explained in Topic 9.
Topic 4. Determine the types of variables in the analysis.
In dyadic analysis,
there are three major types of variables:
Betweendyads variable: All the variation in the variable is between dyads. So both members of the dyad have the same score on the variable.
Withindyads variable: All the variation is in the variable is within dyads. So the sum of the two persons' scores is the same for every dyad.
Mixed variable: There is both variation between and within dyads. Consider the variable of gender. If the study consisted of samegender roommates, gender would be a betweendyads variable. If the study consisted of oppositegender or platonic friendships, then gender would be a withindyads variable. Finally, if some dyads were same and others were opposite gender, then gender would be a mixed variable.
More often than not, categorical variables are between or within, whereas continuous variables are very often mixed. There are exceptions. For example, number of years married is continuous but still a betweendyads variable.
With these distinctions, the meaning of "distinguishable" can be made clearer. If dyad members are said to be distinguishable, then there is a withindyads variable that is dichotomous. The second major distinction concerns the ordering in the analysis. Some variables are usually considered predictors and others are outcomes. Almost always outcomes variables are mixed variables, and so between and withindyads variables are almost always predictor variables. It is possible to have a multiequation model such that a variable that is an outcome in one equation is a predictor variable in another. For the data on the following three variables:
Variable
1
2 3 ^{
}
Member
1 2 1
2 1 2
D 1
1
1 2 3
1 7
Y 2
3
3 4 1
3 3
A 3
5
5 5 0
2 2
D 4 3 3
3 2 5 2
Variable 1 is betweendyads (the scores of the two members are the same), variable 2 is within (both scores sum to 5), and variable 3 is mixed.
Topic 5. The Assessment of Nonindependence
The nonindependence in a variable refers to the degree of similarity between the two members of the dyad on that variable. The degree of nonindependence in outcome variables should ordinarily be assessed as this information affects the choice of statistical analysis. If the scores on the outcome variables are independent, then person can be the unit of analysis. If there is nonindependence, then dyad needs to be explicitly considered in the analysis. The degree of correlation can be positive (the two members are similar to one another) or negative (the two members are different from each other). Some methods for handling nonindependence treat it as a variance and not a correlation. These methods presume then that the nonindependence must be positive because variances cannot be negative. The tests described are preliminary tests because the key question is whether there is nonindependence after the effects of the predictor variables are removed.
The measures of nonindependence for distinguishable dyads are the Pearson productmoment correlation (the ordinary correlation coefficient). For indistinguishable dyads the less familiar intraclass correlation is computed. See Chapter 2 of Kenny, Kashy, and Cook (2006) for much more detail.
Computation of the Intraclass Correlation
ANOVA Formula
MS_{B}  MS_{W}
——————
MS_{B} + MS_{W}
where MS_{B} is variance in the dyad means times two and MS_{W} is the sum of the difference scores squared divided by the product of two times the number of dyads. The two MS terms can be viewed as mean squares from an analysis of variance with dyad as the independent variable. Fisher invented analysis of variance as a generalization of the intraclass correlation.
Double Entry Method
To compute the correlation, treat the two scores as if they were two sets of scores. This can be easily seen by illustration. Imagine the simple data set of
Person
Dyad 1 2
_{}
1 5 7
2 8 4
3 5 6
The correlation between two variables, denoted as X and Y, would become
Dyad X Y
_{}
1 5 7
1 7 5
2 8 4
2 4 8
3 5 6
3 6 5
Notice that each dyad is entered twice, hence the name of double entry. There is a slight negative bias in the estimate of the correlation using this method which is even older than the Fisher intraclass correlation. It has been revived by Dale Griffin and Rich Gonzalez.
Significance Test of the Intraclass CorrelationANOVA Method
The test of the intraclass is an F test of MS_{B}/MS_{W} if MS_{B} is larger or MS_{W}/MS_{B} if MS_{W} is larger. The degrees of freedom for MS_{B} is the number of dyads less one and for MS_{W} are the number of dyads. As discussed below, consideration needs to be given to using a value of alpha greater than the conventional .05 value.
Double Entry Method
The test is simple. Simply multiply the correlation by the square root of the number of dyads and treat it as a standard normal or Z test. This test is somewhat conservative, especially if the correlation is large.
Illustration of the Computations
Consider the
following data set:
Person
Dyad 1 2

1 5 7
2 4 4
3 3 2
4 8 7
5 2 4
6 8 6
7 5 7
8 3 4
9 4 4
10
9 5
ANOVA Intraclass Correlation
The MS_{B} equals 6.828 and the MS_{W} equals 1.750 making the intraclass correlation equal to
6.828  1.750
^{ } = .5926.828 + 1.750
The F test equals F(9,10) = 6.828/1.750 = 3.902, p = .046 (twotailed). Thus, the F test is statistically significant and it is concluded that scores are not independent.
Double Entry Intraclass Correlation
The double
entry value equals .557 and its Z is
1.76 with a p of .078. Both of the r and Z are smaller than the
ANOVA
value.
Computation of Partial Correlations
Generally, it is advisable to test for nonindependence controlling for the predictor variables. This is something that is often not done, but should be. The effects of the predictor variables may create a pseudo nonindependence. If the predictor variable is between dyads, its effects produces positive nonindependence. Alternatively, if the predictor variable is withindyads, its effects produce a negative correlation. If the ordinary Pearson productmoment correlation is used to measure the nonindependence, then partial correlations are computed. If intraclass correlations are used, then one partial out of the variance associated with the effects out of both mean squares are used to compute the intraclass. If the doubleentry method is used, standard partialling methods can be used. Normally, the computation of these partial correlations occurs within the estimation of a model (see Topic 10).
Topic 6. Consequences of Nonindependence on Significance Testing
Considered here is the bias when testing the effect of a predictor variable on an outcome whose scores may be nonindependent. Assume that the predictor variable is either within or between dyads.
The Effect
on the Significance Test When Person
Is the Unit
of Analysis Given Nonindependence
Design 
r

r

Betweendyads

too
many

too
few

Withindyads

too
few

too
many

Definitions of Terms in the Table
As an example, gender would be between dyads if samegender roommates were studied (some dyads are two males and others are two females). Given that the predictor variable is between dyads and a positive correlation in the outcome, the use of person as the unit of analysis leads to too many statistically significant results. The consequence of mixed predictor variables on p values is intermediate. If the intraclass correlation of the mixed variable is positive, its effects are like a betweendyads predictor variable and if negative like a withindyads predictor variable.
Power of the Test of Nonindependence
The analysis of dyadic data often hinges on whether there is nonindependence. Thus, the power of this test is critical. Consider the test of the Pearson correlation, assuming that dyad members are distinguishable. (The power of the intraclass correlation is essentially the same.)
Number of Dyads Needed to Have 80% Power in Testing the Correlation between Dyad Member
(Alpha of .05 and .01)
r 
.05 
.01 
.1 
782 
450 
.2 
193 
112 
.3 
84 
49 
.4 
46 
27 
.5 
28 
17 
.6 
19 
12 
.7 
13 
8 
Quite clearly, when the intraclass is small, the power of its test is very low even when alpha is set very high. I return to this issue in the section on effects of nonindependence on significance testing.
Recommended Strategy
Kashy, Bolger, and I in the a Handbook of Social Psychology chapter have defined the concept of consequential nonindependence. It is the level at which nonindependence results in a p value of .10 when it is presumed to be .05. The level of consequential nonindependence is about .45. We argue that there should be enough power, at least 80%, to test for consequential nonindependence. If there is, then the effective alpha is .06 which is not very troublesome. For dyads, to have sufficient power to test for consequential nonindependence, there must be at least 35 dyads. If there are less, then the power of the test of nonindependence may be too low. Thus, in studies with fewer than 35 dyads, a reasonable course of action is to presume that scores are nonindependent because there is not sufficient power to test for nonindependence.
Topic 7. How are the effects of a betweendyads predictor variable estimated?
First, given that that there are at least about 35 dyads, test whether there is nonindependence. (If there are less than 35 dyads, one should treat the data as if they were nonindependent.) Ideally the test for nonindependence should control for the effects of any predictor variables. If test indicates that there is independence, then person can be used as the unit of analysis. If test indicates that there is nonindependence, then dyad should be used as the unit of analysis. The outcome measure is either the sum or average of the two members whichever is more interpretable.
Topic 8. How are the effects measured when the predictor variable is within dyads?
It is assumed that preliminary tests have shown that there is nonindependence in the outcome. First, assume that the predictor variable is dichotomous, e.g. boss versus employee. The paired ttest can be used. The two groups are the two levels of the withindyads variable. This test in essence tests whether the mean of the difference scores equals zero where the difference is between the scores of the two dyad members.
So if we wanted to test whether husbands are more or less satisfied than wives, we would compute a difference score, say wife score minus husband score, and test whether the mean of the differences is significantly different from zero. Second, assume that the predictor variable is interval, e.g., percent of housework done. Compute difference scores in both X and Y. Regress differenced Y on differenced X without any intercept. The effect of differenced X on differenced Y in this equation measures the effect of X on Y. The intercept is not fitted because the direction of differencing is arbitrary. By not including an intercept, the solution will be the same regardless of how the differencing was done. (Try it out if you do not believe me!) It is instructive to reproduce the paired t test results by fitting a regression equation with no intercept. First, compute a paired t test and note the t and its p value. Now compute difference scores and create a predictor "variable" all of whose scores are that equal two. Run the regression equation, not including an intercept. Note you get the same t and p value. Now reverse the sign of the first dyad, both of the difference and the predictor variable. Note again that you get the same coefficient (the mean difference) and the same t and p value. You would not if the intercept were included.
Topic 9. How is nonindependence controlled if there are several predictor variables, some of which are between and others of which are within dyad?
One needs to perform two analyses. In the first, the sum or average of the two dyad members' scores is the outcome and the predictors in this equation is the set of betweendyads predictor variables. The second analysis is of the difference scores and includes the difference scores of the withindyads variables as predictors and no intercept. Sometimes this second analysis can be accomplished by a repeated measures analysis of variance. There must be one withindyads variable that is dichotomous, e.g., gender. It would be treated as the "repeated measure," and dyad, not person, would be the unit of analysis. If there are two dichotomous withindyads factors, then the following must be done. One of the withindyads variables becomes the repeated measure. The other is captured by creating the following betweendyads factor. It codes for the level of second withindyads factor at first level of the repeated measure. The interaction of this factor with the repeated measures captures the second withindyads factor. Interactions of between and within factors are captured by including the between factors in the difference score regressions. Much more detail on this topic is contained in Chapter 3 of Kenny, Kashy, and Cook (2006).
Topic 10. How can effects be estimated if the predictor variable is a mixed variable?
Model
The model that contains predictor variables that are betweendyads, withindyads, and mixed is called the ActorPartner Interdependence Model or APIM. You can download a bibliography of APIM papers if you click here. Assume that there is a mixed predictor variable, denoted as X, and an outcome denoted as Y. Denote X_{i} and X_{i}' as the two scores of the predictor variable for dyad i and Y_{i} and Y_{i}' as the two scores on the outcome for that same dyad. The actor effect is defined as the effect of X_{i} on Y_{i} and X_{i}' on Y_{i}' and partner effects as the effect of X_{i} on Y_{i}' and X_{i}' on Y_{i}. Basically the model is that X and X' cause Y where the effect of X on Y is called an actor effect and the effect of X' on Y is called a partner effect. So if a researcher examined the effects of gender on intimacy and studied friendships, both same and opposite gender, there are at least two effects of gender on intimacy. Females could be more intimate than males, an actor effect. Alternatively, interactions with females could be more intimate, a partner effect, For more detail on the analysis of APIM consult Chapter 7 of Kenny, Kashy, and Cook (2006).
ActorPartner
Interaction
Many dyadic
processes can be viewed as actorpartner
interactions. For instance, the effects of similarity between dyad members
can be viewed as an actorpartner interaction. The
usual way to measure the actorpartner interaction is to multiply actor times
partner effects or X times X'. However, if the interaction is supposed to
be similarity, then the absolute difference between X and X' would be
measured. Note that the actorpartner interaction is always a betweendyads variable.
For any actorpartner interaction that is estimated, one should control for the main effects
of actor and partner. Thus, if similarity is to be tested, the main effects of
actor and partner should be controlled.
Estimation
Practically, there are three ways to estimate this model when there is nonindependence: pooled regressions, structural equation modeling, and multilevel modeling. The pooled regression method has been essentially replaced by the two other two methods, but it is still useful to describe.
POOLED REGRESSIONS
This is oldfashioned way to do the analysis, but it is discussed as it might be helpful to think about. Two regression equations are estimated and their results
are pooled. In the first, the criterion variable is the average the two Y
scores. The predictor variables in this equation include the average of
all mixed variables and all the betweendyads variables. In the second regression
equation, the differences between the Xs and Ys are computed. One computes the difference in the same way for all of
the variables. In this equation the difference
between X and all withindyads variables (differenced) are included.
However, the intercept is not estimated in this
equation. The standard errors need to be pooled
and specialized degrees of freedom must be estimated that are described below.
In the betweendyads regression, the average of the two members' scores is used
and the regression coefficient is denoted as
b_{B}. In
the second regression equation, the difference score is computed and no intercept is fitted.
This coefficient is denoted as b_{W}.
These two coefficients are used to determine estimates
of actor and partner effects.
STRUCTURAL EQUATION MODELING
MultilevelTwo equations are estimated in which Y and Y' are the outcomes. In each
equation analysis, X and X' are predictors. To test specialized
predictions (e.g., pooling coefficients and setting coefficients equal), a structural equation modeling program
is needed. Dyad is the unit of analysis. The method is most useful when dyad members are distinguishable.
One advantage of this approach is that the entire model is estimated, including the correlation of the Xs and
the residual correlation of the Ys.
MULTILEVEL MODELING
Multilevel modeling can be used to estimate the APIM. Any multilevel program can be used. In this analysis,
each observation is a different person. However, there must be a variable that identifies each dyad and for
some computer programs, the data should be sorted by this variable.
First described here is PROC MIXED within SAS. Later we describe SPSS.
The code is:
PROC MIXED;
CLASS = DYAD;
MODEL OUTCOME = X XPAR /
REPEATED / TYPE = CS SUBJECT = DYAD;
To use SPSS (12.0 and higher), it is advisable that the data be sorted by dyad. We also need a variable that we will call MEMBER.
For example, one person is 1 on MEMBER and the other is 2. (If there is a distinguishing variable in the data
set, it can be used instead of MEMBER.
Upper case terms refers to an SPSS command.
Step 0: Preparation
File: Individual as unit.
Create necessary variables; partner predictor on the record.
Center predictors if necessary.
Make sure dyad id (DYAD) and a person number (MEMBER) is present.
Step 1: Start
ANALYSIS
MIXED MODELS
LINEAR
Type in dyad id in SUBJECTS
Type in the MEMBER variable name in REPEATED MEASURES if members are indistinguishable or the distinguishing identifier (e.g., GENDER) if members are distinguishable.
Pick COMPOUND SYMMETRY is members are indistinguishable and COMPOUND SYMMETRY HETEREOGENOUS if people are distinguishable on the
repeated measures variable
CONTINUE
Step 2: Click LINEAR MIXED MODELS
Type in the name of the DEPENDENT VARIABLE
Type categorical variables in FACTOR(S)
Type continuous variables in COVARIATE(S); include actor or own X as a predictor and partner's X as predictors.
 The remaining steps go from left to righht on the bottom of the screen. 
Step 3: Click FIXED
Add in relevant terms. Include relevant actor and partner effects.
Pay close attention to the term in the box in the middle.
Ordinarily make sure "INCLUDE INTERCEPT" box is checked.
CONTINUE
Step 4: Click STATISTICS
Click PARAMETER ESTIMATES
Click TESTS FOR COVARIANCE PARAMETERS
Can ask for DESCRIPTIVE STATISTICS and CASE PROCESSING SUMMARY
CONTINUE
Step 5: Run the job
Click OK
If you save syntax, you can delete the following statements as they use defaults:
/CRITERIA = CIN(95) MXITER(100) MXSTEP(5) SCORING(1) SINGULAR(0.000000000001) HCONVERGE(0,ABSOLUTE)
LCONVERGE(0, ABSOLUTE) PCONVERGE(0.000001, ABSOLUTE)
/METHOD = REML
Syntax might look like the following for indistinguishable members:
MIXED
dv WITH actorx partx
/FIXED = actorx partx actorx*partx
/PRINT = SOLUTION TESTCOV
/REPEATED = member  SUBJECT(dyadid) COVTYPE(CSR) .
Note that if members were distinguishable, that variable would be added and CSR would be changed to CSH.
Campbell, L. J., & Kashy, D. A. (2002). Estimating actor, partner, and interaction effects for dyadic data using PROC MIXED and HLM5: A brief guided tour. Personal Relationships, 9, 327342.
Griffin, D., & Gonzalez, R. (1995). Correlational analysis of dyadlevel data in the exchangeable case. Psychological Bulletin, 118, 430439.
Gonzalez, R., & Griffin, D. (1999). The correlational analysis of dyadlevel data in the distinguishable case. Personal Relationships, 6, 449469.
Kenny, D. A. (1995). The effect of nonindependence on significance testing in dyadic research. Personal Relationships, 2, 6775.
Kenny, D. A. (1996). Models of nonindependence in dyadic research. Journal of Social and Personal Relationships, 13, 279294.
Kenny, D. A., & Cook, W. (1999). Partner effects in relationship research: Conceptual issues, analytic difficulties, and illustrations. Personal Relationships, 6, 433448.
Kenny, D. A., & Judd, C. M. (1986).Consequences of violating the independence assumption in analysis of variance. Psychological Bulletin, 99, 422431.
Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis in social psychology. In D. Gilbert, S. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., Vol. 1, pp. 233265). Boston, MA: McGrawHill.
Kenny, D. A., Kashy, D. A., & Cook, W. (2006). Dyadic data analysis. New York: Guilford..