David A. Kenny
October 25,
2005
Path Analysis
This page discusses how to use multiple regression to estimate the
parameters of a structural model.
Key Assumption
For an endogenous variable, its disturbance must be uncorrelated with all of the
specified causal variables. So for a model, consider each endogenous variable and
determine that its disturbance is uncorrelated with each of its causes.
Violation of the Assumptions to Use Multiple Regression to Estimate Structural Coefficients
There are three conditions under which the disturbance is correlated with the
an exogenous variable (thus, eliminating multiple regression as an appropriate
tool to estimate causal paths):
a) Spuriousness: A variable causes both the endogenous variable and
one its causal variables and that variable is not included in
the model.
b) Reverse Causation: The endogenous variable causes,
either directly or indirectly, one of its causes.
c) Measurement Error: There is measurement error in a causal variable.
Estimation Using Multiple Regression
Standardized variables
Paths: beta weights from the regression equation
Disturbance ath: square root of one minus the
multiple correlation squared
Curved lines between exogenous variables: correlations
Curved lines between disturbances: partial
correlations with common causal variables of both
endogenous variables partialled
Unstandardized variables
Paths: b weights from the regression equation
disturbance variance: the variance of the endogenous
variable times one minus the multiple correlation
squared
Curved lines between exogenous variables: covariances
Curved lines between disturbances: partial covariances,
with causal variables of both endogenous variables
partialled
Steps in Testing
STEP ONE: TEST OF DELETED PATHS
Respecify the model to make it just-identified. That
is, add the paths that the model specifies to be zero
and include them in the model. Test the paths
specified to be zero making sure that the specified
paths are included in the equation, but not tested.
These tests may be done with a reduced alpha (e.g., .01).
STEP TWO: TEST OF SPECIFIED PATHS
Retaining the significant paths from the previous step,
test the paths that were specified to be present in the
model. Sometimes these tests are done hierarchically.
STEP THREE: TRIMMED MODEL
Re-estimate the model, including (a) the paths that were
specified to be zero but were significant from step one
and (b) dropping the paths that were specified but were not
significant in step two.
Types of Tests
test of the individual paths
standard t or F test of the coefficient
F test of all of the paths in a given equation
(N - p - 1)(R22 - R12)
---------------------
k(1 - R22)
where
N overall sample size
p the number of deleted plus specified paths
k the number of deleted paths
R12 multiple correlation squared (not adjusted)
from the equation with only the specified
paths
R22 multiple correlation squared (not adjusted)
from the equation with the specified and
deleted paths
The combined test of all of the paths in the model is usually a chi square
goodness of fit test from a SEM program such as AMOS, EQS, or LISREL.
Determination of Deleted Paths
If all of the paths in the model can be estimated by multiple
regression, the number of deleted paths equals the number of knowns minus
the number of unknowns or the degrees of freedom of the specified model.
To make the model justidentified, examine each pair of variables and
determine pairs of variables that are not linked by a path or a
correlation (including a correlation between disturbances). Add a path or
a correlation between disturbance between each pair of these unlinked pairs.
The direction of the path is given by theory and the requirement that
feedback not be introduced.
Types of Links Between Two Variables
Direct effect: Either X causes Y, Y causes X, or both.
Indirect effect: The relationship between X and Y is said to be indirect if X
causes Z which in turn causes Y. (To learn more about indirect effects.)
Spuriousness: The relationship between X and Y is said to be spurious
if Z causes X and Y.
Unexplained covariation: Both X and Y are exogenous and so variation
between them is not explained by the model.
Decomposition of a Correlation
Correlation between two endogenous variables:
Correlation = Direct Effect + Indirect Effects + Spuriousness
Correlation between an endogenous variable and an exogenous variable:
Correlation = Direct Effect + Indirect Effects + Unspecified
Covariance
Go to the next
page.
Go to the SEM
page.