SEM: Identification (David A. Kenny)

David A. Kenny
September 11, 2011

Identification: Overview

This page provides an introduction to the topic of identification of SEMs. For a more formal treatment of the topic go to Identification: Formal Treatment. All the details or links about Conditions and Rules are contained on that page. It is advisable to review key concepts of identification on the Basics page.

Traditionally, the theory of the identification of models requires a formal mathematical analysis. Moreover, most structural equation modeling computer programs can determine whether a given model is identified. It is still helpful to have rules of thumb because researchers need to know the identification status of their models before those models are estimated. Also, models that are in principle identified may not be identified in practice when they are actually estimated. The researcher needs to know whether the model is empirically identified.

The following is a set of rules that can be used to check whether a given model is identified. What follows should not be taken as a guide and not as gospel. The rules, by no means exhaustive, are helpful in determining identification.

If both the structural and the measurement models are identified, then the entire model is identified. For the entire model to be identified, the structural model must be identified. Some underidentified measurement models can be identified when the structural models is overidentified (see Condition B3b).

Measurement Model Identification

These rules primarily concern models in which each measure loads on only one construct. Fortunately, most estimated models are of this type. To achieve identification, one of the factor loadings must be fixed to one. The variable with a fixed loading of one is called a marker variable (see Condition A).

The key consideration for the identification of the measurement model is that there are "enough" indicators of each latent variable. A simple rule that works most of the time is that there need to be at least two indicators per latent variable and those indicators' errors are uncorrelated (see Condition B2a). The one major exception to this rule is that if the model contains only one latent variable and no other variables, there need to be at least three indicators of the latent variables whose errors are uncorrelated (see Condition B1). In terms of empirical underidentification, it is key that the indicators correlate with each other.

If the construct is formative, then one of the paths leading into the latent variable must be fixed to some non-zero value, usually one. Also the disturbance of formative construct should be fixed to zero.

Structural Model Identification

In the structural model, there is a set of structural equations. The causal variables are called exogenous variables and the effect variable is called the endogenous variable. Unexplained variation is referred to as disturbance.

All models that satisfy the following condition (see Rule B) appear to be identified: If between any pair of constructs, X and Y, no more than one of the following is true:
      X directly causes Y
      Y directly causes X
      X and Y have a correlated disturbance
      X and Y are correlated exogenous variables
Models that can be estimated by multiple regression form an important special case of this rule. Although there is no known proof of this condition, there is no known exception. It seems likely that the rule holds. As with any identification rule, the model may still not be empirically identified.

Go to the next Formal Rules of Identification page. Go to the next SEM page.

Go to the SEM page.