David
A. Kenny

September 7, 2011

Page being revised.

**Miscellaneous Variables**

**Formative Construct**

A formative construct or composite refers to an index
of a weighted sum of variables. In a formative construct, the indicators cause
the construct, whereas in a more conventional latent
variables, sometimes called a reflective construct, the indicators are
caused by the latent variable. Consider
a set of *k* exogenous variables (usually single-indicator variables) which
are combined to form an index. A latent variable C is created which has no
indicators or disturbance. One of the *k* paths leading into C is fixed to
one. Normally, all of the remaining paths are free, but they could be fixed.
The correlations or covariances between the *k* variables are free
parameters estimated.

An example of a formative construct is socio-economic class or SES whose indicators might be Education, Income, and Occupational status. That construct causes Stress and Mental Health:

Note that the formative variable in the model, SES, has no disturbance and the path from one of its indicators, Education, is fixed to one. A key feature of a formative construct is that it mediates the effect of its three indicators on the endogenous variables of Stress and Mental Health.

Formative variables have many uses: First, if the researcher seeks to create a composite, a weighted sum of measured variables, a formative construct has optimal weights; second, if there is a nominal variable with many categories, the dummy variables for that variable can be the indicators of a formative variable; third, if a reflective latent variable is planned, but either the indicators do not correlate or they are not single-factored, a formative variable might be used.

When
there is a single endogenous variable and no other exogenous variables, the standardized
paths leading into C are beta weights divided by the multiple correlation and the path from C to the endogenous variable is the multiple correlation. In
general, the standardized paths leading into C are proportional to canonical
coefficients.

To evaluate the utility of C to represent the paths from the k variables, drop
C from the model and have each variable cause the appropriate endogenous
variables. Compare the fit of this model to the one with C. The chi square
difference would have k(p - 1) degrees of freedom
where p is the number of endogenous variables. If fit is poor, then it means that the
composite should be formed different for the different endogenous variables.

Ken Bollen discusses in several papers how a formative construct can be endogenous and can have a disturbance, but those topics are not discussed here. If interested, go to his website.

**Second-Order or Hierarchical Latent Variable**

A second-order latent variable is a latent variable whose indicators are
themselves latent variables. Such a latent variable would then have no measured
indicators. It would have a disturbance if it were caused.

Rules of identification for latent variables still hold: The scale of the second-order factor must be fixed either by forcing one its loadings to one which is what is usually done or by standardizing the variable if it is exogenous and there must be a sufficient number of indicators, usually two are sufficient.

An example of a second-order factor might be Liberal-Conservative whose indicators might be the latent variables of attitude toward Social Issues, attitude the Economy, and attitudes toward Defense and the second-order latent variable causes desire to vote for two candidates, A and B:

Note that the second-order construct “mediates” the effect of the three first-order factor. More technically the second-order factor is a spurious cause.

The major uses of a second-order are as follows: First, in one has a construct but finds that it is multi-dimensional by creating a second-order factor one can preserve the construct. Second, if a set of latent variables all cause the same construct, their colinearity may difficult to separate their effects, but by having the causality work through a single second-order factor, the colinearity is reduced. Third, by having just one latent variable instead of many, a second-order model is more parsimonious.

Note that first-order factors have disturbances which should not be treated as measurement error. The disturbances reflect variance in the first-order factor not explained by the second order factor. The disturbance variance should be non-trivial and statistically significant. If not then the first-order factor essentially correlates perfectly with second-order factor and are no different. Heywood cases and negative disturbance variances are conceptually problematic.