Article Preview
TopIntroduction
The vast majority of the statistical methods used in the behavioral sciences, and many of those used in the natural sciences, can be seen as special cases of structural equation modeling (SEM). This applies to both univariate (a.k.a. bivariate) and multivariate statistical analysis methods (Hair et al., 1987). This can be demonstrated through a sequential logical inference process. In short, it can be shown that most of these methods are instances of multiple regression analysis, which is itself an instance of path analysis, which in turn is an instance of SEM.
Methods like ANOVA, ANCOVA, MANOVA and MANCOVA can be shown to be special cases of multiple regression analysis (Hair et al., 1987; Rencher, 1998). In multiple regression analysis, hypothesis testing is typically conducted through the calculation of coefficients of association between multiple independent variables and one main dependent variable. These coefficients of association normally take the form of standardized partial regression coefficients (Rencher, 1998; Rosenthal & Rosnow, 1991). The corresponding P values are the probabilities that the relationships reflected in the coefficients are “real”.
Path analysis is a method developed by Sewall Wright in the 1930s (Wolfle, 1999; Wright, 1934) and later “rediscovered” by statisticians and social scientists. Sewall Wright was an evolutionary biologist and animal breeder. He was also one of the founders of the field of population genetics. Population genetics unified Darwin’s theory of evolution with Mendel’s theory of genetics. Another co-founder of the field of population genetics was Ronald A. Fisher, who also has made many contributions to the field of statistics (Hair et al., 1987; Kock, 2009).
Any path analysis model can be decomposed into one or more multiple regression models (Gefen et al., 2000; Kline, 1998). Each of the multiple regression models can then be solved separately, and the solution combined into one main solution to the path analysis model. In this sense, multiple regression analysis can be seen as a special case of path analysis. Since SEM is essentially path analysis with latent variables (LVs), then path analysis can be seen as a special case of SEM (Maruyama, 1998). As a corollary, all of the methods discussed above can also be seen as special cases of the SEM.
In SEM, LV scores are calculated as weighted averages of their respective indicators. Usually there are two or more indicators for each LV, although that is not always the case. Once LV scores are calculated, the SEM solution problem is reduced to the solution of a path analysis model. That is achieved through the calculation of path coefficients and respective P values, as well as several other ancillary statistical coefficients. The path coefficients are standardized partial regression coefficients, which are mathematically identical to those obtained through multiple regression analyses.
The calculation of weights linking indicators to LVs is one of the key aspects that differentiate SEM approaches. Those approaches can be divided into two with main types: variance- and covariance-based (Chin et al., 2003; Gefen et al., 2000; Haenlein & Kaplan, 2004; Kline, 1998). One of the main advantages of variance-based SEM is that it employs robust statistics to calculate P values, and thus can be seen as a nonparametric equivalent to covariance-based SEM. That is, unlike covariance-based SEM, variance-based SEM typically yields robust results even in the presence of small samples and multivariate deviations from normality (Chin et al., 2003; Gefen et al., 2000). Variance-based SEM is often referred to as PLS-based SEM, where “PLS” stands for “partial least squares” or “projection to latent structures”. The term “PLS-based SEM” is actually more commonly found in the literature than the term “variance-based SEM” (Chin et al., 2003; Haenlein & Kaplan, 2004).