Article Preview
TopIntroduction
Structural equation modeling (SEM) is extensively used in many areas of research, including various business disciplines, as well as the social and behavioral sciences (Kline, 2010; Kock, 2014; Schumacker & Lomax, 2004). The techniques underlying SEM are relevant for the incipient field of business data analytics (Abdelhafez, 2014; Cech et al., 2014; Lee et al., 2014; Liu & Shi, 2015; Wang & Zhou, 2014). SEM employs latent variables, which are measured indirectly through “observed” or “manifest” variables, in sets associated with latent variables that are normally called “indicators”. This measurement includes error. Latent variables typically refer to perception-based constructs (e.g., satisfaction with one’s job). Indicators normally store numeric answers to sets of questions in questionnaires, each set designed to refer to a latent variable, and expected to measure it with a certain degree of imprecision.
Many SEM methods have been proposed over the years. Two main classes of methods have gained wider acceptance: covariance-based and PLS-based SEM (Hair et al., 2011; Kline, 2010; Kock, 2014; Kock & Lynn, 2012). Covariance-based SEM, often viewed as the classic form of SEM, builds on strong parametric assumptions (e.g., multivariate normality) and relies on the minimization of differences between indicator covariance matrices.
PLS-based SEM is generally nonparametric in design, building largely on techniques that make no distributional assumptions. It has a few advantages over covariance-based SEM, such as virtually always converging to solutions; even in complex models, with small sample sizes, and severely non-normal data (Hair et al., 2011; Tenenhaus et al., 2005). Also, PLS-based SEM generates latent variable scores, which can be used in further analyses – e.g., analyses that attempt to uncover and model nonlinear relationships among latent variables (Brewer et al., 2012; Guo et al., 2011; Kock, 2010). Finally, leading software tools for conducting PLS-based SEM (e.g., WarpPLS) tend to be viewed as fairly easy to use by a wide range of researchers.
However, PLS-based SEM builds latent variables as exact linear combinations of their indicators, without explicitly accounting for measurement error. Strictly speaking, these are not really latent variables, but “composites” (McDonald, 1996). Because of this, some argue that PLS-based SEM should not be referred to as an “SEM” technique, while others ignore this as just a semantic issue (Hair et al., 2011). This is one of the reasons why PLS-based SEM is sometimes referred to as “PLS path modeling” (Tenenhaus et al., 2005).
Because PLS-based SEM does not explicitly account for measurement error, it often yields path coefficient estimates that asymptotically converge to values of lower magnitude than the true values as sample sizes grow to infinity. Since path coefficients are proportional to correlations, the amount of underestimation for each path can be approximated through the correlation attenuation factor (Nunnally & Bernstein, 1994), expressed in (1). In this equation, is the attenuated correlation between composites that refer to two correlated latent variables and ; is the correlation between the latent variables, and and are the true reliabilities associated with the latent variables. We use the symbols and throughout to refer to latent variables (or factors) and associated composites, respectively: