2013IS实证研究方法的讨论 DISCOVERING UNOBSERVED HETEROGENEITY

RESEARCH ESSAY
DISCOVERING UNOBSERVED HETEROGENEITY IN
STRUCTURAL EQUATION MODELS TO
AVERT VALIDITY THREATS1
JanMichael Becker
Department of Marketing and Brand Management University of Cologne
Cologne 50923 GERMANY {jbecker@wisounikoelnde}
Arun Rai
Center for Process Innovation and Department of Computer Information Systems Robinson College of Business
Georgia State University Atlanta GA 30303 USA {arunrai@gsuedu}
Christian M Ringle
Institute for Human Resource Management and Organizations Hamburg University of Technology (TUHH)
Hamburg 21073 GERMANY {ringle@tuhhde} and
Faculty of Business and Law University of Newcastle Callaghan NSW 2308 AUSTRALIA {christianringle@newcastleeduau}
Franziska Völckner
Department of Marketing and Brand Management University of Cologne
Cologne 50923 GERMANY {voelckner@wisounikoelnde}
1 A large proportion of information systems research is concerned with developing and testing models pertaining
to complex cognition behaviors and outcomes of individuals teams organizations and other social systems
that are involved in the development implementation and utilization of information technology Given the
complexity of these social and behavioral phenomena heterogeneity is likely to exist in the samples used in IS
studies While researchers now routinely address observed heterogeneity by introducing moderators a priori
groupings and contextual factors in their research models they have not examined how unobserved hetero
geneity may affect their findings We describe why unobserved heterogeneity threatens different types of
validity and use simulations to demonstrate that unobserved heterogeneity biases parameter estimates thereby
leading to Type I and Type II errors We also review different methods that can be used to uncover unobserved
heterogeneity in structural equation models While methods to uncover unobserved heterogeneity in
covariancebased structural equation models (CBSEM) are relatively advanced the methods for partial least
squares (PLS) path models are limited and have relied on an extension of mixture regression—finite mixture
partial least squares (FIMIXPLS) and distance measurebased methods—that have mismatches with some
characteristics of PLS path modeling We propose a new method—predictionoriented segmentation (PLS
POS)—to overcome the limitations of FIMIXPLS and other distance measurebased methods and conduct
extensive simulations to evaluate the ability of PLSPOS and FIMIXPLS to discover unobserved heterogeneity
in both structural and measurement models Our results show that both PLSPOS and FIMIXPLS perform
1Ron Thompson was the accepting senior editor for this paper Ron Cenfetelli served as the associate editor
The appendices for this paper are located in the Online Supplements section of the MIS Quarterly’s website (httpwwwmisqorg)
MIS Quarterly Vol 37 No 3 pp 665694September 2013 665
Becker et alDiscovering Unobserved Heterogeneity in SEM
well in discovering unobserved heterogeneity in structural paths when the measures are reflective and that
PLSPOS also performs well in discovering unobserved heterogeneity in formative measures We propose an
unobserved heterogeneity discovery (UHD) process that researchers can apply to (1) avert validity threats by
uncovering unobserved heterogeneity and (2) elaborate on theory by turning unobserved heterogeneity into
observed heterogeneity thereby expanding theory through the integration of new moderator or contextual
variables
Keywords Unobserved heterogeneity validity structural equation modeling partial least squares formative
measures predictionoriented segmentation
Introduction
Assuming that data in empirical studies are homogeneous and
represent a single population is often unrealistic in the social
and behavioral sciences such as information systems man
agement and marketing (Rust and Verhoef 2005 Wedel and
Kamakura 2000) There may be significant heterogeneity in
the data across unobserved groups and it can bias parameter
estimates lead to Type I and Type II errors and result in
invalid conclusions (Jedidi et al 1997) Consider the fol
lowing technology acceptance model (TAM) example A
researcher is interested in individuals’ intention to use an IT
system or service (Davis et al 1989 Venkatesh 2000
Venkatesh and Davis 2000 Venkatesh et al 2003) Informed
by existing theory the researcher proposes a model in which
perceived usefulness (PU) and perceived ease of use (PEOU)
of the IT system explain intention to use the system (IU)
(Figure 1) The empirical results reveal that PU and PEOU
are equally important in explaining IU However the theory
and model overlook the two underlying groups experienced
IT users (Figure 1a segment 1) and inexperienced IT users
(Figure 1a segment 2) Experienced users show a strong
positive relationship between PU and IU and a weak or non
significant relationship between PEOU and IU In contrast
inexperienced users show a strong positive relationship
between PEOU and IU and a weak or nonsignificant rela
tionship between PU and IU (Figure 1a) In this scenario
drawing inferences based on results from the overall sample
would lead to Type I errors as we would be overgeneralizing
the significant findings from the overall sample to the
underlying user groups one with a nonsignificant estimate for
PEOUIU and the other with a nonsignificant estimate for
PUIU If the model is not refined to accommodate this
unobserved heterogeneity a system that is unsuitable for
either user group (ie one with average usefulness and
average ease of use) may be provided to all users
In addition a study may not find PEOU to be a significant
predictor of IU because of unobserved heterogeneity across
two groups of users (ie experienced versus inexperienced)
If experienced users (Figure 1b segment 1) perceive an easy
touse system (ie high PEOU) as being too simple to fulfill
their needs they may show a strong negative relationship
between PEOU and IU In contrast if inexperienced users
(Figure 1b segment 2) show a strong positive relationship
between PEOU and IU as in the first example a sign reversal
occurs between the two groups with regard to the effect of
PEOU on IU thereby leading to an overall nonsignificant
effect of PEOU on IU and a Type II error
Recent TAM models acknowledge existing heterogeneity by
incorporating experience as a moderator of PEOU’s effect on
IU However before its inclusion in the theory experienced
versus inexperienced users represented unobserved hetero
geneity that could lead to biased findings on the effects of PU
and PEOU on IU This illustration shows how not accounting
for unobserved heterogeneity can lead to misinterpretations
and invalid conclusions in IS research—a point we emphasize
later in the paper based on a review of 12 metaanalysis
studies on key IS phenomena (see Table A1 in Appendix A)
Despite the threats to validity from unobserved heterogeneity
there are important gaps in the IS literature about the specific
threats to validity and how to safeguard against them
(1) While IS studies now routinely address observed hetero
geneity by introducing moderators a priori groupings
contextual factors and control variables in their research
models they have not considered unobserved hetero
geneity in their data In fact none of the papers ap
pearing in the field’s two most widely recognized jour
nals (MIS Quarterly and Information Systems Research)
over the last 20 years that have developed and tested
structural equation models have examined unobserved
heterogeneity Our first research objective is to introduce
the concept of unobserved heterogeneity in the IS litera
ture and to show how IS researchers can safeguard
against biases and facilitate theory development
(2) While research in some fields notes that unobserved
heterogeneity threatens empirical results and their inter
pretation a systematic analysis of the threats to specific
666 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
(a) TAM Example 1 (b) TAM Example 2
Figure 1 Examples for Unobserved Heterogeneity in TAM
types of validity is missing in the literature Our second
research objective is to evaluate the implications of
unobserved heterogeneity for four types of validity (ie
instrument internal statistical conclusion and external
validity Cook and Campbell 1976 1979 Straub 1989)
thereby broadening our understanding of the specific
validity threats that arise from unobserved heterogeneity
(3) In structural equation modeling (SEM) unobserved
heterogeneity is not only a validity threat for the struc
tural model but also for the measurement model regard
less of whether the measures are reflective or formative
While heterogeneity in reflective measures has been
discussed in terms of measurement equivalence or invari
ance (MEI) (eg Steenkamp and Baumgartner 1998
Vandenberg and Lance 2000) the implications of unob
served heterogeneity for formative measures have not
been examined Our third research objective is to evalu
ate the implications of unobserved heterogeneity for
formative measures
(4) In contrast to covariancebased SEM (CBSEM eg
Jöreskog 1978 1982) research on partial least squares
(PLS) path modeling (eg Chin 1998 Lohmöller 1989
Wold 1982) has paid limited attention to unobserved
heterogeneity Only recently has a method been pro
posed to detect unobserved heterogeneity in PLS path
models finite mixture partial least squares (FIMIXPLS
Hahn et al 2002 Sarstedt and Ringle 2010) However
FIMIXPLS does not account for heterogeneity in the
measurement model and assumes multivariate normal
distributions for latent variables Furthermore there is
limited evidence of this method’s performance in dis
covering unobserved heterogeneity Our fourth research
objective is to propose and evaluate a new method PLS
predictionoriented segmentation (PLSPOS) which does
not follow distributional assumptions and uncovers
unobserved heterogeneity not only in the structural model
but also in the measurement model
(5) Researchers facing the problem of unobserved hetero
geneity in their empirical work lack guidelines on how to
apply methods systematically to uncover unobserved
heterogeneity Therefore our fifth research objective is
to develop an unobserved heterogeneity discovery
(UHD) process to guide researchers in applying methods
to ensure the validity of findings and to elaborate theory
by turning unobserved heterogeneity into observed
heterogeneity
By addressing the above research objectives we make six
contributions First we provide evidence and reasoning for
why unobserved heterogeneity is an important issue in IS
research Second we demonstrate that unobserved hetero
geneity in SEM has implications not only for the structural
model but also for measurement models Third we identify
the implications of unobserved heterogeneity for different
types of validity and surface the importance of uncovering
unobserved heterogeneity to avoid validity threats Fourth
we introduce the new PLSPOS method for detecting unob
served heterogeneity This method is specifically developed
to fit PLS path modeling as it employs a predictionoriented
and nonparametric approach and uncovers heterogeneity in
both the structural model and the (formative) measurement
MIS Quarterly Vol 37 No 3September 2013 667
Becker et alDiscovering Unobserved Heterogeneity in SEM
models and thereby overcomes the limitations of FIMIXPLS
and other distance measurebased methods Fifth we evaluate
FIMIXPLS and PLSPOS using an extensive simulation
study and generate important insights into the performance of
the two methods in uncovering unobserved heterogeneity in
PLS path models Sixth we provide a UHD process to guide
researchers in discovering and addressing unobserved
heterogeneity in structural equation models
Concept of Heterogeneity and its
Treatment in IS Research
Researchers can obtain different parameter estimates when
they consider differences among observations relative to when
they overlook them However heterogeneity among observa
tions is not necessarily captured by variables that are precon
ceived by the researcher and specified by existing theory as
it can exist beyond these previously identified variables
(Jedidi et al 1997) As a consequence it is necessary to
differentiate between the following two types of hetero
geneity (1) observed heterogeneity when subpopulations are
defined a priori based on known variables and (2) unobserved
heterogeneity when the subpopulations in the data are
unknown (Lubke and Muthén 2005)
Observed Heterogeneity
Observed heterogeneity occurs when differences in parameter
estimates between groups are expected a priori for the phen
omenon—that is when group differences are explained by
existing theory that incorporates moderators or contextual
factors Examples of such moderators or contextual factors
considered in IS research include individual cultural differ
ences (eg individualism versus collectivism Srite and Kara
hanna 2006) individual demographic differences (eg gen
der income levels and education Hsieh et al 2008 Venka
tesh et al 2003) and organizational demographic differences
(eg large versus small firms Rai et al 2006) In our TAM
example from earlier existing theory expects genderbased
heterogeneity in structural paths (ie men are expected to
have a stronger relationship between PU and IU and women
are expected to have a stronger relationship between PEOU
and IU) (eg Venkatesh and Morris 2000) Moreover
existing theory expects contextual variables such as volun
tariness or task type (eg Venkatesh and Davis 2000) or
psychographic variables such as personal innovativeness and
computer attitude to cause heterogeneity in the relationships
among the TAM constructs (eg Venkatesh and Bala 2008)
Unobserved Heterogeneity
When theory does not assume heterogeneity even though it
exists or when theory indicates heterogeneity but the specified
group variables do not sufficiently capture it in the popula
tion unobserved heterogeneity occurs In such situations
researchers need to uncover unobserved heterogeneity by seg
menting data to form homogenous groups If the differences
uncovered by segmentation can be explained post hoc using
contextual or demographic variables (eg culture gender
experience etc) making the groups accessible theory can be
expanded accordingly and unobserved heterogeneity is
turned into observed heterogeneity for future studies If the
differences cannot be explained by wellknown contextual
variables the researcher has to consider complementary
theoretical explanations for the phenomenon
Treatment of Heterogeneity in IS Research
Given the complexity of the social and behavioral phenomena
tackled in IS research heterogeneity is likely to exist in
samples that are used to develop test and refine models If
this heterogeneity is not uncovered and controlled the (unob
served) heterogeneity can bias results and conclusions (eg
Ansari et al 2000 Johns 2006) Consequently unobserved
heterogeneity is receiving increasing attention in related disci
plines (eg marketing where scholars study similar complex
phenomena pertaining to consumer choices and preferences
the alignment of firmlevel marketing strategies interorgani
zational relationships and the business value of tangible and
intangible resources) to safeguard against biases and probe the
underlying reasons for unobserved heterogeneity (eg
Rigdon et al 2010) This enhances the likelihood of
obtaining valid results as well as of generating greater theo
retical contributions Methodologists in marketing econo
metrics and psychology have proposed advances to uncover
unobserved heterogeneity in various approaches—for
instance regression analysis (DeSarbo and Cron 1988 Späth
1979 Wedel and DeSarbo 1994) CBSEM (eg Ansari et al
2000 Jedidi et al 1997 Muthén 1989) panel data models
(eg Allenby and Rossi 1998 Popkowski Leszczyc and Bass
1998) and conjoint analysis (eg DeSarbo et al 1995
Gilbride et al 2006 Lenk et al 1996)
While IS studies now routinely address observed hetero
geneity by introducing moderators a priori groupings con
textual factors and control variables in their research models
they have not examined threats to validity due to unobserved
heterogeneity Our review of 12 metaanalysis studies that
synthesize the findings of empirical research across various IS
phenomena (eg technology acceptance IT investment pay
668 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
off IT innovation adoption IS implementation success and
group support systems) reveals that all of them identify
inconsistent conflicting or mixed findings heterogeneity of
effect sizes (Wang and Keil 2007 p 9) wide variation in
the predicted effects (King and He 2006 p 740) and corre
lations that vary across studies more than would be produced
by sampling error (Wu and Lederer 2009 p A6) (see
Table A1 in Appendix A) Most of these 12 metaanalysis
studies note that these inconsistencies may be caused by the
omission of key contextual variables or moderators How
ever investigating the known moderators or contextual
variables controls for observed heterogeneity (Haenlein and
Kaplan 2011) but as long as these moderators and contextual
variables are not specified in theory population heterogeneity
will remain unobserved and threatens model validity (In the
next section we discuss how unobserved heterogeneity biases
estimates and causes Type I and II errors) Furthermore
uncovering unobserved heterogeneity at the study level
accelerates the theorydevelopment cycle by generating
insights into relationships among constructs (Edmondson and
McManus 2007) In a later section we describe a UHD
process where uncovering unobserved heterogeneity facili
tates abduction (by raising the possibilities of rival explana
tions not previously considered Van de Ven 2007) directing
researchers to identify variables that account for unobserved
heterogeneity and through this process make segments
accessible and turn unobserved heterogeneity into observed
heterogeneity (eg by discovering moderators and grouping
variables) This introduction of constructs to capture formerly
unobserved heterogeneity revises models and theoretical
explanations making it possible for the revised models to be
tested in future research
Effects of Heterogeneity on Structural
Equation Models
Unobserved Heterogeneity in the
Structural Model
In the context of SEM heterogeneity can affect the structural
model the measurement model (formative and reflective) or
both (eg Ansari et al 2000 Qureshi and Compeau 2009)
Unobserved heterogeneity can influence path coefficients in
the structural model because the parameter estimates are
determined based on the overall sample which pools obser
vations across the underlying (unobserved) groups As a
result researchers may encounter the following biases
(1) biased parameter estimates of structural paths (2) non
significant estimates at the group level becoming significant
at the overall sample level that combines (unobserved)
groups (3) sign differences in the parameter estimates across
(unobserved) groups being masked as nonsignificant results
at the overall sample level that combines (unobserved)
groups and (4) decreased predictive power of the model (R²
of the endogenous variables) These biases can lead to Type I
and Type II errors and invalid inferences
To substantiate that these biases occur due to unobserved
heterogeneity we conducted a simulation of a PLS path
model with the following three situations with two unob
served groups (1) the parameter estimates across the groups
have the same sign but differ in absolute values (2) the
parameter estimates across the groups have opposite signs
and (3) the parameter estimates are nonsignificant for one
group but significant for the other Table 1 summarizes the
findings (see Appendix D for details)
The results show that unobserved heterogeneity biases the
parameter estimates decreases the R² and increases the risk
of Type I and Type II errors Specifically in all three simu
lated situations biases in the parameter estimates distort effect
sizes and cause misinterpretation of the parameter values
which is especially problematic for comparative hypotheses
(eg path coefficient 1 > path coefficient 2) When the
groupspecific parameters show inconsistent signs (ie
situation 2 in which signs are reversed across the groups) and
when one of the groups involves nonsignificant parameters
while the other does not (ie situation 3) Type I and Type II
errors are exacerbated by the following (1) If a researcher
overlooks unobserved heterogeneity and there is a significant
nonzero relationship between the constructs as the overall
sample estimate this researcher is incorrectly overgenera
lizing the significant relationship that exists in the first
segment thereby leading to a Type I error with respect to the
second segment2 (2) If a researcher overlooks unobserved
heterogeneity and obtains a nonsignificant relationship
between the constructs as the overall sample estimate this
researcher may overgeneralize the nonsignificant finding
which exists only in the second segment thereby leading to
a Type II error with respect to the first segment In contrast
when all parameters are significant and show the same sign
(situation 1) it is unlikely that Type II errors will occur in
this situation the occurrence of Type II errors depends on the
effect size and the degree to which the increase in standard
errors due to unobserved heterogeneity is compensated by the
increased power of the larger sample size due to combining
the groups The R² decreases in all situations implying an
2This does not mean that there will be a Type I error in general (ie for both
segments) but only with respect to segment 2 where the true effect is zero
To be specific the overall sample estimate cannot show a significant non
zero relationship because of unobserved heterogeneity when all segments
have a true zero relationship
MIS Quarterly Vol 37 No 3September 2013 669
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 1 Conclusions from the Simulation Study on Heterogeneity Effects
True Group Parameters
(heterogeneity is uncovered)
Overall Parameter Estimates
(heterogeneity is not uncovered)
Explanation for Type I
and Type II ErrorsSituation
Group
1
Group
2 Biased
Type I
Error
Type II
Error
Lower
R²
1 Significant in all groups
with consistent signs
++Yes V Depends Yes Increase in standard errors
vs increased sample size ––Yes V Depends Yes
2 Significant in all groups
with inconsistent signs –+Yes V Likely Yes Effects cancel each other
3 Significant in some
groups but not in others + – 0 Yes Likely Likely Yes Depends on the effect size
Notes + significantly positive – significantly negative 0 nonsignificant V not possible
inferior model fit to the overall sample the decrease in R² is
greater when groupspecific effect sizes are high however R²
is almost unaffected when the groupspecific effects are low
Unobserved Heterogeneity in the
Measurement Model
Measurement model specification requires the consideration
of the nature of the relationship between constructs and
measures There are two types of measurement models
reflective and formative measures (Diamantopoulos and
Winklhofer 2001 Jarvis et al 2003) In reflective measures
changes in the construct are reflected in changes in all of its
indicators and the direction of causality is from the construct
to the indicators Reflective indictors are assessed in terms of
their loadings which entails the simple correlation between
the indicator and the construct In formative measures the
indicators do not reflect the underlying construct but are com
bined to form it without any assumptions about the intercorre
lation patterns among them The direction of causality is from
the indicators to the construct and the weights of formative
indicators represent the importance of each indicator in
explaining the variance of the construct (Edwards and
Lambert 2007 Petter et al 2007 Wetzels et al 2009)
Unobserved heterogeneity can lead to differences between
measurement model weights and loadings across groups If
the construct’s measures are reflective unobserved hetero
geneity may result in different loadings when respondents
across groups interpret and respond to measures differently or
when they provide information with different degrees of
accuracy (Ansari et al 2000) Thus when reflective measures
are not equivalent across groups MEI is not established (eg
Steenkamp and Baumgartner 1998 Vandenberg and Lance
2000) In this case the construct does not capture the same
theoretical meaning across groups implying that differences
in the construct’s relationships with other constructs cannot be
compared across groups That is the groupspecific param
eters are only interpretable at the group level and the data
should not be pooled across groups For example when con
sidering reflective measures of PU users’ understanding of
usefulness can differ significantly across groups If this is the
case one cannot combine the groups into an overall sample
because the construct measured does not capture the same
meaning across groups The relationship between PU and
other constructs would be biased as a result of the absence of
invariant measurement However the lack of MEI arising
from heterogeneity provides valuable information that struc
tural parameters should not be compared between groups and
that the data across the groups should not be combined As
such ignoring the heterogeneity and interpreting results based
on the overall sample would lead to invalid conclusions
In contrast when a construct’s measures are formative unob
served heterogeneity can lead to differences in the formative
indicators’ weights across groups While recent research has
discussed MEI in formative measures (Diamantopoulos and
Papadopoulos 2010) it is important to uncover formative
indicator weight differences due to unobserved heterogeneity
in order to avoid ambiguous interpretations Formative indi
cators cause variance in the construct and can be interpreted
as actionable attributes of a construct The weights of forma
tive indicators represent the relative importance of the con
struct’s different facets Therefore the problems associated
with unobserved heterogeneity in formative measures are
similar to those that occur in the structural model Conse
quently ignoring differences in formative indicator weights
due to unobserved heterogeneity can bias parameter estimates
and lead to Type I and Type II errors Thus when researchers
find formative indicator weights to be unstable and nonsigni
ficant in addition to exploring multicollinearity (Cenfetelli
670 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
and Bassellier 2009) they should also explore unobserved
heterogeneity
As an example assume that service quality (SERVQUAL) is
measured using the following five formative indicators
(1) tangibles (2) reliability (3) assurance (4) empathy and
(5) responsiveness (eg Cenfetelli and Bassellier 2009
Collier and Bienstock 2009 Parasuraman et al 1988) Some
customers might favor the communication facets (eg
empathy and responsiveness) when they evaluate service
quality while others might favor the trust facets (eg assur
ance and reliability) in their evaluation These differences in
customer perceptions result in different measurement weights
across the groups although the underlying theoretical con
struct of service quality remains the same For example two
equally sized groups have measurement weights of wg1 [6
6 6 0 0] for a certain formative construct in one group and
wg2 [2 2 2 6 6] in the other group Combining these
two groups in the overall sample results in equal relative
importance (weights) for all indicators with measurement
weights of w [4 4 4 3 3] for the overall sample As a
consequence the interpretation of the weights estimated using
the overall sample is misleading and the formative measures
based on the overall sample represent neither the first group
nor the second Given this bias in the formative measures for
service quality the relationship between service quality and
other constructs (eg customer satisfaction) is also likely to
be biased
Implications of Unobserved Heterogeneity
for Model Validity
If unobserved heterogeneity characterizes the data and results
are based on the overall sample the estimated model lacks
validity because it will not uncover the true effects of the
underlying groups In a broad sense validity is the extent to
which a method (ie the design the model or the construct)
measures what it claims to measure We elaborate on why
unobserved heterogeneity affects the major types of validity—
(1) internal (2) instrumental (including content construct
and criterion validity and reliability) (3) statistical conclu
sion and (4) external (eg Cook and Campbell 1976 1979
Heeler and Ray 1972 Straub 1989) See Table 2 for defini
tions of each type of validity and explanations of how unob
served heterogeneity threatens it
Unobserved heterogeneity is a threat to internal validity
because contextual or group variables that affect results are
overlooked thereby resulting in an incomplete model The
observations across the 12 metaanalyses that we discussed
earlier show that inconsistent findings arise when contextual
or group variables are omitted Uncovering these variables
and improving theory through the discovery of unobserved
heterogeneity safeguards against internal validity threats
In addition unobserved heterogeneity threatens statistical
conclusion validity Analyzing the overall sample without
accounting for heterogeneity increases standard errors and
reduces (averages) effect sizes thereby biasing estimates and
leading to Type I and Type II errors (The simulations in the
previous section show how statistical conclusion validity is
threatened by unobserved heterogeneity)
Our earlier discussion of unobserved heterogeneity shows that
it can bias the measurement model estimates of constructs
thereby adversely affecting instrument validity There is a
particular threat to reliability (internal consistency) when
measures show different correlation patterns or error vari
ances between groups For example experienced users might
have a different understanding of a system’s usefulness com
pared to inexperienced users thereby leading to different
correlation patterns for the PU construct’s indicators The
respondents’ experience can also affect PU’s error variance
between groups as inexperienced users might have higher
variability in their responses than experienced users who have
a clearer understanding of the system’s usefulness
Unobserved heterogeneity can also threaten construct validity
because differences in indicator loadings and weights across
groups will not be detected As such an evaluation of con
struct validity based on the overall sample while overlooking
unobserved heterogeneity will not reveal the true group
specific measures of the constructs thereby risking not
detecting if the construct captures a different phenomenon for
each group Moreover if the measures derived based on the
overall sample do not represent the true construct (eg PU)
the biased construct can lead to invalid inferences on relation
ships with other constructs thereby threatening criterion
validity Both threats are regularly addressed when testing for
MEI in multigroup models (ie observed heterogeneity) (see
Steenkamp and Baumgartner 1998 Vandenberg and Lance
2000) but are usually overlooked in the context of unobserved
heterogeneity
In contrast unobserved heterogeneity typically does not affect
content validity because the constructs’ measures are normally
the same across groups and are grounded in theory However
an increase in the value of a formative measure’s error term
due to unobserved heterogeneity can lead to misinterpre
tations as a high error term is typically associated with the
construct measure’s incompleteness (Diamantopoulos et al
2008)
MIS Quarterly Vol 37 No 3September 2013 671
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 2 Implications of Unobserved Heterogeneity for Model Validity
Type of
Validity What is It
Threats Due to Unobserved
Heterogeneity Why Is It a Threat
Internal
Validity
• Is the effect due to
unhypothesized
variables
• Are there rival
explanations for the
findings or just one
single explanation
• There are other viable
explanations for the findings
namely group differences that
are not accounted for
• The observed effects are a result of unhypothesized andor
unmeasured variables (ie the groups and corresponding
explanatory variables)
• Example the underlying theory does not include
differences in the technology acceptance between
experienced and inexperienced users
Instrumental Validity
Content
Validity
• Do the indicators
accurately reflect the
theoretical domain
Formative & Reflective
• In general heterogeneity does
not affect content validity as
content validity is grounded in
theory
Formative
• The error term of the formative
construct likely increases due to
unobserved heterogeneity which
can be mistakenly interpreted as
lack of content validity (Type II
Error)
• The empirically relevant (ie significant) set of indicators
may vary across groups
• Varying nonsignificant indicators across groups indicate
problems with MEI but this is a problem of construct
validity in the sense of (not) capturing the right
phenomenon
• Nonsignificant indicators should remain in the model if
theoretically relevant
• Following Diamantopoulos et al (2008) the error term in
formative constructs represents those aspects of the
construct domain not represented by the indicators
Understanding the error term in this way and assessing it
without capturing unobserved heterogeneity may indicate
insufficient content validity although all important indicators
are included in the formative construct
Construct
Validity
• Are the chosen
measures repre
senting the true con
struct of the
phenomenon
• Are the operationali
zations of the
constructs correct
Formative & Reflective
• Indicator weightsloadings
estimated with the assumption
that no underlying groups exist
are biased if groups actually
exist
• For formative measures differences in the importance of
indicators across groups lead to different measurement
weights although the phenomenon is still the same
• For reflective measures when MEI is established across
groups (ie there are no differences in the weights
loadings) there is no threat of unobserved heterogeneity to
construct validity Otherwise the construct captures a
different phenomenon for each group Combining the
measures at the overall sample level is not allowed
Criterion
Validity
• Are inferences from
the construct to a
related behavioral
criterion of interest
accurate
Formative & Reflective
• Differences in construct
perceptions across groups (ie
different weights loadings) lead
to biased construct scores
which in turn influence (bias)
the estimated relationship with
other constructs
• The measures based on the overall sample do not
represent the true groupspecific measures of the
constructs This causes problems when interpreting the
construct scores or their relationships with other constructs
in the model
• For reflective measures when there is no MEI established
across groups the apparently different phenomena across
groups have varying and incomparable relationships with
other constructs
Reliability
• Are the measures
accurate
• Are the measures
consistent
TestRetest Reliability
(Formative & Reflective)
• Not affected
Internal Consistency (Reflective)
• Reliability (eg Cronbach’s
alpha) at the overall sample level
is negatively influenced by the
lack of MEI across groups
• Repeating the measurement with the same observations
under the same conditions should lead to the same results
on the overall and group levels
• Different correlation patterns across groups for a reflective
perceived usefulness construct can lead to an average
correlation pattern on the overall sample level which does
not show appropriate internal consistency
Statistical
Conclusion
Validity
• Have adequate
sampling procedures
appropriate statistical
tests and reliable
measurements been
used
• Heterogeneous samples may
lead to higher standard errors or
lower effect sizes thereby
influencing the power of tests
• Biased estimates Type I and
Type II errors
• Path coefficients for relationships between constructs (eg
ease of use and intention to use) might have higher
standard errors on the overall sample than in their
underlying groups indicating a variety of different
coefficients across user groups
• This also applies to formative measurement weights
External
Validity
• Are findings
generalizable to other
populations and
conditions
• Interpretations of the overall
sample may be ambiguous and
misleading
• Results cannot be generalized
easily as they are valid for only a
special condition of the model
• Analyzing population differences reveals more general
conclusions about the model than those from the overall
sample
• Example Based on the overall sample level usefulness
has the same importance as ease of use However there
are no users who value usefulness and ease of use
equally rather there are two distinct groups of experienced
and inexperienced users
672 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Finally if unobserved heterogeneity is not uncovered there
is a threat to external validity (ie the ability to generalize
findings beyond the current population and context) because
the overall sample results are not representative of the under
lying groups As findings are averaged across groups results
obtained using the overall sample cannot be generalized to
different groups The observation of inconsistent conflicting
or mixed findings in the 12 metaanalyses in Table A1
(Appendix A) also show that the results of one study often
cannot be generalized to other studies (indicating low external
validity) with unobserved heterogeneity being one of the
plausible reasons
Because of these threats to the different types of validity it is
important to uncover heterogeneity in data that may otherwise
lead to invalid conclusions Next we present an overview of
methods to uncover unobserved heterogeneity in structural
equation models that researchers can apply to overcome
threats to validity due to unobserved heterogeneity
Uncovering Heterogeneity in Structural
Equation Models
In this section we first synthesize and compare different
methods in SEM (ie CBSEM and PLS path modeling) to
uncover observed and unobserved heterogeneity Given the
objectives of our paper we focus primarily on methods in
SEM to uncover unobserved heterogeneity3 We also intro
duce a new method to address some of the limitations of
existing methods to uncover unobserved heterogeneity in PLS
path models
Existing Methods to Uncover Observed
Heterogeneity in SEM
SEM methods to address observed heterogeneity are now
commonly applied in the social and behavioral sciences
including information systems The first category of methods
identifies homogenous groups of observations (eg indi
viduals) a priori based on grouping variables (eg psycho
graphic or sociodemographic) A multigroup analysis
reveals the heterogeneity between the groups by testing for
differences across groupspecific parameter estimates Exam
ples of these methods for PLS path modeling can be found in
Chin and Dibbern (2010) Sarstedt et al (2011b) and Qureshi
and Compeau (2009) and for CBSEM in Jöreskog (1971) and
Sörbom (1974) The second category of methods aims at
identifying moderating factors that explain heterogeneity in
specific structural model relationships Examples of these
methods in PLS path modeling can be found in Chin et al
(2003) Goodhue et al (2007) and Henseler and Chin (2010)
and for CBSEM in Jaccard and Wan (1995) Jöreskog and
Yang (1996) and Klein and Moosbrugger (2000) Uncovering
observed heterogeneity with both types of methods requires
a priori knowledge about differences across groups Conse
quently these two types of methods do not account for unob
served heterogeneity—that is differences across groups that
are not informed by existing theory and are unknown a priori
Existing Methods to Uncover Unobserved
Heterogeneity in SEM
The next sections present methods in CBSEM and PLS path
modeling to uncover unobserved heterogeneity
CBSEM Methods to Uncover
Unobserved Heterogeneity
In CBSEM the following two primary methods have been
developed to uncover unobserved heterogeneity (1) finite
mixture models that extend multigroup CBSEM (Arminger
et al 1999 Dolan and van der Maas 1998 Jedidi et al 1997)
and (2) hierarchical Bayesian models that extend multilevel
CBSEM (Ansari et al 2000 Cai and Song 2010 Lee and
Song 2003) Table 3 presents a summary of these CBSEM
methods
Finite mixture models for CBSEM were developed by Jedidi
et al (1997) Arminger et al (1999) and Dolan and van der
Maas (1998) These models (1) assume that data originate
from subpopulations (groups) in the overall population that is
a mixture of them and (2) generalize multigroup CBSEM
(Jöreskog 1971 Sörbom 1974) to unobserved latent groups
assuming the structural parameters (covariance) and factor
means to be mixtures of components The method used for
finite mixture models assigns the observations to a pre
specified number of groups by means of fuzzy (probabilistic)
clustering thereby permitting the simultaneous estimation of
groupspecific parameters (Jedidi et al 1997) Consequently
finite mixture models address unobserved heterogeneity in the
data by grouping observations and estimating groupspecific
3There are several methods to uncover both observed and unobserved
heterogeneity in other methodological contexts—for example regression
analysis (DeSarbo and Cron 1988 Späth 1979 Wedel and DeSarbo1994)
panel data models (Allenby and Rossi 1998 Popkowski Leszczyc and Bass
1998) and conjoint analysis (DeSarbo et al 1995 Gilbride et al 2006 Lenk
et al 1996) Given the objectives of our paper and for reasons of scope we
do not review these methods
MIS Quarterly Vol 37 No 3September 2013 673
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 3 Overview of CBSEM Methods to Uncover Unobserved Heterogeneity in SEM
Method Description
Parameter
Estimates Limitations
Illustrative
Applications
Finite Mixture
Models for
CBSEM
Jedidi et al
1997
Generalizes the multigroup SEM
for unobserved groupspecific
differences in the following
• Structural parameters
(covariance)
• Factor means
For a
defined
number of
groups
• Number of groups is unknown to the
researcher
• Does not account for heterogeneity
in the covariance of the measures
• Requires large number of
observations (large sample sizes)
Bart et al 2005
DeSarbo et al 2006
Reinecke 2006
Tueller and Lubke 2010
Hierarchical
Bayesian
CBSEM
Ansari et al
2000
Generalizes the multilevel SEM
for unobserved individualspecific
differences in the following
• The covariance structure (ie
structural parameters
measurement error variance
and factor covariance)
• Factor means
Specific
estimates
for
individuals
• Needs continuous data with multiple
observations per individual
• Only works for recursive structural
equation models
• Not available in standard software
packages
Luo et al 2008
parameters simultaneously thus avoiding wellknown biases
that occur when groupspecific models are estimated sep
arately (Fraley and Raftery 2002) Several applications and
simulation studies (eg Arminger et al 1999 Henson et al
2007 Jedidi et al 1997 Tueller and Lubke 2010) illustrate
the usefulness of finite mixture models by showing how struc
tural relationships among factors differ across unobserved
groups
In contrast to finite mixture models hierarchical Bayesian
models for CBSEM which were developed by Ansari et al
(2000) do not assume heterogeneity among a defined number
of groups of individuals but estimate unobserved hetero
geneity at the individual4 level using a random coefficients
model Specifically they uncover unobserved heterogeneity
in the factor means and covariance structure (ie structural
parameters measurement error variance and factor co
variance) thereby generalizing multilevel SEM models
(Muthén 1994 RabeHesketh et al 2004) that only account
for heterogeneity in the mean structure Hierarchical Bayes
ian CBSEM provides individualspecific estimates for the
factor scores structural coefficients and other model param
eters (Ansari et al 2000) However this method requires
continuous data with multiple observations per individual to
estimate individuallevel heterogeneity and the method is
limited to recursive structural equation models There has
been some work (eg Cai and Song 2010 Lee and Song
2003) to extend the method to dichotomous variables and
missing data and evaluate the performance of these methods
While both the finite mixture and the hierarchical Bayesian
CBSEM models have been the subject of extensive method
ological research finite mixture models have been applied in
empirical CBSEM research to a greater extent An in
creasing number of applications especially in the marketing
econometrics and sociology literatures have utilized finite
mixture models to uncover unobserved heterogeneity thereby
improving theoretical and practical implications (eg Bart et
al 2005 DeSarbo et al 2006 Reinecke 2006 Tueller and
Lubke 2010)
PLS Path Modeling Methods to Uncover
Unobserved Heterogeneity
Although PLS path modeling research has paid limited
attention to unobserved heterogeneity in comparison to CB
SEM research multiple PLS segmentation methods have been
proposed We draw on Sarstedt’s (2008) review of these
methods to identify the following key PLS segmentation
methods
1 The PATHMOX (path modeling segmentation tree)
algorithm (Sánchez 2009 Sánchez and Aluja 2006)5
This algorithm requires the a priori specification of
explanatory variables that are not used as indicators in
the PLS path model to discover segments While this
feature can be advantageous for interpreting discovered
segments it limits the heterogeneity discovery process to
the selected explanatory variables (and their specified
4An individual can be a person group team or company that is the object of
investigation in a study and has provided several observations (eg over time
or within a group)
5PATHMOX is available in the pathmox package of the statistical software
R (Sánchez and Aluja 2012)
674 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
order) that are provided as inputs to the PATHMOX
algorithm (Sarstedt 2008)
2 Distance measurebased methods These methods deter
mine the distance of an observation to its current group
and all other given groups in order to decide on this
observation’s group membership PLS typological path
modeling (PLSTPM Squillacciotti 2005 Squillacciotti
2010) and its enhancement—responsebased detection of
respondent segments in PLS (REBUSPLS Esposito
Vinzi et al 2010 Esposito Vinzi et al 2008)—are the
key methods in this class6 Both PLSTPM and REBUS
PLS7 can only uncover unobserved heterogeneity in PLS
path models with reflective measures (ie they cannot be
applied to path models that include formative measures)
(Esposito Vinzi et al 2010 Esposito Vinzi et al 2008)
3 The finite mixture partial least squares method (FIMIX
PLS) (Hahn et al 2002)8 This method assumes that each
endogenous latent variable is distributed as a finite
mixture of conditional multivariate normal densities It
captures heterogeneity by estimating the probabilities of
segment memberships for each observation in order to
optimize the likelihood function Consequently it impli
citly maximizes the segmentspecific explained variance
(ie the R² value) which is part of the likelihood func
tion While FIMIXPLS is generally applicable to PLS
path models regardless of whether the latent variables are
measured reflectively or formatively it does not account
for the heterogeneity in the measurement models More
over the assumption that the endogenous latent variables
have multivariate normal distribution is inconsistent with
the nonparametric PLS path modeling which does not
impose distributional assumption
We select FIMIXPLS to benchmark the performance of the
new PLSPOS method for two reasons First based on an
assessment of the benefits and limitations of these methods
Sarstedt (2008 p 152) concludes To sum up FIMIXPLS
can presently be viewed as the most comprehensive and
commonly used approach to capture heterogeneity in PLS
path modeling Second as our research objectives include
developingevaluating a method (ie PLSPOS) that detects
unobserved heterogeneity in both the structural model and
formative measures we conduct simulations with both forma
tive and reflective models While PLSTPM and REBUS
PLS are not applicable to PLS path models that include
formative measures FIMIXPLS is applicable to PLS path
models regardless of the use of reflectiveformative measure
ment We next elaborate briefly on FIMIXPLS’ assump
tions procedure and limitations
FIMIXPLS follows the assumption that heterogeneity is
concentrated in the parameters of the estimated relationships
among latent variables (ie the path coefficients in the struc
tural model) Based on this concept FIMIXPLS assigns
observations to a prespecified number of groups by means of
probabilistic clustering to optimize the likelihood function
(which implicitly maximizes the segmentspecific explained
variance as part of the likelihood function) thereby simul
taneously estimating the model parameters for the groups and
ascertaining the heterogeneity of the data for the PLS path
model It adapts a finite mixture regression model that in
contrast to conventional mixture regression models can be
comprised of a multitude of interrelated endogenous latent
variables (Hahn et al 2002)
Compared to the finite mixture and hierarchical Bayesian CB
SEM FIMIXPLS does not account for groupspecific mean
differences of latent variables because it is based on the
standardized results of an overall sample PLS path model In
addition FIMIXPLS builds on the latent variable scores of
the PLS path model estimation using the full set of data and
thus only focuses on the relationships among latent variables
Consequently it is generally applicable to PLS path models
(regardless of the latent variables being measured reflectively
or formatively) but does not account for the heterogeneity in
the measurement models (eg the factor covariance or the
measurement error variance) (Hahn et al 2002 Sarstedt and
Ringle 2010)
FIMIXPLS has been applied recently to uncover unobserved
heterogeneity in PLS path models for success factors in
industrial goods (Sarstedt et al 2009) intention to adopt new
movie distribution services on the Internet (Papies and
Clement) 2008) the American customer satisfaction index
model (Ringle et al 2010a) and unanticipated reactions to
organizational strategy among stakeholder segments (Money
et al 2012) The advantage of applying the parametric finite
mixture regression concept to PLS path models is that it offers
segment retention criteria (eg AIC BIC and CAIC Hahn
et al 2002 Sarstedt et al 2011a) for model selection (ie to
6Other distancebased methods which are in earlier stages of development
and currently not available as software packages include fuzzy PLS path
modeling for latent class detection (FPLSLCD Palumbo et al 2008) and
partial least squares genetic algorithm segmentation (PLSGAS) (Ringle et
al 2010b Ringle et al 2013)
7The REBUSPLS method is included in the XLSTAT software as well as in
the plspm package (Sánchez and Trinchera 2013) of the statistical software
R (R Core Team 2013)
8The FIMIXPLS method is included in the PLS path modeling software
SmartPLS (Ringle et al 2005)
MIS Quarterly Vol 37 No 3September 2013 675
Becker et alDiscovering Unobserved Heterogeneity in SEM
decide on an appropriate number of segments) However
FIMIXPLS has some limitations in that it (1) assumes that
the endogenous latent variables in the structural model have
a multivariate normal distribution (which is inconsistent with
PLS’ distributionfree assumption) and (2) uses latent variable
scores in the structural model based on the measurement
model for the overall sample and ignores plausible hetero
geneity in the measurement model’s weights Consequently
it not only ignores heterogeneity in the measurement model
but may also fail to detect heterogeneity in the structural
model that results from unobserved heterogeneity in the
measurement model
Partial Least Squares–PredictionOriented
Segmentation (PLSPOS)
To overcome the identified methodological limitations of
FIMIXPLS and of existing distance measurebased PLS
segmentation methods for uncovering unobserved hetero
geneity we introduce the PLS predictionoriented segmen
tation (PLSPOS) method that offers three novel and
distinctive features (1) it uses a PLSspecific objective
criterion to form homogeneous groups that maximize the
explained variance (R²) of all endogenous latent variables in
the PLS path model and thereby takes the entire path
model’s structure into account9 (2) it includes a new distance
measure that is appropriate for formative measures (and
heterogeneity within them) and (3) it reassigns observations
only if reassigning observations improves the objective
criterion The latter feature of PLSPOS ensures continuous
improvement of the objective criterion throughout the itera
tions of the algorithm (hillclimbing approach) and provides
the ability to uncover very small niche segments However
like the expectation–maximization (EM) algorithm in FIMIX
PLS PLSPOS can face the problem of ending in local optima
due to its use of a hillclimbing approach Thus a repeated
application of PLSPOS with different starting partitions is
advisable
PLSPOS follows a clustering approach with a deterministic
assignment of observations to groups and uses a distance
measure for the reassignment of observations as such it has
no distributional assumptions The segmentation objective in
a PLS path model is to form homogenous groups of obser
vations with increased predictive power (R² of the endog
enous latent variables) of the groupspecific path model
estimates (compared to the overall sample model) In accor
dance with Anderberg’s (1973 p 195) notion of clustering
for maximum prediction a fitting objective criterion for PLS
segmentation is to maximize the sum of the endogenous latent
variables’ explained variance (R²) across all groups
A key challenge of this approach is the indeterminacy of the
data assignment task as it is unknown how the groupspecific
PLS results will change when an observation is reassigned to
a different group For this purpose the PLSPOS method
uses a distance measure to identify appropriate observations
for reassignment that serve as candidates to improve the PLS
POS objective criterion Using a distance measure (ie cal
culating each observation’s distance from its current group
and from each of the other groups) for segmentation builds on
an idea of earlier work on distancemeasurebased segmen
tation in PLS path modeling (ie PLSTPM and its later
improvement REBUSPLS)
Appendix B provides the details of PLSPOS’ algorithm
objective criterion and distance measure It also includes a
detailed comparison of the technical differences between
FIMIXPLS PLSTPM REBUSPLS and PLSPOS (Table
B1) We implement the PLSPOS algorithm as an extension
of the SmartPLS software (Ringle et al 2005) to evaluate its
performance in our simulation study The extension will be
made available with the next release of SmartPLS
In summary the PLSPOS method complies with the most
important objectives in PLS path modeling It (1) improves
the objective criterion by nonparametric means (2) accounts
for heterogeneity in the structural model as well as in the for
mative measurement model and (3) is applicable to all path
models regardless of the type of measurement model the
distribution of the data or the complexity of the structural
model Table 4 compares the key properties of PLSPOS and
FIMIXPLS which we use as the benchmark method in this
study as depicted in the previous section in terms of five
desired criteria for a PLS segmentation method
In the next section we detail the comprehensive simulation
experiments we conducted to evaluate whether the differences
in the capabilities of FIMIXPLS and PLSPOS noted in
Table 4 hold empirically Specifically we focused our simu
lations on the criteria in columns 2 through 5 because our goal
9While PLSTPM only focuses on a single target construct REBUSPLS
accounts for this limitation by replacing PLSTPM’s distance measure with
the goodnessoffit criterionbased (GoF Tenenhaus et al 2005) closeness
measure The aim of REBUSPLS is to detect sources of heterogeneity in
both the structural and the outer model for all exogenous and endogenous
latent variables (Esposito Vinzi et al 2008 p 444) As in PLSTPM
REBUSPLS requires reflective measurement models (Esposito Vinzi et al
2008) In contrast by focusing on the R² of all the endogenous latent
variables as an explicit objective criterion PLSPOS stresses the prediction
oriented character of PLS path modeling and allows the general application
of this method to PLS path models with both reflective and formative
measurement models
676 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 4 Conceptual Capabilities of FIMIXPLS and PLSPOS
Segmentation
Methods
Desired Criteria for a PLS Segmentation Method
Ability to detect
heterogeneity in
reflective
measures
Ability to detect
heterogeneity in
formative
measures
Ability to detect
heterogeneity in
the structural
model
Maximizes groupspecific R²
of endogenous latent
variables (prediction
orientation)
Ability to handle
nonnormal data
FIMIXPLS
Hahn et al 2002 ––TT–
PLSPOS T* TT T T
*The method can detect heterogeneity in the reflective model if there is heterogeneity in the structural model (ie if heterogeneity in the reflective
measurement model is the source of heterogeneity in the structural model)
is to discover heterogeneity in the structural model and in
formative measures while assuming measurement invariance
in the reflective measures
Simulations of PLSPOS and
FIMIXPLS Performance
We conducted experiments with simulated data that define the
true groupspecific PLS parameters a priori We assessed the
performance of PLSPOS and FIMIXPLS based on the
differences between the true parameters and those estimated
by each method Subsequently we compared the perfor
mance of PLSPOS and FIMIXPLS in recovering the true
parameter estimates
Model Specification
Consistent with most simulation studies on PLS path models
(eg Chin et al 2003) we specified a direct effects path
model that includes four exogenous latent variables and one
endogenous variable We specified two versions of the path
model model 1 uses reflective measures for the exogenous
and endogenous latent variables (Figure 2a) while model 2
uses formative measures for the exogenous latent variables
and reflective measures for the endogenous latent variables
(Figure 2b) While we limit the results reported in this paper
to those obtained from the simulations of a direct effects path
model we also evaluated more complex path models with
multiple endogenous variables and mediation paths between
the latent variables Our results were generally stable for
these more complex models as well
We generated the simulated data so each of the two groups
has one particularly strong relationship in the structural
model while all other path coefficients are at lower levels of
magnitude For example for group 1 the structural path p1
has a high true parameter value while the structural paths p2
to p4 have lower true parameter values Conversely for group
2 p4 has a high true parameter value while the path coeffi
cients p1 to p3 have lower true values The mean differences
in the coefficients for path p1 to p4 between group 1 and group
2 reflect the heterogeneity in the model (ie the differences
between the groups) The same principle applies to the mea
surement weights in the formative measures We used four
formative indicators per construct For group 1 the measure
ment weights w1 and w3 have high true values while weights
w2 and w4 have low true values Conversely for group 2 w2
and w4 have high true values and w1 and w3 have low true
values The mean differences between the weights for group
1 and group 2 reflect the amount of heterogeneity in the
measurement model
Factor Design of the Simulations
Our selection of experimental factors and their levels was
informed by criteria that were shown to influence PLS path
modeling or segmentation results in prior simulation studies
Specifically we manipulated the following factors
(1) Explained variance (R²) of the endogenous latent vari
able per group (100 95 90 85)10 (eg Reinartz et al
2009)
(2) Structural model heterogeneity—that is the group
specific differences in structural model path coefficients
(25 50 75 100) (eg Andrews and Currim 2003b)
10This manipulation results in R² values of 425 to 5 in the overall sample
that combines groups For example when the R² value in both groups is 85
the overall sample that combines the two groups has a R² value of 425
because of unobserved heterogeneity
MIS Quarterly Vol 37 No 3September 2013 677
Becker et alDiscovering Unobserved Heterogeneity in SEM
(a) Reflective Model (b) Formative Model
Figure 2 The Models
(3) Sample size per group (100 200 400) (eg Chin et al
2003)
(4) Data distribution (normal nonnormal11) (eg Reinartz
et al 2009)
(5) Relative segment sizes (equal unequal12) (eg Andrews
and Currim 2003b)
In addition we manipulated the following factors related to
the measurement model
(6) Reliability of reflective measures (perfect versus normal
loadings of 100 and ~85) (eg Chin et al 2003)
(7) Measurement model heterogeneity—that is the group
specific differences in formative measurement weights
(25 50 75) (We note that to the best of our knowl
edge this particular factor has not been examined in prior
simulation research on PLS path models)
(8) Multicollinearity between formative indicators (none
level 1 level 2)13 (Mason and Perreault 1991)
The number of factors and the number of factor levels system
atically increase the complexity of the PLS segmentation task
The full factorial design for the study results in 42 × 3 × 23
384 different combinations for the reflective model (model 1)
and 42 × 33 × 22 1728 different combinations for the forma
tive model (model 2) To ensure stability of the results all
factor combinations include 30 datageneration and segmenta
tion runs for each segmentation method so in total (384 +
1728) × 2 × 30 126720 segmentation runs were performed
Data Generation
Simulation studies in PLS path modeling require that data
generated for the indicators (manifest variables) match the
true values of the model Previous studies on PLS path
modeling (eg Chin et al 2003 Henseler and Chin 2010
Reinartz et al 2009) first generated data by extracting latent
variable scores to match the true relationships in the structural
model and then generated data for the indicators by adding
measurement errors to match the indicators’ true parameters11For the nonnormal data we use a logtransformation of the normal data to
get a skewness of about 2 and a kurtosis of about 5 for the indicators
12The unequal condition has one segment with 80 and one with 20 of the
total sample size 13For a detailed explanation of this factor see Appendix C
678 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
in the measurement model This procedure does not allow for
generating data for formative indicators as the direction of
causality in formative measures is from the indicators to the
construct (in contrast to reflective measures where the indi
cators cause the construct) Data for the formative indicators
must first be generated to compute the latent variable scores
for formative constructs We address this requirement by
generating random variables for the formative indicators such
that the generated formative indicators match a prespecified
correlation matrix (for modeling multicollinearity in the simu
lation design) the true values of the formative measurement
weights as well as the true values for the structural model
parameters
Performance Assessment
The objectives of our simulation experiments were to
(1) assess PLSPOS and FIMIXPLS in terms of their
respective abilities to recover true groupspecific parameters
(2) compare PLSPOS and FIMIXPLS based on the assess
ment of their parameter recovery and (3) identify the relative
effects of the design factors on the parameter recovery of
PLSPOS and FIMIXPLS
We knew the true parameters of each factorial combination
(ie the R² path coefficients outer weights and loadings) a
priori based on the parameter settings for the data generation
The smaller the differences between the true values and the
segmentation method’s parameter estimates the better the
parameter recovery As FIMIXPLS cannot provide segmen
tation results for the measurement model—because param
eters are fixed to those resulting from the overall sample—we
assessed each segmentation method by comparing the struc
tural model’s path coefficients from the two segmentation
methods with the a priori known values Consistent with
prior studies (eg Henseler and Chin 2010 Reinartz et al
2002) we evaluated parameter recovery using the mean
absolute bias (MAB) which is the average of the simple
absolute deviations between the true parameter and the
parameter estimated by the segmentation method MAB
values close to zero indicate near perfect parameter recovery
To assess PLSPOS and FIMIXPLS we compared each
method’s MAB with the MAB when the overall sample was
analyzed without uncovering unobserved heterogeneity (ie
without using a segmentation method) Finally to understand
the relative importance of the design factors we evaluated
parameter recovery (ie the path coefficient’s MAB) using a
mixedeffects ANOVA model with the two segmentation
methods (PLSPOS and FIMIXPLS withinsubjects factor)
and the eight design factors (betweensubjects factors)
Results of the Simulation Experiments
We discuss the findings for both model 1 (reflective mea
sures) and model 2 (formative measures) below starting with
the results for model 1
Results for Model 1 Reflective Measures
Table 5 presents the results for the ANOVA with MAB as the
dependent variable Our extensive simulations enabled us to
detect even very small effects indicating high power For the
sake of space and simplicity Table 5 shows only the direct
effects all twoway interactions with the method factor and
all other interactions having a significant and substantial
effect (ie explaining more than 2 of the total variance in
MAB implying a partial η² of more than 02 (Reinartz et al
2009)) The partial η² represents the contribution of each
factor or interaction as if it is the only variable so its effect is
not masked by other variables See Appendix E for the com
plete results
The ANOVA results for model 1 show that parameter
recovery is unaffected by the measurement model’s reliability
The direct effect and all of the interaction effects of reliability
are nonsignificant As the reliability has neither a between
subjects nor a withinsubjects effect we find no evidence that
the accuracy of either segmentation method is affected by the
reliability of the measurement model
The betweensubjects effects identify the factors that influ
enced MAB for both segmentation methods All of the direct
effects are significant with two notable findings (1) sample
size (partial η² 013) and relative segment size (partial
η² 002) have a partial etasquare below 02 so their influ
ence on MAB is not substantial and (2) R² has the strongest
impact on parameter recovery both as a direct effect and as an
interaction effect with structural model heterogeneity This
result is not surprising as an increasing error in the model
distorts group differences As PLSPOS capitalizes on the
model’s predictive power of the model (ie the explained
variance) the method is better at uncovering heterogeneity
when the predictive power is high
The withinsubjects effects identify the differential influence
of the design factors on MAB across the segmentation
methods In general the method has a significant and sub
stantial impact on the parameter recovery for the reflective
model Furthermore the method’s two interaction effects
with structural model heterogeneity and R² are significant and
substantial All other interaction effects with the method are
nonsignificant or are not substantial
MIS Quarterly Vol 37 No 3September 2013 679
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 5 Model 1 (Reflective Measures) ANOVA Explaining MAB by Method (PLSPOSFIMIXPLS) and
Design Factors
Source of Variance in MAB df Fvalue pvalue Partial η²
Between
Subjects
Effects
Intercept 1 1465862 000 568
Structural Model Heterogeneity 3 112171 000 232
R² 3 194885 000 344
Sample Size 2 7077 000 013
Reliability 1 188 170 000
Data Distribution 1 49752 000 043
Relative Segment Size 1 2262 000 002
Structural Model Heterogeneity × R² 9 17896 000 126
Error 11136
Within
Subjects
Effects
Method 1 95231 000 079
Method × Structural Model Heterogeneity 3 21747 000 055
Method × R² 3 13714 000 036
Method × Sample Size 2 466 009 001
Method × Reliability 1 01 974 000
Method × Data Distribution 1 8797 000 008
Method × Relative Segment Size 1 10401 000 009
Error (Method) 11136
Note df degrees of freedom
Table 6 shows the MAB for each factor level when PLSPOS
or FIMIXPLS is applied to uncover heterogeneity or the
overall sample was analyzed without the use of a segmen
tation method to uncover heterogeneity A detailed examina
tion of the significant interaction effects of the method with
the structural model heterogeneity and the R² shows that the
MAB for PLSPOS increases more than the MAB for FIMIX
PLS when the structural model heterogeneity or the R² is
lower (Figures 3a and 3b) However using PLSPOS results
in a MAB that is still very low compared to the MAB when
the overall sample was analyzed without the use of a segmen
tation method
Overall the results reveal that for model 1 (reflective mea
sures) both methods perform equally well in almost all
conditions FIMIXPLS is slightly better than PLSPOS when
the R² or the structural model heterogeneity is low and the
bias from using either of the two methods (FIMIXPLS or
PLSPOS) is much lower than the bias from analyzing the
overall sample without uncovering heterogeneity
Results for Model 2 Formative Measures
Table 7 presents the results for the ANOVA in model 2
(formative measures) with MAB as the dependent variable
Again for the sake of space and simplicity Table 7 presents
the direct effects all twoway interactions with the method
and all other interactions that have significant and substantial
effects (partial η² of more than 02) See Appendix F for the
complete results
For the betweensubjects effects all of the direct effects on
MAB are significant but again the effect of relative segment
size (partial η² 012) on MAB is not substantial Interest
ingly the relative segment size and sample size have a sub
stantial interaction in this model (partial η² 054) The
MAB decreases for increased sample sizes in groups of equal
size but stays constant for increased sample sizes in unequal
groups
The MAB for both segmentation methods is influenced by the
heterogeneity in the structural model the heterogeneity in the
measurement model the R² of the model the sample size the
data distribution and the multicollinearity In contrast to the
results for model 1 (reflective measures) it is not the R²
(partial η² 0204) but the structural model heterogeneity that
has the highest impact (partial η² 313) on parameter
recovery for model 2 (formative measures) The impact of the
measurement model heterogeneity (this factor is only relevant
for formative measures) on MAB is the third most important
factor and explains about 10 percent of the MAB variance
(partial η² 104) Moreover the interaction effects between
the structural model and measurement model heterogeneity as
680 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
(a) MAB of Both Segmentation Methods for
Different Structural Model Heterogeneity
(b) MAB of Both Segmentation Methods
for Different R² Values
Figure 3 MAB of Both Segmentation Methods for Model 1 (Reflective Measures)
Table 6 MAB in Model 1 (Reflective Measures) for Each Method
Design Factor Level
POS
Mean Absolute Bias
FIMIX
Mean Absolute Bias
No Segmentation
Method
Mean Absolute Bias
Structural Model
Heterogeneity
25 055 030 125
50 033 016 250
75 019 013 375
100 012 013 500
R²
85 054 033
31290 038 023
95 025 013
100 002 003
Sample
Size
100 032 021
312200 031 018
400 026 015
Reliability Perfect 030 018 312Normal 029 018
Data Distribution Normal 024 015 312NonNormal 036 021
Relative Segment Size Equal 027 019 312Unequal 033 017
Overall 030 018 312
MIS Quarterly Vol 37 No 3September 2013 681
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 7 Model 2 (Formative Measures) ANOVA Explaining MAB by Method (PLSPOSFIMIXPLS) and
Design Factors
Source of Variance in MAB df Fvalue pvalue Partial η²
Between
Subjects
Effects
Intercept 1 14269680 00 740
Structural Model Heterogeneity 3 760533 00 313
Measurement Model Heterogeneity 2 291299 00 104
R² 3 428631 00 204
Sample Size 2 86477 00 033
Relative Segment Size 1 62983 00 012
Data Distribution 1 146575 00 028
Multicollinearity 2 84818 00 033
Structural Model Heterogeneity × Measurement
Model Heterogeneity 6 29809 00 034
Sample Size × Relative Segment Size 2 142686 00 054
Measurement Model Heterogeneity ×
Multicollinearity 4 28784 00 022
Error 50112
Within
Subjects
Effects
Method 1 393852 00 073
Method × Structural Model Het 3 398798 00 193
Method × Measurement Model Het 2 677105 00 213
Method × R² 3 82632 00 047
Method × Sample Size 2 22755 00 009
Method × Relative Segment Size 1 17166 00 003
Method × Data Distribution 1 297 08 000
Method × Multicollinearity 2 173912 00 065
Method × Structural Model Het × Measurement
Model Het 6 97649 00 105
Method × Structural Model Het × Multicollinearity 6 37296 00 043
Method × Measurement Model Het ×
Multicollinearity 4 25724 00 020
Error (Method) 50112
Note df degrees of freedom
well as between measurement model heterogeneity and
multicollinearity are significant and substantial but have very
little impact compared to the factors discussed earlier
For the withinsubjects effects the method’s effect on MAB
is significant and substantial The method also significantly
and substantially interacts with heterogeneity in both the
structural model and the measurement model Looking at
these interaction effects in more detail reveals that PLSPOS
performs consistently well across all of the factor levels
while the performance of FIMIXPLS deteriorates with
decreasing structural model heterogeneity or increasing mea
surement model heterogeneity Interestingly the threeway
interaction of method with structural and measurement model
heterogeneity is also significant and substantial (partial
η² 105) (Figures 4a and 4b) While the MAB for PLSPOS
is always below 05 thereby indicating good parameter
recovery the MAB for FIMIXPLS increases when measure
ment model heterogeneity becomes higher and structural
model heterogeneity becomes lower
Table 8 shows the MAB for each factor level in model 2
(formative measures) and reveals that the level of structural or
measurement model heterogeneity only slightly affects
parameter recovery for PLSPOS In contrast parameter
recovery for FIMIXPLS decreases with decreasing structural
model heterogeneity or increasing measurement model
heterogeneity Thus FIMIXPLS is as good as PLSPOS in
682 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
(a) PLSPOS (b) FIMIXPLS
Figure 4 MAB of Both Methods for Different Structural and Measurement Model Heterogeneity
Table 8 MAB in Model 2 (Formative Measures) for Each Method
Design Factor Level
POS
Mean Absolute Bias
FIMIX
Mean Absolute Bias
No Segmentation
Method
Mean Absolute Bias
Structural Model
Heterogeneity
25 038 089 132
50 039 052 250
75 032 031 375
100 025 016 500
Measurement Model
Heterogeneity
25 039 024 312
50 033 042 312
75 029 074 318
R²
85 057 056
31490 041 050
95 025 043
100 011 038
Sample
Size
100 043 050
314200 030 047
400 028 043
Data Distribution Normal 030 043 314NonNormal 037 051
Relative Segment Size Equal 029 046 314Unequal 038 048
Multicollinearity
none 031 062
314Level 1 034 041
Level 2 036 037
Overall 034 047 314
MIS Quarterly Vol 37 No 3September 2013 683
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table 9 Empirical Evaluation Summary of FIMIXPLS and PLSPOS
Segmentation
Method
Desired Criteria for a PLS Segmentation Method
Ability to detect
heterogeneity in
reflective
measures
Ability to detect
heterogeneity in
formative
measures
Ability to detect
heterogeneity in
the structural
model
Maximizes groupspecific
R² of endogenous latent
variables (prediction
orientation)
Ability to
handle non
normal data
FIMIXPLS
Hahn et al 2002 Not tested – TTT
PLSPOS Not tested TT T T
Note T indicates support by the simulation experiments – indicates that the criterion is not associated with the method
situations with very high structural model heterogeneity
regardless of the measurement model heterogeneity and also
in situations where the measurement model heterogeneity is
low and the structural model heterogeneity is at moderate
levels Therefore as the results in Figures 4a and 4b reveal
the parameter recovery ability of a segmentation method
cannot be assessed independently for these two types of
heterogeneity
It is worth noting that the interaction effect between method
and data distribution is not substantial for either model 1
(reflective measures) or model 2 (formative measures) In
addition data distribution only has a small impact on param
eter recovery in both model 1 and model 2 (direct effects of
partial η² 043 and partial η² 028) Accordingly we
conclude that both methods perform equally well with both
normal and nonnormal distributions This finding is espe
cially interesting as FIMIXPLS assumes multivariate normal
distributions of the endogenous latent variables which should
theoretically result in unfavorable performance with non
normal data compared to PLSPOS However with several
indicators for each construct the composite latent variable
scores might become essentially normal even if the indicators
are not This might explain this initially surprising result
Summary of Results
Overall we can conclude that the use of either PLSPOS or
FIMIXPLS is better for reducing biases in parameter esti
mates and avoiding inferential errors than ignoring unob
served heterogeneity in PLS path models A notable excep
tion is when there is low structural model heterogeneity and
high formative measurement model heterogeneity in this
condition FIMIXPLS produces results that are even more
biased than those resulting from ignoring heterogeneity and
estimating the model at the overall sample level PLSPOS
shows very good performance in uncovering heterogeneity for
path models involving formative measures and is significantly
better than FIMIXPLS which shows unfavorable perfor
mance when there is heterogeneity in formative measures
However FIMIXPLS becomes more effective when there is
high multicollinearity in the formative measures while PLS
POS consistently performs well There are two interrelated
reasons for this result (1) multicollinearity masks hetero
geneity in the measurement model making the measures more
similar (ie homogenous) across groups and (2) FIMIXPLS
ignores heterogeneity in the measurement model and therefore
the multicollinearity problems in formative indicators The
strongly correlated formative measures become closer to a
homogenous reflective measurement of the construct There
fore the performance of PLSPOS and FIMIXPLS converges
in situations with high multicollinearity because FIMIXPLS
performs marginally better in purely reflective models (model
1) regardless of the distribution being normal or nonnormal
However the performance differences between FIMIXPLS
and PLSPOS are much smaller in the case of a reflective
model than in the case of a formative model Therefore PLS
POS is more generally applicable than FIMIXPLS to
discover heterogeneity in PLS path models
Thus the simulation experiments provide an empirical assess
ment of the segmentation criteria associated with PLSPOS
and FIMIXPLS (Table 9) All criteria associated with each
of these methods are supported by our findings with the
exception that FIMIXPLS does not degrade in performance
with nonnormal data
A Process for Unobserved
Heterogeneity Discovery
Given the availability of methods to uncover unobserved
heterogeneity as discussed in the two previous sections
researchers working with SEM face the following two major
questions when to investigate unobserved heterogeneity and
684 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
how to apply methods for uncovering unobserved hetero
geneity and defining segments We address these questions
by proposing a UHD process (Figure 5) and also by iden
tifying how this process can be applied given the research
objective (ie purely testing a model or testing and elabo
rating a model Colquitt and ZapataPhelan 2007)
How to Apply the UHD Process
When selecting an appropriate UHD method researchers have
to determine whether they are interested in evaluating
unobserved heterogeneity associated with latent segments or
individuallevel estimates (eg hierarchical Bayesian ap
proach fixed effects and random effects) As our focus is on
the discovery of latent segments we propose a UHD process
for defining the segments in this context In contrast if the
objective is to examine unobserved heterogeneity for
individuallevel estimates the described UHD process does
not apply because the methods have different assumptions and
objectives and require different data (ie several observations
per individual) The UHD process for the discovery of latent
segments consists of the following three stages
1 Selecting an appropriate UHD method
2 Applying the segmentation method to define the
segments
a Using heuristics to narrow the range of statistically
wellfitting segments
b Separating relevant from irrelevant segments (Are
the segments substantial)
c Testing the significance of the differences between
segments (Are the segments differentiable)
d Characterizing segments using constructs in the
modeltheory (Are the segments plausible)
e Turning unobserved heterogeneity into observed
heterogeneity (Are the segments accessible)
3 Validating the segmentation results
Selecting an Appropriate UHD Method
(Stage 1 of the UHD Process)
As discussed earlier the methodological options for analyzing
unobserved heterogeneity involving CBSEM cover two con
ceptually different approaches (ie latent segment analysis
and individuallevel estimate correction) For latent segment
analysis the appropriate UHD choice is the finite mixture
model as no modelbased clustering alternative is available
For analyses involving PLS path modeling there are no
methods available that address unobserved heterogeneity
associated with individuallevel estimates Latent segments
in PLS path modeling can be uncovered using one of the two
methods we present in this paper (ie FIMIXPLS and PLS
POS) Our simulation results show that FIMIXPLS is
restricted to uncovering unobserved heterogeneity in the
structural model while PLSPOS can uncover unobserved
heterogeneity in both the measurement and structural models
Therefore researchers should choose FIMIXPLS if their
models include only reflective measures and heterogeneity is
expected to affect only the structural model and not the
measurement model In contrast PLSPOS should be applied
for discovering unobserved heterogeneity when PLS path
models include formative measures and heterogeneity can
affect both the structural and measurement models
Applying the UHD Method to Define Segments
(Stage 2 of the UHD Process)
After choosing the appropriate method for uncovering unob
served heterogeneity the researcher has to apply the method
to evaluate whether significant unobserved heterogeneity is
present in the model and to define the number of segments to
retain from the data Determining the correct number of
segments is important as under or oversegmentation leads to
biased results and misinterpretations The second stage of the
UHD process focuses on (1) defining with heuristics a range
of statistically wellfitting segments and (2) evaluating the
segments based on theoretical considerations The steps in
this stage emphasize that researchers (1) evaluate the plausi
bility of segments by connecting the segmentation solution to
theory and (2) avoid capitalizing on data idiosyncrasies to
improve the explained variance or significance of parameters
Stage 2 Step 1 Narrow the range of statistically wellfitting
segments To determine the best fitting number of segments
the researcher has to apply the selected segmentation method
for a consecutive number of segments (eg 1 to 10) and
assess the methodspecific heuristics to generate information
on the number of segments that result in good model fit
Researchers have to rely on heuristics to determine a well
fitting number of segments as there is no exact statistical test
to accomplish this task (McLachlan and Peel 2000) In
mixture models these heuristics include modelselection
criteria that are well known from the modelselection litera
ture (eg AIC BIC and CAIC) and can also be used to
approximate the best fitting number of segments (Andrews
and Currim 2003a Sarstedt et al 2011a)
In contrast modelbased clustering methods such as PLS
POS are not based on the mixture model concept and do not
MIS Quarterly Vol 37 No 3September 2013 685
Becker et alDiscovering Unobserved Heterogeneity in SEM
Figure 5 Unobserved Heterogeneity Discovery (UHD) Process
provide modelselection criteria These methods require other
modelspecific heuristics to compare the results across dif
ferent numbers of groups for example in terms of their
average explained variance (R²) or the increase in predictive
relevance (Q²) However researchers should not rely purely
on heuristics (eg modelselection criteria in finitemixture
modeling or the explained variance per segment in PLSPOS)
to retain the best fitting number of segments because past
studies have shown heuristics to have a low probability of
finding the true number of segments There is some empirical
evidence that the best information criteria in mixture models
only have about a 60 percent chance of identifying the true
number of segments (Andrews and Currim 2003a 2003b
Sarstedt et al 2011a) Consequently relying on heuristics can
lead to strongly datadriven outcomes if the researcher fits the
number of segments to the data without considering the theo
retical or practical meaning of the segments Therefore these
heuristics should only be used to narrow the range of
segments for further theoretical assessment
Regardless of whether mixture models or modelbased
clustering is used if multiple heuristics clearly point to a one
segment solution the researcher might conclude that the
threat to validity from unobserved heterogeneity is low and
the overall sample represents a homogenous population This
will occur when (1) the average variance explained in PLS
path models for the multisegment solution is substantially
lower than the overall sample and (2) the modelselection
criteria in the mixture models collectively indicate a one
segment solution as showing the best fit and a large deteriora
tion in fit for the best multisegment solution
Stage 2 Step 2 Are the segments substantial The next step
after defining a range of wellfitting segments is to separate
relevant from irrelevant segments Often segmentation
methods produce very small but wellfitting segments that are
likely to represent data idiosyncrasies (eg outliers and bad
respondents) However the problem with these very small
segments is that they may (1) be irrelevant for theory or prac
686 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
tice (eg outliers) (2) represent statistical artifacts or data
collection problems (eg bad respondents) (3) yield unre
liable parameter estimates because of the small sample size
and (4) not be usable in the next step of the UHD process
(ie multigroup difference testing) Therefore each segment
has to be large enough to represent a real segment how
ever one also needs to be cautious when contrasting niche
and irrelevant segments Each segment should therefore be
carefully assessed if it represents a substantial segment A
guideline for this analysis might be to take the average
expected segment size to evaluate a segment’s relevance (ie
five segments would suggest an average expected segment
size of 20) If the segment size is considerably lower in
proportion (eg a 2 segment size) it is a candidate for
exclusion as an irrelevant segment In addition the total
segment size should meet the minimum standards for reliable
parameter estimates for the given SEM estimation method
(ie CBSEM and PLS path modeling) The researcher will
need to determine if the segment may be a niche segment that
is substantial and needs to be evaluated further in the next
steps of the UHD process
Stage 2 Step 3 Are the segments differentiable To deter
mine whether heterogeneity significantly affects the results
the substantial segments from the previous step need to be
tested to determine the significance of group differences
assessing if a given segment is differentiable from others
Therefore researchers should perform multigroup structural
equation modeling or multigroup PLS analysis and assess
(1) the measurement invarianceequivalence and (2) the signi
ficance of differences in path coefficients between segments
If a segment is not significantly different from other segments
researchers should consider either combining the segment
meaningfully with other segments that are not significantly
different from it or reducing the number of segments in the
segmentation method A reason for nonsignificant segment
differences might be that the prespecified number of segments
for extraction in the segmentation method has caused over
fitting of the data If no significant differences are detected
among any of the segments researchers should conclude they
have a homogenous population and low validity threats due to
unobserved heterogeneity
Stage 2 Step 4 Are the segments plausible Given a set of
differentiable segments the next step is to evaluate whether
the segments are plausible This plausibility assessment is to
be conducted by characterizing the segments with the
constructs in the modeltheory Each segment’s theoretical
plausibility should be assessed by considering the
(1) segmentspecific characteristics based on constructs in the
modeltheory (2) the conceptual differences between the
segment and other segments and (3) the segment’s theoretical
or managerial relevance If it is plausible within the specific
research domain that segments can change the explanatory
role of the constructs (eg certain types of IS users empha
size different IS characteristics which changes the role of the
constructs in predicting usage) researchers should include
user type segments in their theoretical implications to avoid
the premature invalidation or overgeneralization of theoretical
claims based on results from the overall sample If a segment
is not theoretically plausible it should also be considered a
limitation of the theory One possible reason for an implau
sible segment could be that it was mistaken as substantial
when it actually represented outliers Future research should
solve the anomaly of differentiable segments that cannot be
explained by (1) complementary theoretical elaboration andor
(2) empirical reevaluation However because unobserved
heterogeneity can threaten the validity of conclusions based
on the overall sample due to significant segment differences
differentiable segments that are not plausible should not be
part of a combined sample used to test the modelhypotheses
Stage 2 Step 5 Are the segments accessible The last step
in applying the segmentation methods is to turn unobserved
heterogeneity into observed heterogeneity by making the
segments accessible Researchers can further elaborate on the
theoretical meaning of the plausible segments by identifying
additional variables (eg demographic psychographic con
textual etc) beyond the original model that (1) help distin
guish the segments by explaining the differences between
retained segments and (2) determine to which segment
responses belong Statistical techniques to support this step
include (1) discriminant analysis (2) exhaustive CHAID and
(3) contingency tables where potential variables are tested for
their ability to explain segment differences However instead
of applying an ad hoc approach complementary theoretical
considerations should guide the process of identifying exter
nal variables It should not be a process in which the best
discriminating leftover variable in the dataset (that is not
part of the model) is used to explain segment differences If
it is not possible to identify theoretically reasonable variables
within the given datasetstudy that have sufficient explanatory
power to differentiate between segments suggestions for
additional variables based on complementary theoretical
perspectives should guide future research
Validating the Segmentation Results
(Stage 3 of the UHD Process)
In the final stage of the UHD process researchers should
validate the segmentation results including the number of
segments with external data not used in the estimation
process Researchers may (1) apply holdout sample valida
tion techniques using data that are already available (Andrews
et al 2010 Bapna et al 2011) (2) use crossvalidation
MIS Quarterly Vol 37 No 3September 2013 687
Becker et alDiscovering Unobserved Heterogeneity in SEM
random splits to compare the stability of segmentation results
(Jedidi et al 1997) or (3) collect additional data (eg in a
followup study) to evaluate the results and find new explana
tory variables that match segments better to explain hetero
geneity (ie make them accessible) Furthermore repeating
the segmentation study on a different population (ie sample)
and testing the proposed explanatory variables (ie modera
tors or grouping variables) in followup studies increases the
generalizability of the results
When to Apply Methods to Uncover
Unobserved Heterogeneity
Given a model that is grounded in substantive theory the
complexity of the social and behavioral phenomena examined
in IS research makes it plausible there will be heterogeneity
in any sample that is used to test and refine the model
Accordingly we recommend that all empirical IS research
should consider the discovery of unobserved heterogeneity
following the UHD process just as we evaluate reliability and
validity However researchers should (1) only use segmenta
tion methods when substantive theory supports the model and
(2) avoid using segmentation methods in models that are not
well grounded in theory to merely improve the explained
variance or the significance of parameters As Jedidi et al
(1997 p 57) observe one practice that should be avoided is
that of fitting a … model which is not well grounded in sub
stantive theory and simply adding segments until a reasonable
fit is found This rule applies to both CBSEM and PLS path
modeling regardless of the unobserved heterogeneity
discovery method that is to be used
For models grounded in substantive theory the objectives for
discovering unobserved heterogeneity can differ depending on
the study’s research objectives If the research objective is
theory testing (ie testers Colquitt and ZapataPhelan 2007)
uncovering unobserved heterogeneity serves as a validity
check to safeguard against biases and the false rejection or
false confirmation of theoretical claims When the theory
tester uncovers unobserved heterogeneity in the sample (ie
significant segment differences are detected and the segments
are determined to be theoretically plausible) heshe has
evidence of a theoretical breakdown given the segments As
such the discovery of unobserved heterogeneity safeguards
against (1) premature invalidation of theoretical claims (ie
the results based on the overall sample suggest certain rela
tionships are nonsignificant but the significance of these
relationships is actually masked by the heterogeneity) and
(2) premature overgeneralization of theoretical claims (ie
the modeltheory holds in some segments and not in others
thus requiring qualifiers for support found for the theory in
different segments) Hence theory testers apply the UHD
process to evaluate validity threats due to unobserved hetero
geneity If significant differences across plausible segments
are detected researchers should revise the boundary condi
tions for the theory (ie specify within which plausible
segments the theory was supported and in which it was not)
If unobserved heterogeneity is not uncovered in the sample
(ie no significant differences across segments are detected
segments are not differentiable) the researcher can continue
with the standard analysis on the overall sample (in)validate
theoretical claims and note that the validity of the findings is
not threatened by unobserved heterogeneity
If the research objective is theory testing and elaboration (ie
expanders Colquitt and ZapataPhelan 2007) uncovering
unobserved heterogeneity not only serves as a validity check
but can also guide researchers to identify variables explaining
the uncovered segments and to integrate these variables to
expand the modeltheory Hence researchers should turn
unobserved heterogeneity into observed heterogeneity by
(1) advancing theoretical reasons to explain the differences
between segments (2) identifying constructs beyond the
original model that explain these differences thereby making
the segments accessible and (3) expanding the modeltheory
by integrating the constructs that make the segments acces
sible Accordingly the accessibility stage in the UHD pro
cess will be facilitated when researchers anticipate this task
during the research design identify complementary theo
retical perspectives and corresponding constructs and collect
additional data for these constructs that can be instrumental in
making the segments accessible Of course these considera
tions require extra effort and datacollection costs and should
be accommodated in a study when the researcher expects
unobserved heterogeneity (eg based on inconsistent results
in past studies metaanalysis the nature of phenomena etc)
We note that the discovery of unobserved heterogeneity for
theoretical tests and elaboration is relevant even when ex
isting theory offers a priori knowledge about observed hetero
geneity (eg age gender or income) There can be addi
tional explainable and generalizable heterogeneity beyond the
known heterogeneity (eg experienced versus inexperienced
users) that threatens the theoretical validity of the test and
when discovered can be used to elaborate theorymodels
As an illustration assume that the research objective is to test
the baseline technology acceptance model presented in the
introduction Based on the analysis of the overall sample the
researcher risks overgeneralization in that the effects of PU
and PEOU are always important for IU To avert this risk the
researcher applies the UHD process and discovers two
substantial and differentiable segments One segment shows
a strong positive relationship between PU and IU and a weak
or nonsignificant relationship between PEOU and IU In
688 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
contrast the other segment shows a strong positive relation
ship between PEOU and IU and a weak or nonsignificant
relationship between PU and IU (Figure 1a) The researcher
concludes that these two identified segments (ie users
emphasizing PU or PEOU) are theoretically plausible (ie
within TAM it is reasonable that there are different users who
emphasize different system characteristics) and conceptually
important for the theory In contrast to the results derived
from the overall sample only one of the posited TAM con
structs influences IU in each segment As such the researcher
(1) does not overgeneralize the theory by assuming that it will
always be applicable (2) acknowledges there are user
segments that determine which construct is influential for IU
and (3) specifies the need to make the segments accessible
thereby expanding the TAM model
Given the study’s objective (ie theory testing) and the
limited availability of additional data (eg a lack of demo
graphic or psychographic variables such as experience)
researchers might end the UHD process after concluding the
segments are plausible (ie that it is plausible that the
segments change the explanatory role of the constructs) with
out explaining which users belong to which segment (ie
without making the segments accessible)
Instead if the research objective is theory testing and elabo
ration researchers should continue to find complementary
theoretical explanations to make the segments accessible (ie
to give additional theoretical meaning to the segments) A
complementary theory could explain that users’ experience
influences their appreciation of system characteristics (eg
PEOU and PU) Experience therefore could be an external
variableconstruct that if available in the dataset could be
tested for explaining the segment membership Other plau
sible theoretical considerations could suggest other variables
constructs that might explain segment membership and should
be evaluated (eg age income computer anxiety task type
subjective norms etc) If researchers are able to identify a
variableconstruct that explains the segment membership (ie
makes segments accessible) the unobserved heterogeneity is
turned into observed heterogeneity thereby expanding the
theory with new constructs accounting for the group differ
ences (eg a moderator) If researchers are unable to assess
the ability of variablesconstructs to explain segment member
ship because of lack of data in the study they can only theo
retically identify reasonable variablesconstructs for future
testing
Limitations and Future Research
In this study we (1) discussed why unobserved heterogeneity
is an important issue in IS research (2) identified threats to
validity due to unobserved heterogeneity (3) synthesized
current work on unobserved heterogeneity in CBSEM and
PLS path modeling (4) introduced a new segmentation
method (PLSPOS) for PLS path modeling (5) assessed its
performance and that of FIMIXPLS and (6) provided
guidelines for researchers on when and how to uncover unob
served heterogeneity While our study makes contributions
it has its limitations and opens up avenues for future research
First the validity and generalizability of simulation studies
are limited by the choice of design factors and factor levels
We focused on eight factors based on past studies on PLS
path modeling or segmentation The analysis of all factor
level combinations of the two PLS path models entailed
126720 simulated segmentation runs for assessing the per
formance of PLSPOS and FIMIXPLS The inclusion of
additional design factors—namely those that are theoretically
less important for PLS segmentation—or additional factor
levels would have increased the complexity of the simulations
exponentially and is beyond the scope of a single study
Therefore researchers should also apply PLSPOS and
FIMIXPLS in a broad range of empirical studies to find
additional evidence of the methods’ abilities to detect
unobserved heterogeneity
Second heterogeneity is a special type of endogeneity prob
lem (ie omitted group variables) Future studies may want
to evaluate the impact of other types of endogeneity problems
(eg reciprocal relationships) on PLS path modeling results
As PLS path modeling cannot handle nonrecursive models
these issues might also threaten the consistency of parameters
In addition researchers may want to assess the effect of
unobserved heterogeneity in models that do not comply with
the recursive nature of models imposed by PLS path models
If heterogeneity affects nonrecursive (reciprocal) relation
ships it might have a strong impact on the ability of both PLS
segmentation methods (FIMIXPLS and PLSPOS) to
uncover unobserved heterogeneity
Third this research does not focus on the parameter settings
of the methods or the time needed to arrive at the final seg
mentation solution Our simulations suggest that PLSPOS is
more time consuming than FIMIXPLS14 Determining
efficient parameter settings to reduce the computational effort
of PLSPOS represents another avenue for future research
14In absolute terms PLSPOS works within acceptable timeframes Applying
both methods to the ECSI mobile phone dataset from Tenenhaus et al (2005)
with two segments the FIMIXPLS algorithm needs approximately 10
seconds while PLSPOS requires about 3 minutes to arrive at a solution
(We used a Windows 7 PC with an Intel Core 2 T7300 2GHz and 2GB
RAM) We believe this should be acceptable to researchers in an advanced
stage of model investigation
MIS Quarterly Vol 37 No 3September 2013 689
Becker et alDiscovering Unobserved Heterogeneity in SEM
Conclusion
We differentiated between observed and unobserved hetero
geneity and showed why unobserved heterogeneity biases
structural equation model estimates leads to Type I and
Type II errors and is a threat to different types of validity
(ie internal instrumental statistical conclusion and
external) We demonstrated that heterogeneity is present in
empirical IS research across various IS phenomena by
presenting evidence from 12 metaanalyses showing that
inconsistent findings are prevalent across IS studies with
unobserved heterogeneity being a plausible cause for these
inconsistencies We explained how researchers can avoid
threats to validity due to unobserved heterogeneity in struc
tural equation modeling by using different methods that have
been proposed in the literature to uncover unobserved
heterogeneity The application of these methods not only
safeguards against biases and validity threats but also
facilitates theory development by promoting abduction (Van
de Ven 2007) Specifically uncovering unobserved hetero
geneity and explaining segments with new constructs beyond
those in the model allows researchers to develop additional
theoretical descriptions that make segments accessible
Thereby they can expand and further develop existing theory
We introduced a new segmentation method for PLS path
modeling—PLSPOS—that overcomes some of the restrictive
assumptions associated with FIMIXPLS and other distance
measurebased methods and we evaluated the ability of the
FIMIXPLS and PLSPOS methods to uncover unobserved
heterogeneity in PLS path models Our findings show that
both FIMIXPLS and PLSPOS alleviate threats to validity
from unobserved heterogeneity by providing considerably less
biased parameter estimates than those that are based on
invalid assumptions of homogenous data However FIMIX
PLS is restricted to uncovering unobserved heterogeneity in
the structural model while PLSPOS can uncover unobserved
heterogeneity in both the measurement and structural models
Our results show that the parameter recovery of PLSPOS and
FIMIXPLS is comparable for those PLS path models in
which all measures are reflective (with measurement invari
ance across groups) and that heterogeneity is limited to the
structural model PLSPOS performs very well in uncovering
heterogeneity across all types of PLS path models with
different locations of heterogeneity in the model (structural
model measurement model or both) and different data
conditions (sample size relative segment sizes multi
collinearity and data distribution)
Our findings also reveal that unobserved heterogeneity in
formative measures and in the structural model should be
evaluated collectively As FIMIXPLS does not uncover
heterogeneity in measurement models PLSPOS should be
applied for discovering unobserved heterogeneity if PLS path
models include formative measures This finding is parti
cularly important because formative measurement models are
often used in IS research A comprehensive analysis of the
application of PLS path models in MIS Quarterly over the last
20 years indicates that about 42 percent of the models use
only reflective measures about 32 percent of the models use
formative measures and about a quarter of the studiesmodels
do not explicitly state which measurement model was used
(Ringle et al 2012) In addition the number of studies using
formative measures in IS research has increased over time
While there is an ongoing discussion on the interpretation and
use of formative measures (AguirreUrreta and Marakas 2012
Diamantopoulos 2011 Edwards 2010 Jarvis et al 2012
Petter et al 2012) there is general consensus that the theo
retical meaning of a construct should correspond to its empi
rical meaning and that some theoretical constructs fit forma
tive specifications better than reflective specification (Bagozzi
2011 Diamantopoulos and Winklhofer 2001 Jarvis et al
2012 Petter et al 2007) As Bagozzi (2011) notes there are
different ontologies underlying formative and reflective mea
sures which have different accompanying approaches for
interpreting and assessing the construct and its relationships
with other constructs If researchers have chosen a formative
ontology the discovery of unobserved heterogeneity in
formative indicator weights can assist them in evaluating
plausible differences in the construct’s theoretical or empirical
meaning between groups thereby safeguarding against
interpretational confounds
It is important to note that we do not recommend using
segmentation methods (including FIMIXPLS and PLSPOS)
for post hoc datadriven improvement of results where
researchers engage in fishing expeditions with the objective
of improving the significance of an association or the predic
tive power of the model as described earlier in the section on
the UHD process Instead consistent with Jedidi et al (1997)
and Van de Ven (2007) we take the position that theory
development in the social and behavioral sciences does not
need to be confined to deductive reasoning Moreover in
situations in which the researcher discovers anomalies that
must be resolved through theoretical elaboration theory
development is significantly enhanced by abduction Seg
mentation provides a mechanism to facilitate abduction by
surfacing anomalies which must then be confronted and
resolved theoretically Using the presented methods in PLS
path modeling and CBSEM within the UHD process is a
possible way to achieve this goal
Acknowledgments
We thank the senior editor Ron Thomson the associate editor and
the reviewers for their constructive comments and valuable
690 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
suggestions We also appreciate the comments from Ed Rigdon and
Detmar Straub from Georgia State University on our motivation and
initial ideas for this study
References
AguirreUrreta M I and Marakas G M 2012 Revisiting Bias
Due to Construct Misspecification Different Results from
Considering Coefficients in Standardized Form MIS Quarterly
(361) pp 123138
Allenby G M and Rossi P E 1998 Marketing Models of
Consumer Heterogeneity Journal of Econometrics (8912) pp
5778
Anderberg M R 1973 Cluster Analysis for Applications New
York Academic Press
Andrews R L Brusco M J and Currim I S 2010 Amal
gamation of Partitions from Multiple Segmentation Bases A
Comparison of NonModelBased and ModelBased Methods
European Journal of Operational Research (2012) pp 608618
Andrews R L and Currim I S 2003a A Comparison of
Segment Retention Criteria for Finite Mixture Logit Models
Journal of Marketing Research (4020) pp 235243
Andrews R L and Currim I S 2003b Retention of Latent
Segments in RegressionBased Marketing Models International
Journal of Research in Marketing (204) pp 315321
Ansari A Jedidi K and Jagpal S 2000 A Hierarchical Bayes
ian Methodology for Treating Heterogeneity in Structural
Equation Models Marketing Science (194) pp 328347
Arminger G Stein P and Wittenberg J 1999 Mixtures of
Conditional Mean and CovarianceStructure Models Psycho
metrika (644) pp 475494
Bagozzi R P 2011 Measurement and Meaning in Information
Systems and Organizational Research Methodological and
Philosophical Foundations MIS Quarterly (352) pp 261292
Bapna R Goes P Kwok Kee W and Zhongju Z 2011 A
Finite Mixture Logit Model to Segment and Predict Electronic
Payments System Adoption Information Systems Research
(221) pp 118133
Bart Y Shankar V Sultan F and Urban G L 2005 Are the
Drivers and Role of Online Trust the Same for All Web Sites and
Consumers A LargeScale Exploratory Empirical Study The
Journal of Marketing (694) pp 133152
Cai JH and Song XY 2010 Bayesian Analysis of Mixtures
in Structural Equation Models with NonIgnorable Missing
Data British Journal of Mathematical and Statistical
Psychology (633) pp 491508
Cenfetelli R T and Bassellier G 2009 Interpretation of Forma
tive Measurement in Information Systems Research MIS
Quarterly (334) pp 689707
Chin W W 1998 The Partial Least Squares Approach to Struc
tural Equation Modeling in Modern Methods for Business
Research G A Marcoulides (ed) Mahwah NJ Erlbaum pp
295358
Chin W W and Dibbern J 2010 A Permutation Based Pro
cedure for MultiGroup PLS Analysis Results of Tests of Dif
ferences on Simulated Data and a Cross Cultural Analysis of the
Sourcing of Information System Services between Germany and
the USA in Handbook of Partial Least Squares Concepts
Methods and Applications V Esposito Vinzi W W Chin J
Henseler and H Wang (eds) Berlin Springer pp 171193
Chin W W Marcolin B L and Newsted P R 2003 A Partial
Least Squares Latent Variable Modeling Approach for Measuring
Interaction Effects Results from a Monte Carlo Simulation
Study and an ElectronicMail EmotionAdoption Study Infor
mation Systems Research (142) pp 189217
Collier J E and Bienstock C C 2009 Model Misspecification
Contrasting Formative and Reflective Indicators for a Model of
EService Quality Journal of Marketing Theory & Practice
(173) pp 283293
Colquitt J A and ZapataPhelan C P 2007 Trends in Theory
Building and Theory Testing A FiveDecade Study of the
Academy of Management Journal Academy of Management
Journal (506) pp 12811303
Cook T D and Campbell D T 1976 The Design and Conduct
of QuasiExperiments and True Experiments in Field Settings
in Handbook of Industrial and Organizational Psychology M D
Dunnette (ed) Chicago Rand McNally pp 223326
Cook T D and Campbell D T 1979 QuasiExperimentation
Design and Analysis Issues for Field Settings Chicago Rand
McNally
Davis F D Bagozzi R P and Warshaw P R 1989 User
Acceptance of Computer Technology A Comparison of Two
Theoretical Models Management Science (358) pp 9821003
DeSarbo W S and Cron W L 1988 A Maximum Likelihood
Methodology for Clusterwise Linear Regression Journal of
Classification (52) pp 249282
DeSarbo W S Di Benedetto C A Jedidi K and Song M
2006 Identifying Sources of Heterogeneity for Empirically De
riving Strategic Types A Constrained FiniteMixture Structural
Equation Methodology Management Science (526) pp
909924
Desarbo W S Ramaswamy V and Cohen S H 1995 Market
Segmentation with ChoiceBased Conjoint Analysis Marketing
Letters (62) pp 137147
Diamantopoulos A 2011 Incorporating Formative Measures into
CovarianceBased Structural Equation Models MIS Quarterly
(352) pp 335358
Diamantopoulos A and Papadopoulos N 2010 Assessing the
CrossNational Invariance of Formative Measures Guidelines
for International Business Researchers Journal of International
Business Studies (412) pp 360370
Diamantopoulos A Riefler P and Roth K P 2008 Advancing
Formative Measurement Models Journal of Business Research
(6112) pp 12031218
Diamantopoulos A and Winklhofer H M 2001 Index Con
struction with Formative Indicators An Alternative to Scale
Development Journal of Marketing Research (382) pp
269277
Dolan C and van der Maas H 1998 Fitting Multivariage
Normal Finite Mixtures Subject to Structural Equation
Modeling Psychometrika (633) pp 227253
Edmondson A C and McManus S E 2007 Methodological Fit
in Management Field Research Academy of Management
Review (324) pp 11551179
Edwards J R 2010 The Fallacy of Formative Measurement
Organizational Research Methods (142) pp 370388
MIS Quarterly Vol 37 No 3September 2013 691
Becker et alDiscovering Unobserved Heterogeneity in SEM
Edwards J R and Lambert L S 2007 Methods for Integrating
Moderation and Mediation A General Analytical Framework
Using Moderated Path Analysis Psychological Methods (121)
pp 122
Esposito Vinzi V Trinchera L and Amato S 2010 PLS Path
Modeling From Foundations to Recent Developments and Open
Issues for Model Assessment and Improvement in Handbook of
Partial Least Squares Concepts Methods and Applications
V Esposito Vinzi W W Chin J Henseler and H Wang (eds)
Berlin Springer pp 4782
Esposito Vinzi V Trinchera L Squillacciotti S and Tenenhaus
M 2008 REBUSPLS A ResponseBased Procedure for
Detecting Unit Segments in PLS Path Modelling Applied
Stochastic Models in Business & Industry (245) pp 439458
Fraley C and Raftery A 2002 ModelBased Clustering
Discriminant Analysis and Density Estimation Journal of the
American Statistical Association (97458) pp 611631
Gilbride T J Allenby G M and Brazell J D 2006 Models
for Heterogeneous Variable Selection Journal of Marketing
Research (433) pp 420430
Goodhue D Lewis W and Thompson R 2007 Statistical
Power in Analyzing Interaction Effects Questioning the Advan
tage of PLS with Product Indicators Information Systems
Research (182) pp 211227
Haenlein M and Kaplan A M 2011 The Influence of
Observed Heterogeneity on Path Coefficient Significance
Technology Acceptance Within the Marketing Discipline The
Journal of Marketing Theory and Practice (192) pp 153168
Hahn C Johnson M D Herrmann A and Huber F 2002
Capturing Customer Heterogeneity Using a Finite Mixture PLS
Approach Schmalenbach Business Review (SBR) (543) pp
243269
Heeler R M and Ray M L 1972 Measure Validation in
Marketing Journal of Marketing Research (94) pp 361370
Henseler J and Chin W W 2010 A Comparison of Ap
proaches for the Analysis of Interaction Effects Between Latent
Variables Using Partial Least Squares Path Modeling Structural
Equation Modeling A Multidisciplinary Journal (171) pp
82109
Henson J M Reise S P and Kim K H 2007 Detecting
Mixtures from Structural Model Differences Using Latent
Variable Mixture Modeling A Comparison of Relative Model
Fit Statistics Structural Equation Modeling (142) pp 202226
Hsieh J J PA Rai A and Keil M 2008 Understanding
Digital Inequality Comparing Continued Use Behavioral
Models of the SocioEconomically Advantaged and Disad
vantaged MIS Quarterly (321) pp 97126
Jaccard J and Wan C K 1995 Measurement Error in the
Analysis of Interaction Effects Between Continuous Predictors
Using Multiple Regression Multiple Indicator and Structural
Equation Approaches Psychological Bulletin (1172) pp
348357
Jarvis C B MacKenzie S B and Podsakoff P M 2003 A
Critical Review of Construct Indicators and Measurement Model
Misspecification in Marketing and Consumer Research Journal
of Consumer Research (302) pp 199218
Jarvis C B MacKenzie S B and Podsakoff P M 2012 The
Negative Consequences of Measurement Model Misspecification
A Response to AguirreUrreta and Marakas MIS Quarterly
(361) pp 139146
Jedidi K Jagpal H S and DeSarbo W S 1997 FiniteMixture
Structural Equation Models for ResponseBased Segmentation
and Unobserved Heterogeneity Marketing Science (161) pp
3959
Johns G 2006 The Essential Impact of Context on Organiza
tional Behavior The Academy of Management Review (312)
pp 386408
Jöreskog K G 1971 Simultaneous Factor Analysis in Several
Populations Psychometrika (364) pp 409426
Jöreskog K G 1978 Structural Analysis of Covariance and
Correlation Matrices Psychometrika (434) pp 443477
Jöreskog K G 1982 The LISREL Approach to Causal Model
Building in the Social Sciences in Systems Under Indirect
Observation Part I H Wold and K G Jöreskog (eds) Amster
dam NorthHolland pp 81100
Jöreskog K G and Yang F 1996 Nonlinear Structural Equa
tion Models The KennyJudd Model with Interaction Effects
in Advanced Structural Equation Modeling Issues and Tech
niques G A Marcoulides and R E Schumacker (eds)
Mahwah NJ Lawrence Earlbaum Associates pp 5787
King W R and He J 2006 A MetaAnalysis of the Technology
Acceptance Model Information & Management (436) pp
740755
Klein A and Moosbrugger H 2000 Maximum Likelihood Esti
mation of Latent Interaction Effects with the LMS Method
Psychometrika (654) pp 457474
Lee SY and Song XY 2003 Bayesian Analysis of Structural
Equation Models with Dichotomous Variables Statistics in
Medicine (2219) pp 30733088
Lenk P J DeSarbo W S Green P E and Young M R 1996
Hierarchical Bayes Conjoint Analysis Recovery of Partworth
Heterogeneity from Reduced Experimental Design Marketing
Science (152) pp 173191
Lohmöller JB 1989 Latent Variable Path Modeling with Partial
Least Squares Heidelberg Physica
Lubke G H and Muthén B 2005 Investigating Population
Heterogeneity With Factor Mixture Models Psychological
Methods (101) pp 2139
Luo L Kannan P K and Ratchford B T 2008 Incorporating
Subjective Characteristics in Product Design and Evaluations
Journal of Marketing Research (452) pp 182194
Mason C H and Perreault W D 1991 Collinearity Power and
Interpretation of Multiple Regression Analysis Journal of Mar
keting Research (283) pp 268280
McLachlan G J and Peel D 2000 Finite Mixture Models New
York Wiley
Money K G Hillenbrand C Henseler J and Da Camara N
2012 Exploring Unanticipated Consequences of Strategy
Amongst Stakeholder Segments The Case of a European
Revenue Service Long Range Planning (4556) pp 395423
Muthén B O 1989 Latent Variable Modeling in Heterogeneous
Populations Psychometrika (544) pp 557585
Muthén B O 1994 Multilevel Covariance Structure Analysis
Sociological Methods & Research (223) pp 376398
Palumbo F Romano R and Esposito Vinzi V 2008 Fuzzy
PLS Path Modeling A New Tool For Handling Sensory Data
692 MIS Quarterly Vol 37 No 3September 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
in Data Analysis Machine Learning and Applications Pro
ceedings of the 31st Annual Conference of the Gesellschaft für
Klassifikation C Preisach H Burkhardt L SchmidtThieme
and R Decker (eds) Berlin Springer pp 689696
Papies D and Clement M 2008 Adoption of New Movie
Distribution Services on the Internet Journal of Media Econo
mics (213) pp 131157
Parasuraman A Zeithaml V A and Berry L L 1988
SERVQUAL A MultipleItem Scale for Measuring Consumer
Perceptions of Service Quality Journal of Retailing (641) pp
1240
Petter S Rai A and Straub D 2012 The Critical Importance
of Construct Measurement Specification A Response to
AguirreUrreta and Marakas MIS Quarterly (361) pp
147156
Petter S Straub D and Rai A 2007 Specifying Formative
Constructs in Information Systems Research MIS Quarterly
(314) pp 623656
Popkowski Leszczyc P T and Bass F M 1998 Determining
the Effects of Observed and Unobserved Heterogeneity on
Consumer Brand Choice Applied Stochastic Models and Data
Analysis (142) pp 95115
Qureshi I and Compeau D 2009 Assessing BetweenGroup
Differences in Information Systems Research A Comparison of
Covariance and ComponentBased SEM MIS Quarterly (331)
pp 197214
RabeHesketh S Skrondal A and Pickles A 2004 Gener
alized Multilevel Structural Equation Modeling Psychometrika
(692) pp 167190
Rai A Patnayakuni R and Seth N 2006 Firm Performance
Impacts of Digitally Enabled Supply Chain Integration Capa
bilities MIS Quarterly (302) pp 225246
R Core Team 2013 R A Language and Environment for
Statistical Computing R Foundation for Statistical Computing
Vienna
Reinartz W J Echambadi R and Chin W W 2002 Gener
ating NonNormal Data for Simulation of Structural Equation
Models Using Mattson’s Method Multivariate Behavioral
Research (372) pp 227244
Reinartz W J Haenlein M and Henseler J 2009 An
Empirical Comparison of the Efficacy of CovarianceBased and
VarianceBased SEM International Journal of Research in
Marketing (264) pp 332344
Reinecke J 2006 Special Issue Mixture Structural Equation
Modeling Methodology European Journal of Research
Methods for the Behavioral and Social Sciences (23) pp 8385
Rigdon E E Ringle C M and Sarstedt M 2010 Structural
Modeling of Heterogeneous Data with Partial Least Squares in
Review of Marketing Research N K Malhotra (ed) Armonk
NY M E Sharpe pp 255296
Ringle C M Sarstedt M and Mooi E A 2010a Response
Based Segmentation Using Finite Mixture Partial Least Squares
Theoretical Foundations and an Application to American
Customer Satisfaction Index Data Annals of Information
Systems (8) pp 1949
Ringle C M Sarstedt M and Schlittgen R 2010b Finite
Mixture and Genetic Algorithm Segmentation in Partial Least
Squares Path Modeling Identification of Multiple Segments in
a Complex Path Model in Advances in Data Analysis Data
Handling and Business Intelligence A Fink B Lausen W
Seidel and A Ultsch (eds) Berlin Springer pp 167176
Ringle C M Sarstedt M Schlittgen R and Taylor C R 2013
PLS Path Modeling and Evolutionary Segmentation Journal
of Business Research forthcoming
Ringle C M Sarstedt M and Straub D 2012 A Critical Look
at the Use of PLSSEM in MIS Quarterly MIS Quarterly (361)
pp iiiviii
Ringle C M Wende S and Will A 2005 SmartPLS 20
wwwsmartplsde
Rust R T and Verhoef P C 2005 Optimizing the Marketing
Interventions Mix in IntermediateTerm CRM Marketing
Science (243) pp 477489
Sánchez G 2009 PATHMOX Approach Segmentation Trees
in Partial Least Squares Path Modeling unpublished doctoral
dissertation Universitat Politècnica de Catalunya
Sánchez G and Aluja T 2006 PATHMOX A PLSPM
Segmentation Algorithm in Proceedings of the IASC Sym
posium on Knowledge Extraction by Modelling International
Association for Statistical Computing Island of Capri Italy
Sánchez G and Aluja T 2012 R Package pathmox Segmen
tation Trees in Partial Least Squares Path Modeling (Version
011) httpcranrprojectorgwebpackagespathmox
Sánchez G and Trinchera L 2013 R Package PLSPM (version
035) httpcranrprojectorgwebpackagesplspm
Sarstedt M 2008 A Review of Recent Approaches for Capturing
Heterogeneity in Partial Least Squares Path Modelling Journal
of Modelling in Management (32) pp 140161
Sarstedt M Becker JM Ringle C M and Schwaiger M
2011a Uncovering and Treating Unobserved Heterogeneity
with FIMIXPLS Which Model Selection Criterion Provides an
Appropriate Number of Segments Schmalenbach Business
Review (631) pp 3462
Sarstedt M Henseler J and Ringle C M 2011b MultiGroup
Analysis in Partial Least Squares (PLS) Path Modeling Alter
native Methods and Empirical Results in Advances in Inter
national Marketing Volume 22 M Sarstedt M Schwaiger and
C R Taylor (eds) Bingley UK Emerald Group Publishing
Limited pp 195218
Sarstedt M and Ringle C M 2010 Treating Unobserved
Heterogeneity in PLS Path Modelling A Comparison of FIMIX
PLS with Different Data Analysis Strategies Journal of
Applied Statistics (378) pp 12991318
Sarstedt M Schwaiger M and Ringle C M 2009 Do We
Fully Understand the Critical Success Factors of Customer
Satisfaction with Industrial Goods Extending Festge and
Schwaiger’s Model to Account for Unobserved Heterogeneity
Journal of Business Market Management (33) pp 185206
Sörbom D 1974 A General Method for Studying Differences in
Factor Means and Factor Structure between Groups British
Journal of Mathematical and Statistical Psychology (272) pp
229239
Späth H 1979 Algorithm 39 Clusterwise Linear Regression
Computing (224) pp 367373
Squillacciotti S 2005 Prediction Oriented Classification in PLS
Path Modeling in PLS & Marketing Proceedings of the 4th
International Symposium on PLS and Related Methods T Aluja
MIS Quarterly Vol 37 No 3September 2013 693
Becker et alDiscovering Unobserved Heterogeneity in SEM
J Casanovas V Esposito Vinzi and M Tenenhaus (eds) Paris
DECISIA pp 499506
Squillacciotti S 2010 PredictionOriented Classification in PLS
Path Modeling in Handbook of Partial Least Squares
Concepts Methods and Applications V Esposito Vinzi W W
Chin J Henseler and H Wang (eds) Berlin Springer pp
219233
Srite M and Karahanna E 2006 The Role of Espoused
National Cultural Values in Technology Acceptance MIS
Quarterly (303) pp 679704
Steenkamp JB E M and Baumgartner H 1998 Assessing
Measurement Invariance in CrossNational Consumer Research
Journal of Consumer Research (251) pp 7890
Straub D W 1989 Validating Instruments in MIS Research
MIS Quarterly (132) pp 147169
Tenenhaus M Esposito Vinzi V Chatelin YM and Lauro C
2005 PLS Path Modeling Computational Statistics & Data
Analysis (481) pp 159205
Tueller S and Lubke G 2010 Evaluation of Structural Equa
tion Mixture Models Parameter Estimates and Correct Class
Assignment Structural Equation Modeling A Multidisciplinary
Journal (172) pp 165192
Van de Ven A H 2007 Engaged Scholarship A Guide for
Organizational and Social Research New York Oxford
University Press
Vandenberg R J and Lance C E 2000 A Review and
Synthesis of the Measurement Invariance Literature Sugges
tions Practices and Recommendations for Organizational
Research Organizational Research Methods (31) pp 470
Venkatesh V 2000 Determinants of Perceived Ease of Use
Integrating Control Intrinsic Motivation and Emotion into the
Technology Acceptance Model Information Systems Research
(114) pp 342365
Venkatesh V and Bala H 2008 Technology Acceptance Model
3 and a Research Agenda on Interventions Decision Sciences
(392) pp 273315
Venkatesh V and Davis F D 2000 A Theoretical Extension of
the Technology Acceptance Model Four Longitudinal Field
Studies Management Science (462) pp 186204
Venkatesh V and Morris M G 2000 Why Don’t Men Ever
Stop to Ask for Directions Gender Social Influence and Their
Role in Technology Acceptance and Usage Behavior MIS
Quarterly (241) pp 115139
Venkatesh V Morris M G Davis G B and Davis F D 2003
User Acceptance of Information Technology Toward a Unified
View MIS Quarterly (273) pp 425478
Wang J and Keil M 2007 A MetaAnalysis Comparing the
Sunk Cost Effect for IT and NonIT Projects Information
Resources Management Journal (203) pp 118
Wedel M and DeSarbo W S 1994 A Review of Latent Class
Regression Models and their Applications in Advanced
Methods for Marketing Research R P Bagozzi (ed) Cam
bridge UK Blackwell Business pp 353388
Wedel M and Kamakura W 2000 Market Segmentation Con
ceptual and Methodological Foundations (2nd ed) New York
Kluwer Academic Publishers
Wetzels M OdekerkenSchröder G and van Oppen C 2009
Using PLS Path Modeling for Assessing Hierarchical Construct
Models Guidelines and Empirical Illustration MIS Quarterly
(331) pp 177195
Wold H 1982 Soft Modeling The Basic Design and Some
Extensions in Systems Under Indirect Observations Part I
K G Jöreskog and H Wold (eds) Amsterdam NorthHolland
pp 154
Wu J and Lederer A 2009 A MetaAnalysis of the Role of
EnvironmentBased Voluntariness in Information Technology
Acceptance MIS Quarterly (332) pp 419432
About the Authors
JanMichael Becker is a postdoctoral researcher at the University
of Cologne He received his doctoral degree in Marketing from the
University of Cologne Germany and his diploma in Information
Systems from the University of Hamburg Germany He has been
a visiting scholar at Georgia State University several times His
research interests focus on structural equation modeling PLS path
modeling unobserved heterogeneity and mixture models as well as
brand management and bridging marketing and IS problems
Arun Rai is Regents’ Professor and the Harkins Chair in the Center
for Process Innovation and the Department of Computer Informa
tion Systems at the Robinson College of Business Georgia State
University His research has examined how firms can leverage
information technologies in their strategies interfirm relationships
and processes and how systems can be successfully developed and
implemented His articles have appeared in Management Science
MIS Quarterly Information Systems Research Journal of Manage
ment Information Systems Journal of Operations Management and
other journals He serves or has served as a senior editor at
Information Systems Research MIS Quarterly and Journal of
Strategic Information Systems and as an associate editor at Infor
mation Systems Research Management Science Journal of MIS
and MIS Quarterly He was named Fellow of the Association for
Information Systems in 2010 in recognition for outstanding
contributions to the Information Systems discipline
Christian M Ringle is Professor of Management at the Hamburg
University of Technology (TUHH) Germany and Visiting
Professor at the University of Newcastle Australia His research
concerns improvements of quantitative methods for business
research applied to study management and marketing issues His
work has been published in outlets that include MIS Quarterly
International Journal of Research in Marketing Journal of the
Academy of Marketing Science Journal of Service Research
Journal of Business Research and Long Range Planning
Franziska Völckner is Professor of Marketing at the University of
Cologne Germany Her research interest focuses on building and
managing marketbased assets This interest bridges the areas of
branding consumer behavior marketing metrics and marketing
strategy Her work has been published in several academic journals
including Journal of Marketing Journal of Marketing Research
International Journal of Research in Marketing Journal of the
Academy of Marketing Science Journal of Service Research
Journal of Business Research and Marketing Letters among others
694 MIS Quarterly Vol 37 No 3September 2013
RESEARCH ESSAY
DISCOVERING UNOBSERVED HETEROGENEITY IN
STRUCTURAL EQUATION MODELS TO
AVERT VALIDITY THREATS
JanMichael Becker
Department of Marketing and Brand Management University of Cologne
Cologne 50923 GERMANY {jbecker@wisounikoelnde}
Arun Rai
Center for Process Innovation and Department of Computer Information Systems Robinson College of Business
Georgia State University Atlanta GA 30303 USA {arunrai@gsuedu}
Christian M Ringle
Institute for Human Resource Management and Organizations Hamburg University of Technology (TUHH)
Hamburg 21073 GERMANY {ringle@tuhhde} and
Faculty of Business and Law University of Newcastle Callaghan NSW 2308 AUSTRALIA {christianringle@newcastleeduau}
Franziska Völckner
Department of Marketing and Brand Management University of Cologne
Cologne 50923 GERMANY {voelckner@wisounikoelnde}
Appendix A
MetaAnalyses of Information Systems Studies
Table A1 MetaAnalyses of IS Studies Inconsistent Results Across a Range of Phenomena
IS Phenomenon
Reference
Journal Scope MetaAnalysis Purpose
ModeratorsContingency
Variables Examined
Nature of Inconsistent Findings
(emphasis added)
Decision Support
System (DSS)
Implementation
Success
Alavi and
Joachimsth
aler 1992
MISQ
144
findings
from 33
studies
Investigating the relationship
between userrelated factors and
DSS implementation success
Authors suggest that
moderators could explain the
large variance in effect sizes
across studies
Reviews of information systems
implementation research…have
revealed that collectively implemen
tation studies have yielded
conflicting and somewhat
confusing findings
Group Support
Systems (GSS)
Dennis et al
2001 MISQ 61 articles
Developing a new model for
interpreting GSS effects on firm
performance
• Fit between the Task and
the GSS Structures
• Appropriation Support
Received
Many previous papers have
lamented the fact that the findings of
past GSS research have been
inconsistent This paper develops
a new model for interpreting GSS
effects on performance…
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A1
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table A1 MetaAnalyses of IS Studies Inconsistent Results Across a Range of Phenomena
(Continued)
IS Phenomenon
Reference
Journal Scope MetaAnalysis Purpose
ModeratorsContingency
Variables Examined
Nature of Inconsistent Findings
(emphasis added)
IT Investment
Payoff
Kohli and
Deveraj
2003 ISR
66 studies
Examining structural variables that
explain why some IT payoff studies
observe a positive effect and some
do not
• Dependent Classification
• Sample Size
• Data Source
• Type of IT Impact
• Type of IT Assets
• Industry
…some studies have shown mixed
results in establishing a relationship
between IT investment and firm
performance
IT Innovation
Adoption
Lee and Xia
2006 I&M
54 correla
tions from
21 studies
Investigating the effects of
organizational size on IT innovation
adoption
• Type of Innovation
• Type of Organization
• Stage of Adoption
• Scope of Size
• Industry Sector
…empirical results on the
relationship between them have
been disturbingly mixed and
inconsistent…explain and resolve
these mixed results by… examining
the effects of six moderators on the
relationship
IT Project
Escalation
Wang and
Keil 2007
IRMJ
12 articles
with
20 separate
experiment
s
Investigating the effect size of sunk
cost on project escalation and deter
mining whether there is a difference
in effect sizes between IT and non
IT projects
• IT vs NonIT Projects
…because of the strong magnitude
and heterogeneity of effect sizes
for the sunk cost effect we need
more primary studies that
investigate potential moderators of
sunk cost
Turnover of IT
Professionals
Joseph et
al 2007
MISQ
33 studies
Integrating the 43 antecedents of
turnover intentions of IT
professionals in a unified framework
using metaanalytic structural
equation modeling
• Age
• Gender Ratio of Sample
• Operationalization of
Turnover Intention
• Operationalization of
Antecedents
…our narrative review finds several
inconsistent (eg organization
tenure and role conflict) and
inconclusive (eg age and gender)
findings
IS
Implementation
Success
Sharma and
Yetton
2003 MISQ
22 studies
Proposing a contingent model in
which task interdependence
moderates the effect of
management support on
implementation success
• Task Interdependence
A metaanalysis of the empirical
literature provides strong support for
the model and begins to explain the
wide variance in empirical
findings
The theory developed and findings
reported above help to explain the
inconsistent findings in the
literature
Sabherwal
et al 2006
MgmtScien
ce
612
findings
from 121
studies
Explaining the interrelationships
among four constructs representing
the success of a specific information
system and the relationships of
these IS success constructs with
four userrelated constructs and two
constructs representing the context
Authors suggest that possible
moderators include voluntari
ness of IS adoption and user
characteristics such as age
and gender
Despite considerable empirical
research results on the
relationships among constructs
related to information system (IS)
success as well as the determinants
of IS success are often
inconsistent
Sharma and
Yetton
2007 MISQ
27 studies
Proposing a contingent model in
which the effect of training on IS
implementation success is a
function of technical complexity and
task interdependence
• Technical Complexity
• Task Interdependence
Research has investigated the main
effect of training on information
systems implementation success
However empirical support for
this model is inconsistent
A2 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table A1 MetaAnalyses of IS Studies Inconsistent Results Across a Range of Phenomena
(Continued)
IS Phenomenon
Reference
Journal Scope MetaAnalysis Purpose
ModeratorsContingency
Variables Examined
Nature of Inconsistent Findings
(emphasis added)
Technology
Acceptance
King and He
2006 I&M 88 studies
Summarizing TAM research and
investigating conditions under which
TAM may have different effects
• Type of Users
• Type of Usage
all TAM relationships are not
borne out in all studies there is
wide variation in the predicted
effects in various studies…
Since there are inconsistencies in
TAM results a metaanalysis is
more likely to appropriately integrate
the positive and the negative
Schepers
and Wetzels
2007 I&M
51 articles
containing
63 studies
Analyzing the role of subjective
norms and three interstudy
moderating factors
• Type of Respondents
• Type of Technology
•Culture
First the subjective norm has had a
mixed and inconclusive
role…Some studies found
considerable impacts of it on the
dependent variables However
others did not find significant
effects
Wu and
Lederer
2009
MISQ
71 studies
Investigating the impact of
environmentbased voluntariness on
the relationships among the four
primary TAM constructs (ie ease
of use perceived usefulness
behavioral intention and usage)
• EnvironmentBased
Voluntariness
The Q statistic for each of the five
correlations exceeded its cutoff and
thus the analyses confirmed
heterogeneity for each (p < 001)
That is of all the correlations vary
across studies more than would
be produced by sampling error
Appendix B
PredictionOriented Segmentation for PLS Path Modeling (PLSPOS)
Overview
As a distancebased segmentation method the PLS predictionoriented segmentation (PLSPOS) method builds on earlier work on distance
measurebased segmentation—that is the PLS typological path modeling (PLSTPM) approach (Squillacciotti 2005) and its enhancement the
responsebased detection of respondent segments in PLS (REBUSPLS) (Esposito Vinzi et al 2008) To extend the distancemeasurebased
PLS segmentation methods (including overcoming the methodological limitation of PLSTPM and REBUSPLS being applicable only to PLS
path models with reflective measures (Esposito Vinzi et al 2008 Sarstedt 2008)) the PLSPOS algorithm introduces three novel features (1) it
uses an explicit PLSspecific objective criterion to form homogeneous groups (2) it includes a new distance measure that is appropriate for
PLS path model with both reflective and formative measures and is able to uncover unobserved heterogeneity in formative measures and (3) it
ensures continuous improvement of the objective criterion throughout the iterations of the algorithm (hillclimbing approach) Table B1 shows
the key technical differences of the new PLSPOS method in comparison with the main distancebased methods (ie PLSTPM and REBUS
PLS) and the popular finitemixture method for PLS (ie FIMIXPLS)
The following sections explain in greater detail PLSPOS’ distinctive features To begin with we focus on the description of PLSPOS’
objective criterion An explanation of the distance measure employed and its extension to use it for formative measurement models follows
Finally we provide details on the algorithm with its specific steps and procedures and how it ensures the continuous improvement of the
objective criterion
Objective Criterion of PLSPOS
The main segmentation objective in PLS is to form homogenous groups of observations that show increased endogenous variables’ explained
variance (R²) and thus provide an improved prediction (compared to the overall sample) which is in accordance with Anderberg’s (1973 p
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A3
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table B1 Comparison of the Technical Differences of FIMIXPLS PLSTPM REBUSPLS and PLSPOS
FiniteMixture
Segmentation
Approach DistanceBased Clustering Approaches
Algorithm Feature FIMIXPLS
(Hahn et al 2002)
PLSTPM
(Squillacciotti 2005
Squillacciotti 2010)
REBUSPLS
(Esposito Vinzi et al 2010
Esposito Vinzi et al 2008)
PLSPOS
Distributional
Assumptions Yes No No No
Preclustering
No preclustering
random split of
observations
Hierarchical
classification based
on redundancy
residuals of the
overall model
Hierarchical classification
based on communality and
structural residuals of the
overall model
No preclustering random
split of observations and
assignment to closest
segment according to the
distance measure
Distance measure Has no distance
measure†
Based on redundancy
residuals of a single
reflective endogenous
latent variable
Based on communality
residuals of all latent vari
ables and structural
residuals of all endog
enous latent variables
Based on structural resi
duals of all endogenous
latent variables with an
extension that also accounts
for heterogeneity in
formative measures
Accounts for sources of
heterogeneity in reflec
tive measures
No No Yes No
Accounts for sources of
heterogeneity in forma
tive measures
No No‡ No ‡ Yes
Accounts for sources of
heterogeneity in the
structural model
Yes Yes Yes Yes
Assignment of
observations to
segments in each
iteration
Proportional assignment
of all observations to all
segments based on the
conditional multivariate
normal densities to
optimize the likelihood
function
Assigns all
observations to the
closest segment
Assigns all observations to
the closest segment
Assigns only one
observation to the closest
segment and assures
improvement of an objective
criterion (R² of all
endogenous latent
variables) before accepting
the change
Stop criterion
Extremely small
improvement in log
likelihood below critical
value (or maximum
number of iterations)
Stability of the
classes’ composition
(no reassignment of
observations) or
maximum number of
iterations
Stability of the classes’
composition (number of re
assignments below a
critical percentage value of
observations) or maximum
number of iterations
Infinitesimal improvement in
objective criterion (or
maximum number of
iterations)
†FIMIXPLS assumes that each endogenous latent variable is distributed as a finite mixture of conditional multivariate normal densities It uses
these densities to estimate probabilities of segment memberships for each observation (proportional assignment) to optimize the likelihood function
(which implicitly maximizes the segmentspecific explained variance as part of the likelihood function)
‡As in PLSTPM … [REBUSPLS] distance’ has so far only been implemented on models with reflective blocks Although this is not to be
considered a strict limitation for many applications it must be pointed out that REBUSPLS requires all blocks to be reflective (Esposito Vinzi et
al 2008 p 444) This requirement for models with only reflective measures also holds for the REBUSPLS implementation in the PLSPM package
(Sánchez and Trinchera 2013) for the statistical software R (R Core Team 2013)
A4 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
195) notion of clustering for maximum prediction Consequently possible PLSspecific and thus predictionoriented objective criteria
include the following (1) the sum of the manifest variables’ redundancy residuals in the reflective measures (2) the sum of endogenous latent
variables’ R² values in the structural model and (3) the goodnessoffit criterion (GoF Tenenhaus et al 2005)1 for assessing both the structural
model and the reflective measures
Including the residual terms of the manifest variables would only be appropriate to assess the explained variance and thus the predictive
performance in reflective measures Because PLS path modeling allows for the use of reflective and formative measures objective criteria
that draw on the manifest variables’ residual terms do not support the general applicability of PLSPOS in both measurement models (ie
reflective and formative) Consequently the redundancy and community residual in the reflective measures which are also included in the
PLSGoF measure are not a useful criterion for the purpose of the PLS segmentation method
An appropriate PLSspecific objective criterion maximizes the sum of the endogenous latent variables’ R² values In accordance with the PLS
algorithm’s objective (Lohmöller 1989 Wold 1982) PLSPOS focuses on maximizing the predictivity of each group by minimizing the sum
of the endogenous latent variables’ squared residuals in the PLS path model Thus the sum of each group’s sum of R² values represents the
objective criterion which is explicitly defined and calculated in the PLSPOS algorithm Every reassignment of observations in PLSPOS
ensures improvement of the objective criterion (hill climbing approach see description of the algorithm below) This objective criterion is
suitable for any PLS path model regardless of whether such models include reflective or formative measures
Distance Measure
To reassign observations PLSPOS builds on the idea of Squillacciotti (2005) and Esposito Vinzi et al (2008) to use a distance measure We
propose a new distance measure that is applicable to both reflective and formative measures and accounts for heterogeneity in the structural
and the formative measurement model This observationtogroup distance measure identifies appropriate observations to form homogenous
groups and thereby depicts suitable candidates to improve the objective criterion Within a group each observation’s capability to predict the
endogenous latent variables in the PLS path model determines its distance to that group the shorter the distance of observation i to group g
the higher the predictivity of observation i in group g
It is important to understand the conceptual difference between observation i’s membership in its current group k (k g k g ε G) and its dis
tance to an alternative group g (k … g k g ε G) For every endogenous latent variable b (b ε B) the latent variable scores of its direct prede
cessors and the corresponding structural model path coefficients allow for the groupspecific prediction of the endogenous latentYaik
exogenous
b
pagb
variable scores via linear combinations To calculate we use the latent variable scores of()Ybig ()YYpbig a ik
exogenous
aga
A
bbb
b× 1 Ybig
an observation’s current group k and draw on the alternative group g’s PLS path coefficients The difference between the predicted value pagb
Ybig
and the current group’s latent variable scores from the PLS path model estimation is the residual of observation i in group g for theYbik
endogenous latent variable b (Equation 1)
(1)()eYY Y pYbig big bik a ok
exogenous
ag bik
endogenous
a
A
bb
b
b
2 2
1
2
− ×−

 




The result of is an observation’s predictivity in its current group when k g (k g ε G) Furthermore using the path coefficients ebig
2 pagb
of alternative groupspecific PLS estimations for k … g (k g ε G) provides a heuristic outcome for observation i’s predictivity in each of the
G1 other possible group assignments This establishes the new predictionoriented PLSPOS distance measure as presented by Equation (2)
(2)D e
ekig
big
bigi
I
b
B
k


2
2
11
The residuals of each observation i are divided by the sum of the residuals of all observations in i’s current group k (Ik sample size in group
k) This ratio’s square root is the distance of an observation i to group g for an endogenous latent variable b (b ε B) The sum over all
1Against its naming PLSGoF does not represent a measure of fit for PLS path modeling see Henseler and Sarstedt (2012) for a discussion
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A5
Becker et alDiscovering Unobserved Heterogeneity in SEM
endogenous variables B in the PLS path model provides the total distance measure Dkig The smaller the sum of the endogenous latent variables’
squared residual values the higher the predictivity of observation i in group g of the underlying PLS path model
The distinction between formative and reflective measures requires that one pays particular attention in PLS path modeling (eg
Diamantopoulos et al 2001 Gudergan et al 2008 Jarvis et al 2003) Formative measures require (1) taking into account the indicators’
heterogeneity for each measurement model within each group andor (2) uncovering the significant differences in weights between the groups
Therefore calculating the groupspecific residual term in models with formative measures requires an extension of the groupspecific residual
in the distance measure The latent variable scores are replaced by linear combinations of the manifest variable scores andebig
2 Yajikb
xajikb
the corresponding measurement model’s formative weights Equation (3) shows the calculation of the residual term for formativeπajgb
measures in the PLS path model
(3)ex pYbig a jik a jg a g bik
endogenous
j
J
a
A
bbb
b
b
2
1
2
××−

 


π
The formative latent variable scores become a groupwise reestimated prediction of the associated manifest variables j when the squared residual
is determined
Algorithm
The segmentation process starts by randomly partitioning the overall sample into the prespecified number of G equal groups (Figure B1 Step
1) Calculating all groupspecific PLS path model estimates reveals each observation’s distance to its own and all other G1 groups A
partitioning approach that assigns each observation to the group to which it has the shortest distance improves the initial segmentation
Subsequently the PLSPOS algorithm computes the groupspecific PLS path modeling results (Figure B1 Step 2) updates the objective
function (Figure B1 Step 3) and computes the observations’ distances to all groups (Figure B1 Step 41) PLSPOS uses the distance measure
to reassign observations based on the maximum value of the difference between an observation’s distance to its current group (ie the group
to which the observation has been assigned) and its distance to an alternative group (Equation 4)
difference Δkig distance to current group k (Dkik) – distance to alternative group g (Dkig)(4)
Positive differences indicate that an observation has a shorter distance to the alternative group and thus potentially fits better in that group
in terms of predictivity This computation is conducted for all observations (Figure B1 Step 41) Each observation’s maximum positive
difference becomes part of the list of candidates (Figure B1 Step 42) Negative values are not considered because reassigning these
observations possibly decreases the objective criterion Subsequently the candidates are sorted in descending order in terms of their positive
distance differences (Figure B1 Step 43)
After the STOP statement PLSPOS provides the groupspecific PLS path model estimates for the final segmentation solution (Figure B1
Step 7) The maximum number of iterations should be sufficiently high (eg twice the number of observations in the overall sample) to obtain
a solution that is close to the global optimum The maximum search depth equals the number of observations in the sorted list of candidate
observations for reassignment and thus may not exceed the number of observations in the overall sample In early explorative research stages
one may use a reduced search depth for performance reasons However to determine the final segmentation result the search depth should
equal the maximum number of observations to ensure that the segmentation solution that minimizes the PLSPOS objective criterion (ie the
endogenous latent variables’ R² values in the PLS path model) has been identified
Finally three important issues are worth noting First PLSPOS only reassigns observations that improve the objective criterion As such
the algorithm ensures the continuous improvement of the objective criterion and potentially provides a solution that is at least close to the global
optimum Second in each iteration step the algorithm changes the assignment of only one observation and calculates the groupspecific PLS
estimates of all observations and their new distance measures Thus in contrast to the alternative distancebased PLS segmentation approaches
suggested in the literature to date (eg Esposito Vinzi et al 2008 Squillacciotti 2005) PLSPOS avoids moving a sizeable set (more or less)
of similar candidates from one group to another without improving the objective criterion Third owing to the implementation of a hill
climbing approach PLSPOS could face the problem of ending in local optima Wedel and Kamakura (2000) recommend running hillclimbing
algorithms several times to attain alternative starting partitions and finally to select the best segmentation solution The same procedure should
be applied in the application of PLSPOS
A6 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Step 1 Create an initial segmentation to start the algorithm
Step 11 Randomly split the overall sample into K equally sized groups
Step 12 Compute the groupspecific PLS estimates for the path model
Step 13 Establish each observation’s distance to each group
Step 14 Assign each observation to the closest group
DO LOOP
Step 2 Compute the groupspecific PLS estimates for the path model
Step 3 Determine the result of the objective criterion
Step 4 Create a list of candidate observations for reassignment
Step 41 Establish the K1 differences between each observation’s distance to its current group and an alternative
group
Step 42 IF an observation has one or more positive differences of distances then
Add the maximum difference and the observation’s corresponding alternative group assignment to a list of
candidates
ELSE Do nothing
Step 43 IF the list is empty then
GO TO STOP
ELSE Sort the list of candidate observations in descending order in terms of their positive distance differences
Step 5 Improve the segmentation result
Step 51 Select the first observation in the list of candidate observations for reassignment
DO LOOP
Step 52 Reassign the observation
Step 52 Compute the groupspecific PLS estimates for the path model
Step 53 Determine the result of the objective criterion
Step 54 IF the observation’s reassignment improves the objective criterion then
Save the current assignment and GO TO Step 6
ELSE Undo changes and continue with Step 55
Step 55 IF the list contains a subsequent observation following the currently selected observation on the list of
candidates AND the maximum search depth has not been reached then
Select the next observation
ELSE GO TO Step 6
UNTIL the objective criterion is improved
Step 6 IF the maximum number of iterations OR the maximum search depth has been reached then
GO TO STOP
ELSE GO TO Step 2
UNTIL STOP
Step 7 Compute the groupspecific PLS path model estimates and provide the final segmentation results
Figure B1 The PLSPOS Algorithm
Appendix C
Design of the Multicollinearity Factor for the Simulation Study
The design of the simulation study for the formative measurement model includes three levels of multicollinearity between the formative
indicators in the model To simulate different levels of multicollinearity we revert to Mason and Perreault’s (1991) seminal study on
multicollinearity (see also Grewal et al 2004) We vary two levels of correlation patterns among the predictor variables reflecting conditions
typically encountered by researchers and practitioners In addition a situation in which the indicators are uncorrelated (orthogonal) serves as
a baseline for comparison (ie a perfect formative measure) because this model is unaffected by multicollinearity
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A7
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table C1 shows the two multicollinearity levels based on Mason and Perreault including the trace of (X’X)1 det(X’X) and condition number
as well as each variable’s variance inflation factor (VIF) associated with a given level of multicollinearity
Table C1 Levels of Multicollinearity
Level 1 Level 2
X1 X2 X3 X4 X1 X2 X3 X4
X1 100 100
X2 65 100 80 100
X3 40 40 100 60 60 100
X4 00 00 00 100 00 00 00 100
VIF 180 180 124 100296296167100
Trace (X’X)1 585 859
Det(X’X) 47 22
Condition no 238 342
Note VIF variance inflation factor
Appendix D
Simulation on the Effects of Unobserved Heterogeneity
The objective of this simulation study is to evaluate the implications of unobserved heterogeneity for structural model parameter estimates in
PLS path models The results show that unobserved heterogeneity has a strong adverse effect on PLS estimation outcomes (1) parameter
estimates are biased (2) nonsignificant path coefficients at the group level become significant at the overall sample level that combines groups
(3) sign differences in the parameter estimates between groups are manifested as nonsignificant results at the overall sample level and
(4) explained variance of the model (R² of the endogenous variables) decreases These erroneous estimates can lead to both Type I and Type II
errors and to invalid inferences
The simulation study uses a path model with two exogenous variables having a direct effect on one endogenous variable (all variables measured
with five reflective indicators) We generate data for the true path coefficients of two groups by considering three situations of unobserved
heterogeneity
• Situation 1 where the path coefficients between group 1 and group 2 differ but show the same sign We consider scenarios where all
parameter estimates are positive (situation 1a) and negative (situation 1b) and where the magnitude in parameter differences between groups
is low (1) and high (5)
• Situation 2 where unobserved heterogeneity causes sign reversal in parameter estimates across the two groups (ie group 1 has a positive
path coefficient while group 2 has a negative one)
• Situation 3 where one group has a nonsignificant parameter estimate and the other group has a significant parameter estimate We distinguish
between two different levels of parameter differences represented by the effect size of the significant parameter namely 2 and 7
We generated 100 sets of data for each condition and estimated the groupspecific path coefficients the overall sample path coefficients and
the tvalues of these coefficients by employing the bootstrapping procedure on 1000 subsamples (Henseler et al 2009)
Table D1 presents the results The left side shows the groupspecific mean estimates of the path coefficients and their average tvalues2 The
columns on the right side show the mean path coefficients of the overall sample and the interpretation of the results in terms of bias Type I
and II errors and variance explained (R²)
2For a significance level of α 005 the tvalue has to exceed the threshold of 198 in these conditions
A8 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
The results show that in all situations biases in the parameter estimates distort effect sizes and cause misinterpretation of the path coefficients
which is especially problematic for comparative hypotheses (eg path coefficient 1 > path coefficient 2) Type I and Type II errors are
exacerbated in situations where the groupspecific parameters show inconsistent signs (ie situation 2 where signs are reversed across groups)
and when at least one of the groups involves nonsignificant parameters while the other group does not (ie situation 3) In contrast when all
parameters are significant and show the same sign (situation 1) our results suggest that it is not very likely that Type II errors occur In this
situation the existence of Type II errors depends on the effect size and the degree to which the increased power of the combined sample size
compensates for the increase in standard errors due to unobserved heterogeneity For all parameter constellations in our simulation study the
increased sample size compensates for the increase in standard errors
The R² decreases in almost all situations implying an inferior model fit at the overall sample level We find particularly strong decreases in
R² in situations in which the groupspecific effect sizes are high in contrast R² is almost unaffected in situations showing low groupspecific
effect sizes
Table D1 Results of the Simulation Study
GroupSpecific
Parameter Estimates Pooled Parameter Estimate
Group 1
(n 200)
Group 2
(n 200)
Parameter
(n 400) Biased
Type I
Error
Type II
Error Lower R²
1a
7 (t 1857)
2 (t 394)
R² 53
2 (t 384)
7 (t 1964)
R² 53
45 (t 1136)
45 (t 1154)
R² 41
Yes – No Yes
3 (t 495)
2 (t 331)
R² 13
2 (t 336)
3 (t 479)
R² 13
25 (t 570)
25 (t 573)
R² 12
Yes – No (Yes)
1b
7 (t 1895)
2 (t 370)
R² 53
2 (t 401)
7 (t 1927)
R² 53
45 (t 1119)
45 (t 1144)
R² 24
Yes – No Yes
3 (t 503)
2 (t 314)
R² 13
2 (t 325)
3 (t 509)
R² 13
25 (t 561)
25 (t 580)
R² 12
Yes – No (Yes)
2
7 (t 1943)
2 (t 399)
R² 53
7 (t 1909)
2 (t 378)
R² 53
00 (t 01)
00 (t 00)
R² 00
Yes – 100
100 Yes
3
7 (t 1994)
0 (t 01)
R² 49
0 (t 01)
7 (t 1989)
R² 49
35 (t 761)
35 (t 738)
R² 24
Yes 100
100 No Yes
2 (t 338)
0 (t 00)
R² 04
0 (t 01)
2 (t 317)
R² 04
10 (t 188)
10 (t 190)
R² 02
Yes 20
40
80
60 (Yes)
4
0 (t 00)
0 (t 01)
R² 00
0 (t 01)
0 (t 00)
R² 00
00 (t 00)
00 (t 00)
R² 00
–No––
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A9
Becker et alDiscovering Unobserved Heterogeneity in SEM
Appendix E
ANOVA Results—Model 1 (Reflective Measures)
Tables E1 to E4 present the ANOVA results for model 1 (reflective measures) explaining MAB by method (PLSPOSFIMIXPLS) and the
six design factors All significant and substantial effects (ie all effects that explain more than 2 percent of the total variance in MAB implying
a partial η² of more than 02) are highlighted in grey
We find that the R² structural model heterogeneity data distribution and the interaction of structural model heterogeneity and R² have a
substantial and significant effect on the MAB of both methods Furthermore there is a significant and substantial difference in the parameter
recovery (MAB) of the two methods (PLSPOS and FIMIXPLS) and for the interaction effects between the method and structural model
heterogeneity and between the method and R²
Table E1 BetweenSubjects Effects (Part I)
Source of Variance in MAB df F Sig Partial η²
Intercept 1 1465862 000 568
SMH 3 112171 000 232
R² 3 194885 000 344
Sample Size 2 7077 000 013
Reliability 1 188 170 000
Data Distribution 1 49752 000 043
RSS 1 2262 000 002
SMH × R² 9 17896 000 126
SMH × Sample Size 6 964 000 005
SMH × Reliability 3 133 262 000
SMH × Data Distribution 3 2115 000 006
SMH × RSS 3 2517 000 007
R² × Sample Size 6 1144 000 006
R² × Reliability 3 75 524 000
R² × Data Distribution 3 1472 000 004
R² × RSS 3 2976 000 008
Sample Size × Reliability 2 48 620 000
Sample Size × Data Distribution 2 1417 000 003
Sample Size × RSS 2 6392 000 011
Reliability × Data Distribution 1 404 044 000
Reliability × RSS 1 11 735 000
Data Distribution × RSS 1 26772 000 023
SMH × R² × Sample Size 18 175 026 003
SMH × R² × Reliability 9 127 249 001
SMH × R² × Data Distribution 9 600 000 005
SMH × R² × RSS 9 232 013 002
SMH × Sample Size × Reliability 6 139 216 001
Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity
all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB implying a partial η²
of more than 02) are highlighted in grey
A10 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table E2 BetweenSubjects Effects (Part II)
Source of Variance in MAB df F Sig Partial η²
SMH × Sample Size × Data Distribution 6 522 000 003
SMH × Sample Size × RSS 6 923 000 005
SMH × Reliability × Data Distribution 3 219 087 001
SMH × Reliability × RSS 3 350 015 001
SMH × Data Distribution × RSS 3 230 075 001
R² × Sample Size × Reliability 6 188 080 001
R² × Sample Size × Data Distribution 6 183 089 001
R² × Sample Size × RSS 6 1300 000 007
R² × Reliability × Data Distribution 3 185 135 000
R² × Reliability × RSS 3 42 740 000
R² × Data Distribution × RSS 3 783 000 002
Sample Size × Reliability × Data Distribution 2 165 191 000
Sample Size × Reliability × RSS 2 219 112 000
Sample Size × Data Distribution × RSS 2 1714 000 003
Reliability × Data Distribution × RSS 1 108 299 000
SMH × R² × Sample Size × Reliability 18 53 948 001
SMH × R² × Sample Size × Data Distribution 18 168 036 003
SMH × R² × Sample Size × RSS 18 211 004 003
SMH × R² × Reliability × Data Distribution 9 68 725 001
SMH × R² × Reliability × RSS 9 80 614 001
SMH × R² × Data Distribution × RSS 9 152 135 001
SMH × Sample Size × Reliability × Data Distribution 6 60 730 000
SMH × Sample Size × Reliability × RSS 6 79 577 000
SMH × Sample Size × Data Distribution × RSS 6 241 025 001
SMH × Reliability × Data Distribution × RSS 3 206 104 001
R² × Sample Size × Reliability × Data Distribution 6 152 168 001
R² × Sample Size × Reliability × RSS 6 104 399 001
R² × Sample Size × Data Distribution × RSS 6 475 000 003
R² × Reliability × Data Distribution × RSS 3 26 851 000
Sample Size × Reliability × Data Distribution × RSS 2 53 588 000
SMH × R² × Sample Size × Reliability × Data Distribution 18 70 817 001
SMH × R² × Sample Size × Reliability × RSS 18 70 811 001
SMH × R² × Sample Size × Data Distribution × RSS 18 99 473 002
SMH × R² × Reliability × Data Distribution × RSS 9 50 874 000
SMH × Sample Size × Reliability × Data Distribution × RSS 6 171 115 001
R² × Sample Size × Reliability × Data Distribution × RSS 6 141 206 001
SMH × R² × Sample Size × Reliability × Data Distribution × RSS 18 96 502 002
Error 11136
Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A11
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table E3 WithinSubjects Effects (Part I)
Source of Variance in MAB df F Sig Partial η²
Method 1 95231 000 079
Method × SMH 3 21747 000 055
Method × R² 3 13714 000 036
Method × Sample Size 2 466 009 001
Method × Reliability 1 00 974 000
Method × Data Distribution 1 8797 000 008
Method × RSS 1 10401 000 009
Method × SMH × R² 9 1284 000 010
Method × SMH × Sample Size 6 279 010 002
Method × SMH × Reliability 3 26 854 000
Method × SMH × Data Distribution 3 3726 000 010
Method × SMH × RSS 3 88 450 000
Method × R² × Sample Size 6 184 087 001
Method × R² × Reliability 3 02 995 000
Method × R² × Data Distribution 3 1948 000 005
Method × R² × RSS 3 398 008 001
Method × Sample Size × Reliability 2 27 765 000
Method × Sample Size × Data Distribution 2 1760 000 003
Method × Sample Size × RSS 2 1660 000 003
Method × Reliability × Data Distribution 1 02 876 000
Method × Reliability × RSS 1 149 700 000
Method × Data Distribution × RSS 1 1437 000 001
Method × SMH × R² × Sample Size 18 89 589 001
Method × SMH × R² × Reliability 9 133 215 001
Method × SMH × R² × Data Distribution 9 207 029 002
Method × SMH × R² × RSS 9 456 000 004
Method × SMH × Sample Size × Reliability 6 73 626 000
Method × SMH × Sample Size × Data Distribution 6 394 001 002
Method × SMH × Sample Size × RSS 6 172 112 001
Method × SMH × Reliability × Data Distribution 3 74 527 000
Method × SMH × Reliability × RSS 3 102 381 000
Method × SMH × Data Distribution × RSS 3 1888 000 005
Method × R² × Sample Size × Reliability 6 28 945 000
Method × R² × Sample Size × Data Distribution 6 209 051 001
Method × R² × Sample Size × RSS 6 357 002 002
Method × R² × Reliability × Data Distribution 3 29 835 000
Method × R² × Reliability × RSS 3 128 278 000
Method × R² × Data Distribution × RSS 3 897 000 002
Method × Sample Size × Reliability × Data Distribution 2 69 501 000
Method × Sample Size × Reliability × RSS 2 13 876 000
Method × Sample Size × Data Distribution × RSS 2 898 000 002
Method × Reliability × Data Distribution × RSS 1 00 993 000
Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity all significant
and substantial effects (ie all effects that explain more than 2 of the total variance in MAB implying a partial η² of more than 02) are highlighted
in grey
A12 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table E4 WithinSubjects Effect (Part II)
Source of Variance in MAB df F Sig Partial η²
Method × SMH × R² × Sample Size × Reliability 18 56 930 001
Method × SMH × R² × Sample Size × Data Distribution 18 195 009 003
Method × SMH × R² × Sample Size × RSS 18 147 092 002
Method × SMH × R² × Reliability × Data Distribution 9 95 484 001
Method × SMH × R² × Reliability × RSS 9 107 380 001
Method × SMH × R² × Data Distribution × RSS 9 196 040 002
Method × SMH × Sample Size × Reliability × Data Distribution 6 54 775 000
Method × SMH × Sample Size × Reliability × RSS 6 123 286 001
Method × SMH × Sample Size × Data Distribution × RSS 6 262 015 001
Method × SMH × Reliability × Data Distribution × RSS 3 30 828 000
Method × R² × Sample Size × Reliability × Data Distribution 6 120 305 001
Method × R² × Sample Size × Reliability × RSS 6 56 766 000
Method × R² × Sample Size × Data Distribution × RSS 6 259 016 001
Method × R² × Reliability × Data Distribution × RSS 3 34 798 000
Method × Sample Size × Reliability × Data Distribution × RSS 2 34 711 000
Method × SMH × R² × Sample Size × Reliability × Data Distribution 18 49 965 001
Method × SMH × R² × Sample Size × Reliability × RSS 18 44 980 001
Method × SMH × R² × Sample Size × Data Distribution × RSS 18 176 024 003
Method × SMH × R² × Reliability × Data Distribution × RSS 9 47 897 000
Method × SMH × Sample Size × Reliability × Data Distribution × RSS 6 162 138 001
Method × R² × Sample Size × Reliability × Data Distribution × RSS 6 32 928 000
Method × SMH × R² × Sample Size × Reliability × Data Distribution × RSS 18 83 667 001
Error(Method) 11136
Note df degrees of freedom MAB mean absolute bias RSS relative segment size SMH structural model heterogeneity
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A13
Becker et alDiscovering Unobserved Heterogeneity in SEM
Appendix F
ANOVA Results—Model 2 (Formative Measures)
Tables F1 to F7 present the ANOVA results for model 2 (formative measures) explaining MAB by method (PLSPOSFIMIXPLS) and the
seven design factors All significant and substantial effects (ie all effects that explain more than 2 percent of the total variance in MAB
implying a partial η² of more than 02) are highlighted in grey
We find that the R² structural and measurement model heterogeneity sample size multicollinearity and data distribution the interaction of
structural and measurement model heterogeneity and the interaction of sample size and relative segment size have a substantial and significant
effect on the MAB of both methods Furthermore there is a significant and substantial difference in the parameter recovery (MAB) of the two
methods (PLSPOS and FIMIXPLS) and for the twoway interaction effects between method and R² multicollinearity and structural and
measurement model heterogeneity Method even has a significant and substantial interaction effect with both structural and measurement model
heterogeneity (threeway interaction)
Table F1 BetweenSubjects Effects (Part I)
Source of Variance in MAB df F Sig Partial η²
Intercept 1 14269680 00 740
SMH 3 760533 00 313
MMH 2 291299 00 104
R² 3 428631 00 204
Sample Size 2 86477 00 033
RSS 1 62983 00 012
Data Distribution 1 146575 00 028
Multicollinearity 2 84818 00 033
SMH × MMH 6 29809 00 034
SMH × R² 9 4428 00 008
MMH × R² 6 582 00 006
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB
implying a partial η² of more than 02) are highlighted in grey
A14 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table F2 BetweenSubjects Effects (Part II)
Source of Variance in MAB df F Sig Partial η²
SMH × Sample Size 6 3110 00 004
MMH × Sample Size 4 1506 00 001
R² × Sample Size 6 4643 00 006
SMH × RSS 3 7868 00 005
MMH × RSS 2 69 50 000
R² × RSS 3 8786 00 005
Sample Size × RSS 2 142686 00 054
SMH × Data Distribution 3 1204 00 001
MMH × Data Distribution 2 761 00 000
R² × Data Distribution 3 321 02 000
Sample Size × Data Distribution 2 2839 00 001
RSS × Data Distribution 1 226 13 000
SMH × Multicollinearity 6 10917 00 013
MMH × Multicollinearity 4 28784 00 022
R² × Multicollinearity 6 539 00 001
Sample Size × Multicollinearity 4 2836 00 002
RSS × Multicollinearity 2 1571 00 001
Data Distribution × Multicollinearity 2 1650 00 001
SMH × MMH × R² 18 2586 00 009
SMH × MMH × Sample Size 12 518 00 001
SMH × R² × Sample Size 18 78 73 000
MMH × R² × Sample Size 12 48 93 000
SMH × MMH × RSS 6 548 00 001
SMH × R² × RSS 9 60 80 000
MMH × R² × RSS 6 266 01 000
SMH × Sample Size × RSS 6 4287 00 005
MMH × Sample Size × RSS 4 623 00 000
R² × Sample Size × RSS 6 5973 00 007
SMH × MMH × Data Distribution 6 335 00 000
SMH × R² × Data Distribution 9 1258 00 002
MMH × R² × Data Distribution 6 179 10 000
SMH × Sample Size × Data Distribution 6 902 00 001
MMH × Sample Size × Data Distribution 4 233 05 000
R² × Sample Size × Data Distribution 6 276 01 000
SMH × RSS × Data Distribution 3 1381 00 001
MMH × RSS × Data Distribution 2 150 22 000
R² × RSS × Data Distribution 3 264 05 000
Sample Size × RSS × Data Distribution 2 2148 00 001
SMH × MMH × Multicollinearity 12 1831 00 004
SMH × R² × Multicollinearity 18 730 00 003
MMH × R² × Multicollinearity 12 116 31 000
SMH × Sample Size × Multicollinearity 12 1115 00 003
MMH × Sample Size × Multicollinearity 8 317 00 001
R² × Sample Size × Multicollinearity 12 88 57 000
SMH × RSS × Multicollinearity 6 1244 00 001
MMH × RSS × Multicollinearity 4 808 00 001
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB
implying a partial η² of more than 02) are highlighted in grey
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A15
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table F3 BetweenSubjects Effects (Part III)
Source of Variance in MAB df F Sig Partial η²
R² × RSS × Multicollinearity 6 129 26 000
Sample Size × RSS × Multicollinearity 4 1822 00 001
SMH × Data Distribution × Multicollinearity 6 94 46 000
MMH × Data Distribution × Multicollinearity 4 381 00 000
R² × Data Distribution × Multicollinearity 6 88 51 000
Sample Size × Data Distribution × Multicollinearity 4 1109 00 001
RSS × Data Distribution × Multicollinearity 2 1297 00 001
SMH × MMH × R² × Sample Size 36 75 86 001
SMH × MMH × R² × RSS 18 86 63 000
SMH × MMH × Sample Size × RSS 12 531 00 001
SMH × R² × Sample Size × RSS 18 192 01 001
MMH × R² × Sample Size × RSS 12 36 98 000
SMH × MMH × R² × Data Distribution 18 165 04 001
SMH × MMH × Sample Size × Data Distribution 12 387 00 001
SMH × R² × Sample Size × Data Distribution 18 136 14 000
MMH × R² × Sample Size × Data Distribution 12 68 78 000
SMH × MMH × RSS × Data Distribution 6 180 09 000
SMH × R² × RSS × Data Distribution 9 157 12 000
MMH × R² × RSS × Data Distribution 6 54 78 000
SMH × Sample Size × RSS × Data Distribution 6 898 00 001
MMH × Sample Size × RSS × Data Distribution 4 319 01 000
R² × Sample Size × RSS × Data Distribution 6 104 40 000
SMH × MMH × R² × Multicollinearity 36 216 00 002
SMH × MMH × Sample Size × Multicollinearity 24 79 75 000
SMH × R² × Sample Size × Multicollinearity 36 162 01 001
MMH × R² × Sample Size × Multicollinearity 24 104 41 000
SMH × MMH × RSS × Multicollinearity 12 241 00 001
SMH × R² × RSS × Multicollinearity 18 119 26 000
MMH × R² × RSS × Multicollinearity 12 138 17 000
SMH × Sample Size × RSS × Multicollinearity 12 908 00 002
MMH × Sample Size × RSS × Multicollinearity 8 195 05 000
R² × Sample Size × RSS × Multicollinearity 12 138 17 000
SMH × MMH × Data Distribution × Multicollinearity 12 634 00 002
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity
A16 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table F4 BetweenSubjects Effects (Part IV)
Source of Variance in MAB df F Sig Partial η²
SMH × R² × Data Distribution × Multicollinearity 18 172 03 001
MMH × R² × Data Distribution × Multicollinearity 12 112 34 000
SMH × Sample Size × Data Distribution × Multicollinearity 12 1019 00 002
MMH × Sample Size × Data Distribution × Multicollinearity 8 87 54 000
R² × Sample Size × Data Distribution × Multicollinearity 12 223 01 001
SMH × RSS × Data Distribution × Multicollinearity 6 902 00 001
MMH × RSS × Data Distribution × Multicollinearity 4 49 74 000
R² × RSS × Data Distribution × Multicollinearity 6 110 36 000
Sample Size × RSS × Data Distribution × Multicollinearity 4 2461 00 002
SMH × MMH × R² × Sample Size × RSS 36 75 86 001
SMH × MMH × R² × Sample Size × Data Distribution 36 74 88 001
SMH × MMH × R² × RSS × Data Distribution 18 120 25 000
SMH × MMH × Sample Size × RSS × Data Distribution 12 162 08 000
SMH × R² × Sample Size × RSS × Data Distribution 18 69 83 000
MMH × R² × Sample Size × RSS × Data Distribution 12 120 27 000
SMH × MMH × R² × Sample Size × Multicollinearity 72 113 21 002
SMH × MMH × R² × RSS × Multicollinearity 36 166 01 001
SMH × MMH × Sample Size × RSS × Multicollinearity 24 166 02 001
SMH × R² × Sample Size × RSS × Multicollinearity 36 52 99 000
MMH × R² × Sample Size × RSS × Multicollinearity 24 75 81 000
SMH × MMH × R² × Data Distribution × Multicollinearity 36 95 55 001
SMH × MMH × Sample Size × Data Distribution × Multicollinearity 24 152 05 001
SMH × R² × Sample Size × Data Distribution × Multicollinearity 36 133 09 001
MMH × R² × Sample Size × Data Distribution × Multicollinearity 24 90 60 000
SMH × MMH × RSS × Data Distribution × Multicollinearity 12 152 11 000
SMH × R² × RSS × Data Distribution × Multicollinearity 18 190 01 001
MMH × R² × RSS × Data Distribution × Multicollinearity 12 145 14 000
SMH × Sample Size × RSS × Data Distribution × Multicollinearity 12 865 00 002
MMH × Sample Size × RSS × Data Distribution × Multicollinearity 8 113 34 000
R² × Sample Size × RSS × Data Distribution × Multicollinearity 12 85 60 000
SMH × MMH × R² × Sample Size × RSS × Data Distribution 36 98 51 001
SMH × MMH × R² × Sample Size × RSS × Multicollinearity 72 84 84 001
SMH × MMH × R² × Sample Size × Data Distribution × Multicollinearity 72 107 33 002
SMH × MMH × R² × RSS × Data Distribution × Multicollinearity 36 124 15 001
SMH × MMH × Sample Size × RSS × Data Distribution ×
Multicollinearity
24 112 32 001
SMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 36 109 32 001
MMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 24 87 65 000
SMH × MMH × R² × Sample Size × RSS × Data Distribution ×
Multicollinearity
72 105 36 002
Error 50112
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A17
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table F5 WithinSubjects Effects (Part I)
Source of Variance in MAB df F Sig Partial η²
Method 1 393852 00 073
Method × SMH 3 398798 00 193
Method × MMH 2 677105 00 213
Method × R² 3 82632 00 047
Method × Sample Size 2 22755 00 009
Method × RSS 1 17166 00 003
Method × Data Distribution 1 297 08 000
Method × Multicollinearity 2 173912 00 065
Method × SMH × MMH 6 97649 00 105
Method × SMH × R² 9 8350 00 015
Method × MMH × R² 6 613 00 001
Method × SMH × Sample Size 6 2280 00 003
Method × MMH × Sample Size 4 313 01 000
Method × R² × Sample Size 6 395 00 000
Method × SMH × RSS 3 6096 00 004
Method × MMH × RSS 2 1278 00 001
Method × R² × RSS 3 1569 00 001
Method × Sample Size × RSS 2 16340 00 006
Method × SMH × Data Distribution 3 5431 00 003
Method × MMH × Data Distribution 2 339 03 000
Method × R² × Data Distribution 3 519 00 000
Method × Sample Size × Data Distribution 2 1245 00 000
Method × RSS × Data Distribution 1 5616 00 001
Method × SMH × Multicollinearity 6 37296 00 043
Method × MMH × Multicollinearity 4 25724 00 020
Method × R² × Multicollinearity 6 969 00 001
Method × Sample Size × Multicollinearity 4 2284 00 002
Method × RSS × Multicollinearity 2 585 00 000
Method × Data Distribution × Multicollinearity 2 1181 00 000
Method × SMH × MMH × R² 18 1149 00 004
Method × SMH × MMH × Sample Size 12 244 00 001
Method × SMH × R² × Sample Size 18 368 00 001
Method × MMH × R² × Sample Size 12 139 16 000
Method × SMH × MMH × RSS 6 1480 00 002
Method × SMH × R² × RSS 9 1250 00 002
Method × MMH × R² × RSS 6 261 02 000
Method × SMH × Sample Size × RSS 6 4794 00 006
Method × MMH × Sample Size × RSS 4 1337 00 001
Method × R² × Sample Size × RSS 6 1962 00 002
Method × SMH × MMH × Data Distribution 6 174 11 000
Method × SMH × R² × Data Distribution 9 501 00 001
Method × MMH × R² × Data Distribution 6 304 01 000
Method × SMH × Sample Size × Data Distribution 6 768 00 001
Method × MMH × Sample Size × Data Distribution 4 30 88 000
Method × R² × Sample Size × Data Distribution 6 334 00 000
Method × SMH × RSS × Data Distribution 3 368 01 000
Method × MMH × RSS × Data Distribution 2 76 47 000
Method × R² × RSS × Data Distribution 3 43 73 000
Method × Sample Size × RSS × Data Distribution 2 1904 00 001
Method × SMH × MMH × Multicollinearity 12 2862 00 007
Method × SMH × R² × Multicollinearity 18 504 00 002
Method × MMH × R² × Multicollinearity 12 46 94 000
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity all significant and substantial effects (ie all effects that explain more than 2 of the total variance in MAB
implying a partial η² of more than 02) are highlighted in grey
A18 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table F6 WithinSubjects Effects (Part II)
Source of Variance in MAB df F Sig Partial η²
Method × SMH × Sample Size × Multicollinearity 12 1191 00 003
Method × MMH × Sample Size × Multicollinearity 8 140 19 000
Method × R² × Sample Size × Multicollinearity 12 91 53 000
Method × SMH × RSS × Multicollinearity 6 1691 00 002
Method × MMH × RSS × Multicollinearity 4 391 00 000
Method × R² × RSS × Multicollinearity 6 119 31 000
Method × Sample Size × RSS × Multicollinearity 4 2068 00 002
Method × SMH × Data Distribution × Multicollinearity 6 657 00 001
Method × MMH × Data Distribution × Multicollinearity 4 363 01 000
Method × R² × Data Distribution × Multicollinearity 6 99 43 000
Method × Sample Size × Data Distribution × Multicollinearity 4 2439 00 002
Method × RSS × Data Distribution × Multicollinearity 2 2884 00 001
Method × SMH × MMH × R² × Sample Size 36 135 08 001
Method × SMH × MMH × R² × RSS 18 148 08 001
Method × SMH × MMH × Sample Size × RSS 12 199 02 000
Method × SMH × R² × Sample Size × RSS 18 248 00 001
Method × MMH × R² × Sample Size × RSS 12 234 01 001
Method × SMH × MMH × R² × Data Distribution 18 86 63 000
Method × SMH × MMH × Sample Size × Data Distribution 12 268 00 001
Method × SMH × R² × Sample Size × Data Distribution 18 128 19 000
Method × MMH × R² × Sample Size × Data Distribution 12 37 97 000
Method × SMH × MMH × RSS × Data Distribution 6 118 32 000
Method × SMH × R² × RSS × Data Distribution 9 345 00 001
Method × MMH × R² × RSS × Data Distribution 6 51 80 000
Method × SMH × Sample Size × RSS × Data Distribution 6 837 00 001
Method × MMH × Sample Size × RSS × Data Distribution 4 121 31 000
Method × R² × Sample Size × RSS × Data Distribution 6 113 34 000
Method × SMH × MMH × R² × Multicollinearity 36 129 11 001
Method × SMH × MMH × Sample Size × Multicollinearity 24 128 16 001
Method × SMH × R² × Sample Size × Multicollinearity 36 136 08 001
Method × MMH × R² × Sample Size × Multicollinearity 24 105 40 001
Method × SMH × MMH × RSS × Multicollinearity 12 327 00 001
Method × SMH × R² × RSS × Multicollinearity 18 102 43 000
Method × MMH × R² × RSS × Multicollinearity 12 140 16 000
Method × SMH × Sample Size × RSS × Multicollinearity 12 814 00 002
Method × MMH × Sample Size × RSS × Multicollinearity 8 247 01 000
Method × R² × Sample Size × RSS × Multicollinearity 12 136 18 000
Method × SMH × MMH × Data Distribution × Multicollinearity 12 263 00 001
Method × SMH × R² × Data Distribution × Multicollinearity 18 165 04 001
Method × MMH × R² × Data Distribution × Multicollinearity 12 82 63 000
Method × SMH × Sample Size × Data Distribution × Multicollinearity 12 724 00 002
Method × MMH × Sample Size × Data Distribution × Multicollinearity 8 101 42 000
Method × R² × Sample Size × Data Distribution × Multicollinearity 12 142 15 000
Method × SMH × RSS × Data Distribution × Multicollinearity 6 694 00 001
Method × MMH × RSS × Data Distribution × Multicollinearity 4 140 23 000
Method × R² × RSS × Data Distribution × Multicollinearity 6 159 15 000
Method × Sample Size × RSS × Data Distribution × Multicollinearity 4 1565 00 001
Method × SMH × MMH × R² × Sample Size × RSS 36 188 00 001
Method × SMH × MMH × R² × Sample Size × Data Distribution 36 80 80 001
Method × SMH × MMH × R² × RSS × Data Distribution 18 100 45 000
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A19
Becker et alDiscovering Unobserved Heterogeneity in SEM
Table F7 WithinSubjects Effects (Part III)
Source of Variance in MAB df F Sig Partial
η²
Method × SMH × MMH × Sample Size × RSS × Data Distribution 12 214 01 001
Method × SMH × R² × Sample Size × RSS × Data Distribution 18 153 07 001
Method × MMH × R² × Sample Size × RSS × Data Distribution 12 77 68 000
Method × SMH × MMH × R² × Sample Size × Multicollinearity 72 91 70 001
Method × SMH × MMH × R² × RSS × Multicollinearity 36 128 12 001
Method × SMH × MMH × Sample Size × RSS × Multicollinearity 24 195 00 001
Method × SMH × R² × Sample Size × RSS × Multicollinearity 36 137 07 001
Method × MMH × R² × Sample Size × RSS × Multicollinearity 24 90 60 000
Method × SMH × MMH × R² × Data Distribution × Multicollinearity 36 98 50 001
Method × SMH × MMH × Sample Size × Data Distribution × Multicollinearity 24 246 00 001
Method × SMH × R² × Sample Size × Data Distribution × Multicollinearity 36 149 03 001
Method × MMH × R² × Sample Size × Data Distribution × Multicollinearity 24 70 85 000
Method × SMH × MMH × RSS × Data Distribution × Multicollinearity 12 175 05 000
Method × SMH × R² × RSS × Data Distribution × Multicollinearity 18 171 03 001
Method × MMH × R² × RSS × Data Distribution × Multicollinearity 12 137 17 000
Method × SMH × Sample Size × RSS × Data Distribution × Multicollinearity 12 867 00 002
Method × MMH × Sample Size × RSS × Data Distribution × Multicollinearity 8 129 24 000
Method × R² × Sample Size × RSS × Data Distribution × Multicollinearity 12 78 68 000
Method × SMH × MMH × R² × Sample Size × RSS × Data Distribution 36 85 73 001
Method × SMH × MMH × R² × Sample Size × RSS × Multicollinearity 72 105 36 002
Method × SMH × MMH × R² × Sample Size × Data Distribution × Multicollinearity 72 120 11 002
Method × SMH × MMH × R² × RSS × Data Distribution × Multicollinearity 36 153 02 001
Method × SMH × MMH × Sample Size × RSS × Data Distribution × Multicollinearity 24 253 00 001
Method × SMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 36 133 09 001
Method × MMH × R² × Sample Size × RSS × Data Distribution × Multicollinearity 24 125 18 001
Method × SMH × MMH × R² × Sample Size × RSS × Data Distribution ×
Multicollinearity
72 96 58 001
Error(Method) 50112
Note df degrees of freedom MAB mean absolute bias MMH measurement model heterogeneity RSS relative segment size
SMH structural model heterogeneity
References
Alavi M and Joachimsthaler E A 1992 Revisiting DSS Implementation Research A Meta Analysis of the Literature and Suggestions
for Researchers MIS Quarterly (161) pp 95116
Anderberg M R 1973 Cluster Analysis for Applications New York Academic Press
Dennis A R Wixom B H and Vandenberg R J 2001 Understanding Fit and Appropriation Effects in Group Support Systems via
MetaAnalysis MIS Quarterly (252) pp 167193
Diamantopoulos A and Winklhofer H M 2001 Index Construction with Formative Indicators An Alternative to Scale Development
Journal of Marketing Research (382) pp 269277
Esposito Vinzi V Trinchera L and Amato S 2010 PLS Path Modeling From Foundations to Recent Developments and Open Issues
for Model Assessment and Improvement in Handbook of Partial Least Squares Concepts Methods and Applications V Esposito Vinzi
W W Chin J Henseler and H Wang (eds) Berlin Springer pp 4782
Esposito Vinzi V Trinchera L Squillacciotti S and Tenenhaus M 2008 REBUSPLS A ResponseBased Procedure for Detecting
Unit Segments in PLS Path Modelling Applied Stochastic Models in Business & Industry (245) pp 439458
A20 MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013
Becker et alDiscovering Unobserved Heterogeneity in SEM
Grewal R Cote J A and Baumgartner H 2004 Multicollinearity and Measurement Error in Structural Equation Models Implications
for Theory Testing Marketing Science (234) pp 519529
Gudergan S P Ringle C M Wende S and Will A 2008 Confirmatory Tetrad Analysis in PLS Path Modeling Journal of Business
Research (6112) pp 12381249
Hahn C Johnson M D Herrmann A and Huber F 2002 Capturing Customer Heterogeneity Using a Finite Mixture PLS Approach
Schmalenbach Business Review (SBR) (543) pp 243269
Henseler J Ringle C M and Sinkovics R R 2009 The Use of Partial Least Squares Path Modeling in International Marketing in
Advances in International Marketing R R Sinkovics and P N Ghauri (eds) Bingley United Kingdom Emerald Group Publishing Limited
pp 277320
Henseler J and Sarstedt M 2012 GoodnessofFit Indices for Partial Least Squares Path Modeling Computational Statistics
(httplinkspringercomarticle1010072Fs0018001203171)
Jarvis C B MacKenzie S B and Podsakoff P M 2003 A Critical Review of Construct Indicators and Measurement Model
Misspecification in Marketing and Consumer Research Journal of Consumer Research (302) pp 199218
Joseph D KokYee N Koh C and Soon A 2007 Turnover of Information Technology Professionals A Narrative Review
MetaAnalytic Structural Equation Modeling and Model Development MIS Quarterly (313) pp 547577
King W R and He J 2006 A MetaAnalysis of the Technology Acceptance Model Information & Management (436) pp 740755
Kohli R and Devaraj S 2003 Measuring Information Technology Payoff A MetaAnalysis of Structural Variables in FirmLevel
Empirical Research Information Systems Research (142) pp 127145
Lee G and Xia W 2006 Organizational Size and IT Innovation Adoption A MetaAnalysis Information & Management (438) pp
975985
Lohmöller JB 1989 Latent Variable Path Modeling with Partial Least Squares Heidelberg Physica
Mason C H and Perreault W D 1991 Collinearity Power and Interpretation of Multiple Regression Analysis Journal of Marketing
Research (283) pp 268280
R Core Team 2013 R A Language and Environment for Statistical Computing R Foundation for Statistical Computing Vienna
Sabherwal R Jeyaraj A and Chowa C 2006 Information System Success Individual and Organizational Determinants Management
Science (5212) pp 18491864
Sánchez G and Trinchera L 2013 R Package PLSPM (version 035) httpcranrprojectorgwebpackagesplspm
Sarstedt M 2008 A Review of Recent Approaches for Capturing Heterogeneity in Partial Least Squares Path Modelling Journal of
Modelling in Management (32) pp 140161
Schepers J and Wetzels M 2007 A MetaAnalysis of the Technology Acceptance Model Investigating Subjective Norm and Moderation
Effects Information & Management (441) pp 90103
Sharma R and Yetton P 2003 The Contingent Effects of Management Support and Task Interdependence on Successful Information
Systems Implementation MIS Quarterly (274) pp 533555
Sharma R and Yetton P 2007 The Contingent Effects of Training Technical Complexity and Task Interdependence on Successful
Information Systems Implementation MIS Quarterly (312) pp 219238
Squillacciotti S 2005 Prediction Oriented Classification in PLS Path Modeling in PLS & Marketing Proceedings of the 4th International
Symposium on PLS and Related Methods T Aluja J Casanovas V Esposito Vinzi and M Tenenhaus (eds) Paris DECISIA pp 499506
Squillacciotti S 2010 PredictionOriented Classification in PLS Path Modeling in Handbook of Partial Least Squares Concepts Methods
and Applications V Esposito Vinzi W W Chin J Henseler and H Wang (eds) Berlin Springer pp 219233
Tenenhaus M Esposito Vinzi V Chatelin YM and Lauro C 2005 PLS Path Modeling Computational Statistics & Data Analysis
(481) pp 159205
Wang J and Keil M 2007 A MetaAnalysis Comparing the Sunk Cost Effect for IT and NonIT Projects Information Resources
Management Journal (203) pp 118
Wedel M and Kamakura W 2000 Market Segmentation Conceptual and Methodological Foundations (2nd ed) New York Kluwer
Academic Publishers
Wold H 1982 Soft Modeling The Basic Design and Some Extensions in Systems Under Indirect Observations Part I K G Jöreskog
and H Wold (eds) Amsterdam NorthHolland pp 154
Wu J and Lederer A 2009 A MetaAnalysis of the Role of EnvironmentBased Voluntariness in Information Technology Acceptance
MIS Quarterly (332) pp 419432
MIS Quarterly Vol 37 No 3—AppendicesSeptember 2013 A21
Copyright of MIS Quarterly is the property of MIS Quarterly & The Society for Information
Management and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder's express written permission However users may print
download or email articles for individual use

《香当网》用户分享的内容，不代表《香当网》观点或立场，请自行判断内容的真实性和可靠性！
该内容是文档的文本内容，更好的格式请下载文档

热门搜索

2013IS实证研究方法的讨论 DISCOVERING UNOBSERVED HETEROGENEITY

S***1

贡献于2018-10-16

相关文档

法学研究方法作业

法学方法论研究

IT项目管理方法研究

大红鹰公司：员工激励机制实证研究

我国上市公司配股融资行为的实证研究

权力运用、企业文化与创新的实证研究

公司绩效与高阶管理者离职之实证研究

「开题报告」风险态度与企业绩效实证研究

关于上海股市系统风险的实证研究（选读）

股权结构、资本结构与公司价值的实证研究

对财务柔性与企业绩效的实证研究定稿

开题报告中研究思路与研究方法的写法

社会调查研究方法

知识进步及其测量方法研究

咨询研究的主要方法

教育研究方法试题集及答案

研究方法与科技论文写作试题

服务业调查方法及研究

知识进步及其测量方法研究

《论文中常用的研究方法》

研究报告的撰写方法

快速培训方法研究

数学教育研究方法与论文写作

大学毕业论文的研究方法

基础IE ECRS方法研究论文