Department of Industrial Psychology, Stellenbosch University
Published
January 21, 2025
Introduction
In psychometrics the development of a scale that measures a single latent variable is a very desirable outcome, because it allows for unambiguous interpretation of a total score across the items of the scale. Such a scale is referred to as a unidimensional scale or an homogenous scale.
In this tutorial I demonstrate how we can use factor analysis to examine whether a scale measures a single latent variable. As an example I examine the fit of a single factor model to the five-item Neuroticism Scale of the Big Five Inventory. The five items are supposedly indicators of a single latent variable. From a factor analysis perspective, the five items are correlated, because they share the same underlying latent variable. Indeed, a factor analyst might argue that the five items are correlated “because” they are influenced by the same latent variable (and only that latent variable). This line of reasoning implies that if the influence of the latent variable (or factor) is removed, the items will no longer be correlated.
The items
The psychTools package of Bill Revelle includes the bfi data set, which contains the responses of 2800 people to the items of the Big Five Inventory. Here we focus on the five Neuroticism items, which are located in columns 16 to 20 of the bfi data frame. People respond to the items on a six-point scale.
N1: Get angry easily.
N2: Get irritated easily.
N3: Have frequent mood swings.
N4: Often feel blue.
N5: Panic easily.
Do these items measure a single latent variable (which we might label Neuroticism)? Does a single latent variable or factor account for the covariances/correlations of these items?
Observed correlations
We start by examining the correlations of the five items. If they all share the same underlying latent variable, they should all be correlated. We observe that the item correlations range from 0.35 (items N2 and N5) to 0.71 (items N1 and N2). These correlations suggest that the five items have something in common.
data(bfi)mycor <-cor(bfi[16:20], use ="complete.obs")mycor
corPlot(bfi[16:20], upper =FALSE, scale =TRUE,cex = .8,main ="Correlation plot of the Neuroticism items")
Is the correlation matrix an identity matrix?
Bartlett (1951) introduced a test of the null hypothesis that a matrix is an identity matrix. An identity matrix is one where the diagonal contains 1’s, and the off-diagonal elements are zeroes. When applied to correlation matrices, Bartlett’s test is used to test the hypothesis that all the correlations are equal to zero (within error).
Bartlett’s test is distributed as chi-square if the observed correlation matrix is an identity matrix. Our results show that \(\chi^2\)(10) = 4904.381, p = 0, which means that we can reject the null hypothesis that the observed correlation matrix is an identity matrix.
Note that Bartlett’s test is most often used in exploratory factor analysis contexts. Also, on condition that the researcher has a good substantive reason to believe that the manifest variables share one or more latent variables, the test will only in very rare situations fail to reject the null hypothesis. This will almost certainly indicate that the data are of very poor quality or that the sample size is very small.
cortest.bartlett(mycor, n =2800)
$chisq
[1] 4904.381
$p.value
[1] 0
$df
[1] 10
Is the correlation matrix factorable?
Kaiser (1970) introduced a measure of sampling adequacy (MSA) as an index of whether a correlation matrix is factorable. The index is referred to in the literature as the Kaiser-Meyer-Olkin (KMO) index. Values closer to 1 indicate better factor-ability. The KMO index for the correlations of the Neuroticism items was 0.80.
The KMO index is mostly used in exploratory factor analysis contexts. However, it does no harm to examine it and it can alert us to potentially poor quality data.
We specify the single factor model using lavaan syntax. We start by giving the model a name. We specify the model in between two single quotation marks. We give the latent variable or factor a name (here we chose to label the factor “Neuroticism”). We use the “=~” operator to specify that the items that follow are the indicators of the factor. Note that the names of these items must correspond to the names they have in the data frame. By default, lavaan will fix the correlations of the unique variances of the manifest variables to zero.
model1 <-'Neuroticism =~ N1 + N2 + N3 + N4 + N5'
Fitting the single factor model to the data
We use the cfa() function of the lavaan package to fit our confirmatory factor analysis model to the observed data. To identify the model and to set the metric of the latent variable, we have to introduce a constraint in our model. We can choose from two very common constraints: (a) fix one of the unstandardized factor loadings to unity, or (b) fix the variance of the latent variable to unity. By default, lavaan will fix the first unstandardized factor loading to unity (in our example this will be the factor loading of item N1).
If we would rather fix the variance of the latent variable to unity, we have to add the argument std.lv = TRUE, as I have done here.
The parameters of the model are estimated with maximum likelihood, which assumes multivariate normality of the data. This assumption is almost always violated when we work with psychological test items. The consequences are that the standard errors of the parameters are underestimated, which leads to p-values that are too low, and that the model is made to look as if it fits more poorly than it really does. Against this background we employ a robust maximum likelihood estimator (MLR), which will yield appropriate standard errors.
We start by inspecting the fit statistics of the model. First, our results show that the null hypothesis of perfect fit has to be rejected: \(\chi^2\)(5) = 360.932, p = 0. The TLI suggests unsatisfactory fit (TLI = 0.849), but the CFI suggests marginally acceptable fit (CFI = 0.925). The RMSEA suggests poor fit (RMSEA = 0.163). Finally, the SRMR suggests marginally acceptable fit (SRMR = 0.056).
The parameters of the model
The estimated factor loadings reflect the strength of the relations between the latent variable and the manifest variables. These factor loadings are regression coefficients–they show by how many units the scores on the items change if the latent variable were to increase by one unit. Note that there are three sets of factor loadings: (a) the unstandardized factor loadings, (b) factor loadings where the latent variable is standardized, but the items are unstandardized, and (c) factor loadings where both the latent variable and the items are standardized. If the model was identified by fixing the variance of the latent variable to unity, (a) and (b) will be the same. The fully standardized factor loadings, which can be found in the column labeled “Std.all” are typically interpreted.
lavaan 0.6-19 ended normally after 19 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 10
Used Total
Number of observations 2694 2800
Model Test User Model:
Standard Scaled
Test Statistic 360.932 313.521
Degrees of freedom 5 5
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.151
Yuan-Bentler correction (Mplus variant)
Model Test Baseline Model:
Test statistic 4724.621 3492.154
Degrees of freedom 10 10
P-value 0.000 0.000
Scaling correction factor 1.353
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.925 0.911
Tucker-Lewis Index (TLI) 0.849 0.823
Robust Comparative Fit Index (CFI) 0.925
Robust Tucker-Lewis Index (TLI) 0.849
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -23078.504 -23078.504
Scaling correction factor 1.007
for the MLR correction
Loglikelihood unrestricted model (H1) -22898.038 -22898.038
Scaling correction factor 1.055
for the MLR correction
Akaike (AIC) 46177.007 46177.007
Bayesian (BIC) 46235.995 46235.995
Sample-size adjusted Bayesian (SABIC) 46204.222 46204.222
Root Mean Square Error of Approximation:
RMSEA 0.163 0.151
90 Percent confidence interval - lower 0.149 0.138
90 Percent confidence interval - upper 0.177 0.165
P-value H_0: RMSEA <= 0.050 0.000 0.000
P-value H_0: RMSEA >= 0.080 1.000 1.000
Robust RMSEA 0.162
90 Percent confidence interval - lower 0.147
90 Percent confidence interval - upper 0.178
P-value H_0: Robust RMSEA <= 0.050 0.000
P-value H_0: Robust RMSEA >= 0.080 1.000
Standardized Root Mean Square Residual:
SRMR 0.056 0.056
Parameter Estimates:
Standard errors Sandwich
Information bread Observed
Observed information based on Hessian
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Neuroticism =~
N1 1.286 0.025 50.920 0.000 1.286 0.818
N2 1.225 0.025 49.597 0.000 1.225 0.803
N3 1.147 0.028 41.182 0.000 1.147 0.717
N4 0.872 0.033 26.438 0.000 0.872 0.554
N5 0.813 0.034 24.170 0.000 0.813 0.502
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.N1 0.819 0.048 16.954 0.000 0.819 0.331
.N2 0.828 0.046 18.124 0.000 0.828 0.356
.N3 1.245 0.052 23.807 0.000 1.245 0.486
.N4 1.714 0.055 31.118 0.000 1.714 0.693
.N5 1.968 0.057 34.381 0.000 1.968 0.748
Neuroticism 1.000 1.000 1.000
Correlation residuals
Recall that a single factor model specifies that a single latent variable accounts for the correlations/covariances of the items, which implies that if the influence of the factor is removed the items will be uncorrelated.
The obtained factor loadings are used to find the model implied or reproduced correlations of manifest variables. This is done via matrix algebra by multiplying the matrix of factor loadings with its transpose. The differences between the observed correlations and the model implied correlations (i.e. the correlations produced by the factor) are known as the residuals. If the factor indeed accounts for the observed correlations of the manifest variables these residuals should be close to zero. Our results show that there are some residuals that clearly deviate from zero.
myresiduals <-residuals(fit.model1, type ="cor")myresiduals
Modification indices are used to identify model constraints that contribute toward model misfit. In the case of a single factor model the most basic constraint is that the unique variances of the manifest items are uncorrelated. The unique variance of a manifest variable reflects the part of its variance that is not explained by the common factor. The unique variances should be uncorrelated, because the hypothesis of a single factor model is that a single latent variable accounts for all the covariances/correlations of the manifest items. Correlated unique variances are a sign that there is some shared variance (or covariance) that is not explained by the single factor model.
In our example the modification indices indicate that the constraint of uncorrelated unique variances of items N1 and N2 contribute heavily toward the misfit of the model. The observed mi = 389.209, is an estimate of how much the chi-square statistic will reduce if the unique variances of these two items are allowed to correlate.
Our results show that the null hypothesis of perfect fit has to be rejected: \(\chi^2\)(4) = 31.505, p = 0.0000024. The TLI = 0.985 suggests excellent fit , as does the CFI = 0.994. Both the RMSEA = 0.051 and the SRMR = 0.017 suggests very good fit.
The parameters of the modified model
The fully standardized factor loadings of items N1 and N2 of the modified model are noticeably lower than their corresponding loadings in the single factor model. This is so because the variance that these two variables share above and beyond the factor of interest was ‘shunted’ into their factor loadings (after all, the shared variance has to go somewhere and that was the only place it could go).
lavaan 0.6-19 ended normally after 23 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 11
Used Total
Number of observations 2694 2800
Model Test User Model:
Test statistic 31.505
Degrees of freedom 4
P-value (Chi-square) 0.000
Model Test Baseline Model:
Test statistic 4724.621
Degrees of freedom 10
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.994
Tucker-Lewis Index (TLI) 0.985
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -22913.790
Loglikelihood unrestricted model (H1) -22898.038
Akaike (AIC) 45849.580
Bayesian (BIC) 45914.467
Sample-size adjusted Bayesian (SABIC) 45879.517
Root Mean Square Error of Approximation:
RMSEA 0.051
90 Percent confidence interval - lower 0.035
90 Percent confidence interval - upper 0.068
P-value H_0: RMSEA <= 0.050 0.443
P-value H_0: RMSEA >= 0.080 0.002
Standardized Root Mean Square Residual:
SRMR 0.017
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
Neuroticism =~
N1 1.054 0.031 34.507 0.000 1.054 0.670
N2 0.996 0.030 33.360 0.000 0.996 0.653
N3 1.307 0.030 43.741 0.000 1.307 0.817
N4 0.997 0.030 32.767 0.000 0.997 0.634
N5 0.893 0.032 27.781 0.000 0.893 0.551
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.N1 ~~
.N2 0.644 0.039 16.308 0.000 0.644 0.477
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.N1 1.363 0.049 28.063 0.000 1.363 0.551
.N2 1.337 0.047 28.728 0.000 1.337 0.574
.N3 0.853 0.048 17.744 0.000 0.853 0.333
.N4 1.480 0.049 30.292 0.000 1.480 0.598
.N5 1.832 0.056 32.674 0.000 1.832 0.697
Neuroticism 1.000 1.000 1.000
Correlation residuals
The matrix of correlation residuals shows that the modified single factor model gives a very good account of the observed covariances/correlations of the five items.
myresiduals2 <-residuals(fit.model2, type ="cor")myresiduals2
In general it is not a good idea to allow the unique variances of manifest variables to correlate. It can easily become a mindless escape out of a poorly fitting model. If however, there are good substantive grounds for allowing such correlations, then they might help to obtain better estimates of the parameters of interest. In our example, the factor loadings of items N1 and N2 in the modified model reflect their relation with the latent variable after we have controlled for the unwanted variance that these two items share.
Inspection of the content of items N1 and N2 shows that both reflect aspects of being “short-tempered”. Whereas our results show that responses to these items are driven by the Neuroticism factor, it is also conceivable that being “short-tempered” is driven by an “Aggression” factor that is independent of the Neuroticism factor. We don’t have enough items to properly serve as indicators of such and Aggression factor and our only recourse is to allow the unique variances of these items to correlate Another recourse would be to delete one of the two offending items. This might be the best course of action in the test development phase or if a test consists of many items. In our example, which has only five items, deleting an item would be a drastic step.
Allowing the unique variances of items N1 and N2 to correlate gave us insight into the functioning of the 5-item Neuroticism scale and has given us clues about how it might be improved in future.
References
Bartlett, M. S., (1951). The effect of standardization on a chi square approximation in factor analysis, Biometrika, 38, 337-344.
Kaiser, H. F. (1970) A second generation little jiffy. Psychometrika, 35(4):401-415.
Revelle, W. (2024a). psych: Procedures for psychological, psychometric, and personality research. Northwestern University, Evanston, Illinois. R package version 2.4.6, https://CRAN.R-project.org/package=psych.
Revelle, W. (2024b). psychTools: Tools to accompany the ‘psych’ package for psychological research. Northwestern University, Evanston, Illinois. R package version 2.4.3, https://CRAN.R-project.org/package=psychTools.
Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1-36. https://doi.org/10.18637/jss.v048.i02