Below we describe in detail how we handled the label-switching issue. In LCA solutions, there are c! For applied users of LCA, label switching does not indicate any problem with the estimation per se , because class labeling is arbitrary and does not change either the model fit or the interpretation of the classes. Label switching is primarily problematic in the aggregation of results in a simulation study. Although label switching can often be corrected by inspecting the solution, this method can be unreliable and subjective, especially when the estimated parameters vary greatly from the generating parameters e.

In our simulation, we used the true population parameters as starting values for each replication to minimize the occurrence of label switching. In addition, we used an algorithm developed by Tueller et al. Conceptually, incorrigibility indicates that the parameter recovery is so poor for example, due to high sampling error in small samples that the population classes cannot be properly identified.

Low class assignment accuracy and extremely low parameter recovery is an undesirable quality in LCA models and may suggest that the solution itself is untrustworthy. Thus, we felt that it was acceptable to exclude these replications given that they would be undesirable solutions for substantive researchers. All incorrigible or incorrectly labeled models were counted and excluded from the final analysis.

A trivial amount of replications 0. A more relevant issue was the proportion of incorrigible replications. In total, , of the original replications 8. Of the , additional replications that were generated to refill the design, 50, replications Preliminary analyses revealed several conditions with a high proportion of replications that met one or more of the four above-mentioned exclusion criteria.

All other 2-class excluded conditions are shown in Figure 1. It can be seen that in line with our hypotheses, inclusion was positively related to 1 the number of indicators the more indicators, the more included conditions and 2 the covariate effect size the stronger the covariate effect, the more included conditions. Figure 1. For the full set of conditions, before exclusion criteria were applied, we examined the number of incorrigible and non-converged replications. For the remaining included cells of the design, we examined the prevalence of boundary parameter estimates and relative parameter estimate bias for class proportions, CRPs, and the covariate regression slopecoefficient.

- Latent Class Analysis | Mplus Data Analysis Examples;
- Invisible Barriers to Invisible Trade?
- Announcement!
- Marine Control, Practice!
- Latent class regression on latent factors | Biostatistics | Oxford Academic.
- Ruby Tuesday: An Eddie Dancer Mystery.
- Efficiency in Sustainable Supply Chain.

Relative bias was calculated by subtracting the true value of the parameter from the simulated parameter estimate and dividing the difference by the parameter's true value. In our calculations, we used the absolute value of this bias measure. The absolute value of bias was averaged across all replications within each cell. Class proportion estimate bias was calculated separately for each class, given that the classes differed in size.

We examined the impact of sample size, covariate effect size, number of indicators, and quality of indicators, as well as all possible interactions among factors using analysis of variance ANOVA for continuous outcomes and logistic regression for binary outcomes conducted in SPSS Values of 0.

We also restricted ourselves to interpreting only 2- and 3-way interactions.

### APPENDIX A

For binary outcomes, we used the OR as an effect size, with 2. Non-convergence and incorrigibility were examined for all conditions prior to any excluded conditions or refilled replications. Boundary parameter estimates and relative parameter bias were only examined after we applied the exclusion criteria. Figure 2. Mean proportion of incorrigible replications per condition by indicator quality and covariate effect size.

In the original set of replications, 0. Figure 3 shows rare non-convergence in high-quality indicator conditions. Figure 3.

Mean proportion of non-converged replications by sample size, number of indicators, and covariate effect size for high and low quality indicators. The frequency of boundary parameter estimates was assessed by calculating the proportion of boundary parameter estimates per total number of independent CRP parameters in each condition. Figure 4. Mean percent of boundary parameter estimates by sample size, number of indicators, indicator quality, and covariate effect size.

- Latent class regression on latent factors.?
- Latent Class Analysis | Mplus Data Analysis Examples?
- Latent Class Analysis and Mixture Models - Q;
- It Cant Happen Here.
- Advanced Wireless Communications: 4G Cognitive and Cooperative Broadband Technology;
- Principal Components Analysis.
- Login using.
- What Is Latent Class Analysis?.

These 3-way interactions were such that increasing one factor reduced the impact of other 2-way interactions. These effects did not appear in the 3-class conditions. With few indicators 4 or 5 , the high quality conditions had more boundary parameter estimates than the low quality conditions. However, except for the 4- and 5-indicator conditions, prevalence of boundary estimates was very similar among low and high quality conditions.

Similar effect sizes were found for Class 2 and Class 3 bias. Figure 5. Mean class proportion relative bias by indicator quality, number of indicators, and sample size. CRP bias was calculated including boundary parameter estimates, because in practice, researchers would also typically interpret boundary estimates for an otherwise proper solution.

It was important to know whether including boundary parameters led to bias. In addition, in some replications with few indicators, all low probability CRPs 0. Note that relative bias was calculated as the absolute value of the difference between the mean parameter estimate and the population value of the parameter, then divided by the population value. Even if the raw difference is the same for two parameters, the relative bias for the parameter with a smaller population value would be higher.

The higher relative bias for low CRPs and for smaller class proportions was due at least in part to this mathematical property, which made the results in Figure 7 look more extreme. Figure 6. Mean high conditional response probability relative bias by number of indicators, indicator quality, covariate effect size, and sample size.

Figure 7. Mean low conditional response probability relative bias by number of indicators, indicator quality, covariate effect size, and sample size. Covariate parameter estimate bias was large when the covariate effect, indicator quality, and sample size were small. Figure 8. Mean covariate relative bias by indicator quality, covariate effect size, and sample size.

## Multi-group Latent Class Analysis and Latent Class Regression - Statalist

With LCA becoming increasingly popular across diverse fields within the social sciences, it is important for researchers to know which factors influence the quality of estimation when using this method. To our knowledge, this study is the first to study more systematically these factors and their interplay in LCA under a large set of conditions. Below, we summarize our main findings and explain which factors improve LCA performance. Many applied researchers face limitations in terms of the size of the samples that they can gather, so it is important to understand which factors can be beneficial when a sample size of as recommended by Finch and Bronk may simply not be available.

We found support for the hypothesis that using more and high quality indicators or a covariate that is strongly related to class membership can alleviate some of the problems frequently found with small sample sizes. Nonetheless, there was a relatively clear limit for the minimum sample size. Sample size itself showed a small impact in decreasing non-convergence, and a moderate impact on decreasing boundary parameter estimates, as well as class proportion, low CRP, and covariate effect bias. Sample size interacted with indicator quality such that as sample size increased, the negative impact of CRPs of 0.

Sample size also interacted with the number of indicators in reducing boundary parameter estimates such that using many indicators could compensate for a small sample size. Sample size further interacted with indicator quality in reducing low CRP and Class 3 proportion bias such that high indicator quality could compensate for low sample size. These findings highlight what factors can compensate for a lower sample size—higher number and quality of indicators, adding a strong covariate—and which conditions require higher sample sizes—lower number and quality of indicators.

One of the key factors examined here was the influence of the number of indicators, and whether adding more indicators to an LCA is beneficial rather than detrimental. In line with the results of Marsh et al. Increasing the number of indicators had a large effect on decreasing the occurrence of solutions with low class assignment accuracy. This makes sense, given that more indicators contribute to greater certainty in defining classes.

Using more indicators also improved convergence rates and led to reduced class proportion and low probability CRP bias. Also, the number of indicators interacted with indicator quality such that using more indicators negated the negative impact of using indicators with CRPs close to zero or one on boundary parameter estimates.

Furthermore, the number of indicators interacted with covariate effect size in reducing low CRP bias such that using more indicators could partly compensate for a small covariate effect size. Our results demonstrate that, at least under conditions similar to the ones studied here, researchers have no reason to avoid adding more indicators in an attempt to prevent data sparseness 2. In fact, we found that adding more indicators decreased the likelihood of boundary parameter estimates, which often arise from data sparseness.

## Announcement

Note that the conditions with the lowest number of indicators i. Many of the low quality 4- and 5-indicator conditions were ultimately excluded from the analysis because of high levels of non-convergence and incorrigibility. This could be because the particular population class profiles chosen for these models resulted in a larger number of empirically underidentified solutions.

Replications may have passed the Mplus criterion for under identification while in fact, they were close to empirically underidentified. Further research should examine whether 4- and 5-indicator models are generally problematic, or whether they perform better with different class profiles. In addition, further studies should examine whether there is a point at which adding more indicators causes problems.

Based on our findings, we recommend avoiding designs with fewer than 5 indicators. We also examined whether higher quality indicators are always better, and if indicator quality can compensate for a small sample size. The answer here is for the most part yes—increasing indicator quality almost always improved outcomes, even beyond just parameter recovery Collins and Wugalter, Higher indicator quality had a small to moderate effect on decreasing incorrigibility and a large effect on improving convergence rates. Improving indicator quality also had a small effect on decreasing class proportion, covariate, and high CRP bias.

Furthermore, the impact of indicator quality interacted with the impact of adding a covariate on decreasing incorrigibility, such that higher quality indicators could partly compensate for a low covariate effect size. Indicator quality also interacted with the number of indicators such that higher quality indicators could compensate for a low number of indicators in reducing low CRP bias. The only outcome for which high quality indicators performed poorly was the prevalence of boundary parameter estimates, most likely because CRPs of 0.

However, boundary parameter estimates did occur in low and moderate indicator quality conditions as well, suggesting that high quality indicators are not the only factor that causes boundary estimates. As discussed previously, this may indicate that using too few indicators of any quality may result in unstable estimation and frequent boundary parameter estimates. This discussion, of course, is predicated on the idea that boundary parameter estimates are inherently problematic. Although they still may present interpretational difficulties, there were many conditions with high boundary parameter prevalence that did not show any other negative outcomes.

Moreover, we included boundary parameter estimates in our calculation of CRP bias and found that in many conditions, the bias was still acceptably low. This suggests that boundary parameter estimates may not be problematic in general, although further research on this matter is clearly needed. Taken together, our results suggest that higher quality indicators should be used whenever possible, with the understanding that boundary estimates may be more likely to occur if the number of indicators is small to modest.