Skip to: Main Content / Navigation

  • Facebook
  • Twitter
  • LinkedIn
  • Add This

All Forums > Statistical Analysis Questions

Basis variables for latent class cluster analysis segmentation

I'm interested in getting a point of view regarding basis variables used in latent class cluster analysis for market segmentation.

K means cluster analysis is sometimes run on orthogonal principle component scores instead of individual attributes (such as in marketing research when dealing with a lengthy attitudinal battery). Many analysts don't recommend this practice for various reasons (e.g., factor analysis factors might mask key discriminating attributes, the non-correlation between factors artificially spreads people out in space, etc.).

If using latent class cluster analysis to find groups using attitudinal battery items as inputs, do the same criticisms apply for using orthogonal principle component scores as analysis basis variables OR is it a more justifiable practice for latent class cluster analysis versus K means cluster analysis? I’m looking for a point of view/recommendation on using orthogonal principle component scores versus individual attitudinal attributes as inputs for latent class cluster analysis. To what extent are issues such as collinearity between attributes, lack of correlation between orthogonal principle component scores, etc. damaging (or non damaging) for latent class cluster analysis?

One Quick Observation

One of the reasons why principle components models, or latent class models, are used with lengthy batteries of attitudinal items is that the excessive correlation among the specific attitudinal indicators will usually result in a violation of the assumptions of the model. This is the problem with using the specific indicators as opposed to the factors or latent constructs.


As for factor analysis masking key discriminating attributes, to be perfectly honest I have never seen that, but then I do as a general rule favor the more abstract concepts as a way of driving the analysis as opposed to letting idiosyncratic chance variation associated with a specific item drive the analysis.