Not 100% sure this is the best forum to post the question, but it does relate to a Clementine analysis project.
I don't have a great recollection of the basic theory behind Factor Analysis/PCA, but I've always regarded it as a data reduction technique (which I haven't had the opportunity/need to use before) useful for identifying a smaller number of underlying factors amongst a large dataset.
Anyway - in advance of an upcoming segmentation project, a colleague has been pushing the idea of using of a Factor Analysis model to reduce the number of inputs that feed into the K-Means clustering model. However, I'm unconvinced of the merits of undertaking this additional step and would normally go straight to the clustering stage. In particular as the data available relates to different customer activities, and as such represents fairly distinct sets of behaviours,
From my own perspective, I feel that careful selection of variables for the clustering dataset would be sufficent, and that the FA stage is not necessary. Does anyone have any opinion on why this would be a good/bad idea?
Regards,
R