Simple components, correlated components and an application of statistical shape analysis to consumer and other multivariate data



Arnold, David
Simple components, correlated components and an application of statistical shape analysis to consumer and other multivariate data. Doctor of Philosophy thesis, University of Liverpool.

[img] PDF
Thesis_for_printing_-_David_Arnold_07032011.pdf - Submitted version
Access to this file is embargoed until Unspecified.
Available under License Creative Commons Attribution No Derivatives.

Download (1MB)
[img] PDF (Simple components, correlated components and an application of statistical shape analysis to consumer and other multivariate data)
ArnoldDav_Sep2010_3555.pdf - Author Accepted Manuscript
Available under License Creative Commons Attribution No Derivatives.

Download (1MB)

Abstract

The interpretation of a principal component analysis can be complicated because the components are linear combinations of possibly many observed variables. A rotation of the principal components can improve the interpretation, however, there are usually still many small non-informative loadings, which taken together account for a significant proportion of the observed variation. Presented is a new computationally efficient method to find simple components using similar criteria to principal components. Simple components are defined to have restricted weights that are proportional to the set of integers {0, -1, 1}. This choice ensures that no subjective decision is required as to whether a weight is important, and an individual weight is interpreted in a similar way to a correlation of one, minus one or zero with the component. The algorithm can find solutions for large problems in tractable time and can easily accommodate alternative criteria. An application is proposed that provides a simple component summary of a large data set. When data is related to an orthogonal basis, these axes represent the maximum separation of information between axes. An approach is developed that finds orthogonal rotations of the principal components so that the sum or the sum of the squared covariance between a set of components is maximized. This approach can find a group of correlated components that explain a latent trait, and in addition explain different aspects of that trait. Another application is developed where an arbitrary configuration of points from a multidimensional scaling or similar method, can be displayed on a parallel coordinate plot so that the number of cross over's between the axes are minimized. This aids the identification of clusters and outliers. In consumer research a respondent's perception is often driven by tacit knowledge, for example when making product comparisons. However, the traditional variable analogue scale may not capture this. A two dimensional response is proposed for a multiple product comparison. Principal shape analysis is developed to extract latent shape responses from the questions answered by the respondents. The analysis framework is coordinate free, and uses a scaled Euclidean distance matrix to represent a configuration of products, which can be considered a shape. A Euclidean distance matrix representation does not suffer from the problems associated with the use of shape coordinate systems.

Item Type: Thesis (Doctor of Philosophy)
Additional Information: Date: 2010-09 (completed)
Uncontrolled Keywords: multivariate statistics, principal component analysis, factor analysis, latent variables, shape analysis, parallel coordinate plot, napping, simple components, multidimensional scaling
Subjects: ?? QA ??
Divisions: Faculty of Science and Engineering > School of Physical Sciences > Mathematical Sciences
Depositing User: Symplectic Admin
Date Deposited: 01 Dec 2011 16:17
Last Modified: 16 Dec 2022 04:36
DOI: 10.17638/00003555
Supervisors:
  • Cox, TF
  • Clancy, D
URI: https://livrepository.liverpool.ac.uk/id/eprint/3555