Search Algorithms: Build Pure Clusters |
Introduction
Build Pure Clusters is one of the three algorithms in Tetrad designed to build pure measurement/structural models (the others are the MIM Build algorithm and the Purify algorithm).
The goal of Build Pure Clusters is to build a pure measurement model using observed variables from a data set. Observed variables are clustered into disjoint groups, and each group will represent indicators of a hidden variable. Variables in one group are not indicators of the hidden variables associated with the other groups, as in a pure measurement model. Some variables given as input will not be used because they do not fit into a pure measurement model along with the chosen ones.
The Build Pure Clusters algorithm assumes that the population can be described as a measurement/structural model where observed variables are linear indicators of the unknown latents. Notice that linearity among latents is not necessary (although it will be necessary for the MIM Build algorithm) and latents do not need to be continuous. It is also assumed that the unknown population graph contains a pure subgraph where each latent has at least three indicators. This assumption is not testable is should be evaluated by the plausibility of the final model.
The current implementation of the algorithm accepts only continuous data sets as input. For general information about model building algorithms, consult the Search Algorithms page.
Entering Build Pure Clusters parameters
Here is an example:
When the Build Pure Clusters algorithm is chosen from the Search
Object combo box, the following window appears:
The parameters that are used by Build Pure Clusters can be specified in this window. The parameters are as follows:
Execute the search .
Build Pure Clusters returns a pure measurement model. Because of the internal randomization, outputs may vary from run to run, but one should not expect large differences (and this can be actually used to evaluate if the assumptions are reasonable for a given set of input variables). In our example, the outcome should be as follows if the sample is representative of the population:
Edges with circles at the endpoints are added only to distinguish
latent variables from the indicators. Build Pure Clusters does not make
any claims about the causal relationships among latent variables (this
is the role of the MIM
Build algorithm). The labels given to the latent
variables are arbitrary. As part of the analysis, a domain expert
should evaluate if such latents have indeed a physical or abstract
meaning, or if they should be discarded as meaningless. Such
reification is domain dependent.
Warning: The output may be scattered across the window. Use the Move
command to rearrange it more perspicuously.