In Butner, Dyer, Malloy, and Kranz (In
Press), we illustrate how the cusp catastrophe model can be tested in a
multilevel modeling context. Multilevel modeling (also known as
mixed models, hierarchical linear modeling, and random coefficient
modeling) is a maximum likelihood based procedure designed for nested
data structures. In this case, the nesting is the idea that we
have multiple time series rather than just one. It can be thought
of as an expansion of ANOVA or Regression such that repeated measures
ANOVA is a special case.
Bryk and Raudenbush pointed out that multilevel models are analogous to
running a regression within individuals (one for each time series),
saving out the regression coefficients and then comparing the
coefficients across individuals. They call the equation within
cases the Level 1 Equation and the Equations comparing coefficients
between cases as the Level 2 Equations. The actual equation
tested under Maximum Likelihood is the nested (or mixed) equation where
the coefficients in level 1 are replaced with the equations in Level
2. Also, maximum likelihood estimation is an asymptotic
estimation procedure. So, estimates are always population based
calculation. Thus, error terms are represented by error variances
(and a model might have a couple of them known as random effects).
So, let's say that the level 1 equation is the cusp catastrophe
regression equation earlier. The only change I will make is to
switch to greek symbols because of ML being an asymptotic procedure
(greek for population, old statistical norm).
And let's say that the level two equations just had intercepts (no
predictors, no residuals). This is like saying that everyone has
the identical level 1 equation:
The nested equation (actual equation tested) just combines the level 1 and level 2 equations:
Scaling and Expanding the Equation Form
Now the equation above is imperfect. It is based on the
topological formulation of the cusp catastrophe. It is
mathematically correct, based on a scale free nature. However,
multilevel modeling (and regression too, so what I am about to say
would also apply to the earlier parts) generates partial derivative
relationships. The moment we add interactions and polynomials,
many of the terms lose their scale free nature. To see the issue,
let's go to a much simpler regression circumstance:
Y=b
0 + b
1X + b
2Z + b
3XZ + e
Here, X and Z interact to predict Y. This interaction is
represented by a product term. The key question relevant to our
discussion: How do we interpret the main effects of b
1 and b
2 respectively? B
1 is conditional on the scaling of Z and b
2
is conditional on the scaling of X. That is, both are interpreted
when the other variable from the interaction is zero. Change the
scaling of Z and b
1 changes - this is the loss of the scale free nature. So, there is some value of Z where b
1 would be zero and thus the equation would be correct to actually drop b
1
from the equation entirely. At any other value of Z, b
1 would be
non-zero and by leaving it out we bias the other estimates. There
is an exception to this having to do with nested data structures and
these nestings often exist in multilevel models, but that is not what
is going on here.
Returning to our Regression based Cusp Catastrophe, now including all the implied lower order terms, we have
By leaving out these additional terms, we are assuming they are zero
and the extent to which they are not will lead to incorrect
solutions.
This leads to two directions to resolve the scaling issue:
- The literature suggested solution is to put all the variables (Y, A,
& B) into some form of standardized metric. This logic is
based on the notion that centering the zero values will provide the
proper estimates. Personally, I believe this is
flawed. This should only work to the extent to which the
zero values are at the set point and the cusp catastrophe
has multiple set points that can even change location. So, it is
unlikely to work across the board especially when you are dealing with
multiple time points.
- Expand the multilevel model to include all the lower order terms.
I am a big fan of this, because it does not put the honess of scaling
on the user and will always give proper estimates for the key
terms. Some of these actually have topological meaning while others do not.
This actually is a problem because it means we are mixing alternate
topological interpretations with scaling issues. So, I highly
recommend graphing results to see what is going on, much like the graphs I made above. Note that fully expanding
the equations does not mean that one can ignore scaling. In fact,
the exact opposite is true. The polynomials can all be highly
correlated and Maximum Likelihood can actually fail to get proper
estimates. However, the solution is just to follow the common
rules for scaling in multilevel models (grand centering and group
centering).
Random Effects in the Cusp
Butner, et. al. (In Press) conducted the cusp catastrophe model with
implied, but unknown control parameters. The way we did this was
by careful use of random effects. To understand how this works,
we need to talk about the nature of a random effect, also known as a
variance component. Let's assume, for the moment, that A and B
vary across individuals, but not within individuals. Recall the
common assumption that A & B are varying independently from the
rest of the variables changing through time. So, we can argue
that there is a distribution of values of A and B across individuals.
Now imagine that we never measured A and B. Here would be the
same equation without A and B, but where I add question marks instead
of the A & B. These question marks are varying contributions.
Note that the main effect of A & B collapse because we would just
see variation there.
Notice how we would observe variation on the main effect of Y and
variation on the intercept. We just replace these with random
effects and under the assumptions of random effects, this becomes a
stand in for the control parameters.
Where the omegas are the random effects. Random effects in
multilevel models do assume them to be normally distributed - that A
and B are normally distributed. Finding variation in omega1 takes
the place of B. Finding variation in omega0 takes the place of A
(and B to some extent due to the scaling issues mentioned above).
As mentioned before, the random effects approach assumes that A & B
vary across time series but not within time series. If A & B
vary within time series too then those variations come out as error if
we are only using random effects to account for A & B. So,
once we observe random effects, it makes sense to add possible control
parameters for level 1 or level 2 and the control parameter should
force the relevant level 2 random effect to zero.
Code in SPSS
I am going to work in SPSS Mixed to make this work. We can actually use
any multilevel modeling program (HLM, Proc Mixed, MLWin, R, etc.).
I am assuming that the file is constructed as a long file (HLM would
call this a level 1 file; each line is one instance from an
individual). First things first, we need an estimate for the
derivative. Now,
I have always displayed these as derivatives, but the logic of
catastrophe models does not require true derivatives. That is, we
can use discrete change instead. So, one option is to create an
estimate of the derivative using techniques like the Local Linear
Approximation or the GOLD method. Alternatively, we can use
discrete change. That is what I will do today. This
choice has lots to do with the philosophy of time and its relation to
the phenomena of interest. More on this another time (a paper
unto itself).
Let's make Change in the outcome. For this example, my variable
is Y, I have multiple time series that are separated by an ID variable
called ID. We need to first split the file so that the lead
variable does not cross different time series.
split file by ID.
create Y_lead = LEAD(y 1).
split file off.
compute y_change = Y_lead - y.
execute. |
Now, I am going to run the analysis using the Mixed Command.
In this first case, I will assume A and B are unknown, but using random
effects to account for them.
Mixed y_change with y
/fixed=y y*y y*y*y |SSTYPE(3)
/method=reml
/print=solution testcov
/random = intercept y |subject(id) covtype(vc).
|
Now adding in A and B, but where we still keep the random effects to see if something remains.
Mixed y_change with y A B
/fixed=y y*y y*y*y A B y*B|SSTYPE(3)
/method=reml
/print=solution testcov
/random = intercept y |subject(id) covtype(vc). |