- 1 -
Using microsimulation modeling to capture heterogeneity in marijuana usea
Susan M. Paddock1
Rosalie Pacula1
Jeanne Ringel1
Jeremy Arkes2
Tanya Bentley1,5
Jonathan Caulkins3
Christine Eibner4
Beau Kilmer1
Marika Suttorp1
February 23, 2009
1RAND Drug Policy Research Center, RAND Corporation, Santa Monica, CA
2Naval PostgraduateSchool, Monterey, CA. U.S.A.
3Carnegie MellonUniversity, Pittsburgh, PA, U.S.A.
4RAND Drug Policy Research Center, RAND Corporation, Arlington, VA
5UCLACenter for Health Policy Research, Los AngelesCA
aThis paper describes work that is done with support from the National Institute on Drug Abuse (1R01 DA019993). We gratefully acknowledge significant guidance from members of our expert panel, including Michael Dennis, Susan Ettner, Michael French, Teh-wei Hu, Martin Iguchi, Emmett Keeler, Andrew Morral, Peter Reuter and Jeffrey Wasserman.
I. Introduction
Marijuana is the most widely used illicit substance in the United States, with use rates being the highest among youth and young adults. Average thirty-day use rates among high school seniors in the United States have stayed above 20% since 1993 and in 2007 5.1% of high school seniors reported daily use of marijuana (Johnston et al., 2007a). Use rates among young adults are also high, with over half of all individuals between the ages of 19 and 28 reporting having used marijuana at some point, 15.7% of individuals reporting use in the past thirty days, and 5.0% reporting use of marijuana on a daily basis (Johnston et al., 2007b).
Although a large number of studies have examined risk and vulnerability factors associated with the general use and onset of marijuana use, only recently have scientists started examining factors associated with marijuana use careers, the duration and severity of marijuana use, and quit behavior. Current work suggests that age of onset, frequency of use at an early age, drug-using peers, and stressful life events are important factors influencing both duration and the probability of quitting (Perkonigg et al., 2007; Van Ours and Williams, 2007; van Ours, 2006; Pudney, 2004). Other work examining specific trajectories for marijuana use as youth transition into adulthood suggests that there may be as many as 4 to 6 different trajectories that are common in the U.S. population because of heterogeneity in experiences and other correlates that are influential for predicting sustained use (Schulenberg et al., 2005; Ellickson et al., 2004: Windle and Wiesner, 2004; Kandel and Chen, 2000). The vast majority of these studies have focused only on the periods of late adolescence and young adulthood during which marijuana use peaks (Johnston, Bachman et al, 2007a; Hall and Pacula, 2003; Kandel and Logan, 1984). None of these analyses have examined the longer-term effects of marijuana use, such as how marijuana use trajectories both affect and are affected by drug treatment utilization patterns. Knowledge of these sorts of important interactions is important if we hope to understand the implications of alternative policy options.
No single data source captures the full spectrum of drug use and its consequences over the life course that would provide policy makers with all of the relevant information needed in order to assess and compare a diverse set of policy proposals, such as assessing the cost-effectiveness of drug prevention (Caulkins et al., 2002), law enforcement (Caulkins et al., 1997), and marijuana treatment interventions (Dennis et al., 2004; Olmstead et al., 2007; French et al. 2003) within a singular framework. Therefore, one must synthesize information provided from multiple data sources that reflect different parts of the life course and different drug use and treatment states.
Markov or population-cohort models have been used in the past to combine different data sets in order to track entire drug use careers for a population (Rydell, Caulkins & Everingham, 1996; Everingham & Rydell, 1994; Rydell & Everingham, 1994; Childress, 1993). Although Markov models can be useful for understanding questions related to total population consumption or average lifetime consumption, they provide limited insight into the variation in consumption within a population and how or whether individual-level factors or experiences, such as prior disease history or drug treatment history, influence transitions into and out of drug use. The variation in consumption within a population and the factors that contribute to it can be very important, as not all use is dangerous nor does all use impose social harms. Moreover, to the extent that a large fraction of the total consumption within a market is limited to a small set of heavy users, it is useful to understand the factors that are highly correlated with falling into the heavy use set. Furthermore, when using discrete-state Markov modeling to evaluate risk of illicit drug use over time, it is often necessary to exclude or restrict the impact that prior disease history may have on future disease risk; to the extent that the predictions of such health and economic outcomes depend on the degree to which event history is captured, such a limitation may bias model outcomes. This bias may differentially affect policy strategies evaluated in a model. For example, while one may know that relapse risk for illicit drug use increases with prior history, in a Markov model one may not be able to adequately model the dependence on such history.
Microsimulation modeling offers an alternative way of modeling drug use trajectories over the life course. More so than Markov modeling, simulation models are able to capture the uncertainty and heterogeneity in factors influencing drug use and its persistence over time by examining how a change in behavior today impacts the natural course of outcomes over an individual’s lifetime. Simulation models can explicitly capture both the uncertainty of events as well as the heterogeneity in predictors and outcomes by allowing transition probabilities across drug use states to be functions of multiple individual-level characteristics as well as a random stochastic component (Mitton et al., 2000).
In this paper we compare a microsimulation and Markov state-transition model of the life course of marijuana use for persons in the United States aged 12 and older. We aim to show that sufficient data sources exist to inform the construction of such models of lifetime marijuana use trajectories and to demonstrate the greater ability of microsimulation models over Markov cohort models to capture uncertainty and heterogeneity in lifetime trajectories of illicit drug use. The value of recognizing and considering this heterogeneity and uncertainty is important not just because it could enable a better understanding of use trajectories; it can also better inform policy by more precisely identifying common risk factors associated with the escalation and persistence of drug use as well as producing a better sense of the range of consumption across different types of users.
The rest of the paper is organized as follows. In section 2 we present a microsimulation model of the life course of marijuana use for persons in the United States that follows a cohort from age 12 to age 85 and simulates their individual drug use trajectories over this time period. We also present a Markov model that is constructed under the same basic assumptions of the microsimulation but instead uses population average transitions rather than transitions that are allowed (in the microsimulation model) to vary by gender, race, and educational status. We provide an overview of the data sources we use as inputs to our two models in Section 3, highlighting what the data sources do and do not contain, and explain how that informs our model building. In Section 4, we examine the external validity of both the models by comparing key results from the models to findings from existing public use data. We also compare and contrast results obtained from the microsimulation model and Markov model. Finally, in Section 5, we discuss policy implications and draw conclusions.
II. Model Construction
II.A. Microsimulation model
The micro-simulation model is used to simulate marijuana drug use histories for individuals who belong to a cohort of 12 year olds in the U.S. in 2003 and who are followed in the model throughout their lives. These drug use histories are determined by individual-level characteristics, such as gender and race/ethnicity that are fixed, as well as those that vary over time, such as drug use. The microsimulation model consists of states, transition probabilities and a time cycle during which these transitions take place. States represent the physical location and drug use proclivity of the individual at a given point in time t. As shown in Figure 1, there are four physical locations in our model: (1) In the community - Not in treatment; (2) In the community - In outpatient treatment; (3) In inpatient treatment; and (4) Dead. In addition to these physical locations, an individual’s state is determined by one of four drug use proclivities: being a non user, an occasional user, a regular user, and a dependent user.
Drug use proclivity represents the individual’s underlying desire to use a drug rather than their current level of use. When an individual is unconstrained and living in the community, his/her observed drug use behavior will perfectly reflect his/her current drug use proclivity (underlying desire). If, however, the individual is in a constrained environment, then his/her underlying drug use proclivity will not necessarily match his/her observed drug use in time t. For example, a dependent user entering treatment in period t-1 may be observed not using drugs in period t but that does not mean his/her natural proclivity is to be a non user. In actuality, s/he is still a dependent user just being constrained externally not to use. Upon exit from treatment s/he may remain abstinent for some time afterward. However, by having a natural proclivity as a dependent user, s/he will have a higher probability of relapsing into use until s/he has sustained abstinence for some specified length of time. Hence, one’s natural proclivity for drugs can and will change over the lifetime. Transitions to higher and lower proclivities will occur based on one’s drug using experience and length of time consuming marijuana at a given level.
Transition probabilities are used to stochastically move an individual from one state (physical location and/or proclivity) to another state between time t and time t+1. These transition probabilities will vary according to the state in which an individual resided in period t but may also vary by individual characteristics (e.g., age, gender, race/ethnicity) and past history of the individual (e.g., age of first drug use). For example, changing from dependent drug use proclivity in the community to a regular drug use proclivity in the community may be a function of the length of time since the individual last used marijuana and their age of initiation. Transitions are generated as random draws from stochastic distributions, the form and parameters of which are determined from regression analyses or cross-tabulations of relevant data from the target populations. Conditional upon a transition in drug use taking place at a given age, the individual’s life history will be altered from that point onward and subsequent probabilities for state transitions will be conditional upon current drug use proclivity.
Time cycle is the length of time designated for each period t. In our model, the length of time represented is one quarter (3 months). One quarter has an important meaning in treatment settings, as longer lengths of stay (typically greater than 3 months) have been associated with improved post-treatment outcomes (Hubbard et al., 1989; Simpson, 1993).
Figure 1 puts it all together. The rows represent all 13 possible combined location-proclivity states at any given time t. The state transitions that are possible when moving to time t+1 are highlighted in Figure 1. For example, nonusers in the community at time t have nonzero probabilities of transitioning into occasional use at time t+1, but the model restricts non-users from entering treatment. In contrast, a regular user who is in the community at time t can remain the community as either a regular user or with a different proclivity by decreasing to occasional use or increasing to dependent use, or else this regular user die or enter either outpatient or inpatient treatment. The ‘NAs’ in Figure 1 indicate transitions that are not possible with our model. Note that an individual’s proclivity to use drugs will not experience big changes simply with a change in physical location. For example, when a dependent user is living in the community and then transitions into inpatient treatment, his/her proclivity will remain at the dependent user level at least for one time period even if the observed marijuana use behavior in the next period drops to zero.
II.B. Markov Model
The Markov model follows the same overall framework (see Figure 1) as that of the microsimulation model to predict marijuana, with a few distinguishing characteristics. Like the microsimulation, the Markov model starts with a cohort that represents U.S. 12-year-olds and uses a parallel structure of states, time cycle, and transition probabilities. However, the Markov model does not model transitions at the individual level, but instead incorporates population-level and age-specific probabilities for the various transitions. Transitions between and within states and events are thus assumed to be independent of prior individual history, such that probabilities remain constant over time and are not updated based on prior states. All transitions are quarterly except for drug use transitions within community, which occur annually (once every four cycles), due to both limited quarterly-level data and the inherent challenges in converting multinomial probabilities accurately while maintaining the mutually-exclusive and collectively-exhaustive properties necessary to the Markov framework.
III. Data sources and model parameters
Table 1 lists the parameters of the microsimulation and Markov models, summarizing how their values were determined as well as their data sources. In order to fully exploit the strength of microsimulation modeling for tracking individuals, several key model parameters vary with respect to individual characteristics, as elaborated upon below. In the case of the Markov model, these parameters are evaluated for the population as a whole rather than based on individual characteristics, as discussed later. Very few parameters have assumed values – only changes to drug use proclivities following treatment are assumed.
III.A. Microsimulation model data
Drug use proclivity
Since data sources on drug use proclivity states were only available at the annual level, we modeled drug use transitions at the annual level and converted annual transitions to quarterly transitions in the microsimulation model by randomly selecting a plausible set of quarterly transitions associated with each annual transition, under the assumptions that drug use proclivity could only change at most one level per quarter and change in a monotonic fashion (either non-increasing or non-decreasing) over the course of the entire year. In the case of the Markov model, we assumed that all transitions in drug use proclivity occur at the end of a year given the annual availability of drug use proclivity data.[1]
The National Longitudinal Survey on Youth (NLSY-97) is a nationally representative data set of approximately 9,000 youths who were 12 to 16 years old as of December 31, 1996, for which respondents are followed up annually. For each age, we fit an ordinal logistic regression model to predict individual-level drug use proclivity states in the current year: no use in past year; occasional use (≤ 2 times in past month); regular use (3-19 times in past month); and heavy use (>20 times in past month). Predictors included in each model are marijuana use in the previous year, gender, race/ethnicity, educational level (less than high school, high school diploma, some college, and college degree) and age of first use of marijuana. We then included the resulting ordinal logistic regression prediction equation into the microsimulation model in order to simulate drug use proclivity transitions for 15-24 year olds. Because very few observations are available for youth aged 12-14, we obtained transitions for 12-14 year olds using cross-sectional probabilities of these four drug use proclivity states from the 2004 National Survey on Drug Use & Health (NSDUH) and applied them equally to individuals regardless of individual level characteristics and past history.
We used a different strategy to obtain drug use proclivity transition probabilities for adults aged 25-85, since simply using cross-sectional probabilities from NSDUH as we did for those aged 12-14 resulted in too many older persons initiating marijuana use. We used data on marijuana initiation from the 1994b-1998 National Household Survey on Drug Abuse (NHSDA) to fit an age-period-cohort (APC) model to predict marijuana initiation at each age, controlling for period and cohort effects (Johnston & Gerstein, 1998, 2000; Kerr et al., 2007)[2]. We fit Poisson regression models to the age, period and cohort-specific marijuana initiation counts and included an offset term for the number of persons eligible to initiate in each age, period and cohort group. We included age in the model as a set of dummy variables covering 3-year age categories, starting with ages 12-14 and ending with ages 48-50[3], such that ages 21-23 was the reference (holdout) category. For those aged 25 and older, we transformed the age-specific Poisson regression coefficient estimates of marijuana initiation to risk ratios in order to compute the probability of initiation among those aged 25 and older relative to the reference group (e.g., the probability of initiation among 40 year olds equals the probability of initiation among 21-23 year olds multiplied by the relative risk of initiating marijuana use at age 40). We then used these probabilities to predict marijuana initiation for those aged 25 and older.