7th International Forum on Tourism Statistics
Stockholm, Sweden, 9-11 June 2004
Micro-Simulation Modelling of Domestic Tourism Travel Patterns in Sweden
Anders Lundgren
Spatial Modelling Centre, Kiruna
Department of Social and Economic Geography, Umeå University,
Box 839, 981 28 Kiruna, URL:
Tel.:+46 980 676 27, fax +46 980 67626, e-mail:
Abstract
From a geographical point of view, tourism is basically about flows in a spatial system linking together a place of origin and a destination and the impacts on these destinations induced by tourism. Forecasting tourism flows requires reliable data. In the Swedish context the available data source, the Swedish Tourist Database (TDB- Åre marknadsafakta AB), contains individual attributes as age and income as well as individual choices of tourist activities and hence, the database enables analysing socio-economic patterns in relation to recreational activities at an individual level.
Here it is demonstrated how the TDB-data can be used as empirical input for a tourism module integrated into SVERIGE, a geographical micro-simulation model of the entire Swedish population. It is argued that this modelling on the micro-level accounts for changes in population structure and geography to a far greater extent than conventional models because of its focus on individual behaviour in relation to individual socio-economic characteristics. Thus, population change is mirrored directly in the resulting travel patterns.
This paper describes equations and calculations for SVERIGE’s tourism module and presents examples of model runs.
Background
In tourism literature, it is almost a rule to mention the increasing importance of tourism and tourism as a developing industry in the world (Ioannides & Debbage, 1998; Jansson, 1994; Page & Getz, 1997; Page, 1999; Roberts &Hall, 2001; Sharpley & Sharpley, 1997; Sharpley & Telfer, 2002; Shaw & Williams, 1994). It is also widely argued that measuring tourism demand is obstructed by lack of suitable data (Hall & Page, 2002). This lack of data concerning leisure activities indicates that leisure and recreation, within which tourism activities belong, is not regarded to be that important outside the community of tourism and leisure stakeholders. There is only fragmented information on what tourists actually do. There is a diffuse picture of what the tourist industry is and it is sometimes questioned if there is such a thing as a tourist industry (Smith, 2003). This makes planning, management and impact assessment a difficult task. Furthermore, this also makes it more difficult for the commercial enterprises claiming to belong to tourism industry to argue for their case.
In addition, lack of data makes it difficult to forecast tourism from a scientific point of view. Forecasting is dependent on time series of data. Forecasting with structural models has the aim to explain how changes in society and the surrounding environment affect tourism (Smith, 1995). Structural models cannot be applied if data on tourism only contain information on impact and activities without concerning socio-economic data on tourists.
The aim of this paper is to describe a method to use Tourism statistics, structural models and microsimulation to simulate tourism flows. Focus is put upon tourism demand with the resulting spatial travel patterns, socio-economic attributes of individuals and choice of activity.
Swedish Tourism Statistics
There are many organisations that are involved in the analysis and collection of data on tourism in Sweden. The most important ones that can be mentioned are Statistics Sweden (SCB), the Swedish Tourist Authority, Åre Marknadsfakta, Swedish Ski resort Association (SLAO), Swedish Campsite Managers (SCR), The Swedish Institute for Transport and Communications Analysis (SIKA) and there are also projects that has a specific lifespan aimed att collecting and analysing data for special purposes.
The Swedish Tourist database (TDB) contains results from interviews with randomly picked individuals living in Sweden. The company Åre Marknadsfakta who own the database interviews two thousand persons every month. Data has been collected since 1989. The database contains socioeconomic data as age, education, income, number of children, place of residence etc and thematic data for the trip as purpose of trip, money spent on the trip, number of nights, destination and other variables. There are different trip types: domestic, abroad, staying overnight, daytrip, work or leisure. Aggregated analyses are made and statistics are presented yearly in different shapes based on this data by the Swedish Tourist Authority among others.
Activities vs purposes
TDB contain a variable called “travel purpose” which gives the respondent 35 choices. It is possible to declare three purposes for each trip. However, these purposes are actually a mixture of purposes and activities. It is for example possible to choose between “Visiting Second Home” and “Peace and Quiet”. Visiting your own or someone’s second home is an activity. To experience piece and quiet is a purpose you can achieve by performing that activity. Looking at recreation data in different countries, one can see that activities are strictly defined as something a person does and not why it is done (Cushman, 1996). Activities are easier to connect to a place since they depend on certain prerequisites. A purpose can be fulfilled in many ways in many different places and is more difficult to attach to a place.
To be able to assign tourism activities to places the purposes in the data has to be assigned to activities. In this case, all the purposes reported as the primary reason for travel were checked against the secondary purpose for the trip. If for example a majority of respondents stated that the secondary purpose was “Visiting second home” when they stated “Experiencing Peace and Quiet” as the primary reason, “Experience piece and quiet” was put together with the activity “Visiting second homes”.
In table 1 the classification of activities and purposes is shown. It was, however, not possible to avoid using some purposes as activities. The purposes considered as activities are still to some extent possible to assign to a physical environment. The purposes are “Experiencing natural environment” and “Experiencing pleasure and entertainment”. These purposes are, however, more likely to have a location outside cities and in urban areas respectively. Thus, it is possible to regard them as activities.
Table 1. Purposes and activities put together into 10 activities.
Activity / Freq. / Valid Percent / 5 / Fishing / 1056 / 1,11 / Visit friends and relatives / 43380 / 43,9 / Hunting / 323 / 0,3
Great Events / 1875 / 1,9 / Total Fish/Hunt / 1379 / 1,4
Shopping / 898 / 0,9
Private matters/look for job / 778 / 0,8 / 6 / Outdoor life / 1993 / 2,0
Community with others / 2586 / 2,6
Course and meeting – as leisure assignment / 1493 / 1,5 / 7 / Natural environment / 1307 / 1,3
Visit attraction / 776 / 0,8 / Cultural environment / 277 / 0,3
Visit parks / 593 / 0,6 / Adventure/excitement / 188 / 0,2
Education/studies / 467 / 0,5 / Stimulation / 616 / 0,6
Big City Environment / 441 / 0,4 /
Total nature/culture
/ 1772 / 2,4Health service / 236 / 0,2
See the country / 875 / 0,9 / 8 / Pleasure/Entertainment / 5989 / 6,1
Cultural activity / 411 / 0,4 / Get some action / 307 / 0,3
Total Social bonds activities (SBA) / 54809 / 55.,5 / School trip / 253 / 0,3
Total Pleasure/entertainment / 6549 / 6,7
2 / Visit second home / 13608 / 13,8
Piece and quiet / 7381 / 7,5 / 9 / Sports / 2433 / 2,5
Total Visit second home (VSH) / 21214 / 21,3 / Golf / 168 / 0,2
Total Sports/Golf
/ 2601 / 2,73 / Sun & bath / 2613 / 2,6
10 / Others / 1523 / 1,5
4 / Skiing / 2762 / 2,8 / Otheractivity / 1097 / 1,1
Total Others
/ 2620 / 2,6Sample size
When 10 years of TDB data, from 1989 to 1999, is used the number of cases are approximately 100 000 for domestic overnight trips. The relatively small number of cases regarding some activities is obvious as can be seen in table 1. Sparsely populated regions will get few observations since data is collected randomly. To split the material for example into regions as municipalities is not recommendable for regression analysis. Hence, data from just one year is not possible to use for regression analysis as the number of cases would be to low. A remedy for this is to use larger functional regions for people’s recreation activities and to use variables that are not connected to specific municipalities.
Micro Simulation Models
Microsimulation models (MSM) were used quite early (Orcutt, 1957 in Holm et al, 2002). This simulation methodology implies that all analysis departs from single individuals and not as in many quantitative studies from spatial aggregates. These individuals respond to changes in stimuli that can be changes in the environment or the behaviour of other individuals within the model. One basic argument for a micro simulation, time-geographic approach to social phenomena is that aggregation prior to analysis and modelling distorts not only individual but also aggregate outcomes. MSM can contain both deterministic and stochastic relations. In a stochastic or probabilistic model all individuals of the same type do not respond exactly the same way to the same stimuli and this can be represented by a stochastic function.
MSM has however not been used extensively to model tourism. There is one example where microsimulation was used to model tourist expenditure (Brouwer, 1997). It is mostly used in the exploration of tax and benefits systems. In Sweden SESIM has been used (Ministry of Finance, 2001) and in the US CORSIM (Caldwell, 1996) is used in basic research but also for policy analysis.
In Sweden there is systematic information about social behaviour in longitudinal data-bases produced by Statistic Sweden. These data are used for a MSM of the Swedish population called SVERIGE at the Spatial Modelling Centre in Kiruna (SMC). This data is used for estimating the probabilities for certain events to occur stratified for individuals with certain sets of characteristics. Then, these probabilities are employed to simulate the lives of all individuals living in Sweden. The individuals in the model are born, enter school, move from home, obtain work, build families with other individuals in the model, migrate and so on. This is useful for predicting the population in single municipalities or regions and at the same time paying regard to the population changes in the rest of the country (Holm et al, 2002). One idea with SVERIGE is to create an artificial laboratory enabling systematic evaluations of, for example, changed structural conditions or various policy options before implementing them in reality.
Calculating number of trips, choice of activities and choice of destination
The strategy is to first calculate the probability for the number of trips for each individual. After that, the probabilities for the choices of activities for each trip are calculated. Finally, the individuals will be distributed on destinations according to the chosen activity and place of residence.
Variables
The two key factors that make tourism possible is access to money and leisure time (Graham, 2001). There are also gender and social constraints that varies during the life trajectory of an individual (Hall& Page, 2002). An individual’s current life cycle could be either a barrier or a springboard for participating in tourism (Shaw&Williams, 1994). A family with children and only one person employed is more likely to have both less time and money for travels compared to a single person with high income. Variables in the data that refer to the individual life cycle are age, income, number of children, marital status and gender. It is argued that motivation to travel is to find something that is different from the ordinary life (Hall&Page, 2002) or to have more of something, for example better skiing or better climate (warmer) (Jansson, 1996). A person that lives near a skiing resort will probably not go for over night trips to ski unless it is to a destination with much higher qualities and maybe abroad. Someone living in a cold climate is attracted to spend his/hers vacation in a warmer climate. This indicates that geography and place of origin matters. The resolution in the data does not allow for regional models where individuals from each municipality would have had their own model for each activity. In order to take this resolution matter in regard, the main region (Riks-region) is used as a regional variable. Sweden has 7 Riks-regions and this divides the population in 7 geographical regions. Originally, the database does not contain local labour market regions (LA-regions). Municipalities and destinations have been converted into LA-regions to match SVERIGE. The variables used in the regression analyses are chosen for their significance for an individual’s access to money and leisure time. The variables used for calculating how many trips each individual is likely to perform are:
Age group– divided into 5 groupsIncome – household income
Gender
Education – university degree or not
If the individual have children at home or not
If the indivudal are single or not
Size of place of residence
Statistical method
A Poisson regression analysis was made to calculate a model for the number of trips that each individual would make during a month. Poisson regression is used since the distribution of trips approximately follows the Poisson distribution. It has been widely used in studies to model the number of recreational trips (Ozuna and Gomaz, 1995, Lundevaller, 2002). Most travellers make 1 trip every month. 98 % of all cases are covered by a maximum of 5 trips per month. This means that individuals can have a maximum of 60 activity choices per year. Some individuals will of course not travel. The parameter estimates for the regression is shown in table 2.
Table 2. Parameter estimates from Poisson regression for number of trips
Coefficients / Estimate / Signif.(Intercept) / -1.494 / 0,00000
Agegroup 45-59 / 0.1801 / 0,00000
Agegroup 30-44 / 0.1078 / 0,00000
Agegroup 15-29 / 0.3303 / 0,00000
Agegroup 0-14 / 0.5052 / 0,00000
Income – low / 0.00001183 / 0.99910
Income – medium / 0.09618 / 0,00000
Gender 1 = man / 0.05654 / 0,00000
University degree / 0.3562 / 0,00000
Have children / -0.1426 / 0,00000
Single / 0.02437 / 0.00864
Big city (St-holm G-burg, Malmoe) / 0.3702 / 0,00000
City > 50 000 / 0.4148 / 0,00000
Town 5 – 50 000 / 0.2508 / 0,00000
Village 500 – 5 000 / 0.1256 / 0,00001
Since the individual will choose among ten activities, multinomial logistic regression is an appropriate technique for calculating models for these probabilities (Lee et.al, 2002). It allows a simulation of the individuals choosing between these 10 activities having all alternatives regarded when the choice is made. The variables used for the analysis of activity choice are:
Age group – divided into 5 groupsIncome – household income
Gender
Education – university degree or not
If the individual have children at home or not
If the indivudal are single or not
main residential region
Table 3 shows the parameter estimates from the calculation for 4 of the 10 activities.
Table 3. Parameter estimates from the multinomial logit regression. Results from 4 out of 10 dependent variables.
SBA / B / Sig. / Exp(B) /VSH
/ B / Sig. / Exp(B)Intercept / 1,6170 / 0,0409 / Intercept / -0,0850 / 0,9201
Agegroup 30-44 / 0,0455 / 0,5926 / 1,0466 / Agegroup 30-44 / 0,6947 / 0,0000 / 2,0031
Agegroup 45-59 / -0,1311 / 0,0994 / 0,8771 / Agegroup 45-59 / 0,9025 / 0,0000 / 2,4656
Agegroup 60-74 / 0,0230 / 0,7900 / 1,0233 / Agegroup 60-74 / 0,7835 / 0,0000 / 2,1891
University degree / 0,3348 / 0,0000 / 1,3976 / University degree / 0,3115 / 0,0000 / 1,3654
Income - medium / 0,0320 / 0,6573 / 1,0325 / Income - medium / 0,2513 / 0,0010 / 1,2857
Income - high / -0,4472 / 0,0000 / 0,6394 / Income - high / -0,1078 / 0,3088 / 0,8978
Single / -0,0916 / 0,2038 / 0,9125 / Single / 0,4989 / 0,0000 / 1,6470
Children / -0,3270 / 0,0000 / 0,7211 / Children / -0,2790 / 0,0007 / 0,7566
Gender 1=male / -0,3572 / 0,0000 / 0,6996 / Gender 1=male / -0,1775 / 0,0027 / 0,8374
City > 50 000 / -0,0096 / 0,8307 / 0,9905 / City > 50 000 / -0,0318 / 0,4990 / 0,9687
Town 5 – 50 000 / -0,1129 / 0,1741 / 0,8933 / Town 5 – 50 000 / -0,2975 / 0,0007 / 0,7427
Village 500 – 5 000 / -0,2107 / 0,0487 / 0,8100 / Village 500 – 5 000 / -0,4079 / 0,0003 / 0,6650
Rural / -0,2165 / 0,1539 / 0,8053 / Rural / -0,9046 / 0,0000 / 0,4047
Riksreg Sthlm / 1,4814 / 0,0602 / 4,3992 / Riksreg Sthlm / 1,1101 / 0,1882 / 3,0346
East mid-Sweden / 1,7408 / 0,0271 / 5,7020 / East mid-Sweden / 0,9376 / 0,2662 / 2,5538
South-east and islands / 1,6473 / 0,0368 / 5,1929 / South-east and islands / 0,9065 / 0,2832 / 2,4757
Southern Sweden / 1,6389 / 0,0377 / 5,1493 / Southern Sweden / 0,8651 / 0,3055 / 2,3752
West Sweden / 1,6283 / 0,0387 / 5,0953 / West Sweden / 0,9209 / 0,2746 / 2,5116
Northern Mid-Sweden / 1,6297 / 0,0406 / 5,1025 / Northern Mid-Sweden / 0,7551 / 0,3755 / 2,1279
-cont. Table 3
SUN/BATH / B / Sig. / Exp(B) / Skiing / B / Sig. / Exp(B)Intercept / -15,2641 / 0,9893 / Intercept / -1,1750 / 0,3433
Agegroup 30-44 / 0,1603 / 0,1728 / 1,1739 / Agegroup 30-44 / 0,2420 / 0,0317 / 1,2738
Agegroup 45-59 / -0,2866 / 0,0139 / 0,7508 / Agegroup 45-59 / -0,2686 / 0,0162 / 0,7645
Agegroup 60-74 / -0,9162 / 0,0000 / 0,4001 / Agegroup 60-74 / -1,3129 / 0,0000 / 0,2690
University degree / 0,1729 / 0,0437 / 1,1887 / University degree / 0,3704 / 0,0000 / 1,4483
Income - medium / 0,1892 / 0,0719 / 1,2082 / Income - medium / 0,0957 / 0,3514 / 1,1004
Income - high / -0,6210 / 0,0001 / 0,5374 / Income - high / 0,0134 / 0,9235 / 1,0135
Single / 0,2724 / 0,0130 / 1,3131 / Single / 0,2815 / 0,0069 / 1,3252
Children / 0,3881 / 0,0003 / 1,4741 / Children / -0,0685 / 0,5141 / 0,9338
Gender 1=male / -0,2190 / 0,0077 / 0,8033 / Gender 1=male / -0,0298 / 0,7063 / 0,9706
City > 50 000 / 0,1140 / 0,0984 / 1,1208 / City > 50 000 / 0,0127 / 0,8382 / 1,0128
Town 5 – 50 000 / 0,3707 / 0,0035 / 1,4487 / Town 5 – 50 000 / -0,0367 / 0,7517 / 0,9639
Village 500 – 5 000 / 0,4114 / 0,0088 / 1,5089 / Village 500 – 5 000 / 0,1349 / 0,3748 / 1,1444
Rural / 0,1943 / 0,3886 / 1,2144 / Rural / 0,0137 / 0,9515 / 1,0138
Riksreg Sthlm / 14,5727 / 0,9897 / 2132170,95 / Riksreg Sthlm / 1,2831 / 0,2990 / 3,6078
East mid-Sweden / 14,7267 / 0,9896 / 2487198,96 / East mid-Sweden / 1,0247 / 0,4068 / 2,7864
South-east and islands / 14,9233 / 0,9895 / 3027537,84 / South-east and islands / 0,1510 / 0,9031 / 1,1630
Southern Sweden / 14,5061 / 0,9898 / 1994961,90 / Southern Sweden / 0,2156 / 0,8618 / 1,2406
West Sweden / 14,6686 / 0,9897 / 2346984,33 / West Sweden / 0,7411 / 0,5486 / 2,0983
Northern Mid-Sweden / 14,4361 / 0,9898 / 1860086,97 / Northern Mid-Sweden / 0,6975 / 0,5760 / 2,0088
Calculation
In order to generate trips and activity choices the independent variables of individuals where used to calculate individual probabilities for number of trips and choice of activity for each individual. The number of trips and choice of activity was then randomly distributed according to the estimated probabilities. The result was then compared with TDB.
Probabilities based on the Poisson regression model for number of trips were calculated using this equation:
(1)
p(y) = probability for y number of trips
f = an individuals value based on his/her attributes and the parameter estimates exp(intercept+b*variable)
y = number of trips
The equation for each choice of the 10 activities was put into SPSS to calculate the probabilities and the outcomes for each individual by using this equation:
(2)
pk = probability to chose activity k
fk = result from calculations using the parameter estimates from multinomial logit regression for activity k
The individuals represent a 3% sample from the Swedish population. To make the manual calculation easier all individuals that travelled were allowed to make 12 choices. That is one per month. For the individuals that got 2 trips per month the result from equation 2 was doubled. Individuals with 3 trips where tripled and so on up to 5 trips. The number of individuals with more than 5 trips was included with those with 5 trips because they represented less than 2% of the sample. This way the number of calculations were reduced compared to repeating the calculations for the individuals that got more than one trip. After this the activity choices performed by the individuals where aggregated per LA-region and summarised per activity. In table 6 each activity is summarised for 10 LA-regions.
Place of origin and choice of destination
When people move or chose a place for vacation, distance and attraction of the destination matters. A spatial interaction model or a gravity model can describe this flow. The gravity model is a well-known structural forecasting model where population often is used as a mass term or attraction (Smith, 2000). Interaction models can be used to model flows between different locations (Wilson, 2000). The number of trips to a destination will normally increase the nearer the place of origin and the higher the attraction or gravity is. In the present case, the relative number of nights spent in a region by individuals performing a specific activity will represent attraction (A) at the destination (j). This can be obtained from TDB. In a production constrained spatial interaction model the sum of the predicted outflows from any origin will equal the known total outflows from that origin (Fotheringham, 2000). The proportion of travel from an origin i to a destination j for individuals that has chosen the activity (k) can thus be calculated like this: