1

HETUS Post-fieldwork Pilot Project (HETUS-PPP)

Final Report

1 February 2003

Jonathan Gershuny and Kimberly Fisher

on behalf of the HETUS Pilot Team

Executive Summary

The key points of this report are:

  • The successful completion of the pilot project, with work undertaken on all work packages; the project group having…
  • …arrived at agreement on specification of basic time use tables reached after extensive and extended debate among different national groups; twelve countries have now delivered preliminary tables according to this format;
  • …produced a draft design for a harmonised data file which produced broad agreement across the various participating countries;
  • …drafted a specification for, and produced a prototype version of, a software tool allowing remote access production of ad hoc tables from the harmonised data set;
  • …and achieved broad agreement on a set of proposals for next (main) stage of the HETUS post-fieldwork harmonisation work.

Outline of this report.

This report summarises the development of the HETUS Post-fieldwork Pilot Project. It contains two sections:

1General preparatory considerations

  • Conceptualisations of comparative files
  • Systematic comparisons of contributing surveys

2Deliverables under the hetus pilot

(1) establishing participation in the study

(2) specification and production of summary tables for report

(3) specification of front end and (4) trial implementation

(5) specification of harmonised file and (6) trial implementation

(6) additional deliverables: variable concordance, and recommendations.

Detailed proposals for the main stage of the HETUS programme, in effect the third section of this report, are contained in a separate document.

1. General Preparatory Considerations: foundations for the HETUS programme

Conceptual Approaches to Constructing Cross-national Comparative Files

We consider that the criteria for considering variables from different survey sources as “identical” for the purposes of harmonised database construction should be:

  • The semantic content of the questions should be equivalent—operationalised as equivalence of questions from two different national studies implies that the question in one national language could be considered as a reasonable translation of the question in the other national language”; and
  • The classification of the answers should similarly equivalent.

Though there may be differences in respondent “routing” through questionnaires, which might mean that somewhat different parts of the sample might be excluded from the question in different surveys, qualifying variables should still treated nevertheless as equivalent in this sense, but the difference in sample coverage should be both (i) noted in a comment field in the metadata, and also (ii) indicated by a special “missing data” code within the dataset itself.

For the purposes of the current exercise there are then three distinct potential types of non-identity or “difference”:

1a set of substantially different questions, which can nevertheless be used to construct a derived variable which is identical in the above sense to a question in the HETUS instrument.

2A set of substantially different questions, among a group of national surveys, which may be used to construct a variable which is not included in this form in the HETUS instrument.

3Questions with no other national comparators.

The outputs from this pilot study, and from the HETUS data harmonisation project as a whole, all, in one way or another, depend on the construction of a single “rectangular” data table where the rows are individual respondents, and where the columns are variables for each country. There are two different ways of producing this sort of table, a minimal, and a maximal.

A Compact Single Harmonised Rectangular (CSHR) Files

This file would include all the identical variables and the derived variables related to the “type 1” differences specified above, plus the “Highest Common Subset of categories included in all surveys” (HCS) of those “type 2”-difference variables which include most or all of the contributing countries. This file would be the source for the standardised tables, constitute the dataset on which the NESSTAR or similar table generation software would operate, and form the basis for the partial general access micro-datafile. This file would have relatively few missing values at the national level.

A Larger Rectangular (LR) File with some Incompletely Harmonised Variables

This sort of file would include all of the variables in the CSHR files, and in addition some extra variables which are available for only a subset of the countries (ie derived from “type 2”-difference variables covering smaller numbers of coutries, this sacrificing some degree of international comparison, but allowing for more detailed analysis on specific topics). It would have more missing values than a CSHR file, and indeed might at the limit include some non-harmonised national variables (ie variables set to missing for all-but-one country).

The latter category may play some part in the HETUS programme in the longer term. But EUROSTAT has limited resources available for the collection, maintenance and support of the harmonised data files, and in consequence, we propose for the moment:

  • the construction of a single harmonised rectangular CSHR file consisting mainly of identical and “type 1” non-identical variables; and
  • the compilation of a variable concordance which allows the expansion of the harmonised dataset to include further “type 2” harmonised non-identical variables in a (LR) file if resources become available at a future point.

The maintenance of the concordance is of particular importance in relation to future applications of the time-use data that are currently not foreseen. Thus—to invent an example—should a new policy interest emerge in the duration of individuals’ exposure to particular environmental circumstances (such as potential sunlight exposure, or air pollution), the concordance could be used to specify additional relevant type-2 questionnaire variables for addition to the harmonised dataset.

A variable concordance table based on the questionnaires implemented in the HETUS studies

Integral to each of three stages in the Pilot Project is a consideration of the degree of commonality across the variables in the various national studies, leading to the production of concordance tables. Indeed, the establishment of a systematic and comprehensive comparison among the contributing surveys lies at the heart of the harmonisation process as a whole. We had originally proposed to complete a concordance for 3 or 4 countries as part of the pilot project. On further consideration, we concluded that the essential task of identifying the list of common variables for each stage of the Pilot Project involved the provision of as complete a common variable list as is possible: so this deliverable currently includes comprehensive coverage of fifteen countries.

The concordance includes two distinct elements:

(i)spreadsheets for each participating country whose diaries, setting out the complete national set of variables in a standardised format ready for input as metadata for the national studies, and

(ii)a single very large spreadsheet, the “concordance file” — bringing together variables lists for all the national studies into a single spreadsheet, grouping them in relation to the variables in the recommended HETUS instrument and providing means for sorting them by topic. This file also notes the degree of compatibility between the variables from each country, and provides an assessment of whether harmonisation is possible with recoding or not.

The concordance file (constructed as an Excel spreadsheet) shows which HETUS variables and which non-HETUS variables were collected in each participating country. The tables allow the comparison of the composition of variables, thus highlighting where functional equivalents can be created when countries did not collect the HETUS recommended variables (as well as where variables not included in the HETUS recommendations can also be constructed for a number of countries).

This process has revealed considerable differences in the collection of background information. Also, while most studies generally followed the HETUS recommendations for time use activity coding at the 1 digit and 2 digit levels, variations emerge at the three-digit level. While the scale of differences is relatively small comparing most combinations of two studies, the differences are considerable taken at the level of all participating studies.

Questionnaires from the HETUS studies conducted in the Belgium, Bulgaria, Denmark, Estonia, Finland, France, Germany, Hungary, Norway, Poland, Portugal, Slovenia, Spain, Sweden, and the United Kingdom, are included in this concordance file. Questionnaires from the other countries cannot be added within the time frame of this project. The concordance table as it presently exists nevertheless provides a basis for the expansion of the prototype data files if future resources permit further development of the project.

Variations in Diary Format

One further question cannot be answered by the concordance file. This is the extent to which different diary formats will affect the entries made in the diaries (see Appendix 1 for the detail of the variations in diary format). There are a number of different departures from the recommended form:

  • While most countries adhered to the 10 minute time slot recommendation, at least two (Hungary, where people recorded starting and stopping times, and The Netherlands, where ¼ hour time slots were used) did not.
  • While the column for simultaneous activities is situated next to and following the primary activity column in most diaries, in at least 6 countries (Denmark, Germany, Hungary, Norway, Poland, and Sweden), the secondary activity column is elsewhere.
  • Some diaries infer location of activities and mode of transport from entries made by diarists in the primary and secondary activity columns, but in one case (The Netherlands), a separate column was included for location, in 4 cases, a separate column was included for mode of transport (Germany, Norway, Portugal, Sweden), and in 5 cases, a column for both location and mode of transport was included (Denmark, France, Hungary, Italy, the United Kingdom).
  • Some separate columns identify location of activities in their own words, while others use tick-boxes for pre-coded options. More differences emerge in the “who else is present” column.
  • The studies use different cut-off ages for with young children column. Some studies asked diarists to write in the names or people present, and different numbers of columns appear in different diary formats.

It may well be that some of these format differences will have influenced how diarists completed their diaries, and the HETUS data offer an ideal opportunity to test for potential effects. These questions form the basis for possible future analysis which EUROSTAT may wish to fund before future waves of HETUS data collection are organised.

2 The Six Work Packages
Work Package 1 – Table of HETUS participation

Additional information obtained at HETUS-PPP and IATUR meetings and subsequent e-mail exchanges have been used to update the table of participation in the HETUS programme.

Participation in the Harmonised European Time Use Survey Project

Updated 19 December 2002

Conducted a Pilot Survey – 20 countries

Albania Bulgaria Estonia Finland Greece
Hungary Ireland Italy Latvia Lithuania
Luxembourg Macedonia Poland Portugal Romania
Slovenia Spain Sweden Turkey United Kingdom

Participation in the Main Stage HETUS Survey – 22 countries confirmed

Completed Field Work – 15 countries
Belgium Denmark Estonia Finland France*
Germany Hungary Netherlands* Norway* Portugal
Romania Slovenia Sweden United Kingdom Bulgaria
In the Field – 2 countries
Italy Spain
Fieldwork to Transpire at a Future Date – 7 countries
Macedonia Poland Slovak Republic Switzerland Turkey
Latvia Lithuania
Not Participating – 5 countries
Albania Austria Ireland Greece Luxembourg
*did not generally follow the guidelines but cloning data to HETUS format

This deliverable was completed and delivered to EUROSTAT, first in June 2001, and subsequently updates have been delivered to EUROSTAT as they have become available. Information about changes in the personnel working on the time use data collection in various countries has also been regularly passed on to EUROSTAT.

Work Package 2 - Specifications of harmonised tabular reports

Development of the basic tables has proved something of a challenge. An initial set of proposals were sent to participants in the HETUS project in December 2001, and extensive feedback was received. A revised proposal was developed and distributed for the meeting in London at the UK Office for National Statistics from 5-6 March 2002. Following this meeting, a further revised version of the proposals were sent out to members of the HETUS consortium on March 7 2002. Successive rounds of feedback and interactive comment took place during the following weeks.[1] And once agreement was reached, Karen Winquist in Eurostat completely revised the layout and the instructions for the basic tables.

A number of factors must be considered for the design of the basic tables.

1)Data for these tables will be provided to EUROSTAT directly by the data collection teams. This phase is thus a decentralised form of data collection, though the resulting tables will be displayed centrally on the EUROSTAT web site. This situation raises the need for simple tables which can be produced with minimal effort.

  1. As HETUS, unlike some other projects such as EU SILC, does not have the advantage of legal mandates requiring member states to provide the statistics, the tables must be sufficiently basic so that member states will be willing to complete the tables;
  2. Official statistical offices are not always involved in the national time use studies.[2] Private and academic research teams which collect national time use data have different agendas from national statistical agencies, which means that harmonisation issues must be considered. Private research and academic research teams also often have more severe funding and staffing limitations than national statistical offices, meaning that cumbersome tables would be more of a burden for such agencies;
  3. HETUS has succeeded in obtaining the participation of states seeking future membership of the EU, many of which have limited budgets and staffs, and which also have no legal obligation to provide data. Again, the tables should not prove too taxing to the data participants.

2)The tables should allow a maximum number of countries to contribute data. As more countries contribute data, the utility of the tables for various interests expands. Moreover, as more countries which participated in HETUS contribute data to the table, pressure increases for the remaining countries to also contribute data.

3)Basic tables also leave open the possibility that countries outside the European Union which did not participate in the HETUS project can still contribute comparable data with minimal effort, which expands the uses for the tables.

4)The tables need to be of a manageable size or they will not often be used. Some users will want very basic information and will not want to start by confronting huge tables or the tables whose contents they can manipulate. These tables accompany an intelligent front end on-line data facility, so more detailed information will be available for people after a few key strokes. These printed tables do not need to replicate possible uses of the on-line tables. These tables only need to give general publicity for the project.

5)Simple basic static tables allow for timely release of some data and a rapid dissemination of basic results once data cleaning is completed.

Against these needs for simplicity, there are also particular needs for basic information on some special topics. Participants at the London meeting recognised three topics as holding particular salience at this time: 1) physical activity; 2) child and adult care; and 3) social time. Dealing with these topics requires more sophisticated calculations. Physically active and social time cut across many of the basic categories in the HETUS coding scheme. Calculating childcare time required use of combinations of transportation purpose calculations, use of the “with whom” variable, and use of secondary activity information.

The work required to create the basic tables is straightforward and not time consuming. Consequently, the basic tables can be produced by the agencies which collect the time use data and sent to EUROSTAT in the period following data cleaning. The basic tables allow for rapid dissemination of basic results on the EUROSTAT web site. These tables are sufficiently straight-forward that it should not be difficult to persuade data collectors to produce these tables. These recommended tables and their creation instructions appear in Appendix 2.

The special topics tables require considerably more work which national statistical agencies are less likely to have time and resources to complete. As the same basic procedures are needed to produce these tables from each of the data sets, it may well be both more efficient and cost effective for these tables to be produced in a central location from the original data files. The currently proposed recommendations for these files appear in Appendix 3.