URGENT Data Management and Quality Assurance Scoping Study


Urban Regeneration and the Environment Programme

Data Management and Quality Assurance

Scoping Study

Summary Report

Timothy J. Moffat

Environmental Information Centre,

Institute of Terrestrial Ecology

Richard P. Shaw

British Geological Survey

C. Isabella Tindall

Institute of Hydrology

Simon R. Williams

British Atmospheric Data Centre,

Rutherford Appleton Laboratory

Robert C. Jones

British Geological Survey

September 1998

ii

URGENT Data Management and Quality Assurance Scoping Study - Summary

Table of Contents

1. Introduction

1.1 Background

1.2 Data Management and Quality Assurance Committee

1.3 Scoping Study

2. End User Requirements

2.1 Categories

2.1.1 Academia

2.1.2 Government

2.1.3 Local Authorities

2.1.4 Industry

2.2 Information Needs

2.3 Information Availability

2.4 Data Handling

3. Data Management

3.1 Issues

3.1.1 Overview

3.1.2 Quality Standards

3.1.3 Data Model and Architecture

3.1.4 Documentation and Metadata

3.2 Data Access

3.3 Data Dissemination

3.3.1 Awareness and Publicity

3.3.2 Catalogues

3.3.3 WWW and CD-ROM Technology

3.4 Intellectual Property Rights and Copyright

3.4.1 Intellectual Property Rights

3.4.2 Copyright

3.4.3 IPR and the Internet

3.5 Liability

3.6 Commercial Confidentiality

4. References

1

URGENT Data Management and Quality Assurance Scoping Study - Summary

1.  Introduction

1.1  Background

The Natural Environment Research Council (NERC) set up the URGENT (Urban Regeneration and the Environment) Programme in 1996 to help to meet the challenge of cleaning up the legacy of past contamination, to establish a sustainable regime which will avoid repetition of past mistakes, and to permit perennial reshaping of the structure and use of the urban environment.

1.2  Data Management and Quality Assurance Committee

It is NERC policy that all new research programmes should make central arrangements for the provision, management and dissemination of data. In recognition of the importance of these data issues, the URGENT Steering Group set up a working group on data and quality assurance, the Data Management and Quality Assurance Committee (DMQA Committee).

1.3  Scoping Study

At the preliminary meeting of the DMQA Committee, it was agreed that a scoping study should be undertaken to establish the data requirements of the project participants and the user community of the Programme, partly as preparation for the development of a work programme for the provision, management and dissemination of data in support of the programme. This document is a summary of the findings of the scoping study.

2.  End User Requirements

To assess the needs of data users, both within the URGENT Programme and potental end users of data/results from the programme, a questionnaire was circulated to over 300 individuals/organisations and the round one URGENT Programme PI’s. About 38% of recipients responded, these replies are summarised below.

2.1  Categories

The potential end users of data and knowledge derived from the URGENT Programme have been divided into four broad categories. A brief definition of these categories is given below with an outline of the types of organisation that fall into each and their functions.

2.1.1  Academia

Academic research is carried out in universities and in some statutory bodies. Very little research into urban regeneration is being carried out by commercial organisations, although they may commission an academic body to carry out research on their behalf. In general, university academics require access to a broad range of data to inform their research. They tend to gather their own data, but also use data from systematic archives such as those of the Meteorological Office and other statutory bodies.

2.1.2  Government

Government is taken to include national regulatory organisations and development agencies as well as central government.

Regulatory bodies, such as the Environment Agency, combine the role of researchers through direct or sponsored research, advisors on legislation and monitors of its observance. Their requirement for information is very broad. They require up-to-date information on scientific thinking and some systematic data from other disciplines in order to interpret the data that they either collect or receive in discharge of their statutory duties.

Development agencies have been set up in many areas to increase economic activity by regenerating derelict urban areas. Frequently, external consultants conduct detailed assessment of development-agency properties.

It is clear that most Government activity related to urban regeneration is at the level of policy generation. Central Government, in Whitehall and Edinburgh, relies on the regulatory agencies to provide it with interpreted data and it is not generally interested in the raw data. An exception is the annual Vacant and Derelict Land Survey in Scotland, which is conducted for the Scottish Office. Local authorities collect information on all sites in their areas, and collation and the Scottish Office carries out interpretation of their returns.

2.1.3  Local Authorities

Many of the responsibilities associated with the management of urban regeneration are borne by local authorities. There is considerable diversity amongst these authorities. The department with responsibility for urban renewal may be planning in one authority and environmental health in another, depending on the problems associated with the regeneration process in any given area. The technical expertise available varies considerably from council to council. Consultants are usually retained to perform any scientific work that is necessary. Recent reorganisations of the local authority structure have caused significant disruption in some authorities.

2.1.4  Industry

The broad term “industry” is used to describe commercial developers and financiers as well as the commercial consultancies, which are often employed to assess derelict sites.

Commercial developers and financiers are primarily interested in the cost of remediation work and are seldom involved directly with the practicalities of assessment of sites to be regenerated. Consultants are almost always employed. Some indication of the state of land is helpful to developers in valuing land.

Consultants perform two main functions. Firstly they provide expertise in various area for those organisations which lack it, for example in the assessment and regeneration of contaminated land. Secondly they act as a link between the providers of raw data and the end users, interpreting a large body of data as it applies to a specific case, and putting that interpretation into simplified terms which the non-technical client can understand. Consultants perform a role that the academic community is not keen on, partially because of the legal implications of selling an interpretation of data. Since consultancies are profit-making organisations they will always be inclined to minimise their expenditure, including their expenditure on data.

2.2  Information Needs

Specific information needs will be closely related to the nature of end users’ own research, scientific, operational and regulatory interests (i.e. data have to be “fit for purpose”). However, some general conclusions can be drawn from the analysis of the questionnaires:

A broad and diverse range of data is required from sources outside the end user organisation to support work on urban regeneration (Figure 1). The largest proportion of all end users requires land-use and historic land-use information.

All categories of end user require both local/case studies and large/systematic data collections (Figure 2). Local authorities and industry tend to require more site-based data, whereas government and academia require the full range of data types.

Government and local authorities require mainly (or only) interpreted third-party data to support decision-making (Figure 3). Industry and academia require both raw and interpreted data, but, significantly, government does not require raw data.

Academia prefers to gain access to sources of information via electronic media, such as the WWW, CD-ROM and floppy disk (Figure 4). By contrast, end users in government and local authorities still have a strong requirement for hard-copy sources.

The majority of all end users are able to obtain the sorts of data that they need (Figure 5), although there are some data needs that are not currently being met (Figure 6). However, a larger proportion of end users (except for those in the government category) indicated that they did not have any data needs that were not being met.

There is an overwhelming requirement for a directory of Urban Regeneration information and data sources (Figure 7).

With respect to data supplied to a “Data Centre” for use by projects or resulting from an URGENT research project, a large proportion of end users (especially within the government category) will expect unrestricted access to those data (Figure 8). However, for some data sets, restricted access (either commercial-in-confidence or approved access) will be necessary.

Most end users may be willing to pay for the provision of data for/from the URGENT Programme (Figure 9). The largest proportion of “non-payers” falls within academia.

2.3  Information Availability

End users will have their own specific information, some of which may be available to third parties and/or the URGENT Programme. Some general conclusions concerning this information can be drawn from the analysis of the questionnaires:

The majority of end users in every category except industry, and particularly from within government, have relevant information that has been published or are willing to make available to others, freely or confidentially, through an agreement with or without a licence fee (Figure 10).






As with the information needs (Figure 1), a broad and diverse range of subjects areas are available but spread across a minority of end users (Figure 11). Significantly, topography is the one subject area where data are not available from any category of end user.

All types of information ranging from local case studies to maps are available (Figure 12).

For most end users intellectual property rights (IPRs) are retained by the organisation, although some end users, particularly within local authorities, have no policy (Figure 13).

Similarly, for most end users copyright is retained by the organisation, most strictly within government (Figure 14).

2.4  Data Handling

Information and data sets originating from the URGENT Programme will be made as widely available as possible. Some knowledge of end users’ data handling systems and practices is required to help to achieve this. The results of the questionnaire can be summarised as follows:

The majority of end users use personal computers (Figure 15). There is a limited usage of UNIX workstations within academia and industry, and Apple-Macintosh within industry.

A large proportion of end users in academia, government and local authorities do not use Geographical Information Systems (GIS) for work on urban regeneration (Figure 16). The majority of end users in local authorities use GIS, including desktop mapping tools like ArcView and MapInfo.

Some end users, particularly within academia and industry are using or considering 3-dimensional computer modelling programs in preference to a GIS for appropriate applications (Figure 17). There is no requirement or usage within government.

Most end users, particularly within government and local authorities prefer data in PC-based databases or spreadsheets like Microsoft Access and Excel (Figure 18).

A majority of end users within government, industry and particularly academia routinely use electronic mail (e-mail) for communication (Figure 19). Some end users within government and local authorities never use e-mail.

Most end users within government, industry and particularly academia use the World Wide Web (WWW) for information gathering (Figure 20). More end users in local authorities do not use the WWW than use it.

An overwhelming majority of end users in academia subscribe to an on-line bibliographic database system (Figure 21). In contrast, very few local authorities use them. A significant minority of end users on Government and industry use them, with a preference of off-line systems.







3.  Data Management

3.1  Issues

3.1.1  Overview

NERC recognises the value of environmental data sets which are unique, often extremely expensive to collect and irreplaceable. To ensure that maximum benefit is derived from data once they are acquired, NERC has developed a data policy, which is set out in the NERC Data Policy Handbook (NERC, 1998). The Data Policy sets out a framework for the handling of data produced both by individual projects and by research programmes. It is therefore central to the management of data produced under the auspices of URGENT.

The Data Policy is formulated to be consistent both with legal frameworks such as the Environmental Information Regulations and with contractual arrangements with other bodies, whereby, for example, NERC holds data on a confidential basis without owning the intellectual property rights.

Key aspects of the policy include:

·  NERC aims to realise the value of its data resource by using it to further scientific understanding, create wealth and improve the quality of life. This can be done in a number of ways including: using data sets within NERC’s own institutes; giving, exchanging or licensing/selling them to other scientific researchers; or licensing/selling them to commercial organisations which will themselves create wealth. Data are thus a potentially tradeable asset.

·  Teams and individuals that have collected data sets are allowed a reasonable period of exclusive use during which they may analyse the data and publish results.

·  Charges for data are dependent on the uses to which the data will be put, with any consequent restrictions being specified in formal licence agreements. Revenue from commercial applications of NERC’s data contributes towards the substantial costs of collection and long-term data management. By contrast, NERC wishes to ensure inexpensive access to its holdings by bona fide academic researchers; i.e. those whose research is solely to advance knowledge and not for commercial gain, and whose results will be published in the open literature. According to circumstances, data are supplied to such users either entirely free of charge, or for a nominal handling fee or at a discounted rate.

The formal policy statement from the handbook is divided into four parts dealing with data acquisition, data management, data use and charging for data.

3.1.2  Quality Standards

Quality standards have been developed for the Environmental Diagnostics Programme and published as a Quality Assurance Guidance Manual (Environmental Diagnostics, 1996). The Manual gives guidance and assists project consortia in installing and operating appropriate quality systems, with the aim of meeting the needs of the Programme and enhancing the general knowledge and skill in QA procedures in the participating institutions.

The Environmental Diagnostics community has accepted the standards. Therefore, the publication of an equivalent manual followed up by workshops and/or visits to project consortia would be sensible measures for maintaining quality standards within the URGENT research community.