/ EUROPEAN COMMISSION
eurostat /

Sixth Meeting of the Steering Group
on EU statistical co-operation
with the New Independent States and Mongolia

Handbook on Information Technologies

for a National Statistical Office

Document SG/2000/12

Kiev, 18-22 September 2000

Contents

Contents

1. Introduction

2. About the Content of the Handbook

3. The organisation of the National Statistical Office (NSO)

3.1. Background

3.2. Centralised organisation – only a CSO

3.3. Several central statistical institutes

3.4. A regional distributed organisation

4. The Role of the Information Technology (IT)......

4.1. Background

4.2. Main tasks of IT in a NSO

4.3. Organisation of the IT work

5. The Main IT Components

5.1. Overall overview

5.2. Hardware

5.2.1. Servers

5.2.2. Workstations

5.2.3. Network

5.2.4. Other equipment

5.3. Software

5.3.1. Overview

5.3.2. Operating systems and server software

5.3.2.1. Servers

5.3.2.2. Workstations

5.3.3. Standard packages

5.3.4. In-house software development

5.4. Selection criteria

5.4.1. Criteria for the selection of network parameters

5.4.2. Selection criteria for network software

5.5. Acquisition

5.5.1.Introduction

5.5.2.General principles

5.5.2.1. Value for money

5.5.2.2. Competition

5.5.2.3. Propriety

5.5.3.Procedures and protocols

5.5.3.1. The procurement cycle

5.5.3.2. Assessing value for money

5.5.3.2.1.Equipment offered

5.5.3.2.2.Immediate cost of acquisition

5.5.3.2.3.Status of companies involved

5.5.3.2.4.Delivery

5.5.3.2.5.Operating costs

5.5.3.2.6.Product support

5.5.3.2.7.Replacement arrangements

5.5.3.2.8.Strategic and structural

5.5.3.3. Encouraging competitiveness of suppliers

5.5.3.3.1.Dialogue

5.5.3.3.2.Innovation

5.5.3.3.3.Criteria

5.5.3.3.4.Standards

5.5.3.3.5.Quality control

5.5.3.3.6.Orderly purchasing

5.5.3.3.7.Feedback

5.5.3.4. Standards

6. Human resources

6.1. General aspects

6.2. Strategies for IT training

6.2.1. Background

6.2.2. Personnel categories and professional roles

6.2.3. Subjects

6.3.Different ways for competence raising

7. The Statistics Production Process (SPP)

7.1. Background

7.2. The Main Components of the SPP (The survey-oriented production system)

7.2.1. Data collection

7.2.2. Data editing

7.2.3. Data storing

7.2.4. Data aggregation

7.2.5. Tabulation

7.2.6. Data analysis

7.2.7. Presentation and dissemination of statistics

7.2.8. Data Archiving

7.2.9. Some conclusions

8. Databases

8.1. Background

8.2. Different types of databases

8.2.1. Content of databases

8.2.2. Position in the production process

8.3. Design of databases

8.4. Implementation and maintaining of databases

8.4.1. Implementation in a client/server environment

8.4.2. A client/server architecture for database applications

8.4.3. Maintaining databases

8.5. The database-oriented production system

8.5.1. Databases

8.5.2. The metadata base

9. Data Warehouse Technologies

9.1. Background

9.2. Architecture of a data warehouse

9.3. Implementation of a data warehouse

10. Presentation and Dissemination

10.1. Basic technologies

10.2. Methods

10.2.1. Tables

10.2.2. Graphics

10.2.3. Geographical information systems (GIS)

10.2.4. Data files, replication, etc.

10.3. The publication process

10.3.1. Technical organisation

10.3.2. Administrative organisation

11. Internet

11.1. Internet – the world-wide network

11.2. Internet and the statistical information system

11.2.1. General architecture

11.2.2. The E-mail server

11.2.3. The web server

11.2.4. Development, implementation, and maintenance of a web site

11.3. Intranet

12. Data security and protection

12.1. Storage reliability for applications and data

12.2. Protection of applications and data from unauthorised access

12.3. Providing security and confidentiality of information in distributed networks of the state statistical system in the TACIS countries

12.3.1. Potential threats to information security in computer networks

12.3.2. Current methods for information protection in computer networks

12.3.3. Access limitation

12.3.4. Access control to hardware

12.3.5. Delimitation and access control to information

12.3.6. Cryptographic transformation of information

12.3.7. Methods and means for protecting information from accidental impacts

12.3.8. Providing reliable protection of information from unauthorised access

12.3.9. Protection from deliberate unauthorised access

13. Some budget questions

Annex 1

Observation register documentation schema

Production documentation schema

Annex 2

The situation of software usage in EU and CIS countries 1998

1. Introduction

This handbook has been written as a conclusion of the High Level Seminar in Statistics in Alma Aty, October 1998. It is a continuation of the work done by the Tacis Task Force on Standardisation and Training in Information Technology. The participants at the Task Force meeting in September 1999 in Stockholm have discussed the draft of the Handbook. The following NSOs were represented at the Task Force:

Russian Statistical Agency

National Statistical Committee of the Kyrgyz Republic

National Institute of Statistics and Forecasting of the Republic of Turkmenistan

Statistical Office of the Republic of Slovenia

Ministry of Agriculture, Fisheries and Food of United Kingdom

Statistics Sweden.

The handbook is supposed to be used as a guideline for the development of implementation strategies of information technology in Tacis countries. It is not a directive for programmers or system analysts, but for decision-makers on different levels. The aim of the handbook is to provide advice and to be considered as a source that can be used by decision-makers on different levels, who are developing implementation strategies for IT in NSOs. It should also contribute to a harmonisation of the implementation strategies in the Tacis countries. Such a more homogenous IT-strategy will help to organise co-operation where that is wanted.

The starting point of the handbook is the state or art as it has been observed at the publication time. Because of the very fast development in the IT field it is always necessary to take the latest developments into consideration when developing a concrete IT strategy.

It is the intention that this Handbook should be updated from time to time. The development in the IT area is very fast and it is more or less impossible to publish such a document to be used for many years without any adaptation to new technical developments.

2. About the Content of the Handbook

The handbook is divided in a number of chapters and related topics. It is not the aim to discuss statistical subject matter aspects others than they are closely related to IT questions. Nevertheless, it is necessary to construct a schema for a statistics production system and discuss the IT problems in relation to it.

The handbook considers first different organisational schemas of a national statistical office as it may be relevant for the IT, in particular the communication facilities (Chapter 3). The role and the main parts of IT are discussed in the chapters 4 and 5. It follows a description of a typical statistical production system, its main components or phases that are reflected in the use of IT (Chapter 6). The chapters 7 to 12 are dedicated to some special important issues as databases, data warehousing, Internet, Dissemination, and data security. The handbook will close with some budgetary issues in chapter 13.

In the different chapters of the Handbook several times concrete software packages are mentioned as examples. That does not mean that these packages are the only that can be used. The software market is very large and it is impossible to mention all software alternatives. That handbook should not be considered as source for software recommendation.

3. The organisation of the National Statistical Office (NSO)

3.1. Background

In a number of countries statistical offices have a long tradition going back to the 18th century. It was and is still the need of the society to express the social and economical situation in figures that motivates the existence of statistical offices. From the beginning is was not only the static (snapshot) situation that is of interest but also the dynamic changes over time (time series). It was quite obvious that statistics can contribute with important information to a various kind of decision-makers in the society. In particular, the public sector of the countries required statistics for its activities.

It is striking to see that the basic representation of statistics has changed so less during more than two centuries. A table is still the most usual way to present statistical information.

In the context of this document it may be important what position a national statistical office has in the society. In particular, the legislation may have a more or less strong impact on the organisation of the statistics. This will often indirectly influence the implementation and use of information technologies.

Intergovernmental dependencies and division of work in the area of statistics have to be taken into consideration when designing a statistics production system. It is not the task of this handbook to discuss these topics in detail. It should only be mentioned that it is important to consider the legal framework of statistical activities when implementing a statistics production system.

Another important question is the organisation of the national statistics in the country. We can distinguish between some typical organisational schemas:

  • A central national statistical office (CSO) as the only body responsible for official statistics;
  • A central statistical office sharing the statistics production with other governmental institutions, as ministries, etc.;
  • A central statistical office with a number of regional offices on one or more regional levels of the country.

The organisational structure of the statistical office as regard the regional and governmental distribution has a strong impact on the data flow and the processing of data.

3.2. Centralised organisation – only a CSO

In this case only a CSO is established responsible for all statistical processing. From the organisational point of view this is probably the simplest structure. All statistical data is collected by and delivered to the CSO. The CSO will process all data and produce the statistical output.

In a modern society this type of organisation requires well-established communications covering the whole country. In large countries the centralised organisation can be difficult to maintain that all potential customers of statistics will be satisfied. One reason can be the different needs of statistics in different parts of the country. Often regional bodies are better aware of the wishes of regional customers and can better handle that. It may also happen that it is necessary to collect data just of interest for a special area of the country. Such data may be of less importance for statistics covering the whole country.

3.3. Several central statistical institutes

In a number of countries the production of statistics is shared by several “central statistical offices” or by statistical departments in different ministries. The CSO has in such situations often the role to co-ordinate the statistics production. Depending on the state organisation the CSO does also collect data or is only responsible for co-ordination tasks. This kind of organisation requires a good communication between all offices responsible for official statistics.

3.4. A regional distributed organisation

In many countries we find a regional structure of the organisation of statistics production. It is of importance to define the role of regional statistical offices. Typical roles are:

  • The regional offices do only collect data and supply them to the CSO. All statistics is produced and disseminated centrally.
  • The regional offices collect and edit the data. Cleaned data is sent to the CSO. At the CSO level additional editing may be needed. But the direct contacts to the respondents are organised by the regional offices.
  • The regional offices are also aggregating data. Only aggregated data is delivered to the CSO.
  • The regional offices are responsible – at least partly – for the production of regional statistics.

From the IT point of view it is of advantage to have a technical similar environment in the whole statistical organisation. A harmonised solution covering all regional offices will make it much easier to develop an efficient data flow across the whole organisation. It is not only the technical issues that have to be considered but also the management of data collection, the development of applications, the implementation of databases, etc. A good co-ordination of all such activities would improve the overall efficiency and support the further development of the statistical system. If this co-ordination is not established, there is a big risk that the organisation will diverse and result in a situation that is very difficult to manage.

In particular it is of importance to co-ordinate the development resources that in many cases are scarce resources. That means to ensure that the education and employment of IT staff follows common agreed rules.

A regional distributed organisation of the statistical organisation depends in a high degree of the available technical infrastructure for communication in the country. You have very carefully to consider the existing possibilities and the expected communication feature in the future. You have to tune your level of ambition with the existing reality. It is better to start with a lower ambition instead to try to implement a solution that does not work. On the other hand you should really try to use all the possibilities offered by the communication infrastructure in your country.

4. The Role of the Information Technology (IT)

4.1. Background

Statistical offices were among the first to use information technologies for their work. For many years a national statistical office belonged to the most advanced users of IT. Today this situation has changed. Still a NSO can be counted as a large user of IT, but meanwhile many new users of IT have appeared, using larger data sets and more advanced features.

Nevertheless for a NSO information technology is one of the key tools to produce statistics. But it is important to see IT as a support for production of statistics, not vice versa to consider IT supervising statistics production. Therefore it is of relevance to describe the basic roles of IT in a NSO, before going to specific technology issues.

4.2. Main tasks of IT in a NSO

The general task is to support the statistics production as already has been mentioned. We can group the main types of IT tasks:

  • Basic system provision and maintenance
  • Acquisition of hard- and software,
  • Implementation of computers and basic software,
  • Networks,
  • Internet, Intranet
  • Database management
  • Database design
  • Implementation
  • Administration
  • Application development, statistical analysis software
  • Dissemination and publication of statistics
  • Ensure the accessibility to relevant statistics for external users
  • Archiving, backup, security
  • Development of special statistical software
  • Follow up the international development of the IT area
  • Education of users in using IT

4.3. Organisation of the IT work

In statistical offices you will find different ways how to organise the IT work. It is difficult to recommend one particular method. The organisational form may depend on the experience of the office in the past, administrative rules in an office, etc. Some typical methods to organise IT are:

  • Centralised IT unit responsible for all kind of IT work. In subject matter departments no IT staff is allocated. The IT department will take development orders from the subject matter departments.
  • Centralised IT department only for common office-wide IT tasks. Normal application development is allocated to the subject matter departments.
  • A mixture of both.

What are the consequences of the different organisation of the IT work? This document is not the place of a deep analysis of the problems. Here are some impacts of the organisation types mentioned above.

Centralised organisation

  • The IT competence is concentrated in one department.
  • It is easier to keep the competence up to date.
  • Introduction of standards for the IT work is easier.
  • IT resources can more flexible been allocated to priority tasks.
  • More administration and less flexibility for unforeseen tasks of subject-matter departments.
  • It is more difficult to generate subject matter knowledge. Good subject matter knowledge must be considered as a positive factor for application development

Decentralised organisation

  • The subject matter departments are responsible for the whole work from the design of a statistical task to the IT implementation.
  • The central IT department is only responsible for basic support, as network running, standard software packages, general IT strategies, etc.
  • It is easier for the subject matter departments to plan their work and to adapt changes.
  • Better subject matter knowledge of the IT experts.
  • Different application development styles in the office.

In some NSOs some parts of the IT have been outsourced to private companies, e.g. system maintenance. A number of other NSOs are sharing computer resources with other governmental institutes.

Independent of the organisational structure of the IT work it is necessary to ensure that all IT follows some general office-wide rules. Such rules are:

  • Strive for a harmonised hard- and software system for the whole office; i.e. avoid different types of hardware, different basic software, etc. Heterogeneous systems tend to generate more problems with interfaces between them. They are often more expensive to maintain. You need different kind of competence in the office. These drawbacks outbalance the expected advantages to use the best software/hardware for every task.
  • Use standards for application development. There should be decisions about what application development tools have to be used in the office. Even within the frame of one software tool it is possible to define rules how to use it. You get a higher degree of competence for just that tool and are more flexible to exchange staff.
  • Try to use high-level development tools and avoid low-level tools as programming languages like C++, Pascal, etc. Use CASE[1] and design tools for the development of database structures and applications.
  • Use cost-efficient solutions.

5. The Main IT Components

5.1. Overall overview

Information Technology is a general term that covers a lot of different aspects. In this chapter it is the intention shortly to discuss the main components that are of significant importance for a successful statistics production system. The statistics production system itself will be discussed later in the handbook. Of course, there is a strong interaction between the production system and the information technology issues. Therefore both the chapter about IT components and the later one about the statistics production system should be considered as very interrelated topics. There is no doubt that the production system is the primary issue.

Hardware and software are the most natural main IT components that have to been considered. It can be observed that the software will play a more and more important role and in many aspects the software will be of a higher degree of importance than hardware. But also human IT resources play an important role, but they will be discussed in a separate chapter.

Because of the very fast development just in the area of hardware and software, it is difficult or practically impossible to provide technical parameters for equipment or to advice special software versions.

5.2. Hardware

Hardware is the substantial part of the IT environment. The choice of hardware will influence all other parts of the IT area. But that does not necessarily mean that you should start with the hardware decision. When defining your IT basis for the statistics production system it is necessary to take a number of elements into consideration. As regards the hardware choice there are today different main lines used, often in combination with each other: