Urban Audit, phase III - Historic Data

Final Report

  1. The project

The work on this project is an extension of the previous 2 phases with a special focus on the historical data, in particular for the year around the previous population census (1992) and 1996 which lies between the two censuses. According to the Terms of Reference of the project the following tasks should be fulfilled:

1.1Examine scrupulously the list of variables, using the classification for 2001 as a starting point.

1.2Contact the cities concerned (and other statistical agencies) in order to find out about data availability

1.3Classify the complete list of variables into three categories (for three spatial units)

1.4Send the conclusions to Eurostat

2.1Contact the cities concerned and other statistical agencies in the country

2.2Collect the variables for the three spatial units used already for the collection of 2001 data: core city (commune), larger urban zone and sub-city districts.

2.3Check the received data

2.4Transmit the data to Eurostat in the agreed format

2.5Transmit the meta information to Eurostat in form of flags (as defined for the collection of 2001 data)

2.6Preparation of intermediate report. Sending the intermediate report to Eurostat.

3.2Collect "similar" (related) quantitative data, both concerning variable definition and concerning spatial units

3.3Estimate the required variables with the aid of "similar" data

3.4Check the quality of the estimated data

3.5Transmit the data to Eurostat in the predefined format

3.6Transmit the meta-information (estimation algorithms) to Eurostat in an agreed format

4.1Verify issues as they are raised (correct variables or confirm accuracy)

4.2Collect data for the variables now included in the sub-city district

4.3Collect missing data for 2001 data where possible

4.4Check data

4.5Transmit the data to Eurostat in the agreed format

4.6Transmit the meta information to Eurostat in form of flags (as defined for the collection of 2001 data)

4.7Preparation of final report. Sending the final to Eurostat.

The tasks could be summarized into 3 groups: organization (contacts and instructions); data collection; validation.

  1. Organization

The analysis of the foreseen activities showed the need of data collection for some specific indicators from and with the support of the local authorities of the cities concerned as well estimations of the non-available data. In the Phase II of this project this was successfully done with the support of the experts contacted with the support of NAMRB.The same approach was implemented for the current project.

Most of the raw data and meta-information for past periods are stored in the NSI regional offices, especially for the Population and Housing Census 1992. That way the participants in the project may be grouped into three groups: from the NSI headquarters; from the NSI regional offices and (through the NAMRB) from the local authorities.

Their tasks were in general distinguished but overlapping in some actions. That is:

-Meta information about the territorial units boundaries was collected from the NSI regional offices in collaboration with the local authorities and experts of NAMRB;

-Data processing of Census’1992 for the three types of units was done by the experts in NSI headquarters;

-Data for 1996 was provided by the NSI headquarters experts as well as with the support of the local authorities

-Validation was done by the NSI headquarters experts as well as with the support of the NAMRB experts.

One of the achievements of the project may be considered the perfect collaboration between the three parties. It has shown that there is enough potential for such joint projects.

There was a good connection during the whole period between the NSI and NAMRB on the one hand, and between central and local partners on the other. A slow down of the work happened in the middle of the year. It was consequence of the start of the repair of the NSI building. It was later compensated in September and the data was delivered on time.

Budgetting

The budget was divided between the two partner organizations – the NSI and the NAMRB. It was spent according to the tasks done by their experts and may be assessed as sufficient. Time sheets for every expert were filled in timely according to their activities.

  1. “Field” work
  • The work started with variables classification. They have been classified into the three required groups and according to the availability of raw data.
  • City authorities were asked to support the data collection within their responsibility. In addition they were asked to provide more metadata information concerning the previous population census (1992).
  • Territorial units boundaries

Focus was put on the data for the core cities and sub-city districts for 1992. Most of the indicators at sub-city level are available only on the basis of the Census’1992. Later on they were aggregated at the upper levels such as Large Urban Zones (LUZ) and Core City.

Here were found the main difficulties, namely to find the information about boundaries of the census’1992 tractsand make it comparable to census’2001 tracts.

During the period between the two censuses significant administrative changes were made in the territorial organization mainly in the big cities (Sofia, Plovdiv, Varna, Pleven). Since the breakdown of the core cities down to SCDs was based on the administrative city regions (where they existed) and the numbering of the tracts was made within these regions, the codes used in both censuses were incomparable.

In fact different methodology was used for the Census’2001. It keeps the tracts closer to the electoral sections which make them hardly comparable. We needed to elaborate corresponding lists using the description lists for the enumerators (which contain street addresses) and draw the borders of the then existed census tracts. It’s been found out that in the late 90-ies all these lists (on paper only) were scrapped.

In 1992 no GIS was used and there were no maps to draw the border lines of the tracts. The information was only verbal and on paper describing the names of the streets and addresses. Moreover, the paper information was not stored properly and in most cities it was lost. A lot of efforts were spent from both regional offices and local authorities to investigate whether this data information could be found. We asked the colleagues from the counterpart and in the regional offices to search thoroughly in the local administrations if they keep a copy of those description lists. The check showed no information about Census’1992 was kept and archived.

In the end, we were forced to “invent” a technology to estimate the coverage of the already created SCDs for 2001 on the basis of the census tracts in 1992. The technology implemented was:

The databases from both censuses were compared and records were matched wherever possible. The results from matching were analyzed to find out the correspondence between 2001 and 1992 census tracts. The decisions were taken on the basis of prevailing number of matched records. Using that approach we came to a correspondence between the two censuses units. Afterwards, the individual data was processed and population numbers were compared and analyzed for each city. When an unexplainable difference was found, the procedure was repeated.

  • Data processing

Data for 1992 was received mainly from the population and housing census. Processing the data from the individual databases allowed producing data for most of the variables in the list on three levels – SCD, Core city and LUZ.

It deserves mentioning that administrative changes between the censuses were too significant and together with the above mentioned problems they require more than expected work. For example, in 1992 Sofia had 11 administrative regions and in 2001 – 24; Varna had 10 mayoralties in 1992 and 5 administrative regions in 2001 etc.

Data for 1996 was taken from the existing (published or in working tables) on the LAU1 or LAU2 level. Since there was no Census in 1996 data on most of the indicators that come from the Population and Housing Census could not be provided. This concerns mainly data at sub-city district level (SCD) but also some indicators that are specific to the Census.This allowed us to provide data only at Core city level (mainly population data) and at LUZ level (mainly social data). Latter were received by aggregation of LAU2 data. Economic data were available only at NUTS3 level and some cases (income data) – only at NUTS2 level.

That problem was discussed during the meeting in Brussels in May 2005 when the Historic Data project was preparing. Most of the countries presented at the meeting expressed that data provision will be much limited in comparison with years when a Census took place. In Bulgaria the use of administrative registers is still limited. Although the number of registers rose in the recent years it is impossible to find retrospective data at sub-city level. Even in the existing registers (such as Civil Registration) this was not possible because they do not register and keep the address of a certain demographic event but only the settlement. Thus, data provision was limited in two directions: the level and the list of indicators.

Respectively, the efforts of the team were focused on finding data for 1996 for the upper levels – large urban zones (LUZ) and City level. Difficulties here come from the distance in time and different methodologies used at that time. Hopefully, most of the statistics have revised their methodologies, and data respectively, in compliance with the EUROSTAT methodology. Even though, some data such as Households’ budgets data, is still hard to be calculated.

The problem with the territorial breakdown is the same as for 1992.

  • Validation

Validation was done making comparison with 2001 data. At different stages it showed some discrepancies in both coverage and in definitions used in 1992 and 2001. They were managed in the mean time and to the possible extent. We can assume that the territorial coverage of SCDs is more than 95% (the main differences found in Sofia), closer to 100%.

Population Total (DE1001V)
LUZ_code / Value / LUZ name
1992г / 2001г
BG001L / 1 183 083 / 1 263 807 / София / Sofia
BG002L / 432 143 / 439 061 / Пловдив / Plovdiv
BG003L / 353 735 / 360 396 / Варна / Varna
BG004L / 240 086 / 236 147 / Бургас / Burgas
BG005L / 206 936 / 190 154 / Плевен / Pleven
BG006L / 201 410 / 189 471 / Русе / Ruse
BG007L / 85 974 / 77 480 / Видин / Vidin

The comparison showed that the size of the SCDs changed between 1992 and 2001 but keep the average close to 2001 level. There are no SCDs with less than 5000 inhabitants and only one in Sofia and one in Burgas were above 40000 in 1992.

Map of SCDs for Sofia, 1992

The first look at the results leads to the conclusion that between the censuses that despite the overall decrease of population, the population of the biggest cities increased. Another conclusion is that the centres, mainly of the bigger cities are “emptying”, decreasing in number of inhabitants in favor of the outskirts where housing and living has become cheaper. The data collected deserve a further deeper analysis.

  1. Problems and conclusions

Main problems we met during the implementation of the project were described above. The problems with the regional breakdown changes and methodological differencies were expected while the loss of loss of information about boundaries surprised us and took more than expected time from all the experts involved.

Another problem to get correct regional data for early 90ies is the territorial breakdown at that time. The known 28 “oblasti” (respective to NUTS3) did not exist officially. Instead, 9 units under the same name with completely different territorial coverage existed. Hopefully, NSI kept its organizational structure on the basis of the 28 okrazi, which are very similar to the nowadays existing 28 oblasti. This allowed us to get most of the data at that level but for sample surveys such as Households’ budgets, data are subject of careful estimation.

Studying more in depth the data from the Census’1992 we found some details in the then-used definitions which deviate from the definitions used in 2001. Most of the differences are reasonable because they belong to the old system of observation (unemployment).

Sergey Tsvetarsky

Head of the project