European Environment Agency
European Topic Centre on Biological Diversity
Work Programme 2005
Work Package 1.4.2: Maintaining and quality assuring EEA priority data flows
Common Data Base on Designated Areas
At National Level
Report on the analysis process for National Data Base of France
Lauri Klein and Tiina Dislis
EEIC
(Estonia)
July 2005
1. Main aims and tasks
Since 2001 annual validation procedures have been performed for updated national datasets of CDDA as given in table below:
Procedure / Reason /Checking completeness of data in all tables / Many gaps are affecting data analysis and annual indicator production
Checking if there exist any change in field properties of all tables / If field properties are changed it may affect data merging as well as it refers to possibility that dataset prepared has not been used for updating
Search for changes in designation type / designation name in designations table / In case of type it should be also marked in to be deleted field or it can just be possible error
Search for new designations added into designations table / To keep record for statistical purpose
Checking difference in total area of sites per designation as taken from designations and sites tables / There should not exist difference, if there is it is error
Search for designations marked to be deleted in designations table / To keep record for statistical purpose and to delete them from master base
Search for designations deleted without marking them to be deleted in designations table / Designations should not be deleted before marking them to be deleted, that is error
Search for designations with duplicated designation code in designations table / Designation code should be unique and any duplication is error
Search for records with deleted site code, but not marked for deletion in sites table / Records should not be deleted before marking them to be deleted as they have unique site code
Search for sites with duplicated site code in sites table / Site code should be unique and any duplication is error
Search for newly entered sites without site code in sites table / Those records need site code to be added by CDDA administrators
Search for sites with missing national site code in sites table / As site code and national site code are key fields for sites table it is very important that there exists national site code for all records
Search for sites with duplicated national site code in sites table / National site code as key field should be unique and any duplication is error
Search for any change in national site code in sites table / All changes in key field may affect master base in merging, so they should be recorded
Search for records with missing ISO code in sites table / Simply question of completeness of both ISO3 and Parent ISO fields
Search for records with incorrect ISO code in sites table / Quality assurance for further analysis
Search for changes in site designation in sites table / If site designation has changed it is recorded in database as new site and also establishment year should be changed, also for recheck by country if the change is real and not error
Search for sites with missing designation in sites table / All sites should have a designation, if it is missing it is an error
Search for changes in site name in sites table / Just recording for statistics, but also if national site code is missing site name is next possible unique field for setting relations for analysis
Search for sites with duplicated name in sites table / Look previous, also feedback to country for rechecking if double names are real or mistake
Search for sites with missing name in sites table / If national site code is missing site name is next possible unique field for setting relations for analysis, also usually any site has name so it is for country to re-check if there is an error
Search for changes in site size in sites table / If site size is changed it refers to change in boundaries that should be checked accordingly
Search for sites with missing size in sites table / Any site if it exists should have a size, if it is missing it is an error (completeness of data)
Search for sites with missing NUTS code in sites table / Administrative NUTS code is important for EUROSTAT
Search for changes in IUCN category in sites table / If IUCN category is changed it may refer to change in designation and if it is not a case it may be error
Search for sites with missing IUCN category in sites table / IUCN category is very needed for statistical purposes (completeness of data)
Search for change in site year in sites table / Reason for changing establishment year may only be change in designation or mistake in previous database version
Search for sites with missing year in sites table / Checking the reason of missing data for establishment year, that is needed for statistical purposes (completeness of data)
Search for sites with missing altitude in sites table / Completeness of data and need for indicating habitat related information
Search for changes in coordinates in sites table / Change in coordinates may refer to change in site boundary and/or size, to be checked accordingly
Search sites with duplicated coordinates in sites table / Duplication, if is not error only refers to overlapping or coinciding, to be checked accordingly
Search for sites with missing coordinates in sites table / When boundary data is not available coordinates are only ones giving possibility to locate the site
Search for sites marked to be deleted in sites table / To keep record for statistical purpose and to delete them from master base
Search for records that have site codes from column SITECODE in table site relations with no reference to SITECODE column in table sites / Any result of that search refers to deletion of site from sites table, but leaving it still into site relations table, that is an error
Search for neighboring sites in site relations table / To be checked in accordance with boundary data
Search for sites containing other sites in site relations table / To be checked in accordance with boundary data
Search for sites overlapped by other sites in site relations table / To be checked in accordance with boundary data
Search for records from site habitats table having no reference in sites table / Any result of that search refers to deletion of site from sites table or just error
Search for records from sites table having no reference in site habitats table / Find out gaps in habitat data
Search for records from site habitats table with no reference in EUNIS habitats table / All habitat codes should come from EUNIS habitats table any other code is error
Search for sites from site habitats table with habitats coverage not 100% / Habitat coverage should always be 100% otherwise it is an error or a gap
Express number of habitats indicated per sites in site habitats table and total area of those sites / Statistics to show habitat diversity of sites given per size
· The aim of digital boundaries database checking was to determine are the delivered files from different countries comparable and whether the data can be merged into a consistent database (to create seamless pan-European boundary files). For that purpose, data were checked for map projections because merging different spatial databases into one spatial database needs information in which coordinate system data are stored. The data has also been checked for main geometrical errors (are polygons closed etc)
· Secondly boundary files were compared with CDDA database to find out whether all designated sites in CDDA database have reference features in boundary files. For the purpose of connecting digital boundary files with CDDA database, it was checked if the CDDA SITE_CODE or relevant field for CDDA is present in the boundary file attribute table.
· Preliminary comparison of main statistics (number and total area of sites) for designated areas present in the CDDA database and those present in the digital boundaries files delivered by countries was done. For that purpose areas were calculated for boundary files using ArcInfo software. Simple MS-Access and ArcInfo queries were used to create necessary statistics. Unique CDDA site code or, in case it was missing in boundary files, national site code was used for the relation between CDDA and boundary file tables. Only in case both unique codes were missing, name was used.
· Delivered GIS data (boundary files) were also checked on the basis of EEA’s Guide to geographical data and maps to find out whether and to what extent files are in accordance with this guide.
2. Conclusions and recommendations
· Database is improved a lot both in content and technically. Still there do exist some very crucial gaps in database, like siterelations table is empty, all references to legislation are missing and all altitude data is missing. Despite of mentioned gaps database is proposed to get high rank on technical point of view.
Some recommendations for next years update:
· Please consider to fill properly in fields “Law”, “Lawreference” and “Agency” in table “designations-fr”.
· There are still some gaps in database (Altitude, year, size, NUTS and coordinates). Please consider to fill the gaps.
· Please consider also to fill in tables “Siterelations” and “Sitehabitats”.
· If you have filled “EIONET Changing type”, please consider also to fill “EIONET Changing date”, “EIONET Edited by” and “EIONET Institute” in “sites-fr” table as reference to time and person who made changes.
Results for digital boundary analysis
At first, thank you for deliveries! All delivered files were generally correct.
There were problems with linking boundary files to CDDA database using name (see please main gaps). That’s why boundary files have been joined to CDDA database using national site code
Unfortunately only 4 boundary files have national site code. So it was possible to link only 4 boundary files of 10 to CDDA database. Therefore and because of possible coding errors in boundary file sic0111 only few analyses of boundary files have been carried out
Main gaps for digital boundaries
· CDDA site code is not present in boundary files and only some boundary files have national site code. Linking boundary files to CDDA database using name could cause problems because fields containing long text string may have typing errors, during conversion process some characters could change etc.
Mistakes or possible mistakes for digital boundaries
· Possible coding errors have been found. National site codes that match have different names in CDDA database and in boundary file sic011.
· More than one boundary file could have sites with same designation type. For example sites with designation type FR06 are presented in boundary files rb and sic0111.
· Only for 348 sites in CDDA database have been found reference objects in boundary files.
Suggestions for digital boundaries
· It might be easier to analyse and work with designated areas if different designations were stored as different layers or at least it should be checked that relevant id-field for separation of different designations into different layers is present
· To avoid any errors in matching boundaries and CDDA data unique CDDA site code field should be present in both data-tables.
· To make data from different countries comparable and analysable (e.g. to give result that would serve one of EEA headline indicator – total are designated per country) EEA Guide to geographical data and maps should be followed
3. Main statistics
Table / Number of records in old set / Number of records in new setDesignations-fr / 25 / 25
Sites-fr / 1538 / 1639
Siterelations-fr / 0 / 0
Sitehabitats-fr / 0 / 0
3.1 Field completeness
· Designations-fr
Column / Records filled in old / Records filled in new / Completeness %ISO3 / 25 / 25 / 100
DESIG_ABBR / 25 / 25 / 100
Category / 23 / 23 / 100
ODESIGNATE / 25 / 25 / 100
DESIGNATE / 23 / 23 / 93
Title – French / 25 / 25 / 100
CDDA sites / 25 / 25 / 100
Law / 0 / 0 / 0
Lawreference / 0 / 0 / 0
Agency / 0 / 0 / 0
Number / 15 / 15 / 60
Total Area / 14 / 14 / 56
Number reference / 8 / 8 / 32
Total Area reference / 7 / 7 / 28
Data Source / 7 / 7 / 28
Reference date / 8 / 8 / 32
Remark / 6 / 6 / 24
Remark source / 0 / 0 / 0
To be deleted / 25 / 25 / 100
· Sites-fr
Column / Records filled in old / Records filled in new / Completeness % /SITE_CODE / 1538 / 1538 / 94
SITE_CODE_NAT / 1531 / 1638 / 100
PARENT_ISO / 1538 / 1639 / 100
ISO3 / 1538 / 1639 / 100
DESIG_ABBR / 1538 / 1639 / 100
AREANAME / 1538 / 1639 / 100
SIZE / 1458 / 1554 / 95
IUCNCAT / 1538 / 1639 / 100
NUTS / 1479 / 1574 / 96
YEAR / 1537 / 1548 / 94
ALTITUDE_MIN / 0 / 0 / 0
ALTITUDE_MAX / 0 / 0 / 0
LAT_NS / 1523 / 1624 / 99
LATDEG / 1521 / 1622 / 99
LATMIN / 1521 / 1622 / 99
LATSEC / 1521 / 1622 / 99
LON_EW / 1523 / 1624 / 99
LONDEG / 1521 / 1622 / 99
LONMIN / 1521 / 1623 / 99
LONSEC / 1521 / 1623 / 99
LAT / 1521 / 1622 / 99
LON / 1521 / 1622 / 99
EIONET CHNG_DATE / 0 / 111 / 7
EIONET CHNG_TYPE / 282 / 392 / 24
EIONET EDITED_BY / 0 / 111 / 7
EIONET INSTITUTE / 0 / 111 / 7
NOTES / 0 / 3 / 0
To be deleted / 1538 / 1639 / 100
· Siterelations-fr