Naming Conventions

GISCO REFERENCE

GEODATABASE

NAMING CONVENTIONS

Title / GISCO Reference Geodatabase Naming Conventions
Creator / César de Diego Díez, Albrecht Wirthmann
Date last revision / 2006-10-09
Subject / Naming conventions for objects in the reference GISCO geodatabase
Status / Ongoing
Publisher / EUROSTAT
Type / Text
Description / Naming conventions for GISCO reference geodatabase
Contributors / -
Format / MS Word 95/2000 (doc)
Source / Not applicable
Rights / Public domain.
Identifier / Naming Conventions v2.0
Language / En
Relation / Not applicable
Coverage / GISCO reference database lifecycle duration

RECORD OF MODIFICATIONS

Version / Date / Author / Modified sections
1 / 13/2/2005 / CDD/AW / Creation
1.1 / 2/6/2005 / CDD / -
2.0 / 27/1/2006 / CDD / Revision of all chapters
2.1 / 2006-10-24 / CDD / Update of annex 3 "keywords" and annex 5 "class identifiers".
New convention for coded-value attributes (domains).
Revision of relationship names
Revision of attribute name "NAME"
2.2 / 2008-04-28 / CDD / Introduction of geometric entity "RL" for attribute tables (i.e. tables with no geometry or "objects" according to ESRI nomenclature) representing relational many-to-many relationships.
2.3 / 2009-08-31 / CDD / Text revision. "New" architecture is not "New" after 5 years.
New examples.

Content

GISCO database naming conventions

Background

Purpose of the document

General considerations:

Naming conventions

General rules

Feature datasets and data themes

Feature/object class & subtypes names:

Syntax rules

Class identification

Geometric entity

Additional information

Classes that implement one-to-one relationships

Classes that implement many-to-many relationships

Examples:

Attribute names

Primary key

Relationship names and role names in Arc GIS (Forward/Backward path label)

Domain names

Subtype names

GISCO database naming conventions

Background

Themigration of the GISCO database from a file based coverage structure to the new system architecture was an opportunity for the definition of a new database naming convention. In the Arc workstation file system, the coverage name was limited to a maximum of 13 characters. GISCO naming conventions were clearly defined, but they resulted into rather cryptic names due to the strict length limitations. For advanced users, it was easy to navigate through the database, whereas, beginners and occasional users could not easily find their way without constantly consulting the database manual.

After migrating to the new architecture based on ORACLE it is possible to extend the object name length and thus making these names more user-friendly. Moreover, relational databases follow a different logical structure than the coverages. Finally, new features, such as relationships or domains have been introduced. All these new opportunities and needs regarding naming convention have been tackled in this document.

Purpose of the document

To define the guidelines that will rule the naming of classes,attributes and other objects in the GISCO Geodatabase.

General considerations:

The aim of the naming conventions is:

to reflect the content and purpose of the database objects in a standardised, concise and user-friendly way;
to reflect the logical location of the objects within the database;
to assure uniqueness of the object name within the database.
to facilitate the multiplatform use of the data

A sequence of abbreviations is therefore used to describe the contents of a database feature.

The codes are grouped into code lists according to their meaning. Syntax rules define the sequence and the reading of the codes. The names of features, tables and attributes are composed according to the following categories:

Topic (Data themes, feature datasets, feature classes, object classes and subtypes are named according to their topic category)
GeometricEntity (The topological type of a feature or object, e.g. region, boundary, point)
Scale, Accuracy, Precision
Time stamp or Version
Source

The naming conventions describe naming rules for the following database features:

Feature datasets and data themes
Feature classes, object classes and subtypes
Relationships
Domains
Attributes

Long names are self explanatory, but become uncomfortable to deal with in programs, scripts, table headings, etc. Sensible and defined abbreviations in the object names can help to the readability of documentation and programming code.

The name of the features and objects in the geodatabase is not meant to be a subset of the metadata. These names must contain the minimum information required to uniquely identify the entity(class, object, relationship, etc) they represent

Naming conventions

General rules

The GISCO geodatabase architecture objects are hosted in different platforms:

- SDE tablesand attributes: Oracle Spatial database

- Feature datasets, subtypes, domains and relationships: ArcGIS

- Data themes: abstract

General restrictions: Each platform presents different naming rules. These are not conventions, but restrictions imposed by the technological platforms. To assure as much as possible the access to the database from different platforms, we will follow the most restrictive rules for all objects, i.e. the Oracle naming rules:

Names cannot be longer than 30 characters.
Names will contain only uppercase characters (A-Z) or _ (underscore).
Oracle reserved words cannot be used as object names (see annex 2)

General conventions: These rules apply to all named objects in the geodatabase.

The name will be structured in segments.
Each segment will have a maximum of four characters. A segment will be an abbreviation or an acronym of the words that better represent the object to be named
When a concept name to be abbreviatedalready exists in the database, the same abbreviation or an acronym should be used for the same concept. See annex 3 for a list of keywords and their respective abbreviations. This annex is intended to be updated with every new database development.
Too generic wordsas “type” or “class” should not be used in any segment of the name. (See annex 4)
Segments will be separated by a “_” (underscore)

On top of these rules, particular conventions are to be applied to the different objects in the geodatabase

Feature datasets and data themes

Alphanumeric data (object classes) and geometry (feature classes) will be conceptually grouped in data themes. This conceptdoes not make part of the geodatabase structure, neither in the Oracle DBMS nor in ArcCatalog.

A feature dataset is a collection of related feature classes, all of which share a common spatial reference system. These collections are recognised only by ArcGIS software.

Most features in the GISCO database will share the same reference system. Inconsequence they couldshare the same feature dataset. Nevertheless this approach makes very difficult to find a feature class browsing the ArcGIS catalog.For readability reasons each conceptual data theme will be implemented at least as one separated feature dataset.

The first step to define the names of the different objects in the data themeis to identifythe real world concept modelled in the data theme. The data theme name can be as long as desired. It represents an abstract concept and it is not bound to the limitation of name length in databases.

Every data theme will have associated a short name. The short name will have a maximum of 4 characters. The generic words “area”, “zones”, “location”, “patterns” etc will be disregarded when choosing the short name. Feature datasets (FDS) will be named after the short name of the data theme that they implement.

Again for readability sake, different versions of a given data theme can be implemented in different feature datsets. In this case a version or date segment will be included in the feature dataset name

Examples:

data theme(long name) / FDS name
Territorial Units for Statistics (NUTS + Statistical Regions) / NUTS_2003
NUTS_2006
Communes / COMM_2004
COMM_2006
Transports / TRAN_V2
TRAN_V3
Urban Audit Areas / URAU_2004
Airports / AIRP

A data theme will comprise at least one feature class or object class.

Feature/object class & subtypesnames:

Syntax rules

The information that will make part of the class name must be exclusively the information needed to uniquely identify the class we want to name. For this purpose we might need to include in the name (or not, depending on the data model) a segment to identify the version, scale, time stamp or source.

The class name (either geometry or just tabular data) will be built up by adding the following strings (segments), in this order:

Data theme short name (compulsory)
Class identification short name (If needed)
Geometric entity(compulsory)
Scale or precision: 100K, 1M, 200M(If needed)

K stands for “thousand”

M stands for “million” or “metres” (no lower case allowed)

- Version: Vxx(If needed)
- Time stamp: (If applicable and needed)
- Source.(If needed)

No spaces are allowed. The different segments will be joined by a “_“(underscore) and the total length of the name cannot exceed 30 characters.

The use of leading zeroes in the scale, version and time stamp segments should be considered in order to get a stable and logic sort of feature and object class names:

NUTS_RG_03M_2006

NUTS_RG_10M_2006

Class identification

In order to identify feature classes within a data theme, the class name may contain a class identification segment. The class identificationwill appear only if the data theme name and geometric entity name (defined in next paragraph) do not uniquely identify a feature or object class (See the example a few lines further).

Every class identifier will have a short name associated. This short name will have a maximum of 4 characters.

The class identification must be chosen based on concepts essential to the class. Scale or time stamps are not essential to any class. Examples:

- City: (Short name: CITY)

- Condominium: (Short name: COND)

- Custom (Short name: CUST)

The class identification will always be a singular noun.

Example:

The data theme “Urban Audit” models 5 different entities, hierarchically structured: Subcity Districts (levels 1 and 2), Cities, Kernels and Large Urban Zones. Therefore the class identifications are needed:

- URAU_LUZ_RG (Urban Audit Large Urban Zone - Multipart polygon topology)

- URAU_KERN_RG (Urban Audit Kernel – Multipart polygon topology)

- URAU_CITY_RG (UrbanAuditCity – Multipart polygon topology)

- URAU_CITY_PT (UrbanAuditCity – Point topology)

- URAU_SCD1_RG (Urban Audit Sub-City District level 1 – Multipart polygon topology)

- URAU_SCD2_RG (Urban Audit Sub-City District level 2 – Multipart polygon topology)

Geometric entity

The geometric entity describes the type of geometric representation chosen to model a certain feature. The description of the geometric entity is mandatory. The geometric entity is abbreviated with 2 characters.

The following table lists the keywords that should be used for describing the type of geometric entity.

Short Name / Long Name / Description / Example
PL / Polygon / A closed, two-dimensional figure with at least three sides that represents an area. It is used to describe spatial elements with a discrete area, such as parcels or political districts. / Lake polygon
RG / Region / Area feature that can represent a single area feature as more than one polygon (multipart polygons). / NUTS regions
BN / Boundary / Line feature separating polygon or region features / NUTS boundary
LI / Line / Line feature representing a geographical entity / Rivers
NW / Network / An interconnected set of lines representing possible paths from one location to another (routing aspect) / Maritime routes
LB / Label / Point feature, used as reference of a polygon / Centroid of NUTS region
PT / Point / Feature modelled as point / Settlement
ND / Node / End point of a line or network feature / Road junctions
AN / Annotation / Text feature for annotating a map / Ocean names
RT / Route / Linear feature specifying a path through a network
GR / Grid / A data format for storing raster data that defines geographic space as an array of equally sized square cells arranged in rows and columns. Each cell stores a numeric value that represents a geographic attribute (such as elevation) for that unit of space. / Digital elevation model
IM / Image / A raster-based representation or description of a scene, typically produced by an optical or electronic device, such as a camera or a scanning radiometer. / Satellite image
AT / Attributes / Alphanumeric tabular information (without geometry) relevant to some geographical feature in the GeoDatabase. / NUTS region attributes.
RL / Relationships / Tabular definition (without geometry) of a many-to-many relationship. / Custom - country.

Additional information

Additional segments are often needed in order to uniquely name a table. These additional stringsmight refer to thescale, precision, version or time stamp and the source.

Classes that implement one-to-one relationships

The attributes of an object class having a 1-to-1 relationship to a feature class will often be integrated in the feature class attribute table, i.e. the object class will not exist. In the following real example, the Large Urban Zone geometry and all its attributes are integrated in one single feature class:

This is not applicable when the object class has a 1-to-1 relationship to at least two feature classes (for instance, different generalisations of the same real world entity or real world entities modelled as points and polygons). In these cases redundancy should be avoided and the attributes will be either integrated in one of the feature classes or preferably separated in an object class. In this last case the name of the object class will be all the common concepts identified in the attributed feature classes. For instance:

Model 1a: Bad design with high level of redundancy

Model 1b: Attribute redundancy suppressed

The information about the real world entity "Settlement" in model 1bis split without redundancy in four different classes: STTL_RG_01M_V3 (multi-part polygons, 1:1M scale), STTL_RG_250K_V3 (multi-part polygons, 1:250 000 scale), STTL_PT_V3 (point topology) and STTL_AT_V3 (Attributes). All the information about one settlement can be put together by means of its unique identifier (STTL_ID) which is the same for a given settlement (real world entity) in all four classes.

Classes that implement many-to-many relationships

The many-to-many relationships are physically stored as an object class. The name of the object class that represents a many-to-many relationship will be built up by the following segments:

-First end class/subtype short name

-Second end class/subtype short name

-A noun (when possible a “-ship” noun) that gives a general description of the relationship

When both ends belong to the same data theme, the common segments will appear only once:

-The data theme short name should be omitted.

-If both have the same time stamp and/or version they should appear only once in the second end

The noun for describing the relationship has a maximum length of 8 characters. If the resulting table name exceeds 30 characters, the noun will be shortened to reach the maximum number of characters allowed.

It is not necessary to create an artificial ID for these tables-

Examples:

Class identifications not needed:

The data theme “NUTS” models only NUTS regions. The class identification can be dropped:

NUTS_NUTS_RG (wrong)

NUTS_RG (Correct)

Class identification required:

The data theme “Urban Audit” models 4 different entities: Sub-city Districts, Cities, Kernels and Large Urban Zones. Therefore the class identifications are needed:

- URAU_CITY_RG (Urban Audit -City – Multipart polygon topology)

- URAU_LUZ_RG (Urban Audit - Large Urban Zones - Multipart polygon topology)

- URAU_CITY_PT (Urban Audit - City – Point topology)

Generalisations

- COAS_LN_01M (coastline, line topology, scale 1:1m)

- COAS_LN_100K (coastline, line topology, scale 1:100 000)

Time or version stamps

STFD_PL_2000_2006(Data theme – Feature class identification – time stamp) Structural Funds. Since only one scale version is available, then “01M” is not strictly needed. This information can be found in the metadata. Nevertheless the scale segment can be included when there are chances to develop generalisations of the dataset to be hosted in the geodatabase in the near future)

In general, version and time stamp will only exceptionally appear at the same time. Since it is more user-friendly, time stamps are preferable to versions.

The source name will rarely be needed.

A list of “Class identifications” and “class identification short names” is available in annex 5. It must be updated after every new geodataset developmen. Before defining a new “class identification”, it should be verified that none of the existing ones is suitable for the new class. Geometric entity namesshould not be used as class identifiers, e.g. label, regions, etc.

Attribute names

The common object oriented nomenclature concatenates class and attribute to uniquely identify an attribute(example: “AIRP_PT.ALT”). This way, an attribute name can be repeated in different feature or object classes. The attribute name will omit all references to the feature class identification, scale or time stamp that are already included in the feature or object class name. Example:

Feature class name: AIRP_PT

Good attribute name:ALT

Bad attribute name:AIRP_ALT

Primary key

The first step to start the naming of attributes will be to identify the conceptual “Primary key”: The primary key will be named after the data theme identifier short name + “ID”. When this is not sufficient, the name will be “Data theme name – Feature/object class identification – ID”

Examples:

- NUTS_ID

- URAU_CITY_ID

Several classes might have the same name for the primary key. This will be correct when the classes are models of the same entity in the real world. For instance, cities in the Urban Audit are modelled as points (URAU_CITY_PT), as regions (URAU_CITY_RG) and as tabular information (URAU_CITY_AT). All these classes will have the same primary key: URAU_CITY_ID

All classes, except relationship classes, will have an explicit ID attribute. Primary keys in relationship classes are by definition built out of two or more attributes. In this cases there is no added value in a explicit ad-hoc unique identifier attribute.