Country / NSI / Statistics Netherlands (CBS)
Contactperson / Harry Goossens
Status / Operational; following development phase is started 2012
General Description
PRACTICE:
CBS has implemented the so called Data Service Centre (DSC), which is a
  • Fundamental corner stone/element of the CBSBusiness Architecture
  • Central ‘vault’ with Steady States, linking:
-statistical data (facts & figures)
-conceptual metadata (description)
-technical metadata (user’s guide)
-documentation
  • Implementation of the Dutch metadata model
  • Generic Services for data exchange between statitical processes.
The Data Service Centre is implemented for SBS-statistics and archiving of and in use since 2010 (?)
WHICH PROBLEMS DID YOU WANT TO RESOLVE WITH THE PRACTICE ?
  • Improve data consistency:
-Every following process must use the same dataset, no local versions
-metadata = mandatory, no more datasets without metadata
  • Improve flexibility
-Enabeling independent, generic process design
-Make data linkable in a standardised way
  • Secure the statistical proces:
-Each steady state is a guaranteed fall back point
  • Maximise re-use of datasets:
FUNDAMENTALS PRACTICE
Basic concepts:
  • Storage of DATA (steady states) after eacht processing step AND METADATA (no data without metadata !)
  • Based upon dedicated CBS metadata model
  • Strict distinction between the data that are actually processed and the metadata that describe the definitions, the quality and the process activities
  • Steady states are explicitly designed for re-use.
  • The metadata (of steady states) are generally accessible and are standardised as much as possible
Steady States:
  • ‘frozen’ final status of a dataset, defined in the process design phase;
  • data set together with information for its correct interpretation:
    represented by 2 metadata objects:
-dataset design (like a template of a table, only borders and heading)
-dataset (the actualk statistical data)
  • 1 Dataset design, n Datasets
  • Rectangular
-Rows represent units (micro) or classes of units (macro)
-Columns represent variables
It offers generic services:
  • Metadata Catalogue: searching & finding
  • Metadata coordination
  • Centralised data distribution
  • Authorisation management
  • Automatic process interfacing (in development)
  • Archiving of statistical datasets
Organisational status:
  • pilot devision, 4 people
  • 2012 regular devision, 5 people
  • Growing interest, users begin to see benefits
  • Push from architecture, redesign projects
The system is developed with the standard tool Documentum (also used outside Statistics Netherlands), which is completely object oriented and making it possible to implemented the CBS metadata model (objects with attributes).
Characteristics / Mix of perspectives/concepts, based upon Business architecture, which strictly splits:
a)DSC-concept
-Passive
-Datamodel oriented – steady states concept
b)Generic process design
-Active
-Processmodel oriented
Problems
Which problems did you encounter so far ?
  • Development time considerably higher as initially foreseen, caused by too ambitious aims when starting.
  • Metadata Quality essential for succes
-high priority issue
-using guidelins/rules (ISO 11179)
-Poor / no metadata available
  • Metadata coordination very difficult
  • Processes first need to be (re)designed according to CBS business architecture
  • Shortage of capacity
  • Concept of steady states not clear, multiple interpretations
  • Acceptance of DSC by business
    -new way of working
    -no direct benefit for producers
  • Additional workload
    -meta should be made during design
    -migration of allready existing datasets leads
    to extra work for production (with less capacity

Desired solutions
Most important solutions desired by NSI (CBS):
  • Generic way of processing, using steady states
    (based upon CBS business architecture)
  • Provide 1 central point for data exchange between processes
  • Maximal re-use of statistical data
  • 1 central metadata catalogue,
  • metadata coordination

Role of metadata
What kind of metadata used
  • conceptual, describing metadata,
    stored and maintained in a central meta catalogue
  • technical metadata in seperate, standard XML files
  • process meta as seperate documentation, no standard yet
Is there a metadata model
  • Yes, generic model, dedicated to:
  • Describe datasets, including process and Q-metadata
  • Search & find datasets (catalogue)
  • Treats micro data and macro data differently
  • Inspired by Swedish model and Neuchatel (among others)

Role / Position BR
  • Outside S-DWH
  • Data made linkable
  • Monthly snapshot as population frame

1