Contactperson / Harry Goossens
Status / Operational; following development phase is started 2012
General Description
PRACTICE:
CBS has implemented the so called Data Service Centre (DSC), which is a
- Fundamental corner stone/element of the CBSBusiness Architecture
- Central ‘vault’ with Steady States, linking:
-conceptual metadata (description)
-technical metadata (user’s guide)
-documentation
- Implementation of the Dutch metadata model
- Generic Services for data exchange between statitical processes.
WHICH PROBLEMS DID YOU WANT TO RESOLVE WITH THE PRACTICE ?
- Improve data consistency:
-metadata = mandatory, no more datasets without metadata
- Improve flexibility
-Make data linkable in a standardised way
- Secure the statistical proces:
- Maximise re-use of datasets:
Basic concepts:
- Storage of DATA (steady states) after eacht processing step AND METADATA (no data without metadata !)
- Based upon dedicated CBS metadata model
- Strict distinction between the data that are actually processed and the metadata that describe the definitions, the quality and the process activities
- Steady states are explicitly designed for re-use.
- The metadata (of steady states) are generally accessible and are standardised as much as possible
- ‘frozen’ final status of a dataset, defined in the process design phase;
- data set together with information for its correct interpretation:
represented by 2 metadata objects:
-dataset (the actualk statistical data)
- 1 Dataset design, n Datasets
- Rectangular
-Columns represent variables
It offers generic services:
- Metadata Catalogue: searching & finding
- Metadata coordination
- Centralised data distribution
- Authorisation management
- Automatic process interfacing (in development)
- Archiving of statistical datasets
- pilot devision, 4 people
- 2012 regular devision, 5 people
- Growing interest, users begin to see benefits
- Push from architecture, redesign projects
Characteristics / Mix of perspectives/concepts, based upon Business architecture, which strictly splits:
a)DSC-concept
-Passive
-Datamodel oriented – steady states concept
b)Generic process design
-Active
-Processmodel oriented
Problems
Which problems did you encounter so far ?
- Development time considerably higher as initially foreseen, caused by too ambitious aims when starting.
- Metadata Quality essential for succes
-using guidelins/rules (ISO 11179)
-Poor / no metadata available
- Metadata coordination very difficult
- Processes first need to be (re)designed according to CBS business architecture
- Shortage of capacity
- Concept of steady states not clear, multiple interpretations
- Acceptance of DSC by business
-new way of working
-no direct benefit for producers - Additional workload
-meta should be made during design
-migration of allready existing datasets leads
to extra work for production (with less capacity
Desired solutions
Most important solutions desired by NSI (CBS):
- Generic way of processing, using steady states
(based upon CBS business architecture) - Provide 1 central point for data exchange between processes
- Maximal re-use of statistical data
- 1 central metadata catalogue,
- metadata coordination
Role of metadata
What kind of metadata used
- conceptual, describing metadata,
stored and maintained in a central meta catalogue - technical metadata in seperate, standard XML files
- process meta as seperate documentation, no standard yet
- Yes, generic model, dedicated to:
- Describe datasets, including process and Q-metadata
- Search & find datasets (catalogue)
- Treats micro data and macro data differently
- Inspired by Swedish model and Neuchatel (among others)
Role / Position BR
- Outside S-DWH
- Data made linkable
- Monthly snapshot as population frame
1