Dr. Bjarne Berg DRAFT
Bedrock Requirements
Ralph Kimball has introduced six bedrock requirements for a data warehouse. The first is that the data warehouse must provide access to information. This requirement is enlarged by also encompassing the ability to easily find information, direct access (not through 3rd person), quick access through acceptable query performance, and finally that the system provides simple navigation for analytical purposes.
The second requirement is that the data in the data warehouse is consistent. This does not only mean that the data is accurate, but also that the data can be reconciled across systems as well. The data must also have the same meaning so that the field “customer” is populated with the same type of data as in other systems containing that attribute (data definition standards). The data must also use the same calculations for key figures. If “net sales” are calculated one way in the data warehouse, it must be consistent across all data stores regardless on the business units or organization. Finally, Kimball’s’ definition of consistency also require user notification about any data latency issues.
The thirds bedrock requirement is that the data can be accessed and pivoted through slicing and dicing. This is a process whereas the fields of a dimension are used to pass constraints on the data I.e. “show me the sales for this region”, followed by “show me the costs of sales in the stores in the region I just selected”. This example demonstrated both drill-down (from region to store) and slicing of the data through the review of cost of sales instead of the sales amount.
The forth requirement is that the data warehouse does not consist of data alone. It is a total architecture of tools and structures and should include query tools and presentation tools to transform the data to information that can be analyzed by the end users.
The fifth requirement is that the data warehouse is used as a publishing hub for “old” data. This publishing idea mean that the data providers are responsible for only making available data that can be used by others without having to worry about data quality, missing elements and missing records. Kimball introduces the concept of a “data quality manager” in the DRM that is responsible for the “editing” and release of only high quality data to the organization.
The last bedrock requirement is that the data is actionable. This means that the data can be used by managers to change the way they do business due to better insights. The ‘buzz word’ for this is Business Process reengineering. The implications are that the organization must capture complete sets of data in order to act on it later. Without a partial view of the activities in an organization, managers can be led astray or be unable to drill into the causation of events and market changes. Kimball therefore advocates that the organization collects the data it has, takes a look at it and then determines the impact of the missing data elements. This, he argues, will lead organizations to change their business processes and also to start capturing more valuable information about their processes. However, if the data is not made actionable due to data inconsistencies or missing data points, the organization cannot act and the value of the data warehouse is diminished.
While there are more bedrock requirements for a data warehouse (i.e. fault tolerance of the hardware, change management, communication, training and documentation), Kimball argues that these are embedded in the six items listed in his book. However, a strong case can be made to address each of these additional requirements as “bedrock” requirements as well...
PS! It is worth noting that Kimball in an August 1996 article in DBMS on-line wrote “The ODS must be the bedrock of the data warehouse itself”, but did not include this observation in his book...
1