Data Modeling
Top of FormITEM TITLE
1)What is a Data Model
2)The History of Data Modeling
3)Data Modeling Explained
4)Data Warehouse Glossary
5)Entity Relationship Model
6)Physical Data Models
7)Tips for Mastering Data Modeling
8)Role of Data Modeling within Enterprise Management
9)Data Modeling Overview
10)Data Modeling Introduction
11)Connection between Data Model and Data Warehouse
Bottom of Form
What is a Data Model?
Quite simply, data models are abstract models whose purpose is to describe how data can be used and represented effectively. The term “data model” is, however, used in two different ways. The first is in talking about data model theory – that is, formal descriptions of how data can be used and structured.
The second is in talking about an instance of a data model – in other words, how a particular data model theory is applied in order to make a proper data model instance for a specific application.
Data modeling refers to the process where by data is structured and organized. It is a key component in the field of computer science. Once data is structured, it is usually then implemented into what is called a database management system. The main idea behind these systems to manage vast amounts of both structured and unstructured data. Unstructured data include documents, word processing, e-mail messages, pictures, and digital video and audio files. Structured data – what is needed to make a data model (via a data model theory) – is found in management systems like relational databases. A data model theory is the formal description of a data model.
In the development of software, the projects may focus on the design of a conceptual data model, or a logical data model. Once the project is well on its way, the model is usually referred to as the physical data model. These two instances – logical and physical – represent two ways of describing data models. The logical description focuses on the basic features of the model, outside of any particular implementation. The physical description, on the other hand, focuses on the implementation of the particular database hosting the model’s features.
Now let’s take a look at the structure of data. This is what the data model describes within the confines of a particular domain and, as it implies, the underlying structure of that specific domain. What this means is that data models actually specify a special “grammar” for the domain’s own private artificial language.
Data models are representations of different entity classes that a company wants to possess information about, containing the specifics behind that information, and the relationship among the differing entities and attributes. The data may be represented in a different fashion on the actual computer system than the way it is described in the data model.
The entities, or types of things, represented in the data model might be tangible entities, but models with entity classes that are so concrete usually change over time. Abstractions are often identified by robust data models. A data model might have an entity class marked “Persons,” which is meant to represent all the people who interact with a company. This abstract entity is more appropriate than ones called “Salesman” or “Boss,” which would specify a special role played by certain people.
In a conceptual data model, the semantics of a particular subject area are what is described. The conceptual data model is basically a collection of assertions about the type of information that is being used by a company. Entity classes are named using natural language, as opposed to technical jargon, and concrete assertions about the subject area benefit from proper naming.
Another way of organizing data involves the use of a database management system. This involves the use of relational tables, columns, classes, and attributes. These models are sometimes called “physical data models,” but in the use of ANSI three schema architecture, it is referred to as “logical.” In this type of architecture, the storage media is described in the physical model – cylinders, tablespaces, tracks, etc. It should be derived from the more conceptual model. There might be slight differences however, for example in the accounting for usage patterns and processing capacity.
Data analysis is a term that has become synonymous with data modeling. Although in truth, the activity seems to have more in common with synthesis than analysis. Synthesis, after all, refers to the process whereby general concepts are inferred from particular instances; in analysis, the opposite happens – particular concepts are identified from more general ones. I guess the professionals call themselves systems analysts because no one can pronounce systems synthesists! All joking aside, data modeling is an important method where by various data structures of interest are brought together into one cohesive whole, relating different structures into relationships and thereby eliminating redundancies – making everyone’s lives a lot easier!
The History of Data Modeling
The programming of computers is an abstract realm of thought. In the ‘70s, it was thought that people would benefit from an increased use in graphic representations. On the side of process, flow charts led to data flow diagrams. Then, in the mid-70s, entity relationship modeling was created as a means of graphically representing data structures.
Entity relationship models are used during the first stage of information system design in order to elucidate types of info that are needed to be stored in the database during the phase of requirements analysis. Any ontology can be described via the data modeling technique for a specific area of interest. If the information system being designed is based on a database, then the conceptual data model will later be mapped on to a logical data model, which in turn will be mapped on to a physical model during the physical design process. (Sometimes both of these phases are referred to as “physical design.”)
Object oriented programming has been used since the 1960s in the realm of writing programs. In the very beginning, programs were organized based on what they did – data was only attached if necessary. Programmers working in this area would organize their work based on the terms that the data’s objects described. For real time systems, this was a major breakthrough. In the 1980s, it broke in to the mainstream data processing scene. It was during this time that graphic user interfaces were able to introduce object-oriented programmers to commercial applications. The problem, they realized, of defining requirements would enormously benefit from an insight in to the realm of objects. The idea of object models was introduced – without the acknolwedgment of the fact that systems analysts had already discovered similar models.
In the ‘80s, a significantly new approach to data modeling was engineered by G.M. Nijssen. Deemed NIAM, short for “Nijssen’s Information Analysis Methodology,” it has since been re-named ORM, or “object role modeling.” The purpose is to show representations of relationships instead of showing types of entities as relational table analogs. With a focus on the use of language in making data modeling more accessible to a wider audience, ORM has a much higher potential for describing business regulations as well as constraints.
In recent times, agile methodologies have arisen to the forefront of data modeling. Central to these methodologies is the concept of evolutionary design. From this standpoint, when confronted with a system’s requirements, you acknowledge that you cannot fix them all up front. It is not practical to have a detailed designing phase at the very beginning of a project. Instead, the idea is that the system’s design must evolve throughout the software’s numerous iterations.
People tend to learn things by experimenting – trying new things out. Evolutionary design recognizes this key component of human nature. Under this concept, developers are expected to experiment with ways of implementing a particular feature. It may take them several tries before they arrive at a settled method. It is the same for database design.
Some experts claim that it is virtually impossible to manage multiple databases. But others claim that a database manager can easily look over a hundred database instances with relative ease, given the possession of the right tools.
While experimentation is important, it is also important to bring the different approaches one has tried out back together into an integrated whole every once in a while. For this, you need a shared master database, one that all work flows out of. When beginning a task, be sure to copy the master into your own workspace.
Then you can manipulate it and enter the changes back into the master copy. Be sure to integrate at least once a day.
Integrating constantly is key. It is a lot easier to perform smaller integrations frequently as opposed to occasionally performing larger integrations. The difficulties of integration tend to increase exponentially with each integration’s size. So in other words, doing a bunch of small changes is a lot easier in practice, even though it might not seem to make a lot of sense. The Software Configuration Management community has taken note of similarities to this practice when dealing with source code.
Data Modeling Explained
Data modeling is a computer science term that is used to describe the process of generating a data model. A data model will be generated by applying a special theory that is known as the data model theory. The data model theory will be used to create an entity that is known as a data model instance.
When you go through the process of data modeling, you are essentially organizing data, as well creating a structure for it. Once the data has been organized, it will be placed in a DBS, or database management system. When data is organized, the modeling process will generate limitations that is placed on the structure of the data. One of the primary functions of information systems is to manage large amounts of data that is structured, as well as unstructured.
Data models will typically deal with structured data that will be used within relational databases. Data models will rarely be used to deal with unstructured data, and an example of this type of data would be pictures, video, or documents that are created in word processing programs. When a software product is in the early stages of development, a great importance will be placed on the structure of a conceptual data model. It is possible to take this design architecture and transform it into a data model that is logical in nature. At the later stages of the software development process, this may be transformed into a data model that is physical. The data model will commonly be described in two ways, and this is physical or logical.
The physical picture of a data model deals with the implementation of a specific database, a database that will host the model. The logical picture of a database will deal with generic aspects of the model, and it will not be concerned with any specific implementation. The structure of the data also plays an important role in data modeling. The data model will be responsible for providing a description of the structure of the data. It will also deal with the primary structure of the domain. As you can seen, data models play an important role in the structures of both domains and data. A data model could also be described as an entity that will symbolize classes of various objects.
This will be closely related to information. It could be the information that a company stores, as well as the characteristics that are inherent in that information. The relationships of these characteristics will often be taken into consideration as well. How the data is presented in the computer system is largely irrelevant. The data model will place an emphasis on providing information about how the data is organized. While the objects that the data models represent may be tangible, the models that deal with concrete classes will often change over a certain period of time. If the data model is highly robust, it may be possible for it to find abstractions for these objects.
If a data model is conceptual, it can be used to showcase the semantics of various topics. It can be presented as a collection of assertions that are made about the function of the information, information that may be used by various companies or organizations. Many of these classes will be used with common words rather than the technical terms that are common in the data modeling field. This is important, because giving the proper names to relationships will allow strong assertions to be made about various areas.
Another concept that you will want to study is generic data modeling. Distinct modelers will produce numerous models for the same domain. This can make it hard to bring together models from distinct people or entities.
Despite this, it should be noted that the differences are related to the varying levels of abstraction that occur in the models. If an agreement can be made among the modelers for specific elements that are concreted, the differences between the entities can be less emphasized, and they can be rendered with a higher level of detail. Data models have played important roles in the functions of many database management systems, and they have become more important as we move further in the information age. Companies that understand how to properly use data models will greatly benefit. There are a large number of fields where data modeling technique are very useful.
Data Warehouse Glossary
Because of the complexity surrounding data warehouses, there are a number of terms that you will want to become familiar with. While there are too many terms to present in this article, I will go over the fundamental terms that you should know. Understanding the terminology surrounding a data warehouse will make it easier for you to learn how to use it effectively, and it will make communication with your peers easier.
Access - Access can be defined as the process of obtaining data from the databases that exist within the data warehouse. It is a fundamental term, and it is one that everyone who works with a data warehouse should know.
Ad hoc query - This is a request for data that cannot be prepared for in advance. The ad hoc query will generally be comprised of an SQL statement that has been built by a skilled user. It will generally be composed of a data access tool.
Aggregation - The procedure in which data values are grouped with the goal of managing each data unit as a single entity. One good example of this would be multiple fields from one customer being combined into a single unit from numerous places.
Analysis - The analysis occurs when a user takes data from a warehouse and takes the time to study it. This is a fundamental concept, since studying the data will allow the user to make important business decisions.
Anomaly - An anomaly is a situation where a user gets a result that is unexpected or strange. It may also be known as a data anomaly. One of the most common scenarios in which an anomaly will occur is when a data unit is defined for one specific purpose but used for another. One example of an anomaly is when a number has either a negative value, or it has a value that is too high for the entity it represents.
Architecture - The architecture is the underlying structure for the data warehouse. It represents the planning of the warehouse, as well as the implementation of the data and the resources that are used to deal with it. The architecture of a data warehouse can be broken down into technologies, data, and processes. The architecture is a blueprint, and it will provide a description for the data warehouse and its environment.
Atomic data - As the name implies, atomic data is data that has been broken down into its most simple form, much like matter is broken down into simple atoms and other subatomic elements.
Attributes - This is a term that is closely related to data modeling as well as data warehouses. It deals with the characteristics that a piece of data will have. Each unit will have its own unique values, and when a logical models are transformed into a physical model, these entities will be transformed into tables. The attributes themselves will be transformed into columns.
Back-end - A back-end can be described as filling up the data warehouse with data that comes from a system that is operational.
Best of Breed - This term is used to refer to the most power products that fall under various categories. When an organization chooses their tools, they will find that some are better than others. By choosing the best products from the best vendors, the efficiecny of the data warehouse will be greatly increased.
Best practices - The best practices can be defined as the processes which maximizes the companies use of the data warehouse.
Business analyst - An analyst is a person who is responsible for studying the data and the operations that are used to maintain it.
Business Intelligence - Business intelligence is an important concept that deals with the evaluation of business data. It deals with both databases and various applications. Business intelligence is a broad term that deals with a large number of topics. Some of them include data mining and alternative forms of storage.