Chapter 12 Designing Databases: Logical Data Modeling

Chapter 12

Designing Databases

Chapter Objectives

Chapter 12 introduces students to database design, discussing both logical and physical database design. During logical design, logical data models are created for each known user interface, the logical models for each interface are integrated into a consolidated logical database model, the application’s conceptual E-R data model is translated into normalized data requirements, and the logical database model is then integrated with the translated E-R model. During physical design, decisions about data types, data structures, file organizations, and media are made.

This chapter introduces the relational data model, the most common notation used for representing detailed data requirements necessary for database design. Concepts of the relational data model, normalization principles for creating relational models with desirable properties, a process for combining different relational data models into a consolidated relational data model, and how to translate an entity-relationship data model into a relational data model are presented.

Chapter 12 reviews several choices systems builders have for the design of physical data storage structures. Chapter 12 emphasizes those decisions for which a systems analyst is most likely to have input, as opposed to the very technical data structure decisions made by database administrators and analysts. You should emphasize to your students throughout this section that physical design issues are addressed as part of a team of system analysts and technology specialists. Systems analysts bring an in-depth understanding of the application to the table, whereas technology specialists intimately understand the relative efficiency, security, integrity, and reliability of different technologies in different settings. To communicate with these specialists, systems analysts must have a sound knowledge of physical design issues, which is accomplished from studying the chapters in this section. Some students are attracted to the more technical topics (possibly because of prior exposure to computer science topics), and others are interested in the more methodological and managerial issues. Thus, another challenge of teaching Chapter 12, as well as Chapters 15 and 16, is keeping a balance of business and technical issues so that you hold the attention of all students.

Chapter 12 provides a transition from typical systems analysis topics to data analysis topics; these topics are often discussed in database management courses. You should coordinate the teaching of this chapter with database management course faculty so that purposeful redundancy occurs and important topics are not missed across the courses in your curriculum.

Instructional Objectives

Specific student learning objectives are included at the beginning of the chapter. From an instructor’s point of view, the objectives of this chapter are to:

1. Show the relationship between systems analysis and design and database design. The philosophy of this textbook is that database design is a topic of joint interest between systems analysts and database specialists. In most cases, considerable interaction between conceptual and logical database design exists, combining the top-down approach of conceptual data modeling with a bottom-up approach using logical data modeling tools. Application area knowledge and enterprise database knowledge (often found in a data administration group) are necessary to converge to a viable logical design for an application’s database. Remember, an application’s logical database design does not imply a separate physical database for that application, but rather only a separate view of data, which may be part of a more comprehensive physical database.

2. Show the relationship between logical database design and physical database design. An important point under this objective is that it is the job of the analyst to capture during prior systems development phases--especially analysis and logical design--all of the parameters needed to make physical system design decisions. Although some information is not necessary for the techniques of prior stages, this information (such as field length, data integrity requirements, and an estimated number of entity instances) is essential for physical system design.

3. Present the relational data model as a logical data model that captures the structure of data in a very fundamental, stable form and suggests ways to organize data during physical database design, resulting in desirable data maintenance properties (which avoid certain data anomalies).

4. Show students how a conceptual data model can obscure some details about data requirements that must be better understood in order to perform physical database design.

5. Show students, using an example from Hoosier Burger, how to translate a conceptual data model into a logical data model and how to incorporate the data requirements of specific system outputs into the process of forming a logical data model.

6. Improve the ability of students who will take systems analyst jobs to communicate with technical specialists on systems development teams.

7. Emphasize the distinction between logical and physical system design by coverage of such topics as denormalization; such topics clarify the different objectives of logical and physical design (which is basically semantic richness of describing information requirements versus efficient and secure data processing).

8. Discuss database design for Internet-based electronic commerce applications.

Classroom Ideas

1. This chapter, like Chapter 10, covers a topic addressed in most database management courses. Depending on your curriculum, this chapter may review previously covered material or introduce material covered (in more depth) in a subsequent course. However, logical database design is not strictly a database topic, but is essential for thorough systems analysis, thus it is an activity that should not be assigned to only specialists (database analysts). Although you are strongly encouraged to cover this chapter in your systems analysis and design course, you should coordinate how you address this topic with those who teach database courses. Chapter 12 is carefully written for the systems analysis and design student. This chapter is an excellent refresher for students who have studied the relational data model and normalization and provides a solid introduction to these topics for those students who will address this topic later in a database management course.

2. Emphasize to your students that logical data modeling is still technology independent. A logical data model is not necessarily mapped on a one-to-one basis into a physical database design. The purpose of logical database modeling is to prepare the description of stored data requirements into a format that makes it easier for physical database design decisions to be made. Students may want to go directly from conceptual data modeling to physical database design, so spend some time discussing the purpose of logical database modeling.

3. Review the key steps of logical database modeling. It is important that novice data modelers understand that all four steps are necessary to produce a thorough logical data model as input to physical database design.

4. Understanding the relational data model is critical for doing logical database modeling. The relational model is fairly simple, and since most students have experience with a PC database management system, this model is intuitive for them. Emphasize the five properties of relations and the concept of anomalies.

5. You can introduce normalization to your students from two perspectives. One approach is to introduce first, second, and third normal forms and teach your students to transform unnormalized data into third normal form by stepping through each normal form in sequence. Another approach is to emphasize functional dependencies and determinants (see Problem and Exercise 6). Use whichever approach is most comfortable for you; this chapter supports either approach.

6. Consider spending a significant portion of your class periods (allocated to Chapter 12) working problems that show how to translate between E-R and relational data models. Table 12-1 is a compact summary of how to map E-R constructs into relational constructs. Your students should become competent with translating in either direction. Problems and Exercises 2, 3, 4, and 6 are suitable for in-class exercises, but you should create other examples. Work a few examples for your students, and then have your students work (either individually or in small teams) on several problems in class and then present their answers. Practice is the best teacher of both normalization and translating between relational and E-R models.

7. Emphasize to your students that most E-R models developed during analysis are incomplete since system inputs and outputs are not designed in detail until logical design. Use this explanation to motivate the need for view integration. Be sure to discuss the potential pitfalls (view integration problems) that make view integration more than a mechanical process. Again, use many examples; Problem and Exercise 4 is a fairly simple one that students can work on inside or outside of class. Ask your students why view integration problems arise; for example, sample problems include independent analysts or project teams of different subsystems with slightly different data semantics, the integration of multiple independently-developed applications coming together to create an enterprise data model, and imprecision or lack of naming standards by analysts.

8. You should review in class the Hoosier Burger example found in the “Logical Database Design For Hoosier Burger” section. This example illustrates how to deal with both translating an E-R model into 3NF relations as well as integrating specific system requirements into the data model during logical database design.

9. You should emphasize to your students that physical database design is technology dependent, as well as logical requirements dependent. To make physical data storage decisions, one must know what constraints and opportunities available technologies have. What file organizations do the available database management systems have? What data types do the operating system and other system software support? What are the physical characteristics of secondary memory devices and what overhead space does data management software require? These are questions that technical specialists can answer; systems analysts, with application knowledge, along with technical specialists armed with answers to these questions can together perform physical file and database design.

10. Discuss with your students the role of the CASE repository as a central source of information necessary to make physical design decisions and as a place where these decisions are stored. You can also point out the interactive nature of some logical and physical design issues, such as choosing field data types, and that a CASE repository helps to synchronize logical and physical design activities. For example, when prototyping computer displays or reports with many design tools, analysts may have to select the data type and length of fields before these fields are placed on the prototype. If this display or report generator is a module of the CASE tool, then it draws on the CASE repository for necessary information in building the user interface.

11. Emphasize the important role analysts play in designing data integrity controls. Analysts are essential because designers need an in-depth knowledge of the business area and application to choose default values, picture formats, and range constraints; to determine whether null values should be allowed; and if a null value is present, how to process the data.

12. Review the types of file organizations. Discuss the seven factors that analysts consider when selecting a file organization. Have students evaluate each file organization with respect to each of the seven factors. Finally, review Table 12-3 as a way summarize the discussion.

13. Often when students prepare system development projects for the systems analysis and design course, they do not include adequate file and database controls into the system. Call your students’ attention to the section “Designing Controls for Files” and emphasize that controls are essential elements of a system and must be designed into the system, not added as an afterthought.

Answers to Key Terms

Suggested answers are provided below. These answers are presented top-down, left to right.

21. Relation / 3. Default value
28. Well-structured relation / 14. Null value
13. Normalization / 16. Physical table
8. Functional dependency / 4. Denormalization
23. Second normal form (2NF) / 15. Physical file
27. Third normal form (3NF) / 6. File organization
7. Foreign key / 17. Pointer
20. Referential integrity / 25. Sequential file organization
19. Recursive foreign key / 12. Indexed file organization
26. Synonyms / 11. Index
10. Homonym / 24. Secondary key
5. Field / 9. Sequential file organization
2. Data type / 18. Primary key
1. Calculated (or computed or derived) field / 22. Relational database model

Answers to Review Questions

1. The purpose of normalization is to rid relations of anomalies. The goal is to form well-structured relations that are simple and stable when data values change or data are added or deleted.

2. The five properties of relations are entries in columns are simple, entries in columns are from the same set of values, each row is unique, the sequence of columns is insignificant, and the sequence of rows is insignificant.

3. Synonyms, homonyms, transitive dependencies, and class/subclass relationships can arise during view integration. Synonyms occur when two or more different names are used for the same attribute from different user views. Homonyms occur when two or more attributes from different user views have the same name. Transitive dependencies are functional dependencies between non-key attributes that arise when functionally dependent non-keys come from different user views. Class/subclass relationships refer to relations representing the same entity from different user views actually represent different subsets of the same entity type.

4. Relationships between entities are represented in several ways in the relational data model. A binary 1:M relationship is represented by placing a foreign key (the primary key of the entity on the one side of the relationship) in the relation for the entity on the many side of the relationship. In a binary 1:1 relationship, a foreign key is placed in the relation on either side of the relationship or on both sides. For a binary and higher degree M:N relationship, a relation is created with a primary key, which is the concatenation of the primary keys from the related entities. In a unary relationship, a recursive foreign key is added to the relation.

5. The fundamental rule of normalization is that each non-key attribute must be fully functionally dependent on the whole primary key attribute (a non-key is dependent on the whole key and nothing but the key). Thus, there can be no functional dependencies between non-keys.