.

The Institute of Computer Science

The PolishAcademy of Sciences

Andrzej Jodłowski

Dynamic Object Roles

in Conceptual Modelingand Databases

Ph. D. Thesis

Advisor:

Doc. Dr. Hab. Kazimierz Subieta

Warsaw, November 2002

1

Table of Contents

1.Introduction

2.Choosing Notions in Conceptual Modeling

2.1.Conceptual Modeling

2.2.Notions of Object-Oriented Conceptual Modeling

3.Inconveniences of Multiple Inheritance

3.1.The Concepts of Class and Inheritance

3.2.The Concept of Multiple Inheritance

3.3.Multiple Classification

3.4.Types and Substitutability

3.4.1.The Concept of Type

3.4.2.Subtypes and Substitutability

3.5.Well-Known Programming Languages and Multiple Inheritance

3.5.1.Smalltalk

3.5.2.Modula-3

3.5.3.Eiffel

3.5.4.C++

3.5.5.O2

3.5.6.Java

3.5.7.General Observations

3.6.Multiple Inheritance in the Object-Oriented Methodologies and Notations

3.7.Multiple-Inheritance Problems

3.7.1.Semantical Context and Multiple Inheritance

3.7.2.Human Factor

3.7.3.The Name Conflict

3.7.4.Typological Problems

3.7.5.The Multiple Inheritance and Object Migration

4.Well-Known Approaches to Dynamic Object Roles

4.1.The Concept of Roles by Bachman and Daya

4.2.Aspects

4.3.Fibonacci Approach

4.4.The Concept of Roles by Kristensen

4.5.Prototypes

4.6.Subtables in SQL99

4.7.Roles Realized by Design Patterns

5.Dynamic Object Roles in Stack-Based Approach

5.1.The Concept of Dynamic Object Role

5.2.Object Store Model with Dynamic Roles

5.2.1.Links among Objects and Roles

5.2.2.Dynamic Roles - a Formal Model of an Object Store

5.3.Dynamic Roles vs. Classical Object-Oriented Models

5.3.1.Multiple Inheritance

5.3.2.Repeating Inheritance

5.3.3.Multiple-Aspect Inheritance

5.3.4.Temporal and Historical Properties

5.3.5.Variants (Unions)

5.3.6.Object Migration

5.3.7.Referential Consistency

5.3.8.Overriding

5.3.9.Binding

5.3.10.Typing

5.3.11.Subtyping

5.3.12.Substitutability

5.3.13.Dynamic Inheritance

5.3.14.Aspects of Objects and Heterogeneous Collections

5.3.15.Aspect-Oriented Programming and Separation of Concerns

5.3.16.Meta-Data Support

5.4.Specification of Dynamic Roles in Database Schemata

5.4.1.Concepts for Building Database Schemata

5.4.2.Naming Issues

5.4.3.A Sample Construction of an Object Schema with Dynamic Roles

5.4.4.Declarations of Data Structures

5.4.5.Metadata Management

5.5.Query Language for the Object Model with Dynamic Roles

5.5.1.The Environment Stack

5.5.2.Opening a New Scope on the Environment Stack

5.5.3.Thin and Thick Sections

5.5.4.Private, Protected, and Public Properties

5.5.5.Binding

5.5.6.Polymorphism and Overriding

5.5.7.Creating and Deleting Roles

5.5.7.1.Create Operator

5.5.7.2.Delete Operator

5.5.8.Role-Specific Operators

5.5.8.1.Casting Operator

5.5.8.2.Hasrole Operator

5.5.8.3.Roles Operator

5.5.9.Query Optimization

6.Extending UML with Dynamic Object Roles

6.1.Dynamic Classification vs. Dynamic Object Roles

6.2.Composition vs. Dynamic Object Roles

6.3.RoleOf Relationship

6.4.Combining Class and Role Hierarchies

6.5.Multiple-Aspect Inheritance

6.6.Classical Inheritance vs. RoleOf Relationship

6.7.Notation for RoleOf Instances

7.Implementation

8.Conclusions

Appendix A. The Prototype - General Description

A.1.The Physical Level

A.2.The Logical Level

A.2.1.The Definition of an Object:

A.2.2.The Definition of Atomic Value

A.2.3.Creating Objects

A.2.4.Information Retrieval Functions

A.2.5.Updating Functions

A.3.The Conceptual Level

A.3.1.Class

A.3.1.1.Attribute

A.3.1.2.Method

A.3.1.3.Relationship

A.3.2.Role Class

A.3.3.Class Instance

A.3.4.Role Class Instance

A.3.5.Other Functions

A.4.Metabase, ODL, SBQL with Dynamic Roles

A.4.1.The Grammar of ODL

A.4.2.ODL Parser

A.4.3.SBQL Parser

A.4.4.The Grammar of SBQL

A.5.Query Evaluation Module

A.5.1.The Organization of Environment Stack

A.5.2.The Organization of Query Result Stack

A.5.3.Query Evaluation

A.6.Types and Reserved Names

Appendix B. The Stack-Based Approach

B.1.Introduction

B.2.Objects, Classes and Abstract Object-Oriented Store Model

B.3.Example Database

B.4.Stacks

B.5.Binding

B.6.Query Language

B.6.1.SBQL Syntax

B.6.2.Results of SBQL Queries

B.6.3.SBQL Semantics

B.6.3.1.Algebraic Operators

B.6.3.2.Non-Algebraic Operators

B.7.An Example of Query Evaluation

Appendix C. List of Figures

Bibliography

1.Introduction

For several years, dynamic object roles have had the reputation of a notion on thebrink of acceptance. There are many articles advocating the concept [ABGO93, BD77, BG95, Fowler97,GSR96, Kristensen95, KØ96, KRR00, LL98, LW99, Papazoglou91, Pernici90, RS91, WJ89, WL95, Wong99], but many researchers do not considerits applications sufficiently broad to justify the extra complexity of conceptualmodeling facilities. Furthermore, the concept is neglected on the implementationside. As far as we know, no popular object-oriented programming language ordatabase system supports it explicitly. Some authors assume a tradeoff, wherethe role concept is the subject of special design patterns [BRSW00, Fowler97, GHJV95, RG98], appliedboth on the conceptual modeling and the implementation sides.The notion has already been adapted by the SQL:1999 standard [ANSI99], although the name is different, it has specific semantics, and some limitations.

Fig. 1.1Roles played by a person

The idea of dynamic object roles assumes that a real or abstract entity canacquire and get rid of some roles during its lifetime without changing its own identity.The roles appear during the life of a given object; they can exist simultaneouslyor disappear at any moment. For instance, a person can be a reviewer, anauthor, aspeaker, or a participant of a conference at the same time, as depicted inFig. 1.1. Similarly,a building can be an office, a house, a magazine headquarters, and so on.

Trying to develop a class model in UML and then to implement it in a certain programming language, one can face three main difficulties:

(i)Due to some objects have many specializations at the same time, this leads tomultiple or multiple-aspect inheritance;

(ii)Some objects have many specializations of the same type (repeating inheritance). For instance, aperson can be a member of many clubs at the same time;

(iii)Some objects have specializations that depend on time. For instance, aperson was a student a year ago, but he or she can be an employee of acompany currently.Furthermore, a person can be anemployee several times, atdifferent times and companies.

Similar problems, mostly related to recording historical information, haveoccurred with other entities such as institutions, companies and documents. One should conclude that the classical inheritance concept, as presented in UML, for instance, is not fully adequate for data environments dependent on historical data.

Another disadvantage of the design is complexity, chiefly after mapping it to arelational DBMS. Very complex relational structures imply very complex SQLqueries. We have concluded that such cases are poorly dealt with in UML andcause difficulties during implementation. The only radical cure is to introducedynamic object roles both at the level of UML class diagrams and at the level ofdata structures implemented in object-oriented or object-relational DBMS.

In this dissertation, we show thatdynamic object roles are useful both for conceptual modeling and for implementation.We argue, the concept could much facilitate modeling tools such as UML [UML01] and could bean important paradigm for object databases, which are built on the spirit of ODMG [ODMG00].

Our idea to deal with dynamic rolesin a query language [JHPS02] is based on the stack-based approach (SBA) [PK00, Płodzień00, SKL95].A version [SMSRW93] was implemented in the prototype system Loqis [Subieta91].The SBA includes a data definition language (DDL) and query language SBQL (Stack-Based Query Language) that are similar to ODMG ODL and OQL respectively and have a clean and precise semantics. SBA constitutes a uniform conceptual platform for integrated query and programming languages. One of its central concepts is naming, scoping and binding principle, which enables us to deal with naming issues effectively.

In our opinion, SBA is a universal, simple and powerful object data model, and is the only formalism able to accommodate the concept of dynamic object roles naturally. We introduce an extended SBA object model that defines roles as composite objects with a special structure, semantics and generic operations. We describe the structure formally, present a sketch of a data definition and query programming languages supporting generic operations to define and process such structures.

The rest of the dissertation is organized as follows.Chapter 2 presents a short discussion about notions in conceptual modeling.Chapter 3analyzes inconveniences of multiple inheritance.Chapter 4presents the state of the art concerning dynamic object roles. Chapters 5 and 6 are the main parts of the thesis. Chapter 5 introduces our object model with roles and discusses its differences with traditional objects models introduced in programming languages and database management systems.Chapter 6describes changes to the UMLnotation that allowsdefiningdynamic roles in the UML class diagrams. Chapter 7 briefly presents the implementation of the prototype.Chapter8 reports on our conclusion and future research plans.

The objectives of this dissertation are

  • The introductionof anextended SBA object model with roles formally.
  • The developmentof a prototype of a data definition language that makes it possible defining class schema with roles, and the development of a prototype of a query/programming language supporting generic operations to act on objects with dynamic roles.
  • The introductionof an extension of the UML class diagram encompassingmodeling dynamic roles.

2.Choosing Notions inConceptual Modeling

Due to different viewpointsabout modeling real-world phenomena andabout the ways to perform this process by computer systems, it isnecessaryto introduce and usevarious notions and languages. Efforts of object-oriented paradigmare heading for unification and limitation of this variety.However,the complete unification cannot be entirely accomplished given different natures of needed descriptions or various requirements imposed by recipients who use a computer system. Peoples who are involved in constructing information systems, wrestle with many problems, especially with: choice of the level of generality of the world to be describedand of its model, decisions about the meaning and localization of conceptual schema within global data schema,and difficulties related togreat complexityof conceptual modeling process. The problems mentioned will be briefly presented and discussed in this chapter.

2.1.Conceptual Modeling

Attempts at representing fragmentsof real world have disclosed the necessity for introducing several basic notions. These concepts characterize most often a descriptive kind of knowledge, sothey are related to certain sets of states of real world as well as to state changes, which describedynamic phenomena. Conceptual modeling makes it possible to build mental images with the help of given notions in order to ensure good comprehension of essential properties of characterized beings.

An image of programmers, who arduously code given algorithms in certain programming language, is a rather common stereotype of works on software. However, an essence of creativity process isratherpoorly illustrated by this stereotype. It does not encompass whole complexity of images and mental processes, which occur in the mind of an analyst, designer or programmerbefore beginning of programming andduring programming works.A designer or a programmer, before he/she starts coding, should very precisely understand a problem and a method of its solution. According to common opinion, it is impossibleto develop correctlysoftware, ifa “what to do and how it to do” plan is incomplete or imprecise. It follows thatfundamental processes related to software developmentoccur in a human mind and they do not need to be connected with any programming language. The conceptual modeling and conceptual model notions are related to all informal mental processes, whichaccompany the works on software.

Conceptual modeling takes place at different phases of system life cycle. It is supported by proper semiformalmeans reinforced by a human memory and imagination. As a rule, those meansare based on graphical representation of mental imagesrelated to the reality and characterized by the data, or related to data structuresand processes needed for developing an information system. Such tools as class diagrams, functional diagrams, and use cases diagrams make communicationwithin and among project teams possible. The same means allow also communication between a designer/analyst and a customer. They also make it possible to document results of phases: analysis, design and implementation.

In any information project, it is possible to distinguish three general formsof a conceptual model [Subieta98]:

  • Mental model of real world defines a subject of information system. This model, representing domain knowledge,is seldom formalized. It should include whole knowledge about a given institution, organization, or companyand about scopes and aspects of their activities, which is necessary to comprehend functions fulfilled by the information system.
  • Abstractconceptual model defined by means of proper formal or semiformal notation (e.g.class diagrams). It usually represents only some part of domain knowledge;
  • Conceptual model of data structures formaking basis of the information system (e.g.relational schema written down in SQL).

Each of these modelsplaysa certain role in the software development. Mental model of real world is necessary to understand:what for are the dataandwhatsignificance has processing of these data. The aspects of domain knowledge, which are important for an information system under development, are described by an abstractconceptual model. At last, duringprogramming works, a schema of data structuresis necessaryin order to understand correctly their structure, organization, manners of data processing, etc.

The following consistencies relate to conceptual modeling of a given programming venture:

  • Consistencybetween a mental model of real world and an abstractconceptual model of this world describedwith the aid of a given formal language;
  • Consistency betweena conceptual model and schema of data structures, expressedby programming language type (or class) system and/orexpressed by schema description language (data description language) supported by a given database management system.

The consistencies mentioned abovemean that the differences between thinking about the real world, comprehension of abstract conceptual model, and understanding of stored data structures should be as small as possible. Minimizationof thesedifferences is crucial for many phases of software development’s lifecycleand it is conducive to its high quality and modifiability.

Aspirations for achieving these consistencies is the driving force behind the development of semanticdata models and the introduction of some elementsof these models (also object-oriented) as programming language and DBMS features. Sincebotha mental model of real world andits abstractconceptual model do not include elementsrelated to computer environment, aspirations for obtaining the foregoing consistencies lead to increasing an abstraction level, and independence between data and programs too.This processmakesthat designers, programmers, and users are gradually disengaged from taking some elements of system environment into consideration.

On the other hand, one can consider the tendency mentioned above asa factor, slowing downthe development and the applications of conceptual modeling tools. Assumingconstraints on data structures and other features of certain programming language and DBMS, many differences between conceptual models and data structure schematacausemore difficulties in understanding of data semantics as well as goals and manners of processing them. Thus, one may question the point in introducing those tools to conceptual schema, which cannot be directly mapped onto properties of data structures within certain realization base. The tendencytoenrich conceptual modeling tools with more and more newconceptsencounters a poorly perceptible barrier related to an excessive complexity of representation and differences between the conceptual schema and the implemented data structure.

It is worth a notice thatrelational model is not capable of representing some features of the conceptual model in a direct way (such as e.g.inheritance hierarchy, multi-valued orcompound attributes, and many-to-many relationships). That for many projects can destroy in effect a direct connection between abstract conceptual model and data structures. Both more flexibility and accidentalness of choices made by designers and programmers during realization of system and more complex understanding of developed data structures cause many disadvantages in software quality.

Increasing consistency between abstract model of system domain knowledge andits concrete realizationin the form of implementeddata structuresand programsis a fundamentalgoal of object-orientedness. It has important consequences for many aspectsof developing information systems, including speed and costof their constructing as well as quality, reliability, openness, modifiability, modularity, portability and possibility for reuse.

2.2.Notions of Object-Oriented Conceptual Modeling

Growing complexity of information systems’ applications has resulted in new demands to the conceptual modeling of business domains, databases and application programs. The conceptual modeling is supported by various formal or semiformal notations such as class, functional, use case, dynamic and other diagrams. Mapping a piece of reality into a conceptual model requires notions, which follow natural ways of human thinking and understanding, and, on the implementation level, should be easily mapped into data and programming abstractions.

Object-oriented conceptual modeling notions are supported in programming languages by proper data structures and behavioral properties defined on the algorithmic level of semantic precision. A long-term tendency in the development of programming languages and database management systems is that (usually semi-formal) conceptual modeling notions after some time are becoming data and programming abstractions. Object databases are an illustration of the thesis. Considering the entity-relationship model as a tool for conceptual modeling of relational databases, we observe that many of its notions (such as entities/classes, relationships, inheritance, and others) are further materialized as constructs of database structures, query languages and programming languages.

Despite a large collection of various conceptual modeling facilities it is still difficult to model directly and precisely some typical situations in the business reality. An example is the concept of multiple inheritance, which supports conceptual modeling, but leads to various semantic anomalies. Inaccurate modeling causes communication difficulties between project’s members, increases the probability of errors, causes additional consumption of resources during system construction, and has negative impact on the code length, documentation, transparency, maintainability and reliability of software. Thus conceptual modeling facilities should contain all necessary notions which allow the analyst and designer to express their design vision as precisely as possible.

On the other hand, excessive extension of conceptual modeling notions may cause difficulties concerning their learning and proper use by project members. Thus there is an opposite tendency to minimize the number of notions and express new notions in terms of known notions. For instance, some methodologies do not deal with aggregation considering it a special case of association; some others do not involve inheritance assuming it can be expressed otherwise, etc. Another disadvantage of a large number of conceptual modeling notions is their inherent semi-formal semantics (they enhance humans’ thinking rather than computer operation), which could cause difficulties in recognizing proper usage of the notions and semantics of their particular combinations.