Computing and Information Systems© University of Paisley 2005

Beyond Semantics: Verifying Information Content Containment of Conceptual Data Schemata by Using Channel Theory

Yang Wang and Junkang Feng

1

There is an ever-growing need to consider semantics in various research fields. Although numerous methods and solutions have been proposed and used along this line, some fundamental issues that need to be addressed still remain. The one that we are particularly interested in is how something that has semantics could have impacts on its receiver. It would seem that the impacts results from the capability of yielding knowledge. And by Dretske’s notion of ‘to know’, an essential component of this capability is information provision. Semantics has a role to play only if it contributes to the provision of information.

Our approach to addressing this issue is therefore to use a number of theories that address the problem of information flow. In this paper we focus on the problemof ‘information content’ of some information-bearing objects or events, such as a piece of data that are constructed in some particular way, which we call ‘data construct’.

To verify whether the information content of a given data construct contains a given piece of information systematically is not easy, particularly so when the information is beyond the literal meaning of the data. In this paper, we describe how a sophisticated theory about information flow, namely the Channel Theory, can be used for such a task. The main point we make in the paper is that this task can be accomplishedthrough identifying nested information channels that cover both some intra-properties and inter-properties of a schema.

Our work so far seems to show that this is a promising avenue. Such an approach introduces mathematical rigor into the study, and results in a sound means for the verification. This in turn provides the practice of conceptual modeling and validation with a desirable and reliable guidance.

1. INTRODUCTION

Semantics has become an increasingly significant factor in various information system (IS) research fields such as system integration, knowledge representation and management, and semantic web services. In order to develop sound semantics-based approaches and methods, the need for solid theoretical bases cannot be overlooked. Our interest resides in how semantics of something could have impact on its receiver. It would seem that the impact results from the capability of yielding knowledge. And by Dretske’s notion of ‘to know’ (Dretske 1981, p.92), an essential component of this capability is information provision.

Therefore we have been exploring the relevance of semantic information and information flow theories, such as Dretske’s work (1981), Devlin’s work (1991) and the Information Flow theory (also called Information Channel Theory, CT for short) put forward by Barwise and Seligman (1997), to information systems. In this paper, we present our thinking on what we call ‘information content containment’, i.e., how an information-bearing object, such as a data construct, conveys information that constrains a given piece of information. Such a concept is put forward as an essential part of the notion of ‘Information Bearing Capability’ (IBC) by Feng and his colleagues in (Feng 1999, Xu and Feng, 2002, and Hu and Feng, 2002). This notion and many associated ideas are a result of observations through years of experience. There are four conditions identified that enable a data construct to represent a particular piece of information (which is called ‘the IBC Principle’ for convenience and will be given shortly), one of which is called the ‘information content containment’ condition. To verify whether these conditions hold of a particular system is not straightforward, as it is not simply a matter of checking the literal meaning of the data. To this end, we have been making considerable effort and achieved some preliminary yet encouraging results. Among what have been achieved, it was found that employing a sophisticated theory of information flow within a notional distributed system, namely the Channel Theory (CT) introduces much needed rigor into this work and can serve as a sound means for verifying whether the ‘Information Content Containment’ condition is satisfied over any given pair of two elements, one of which is a data construct and the other a piece of information. That is, whether the ‘information content' of the data contains the information. In this paper we describe our approach by using examples in conceptual data modeling, and avoid any lengthy description of the CT per se.

The rest of thepaper is organized as follows. In the next section, the IBC Principle is introduced. After that, the reasons why we choose the Channel Theory as our intellectual tool for the job in hand are described.In Section 5, we present details on how we verify the ‘Information Content Containment’ condition by using the method of CT summarised in section 4. In Section 6, some related worksthat are concerned with ‘information content preserving’ in schema transformation and with contributions of CT to semantic interoperability and ontology mapping are briefly summarised. Finally, we give conclusions and indicate directions for further work.

2. THE ‘IBC’ PINCIPLE AND ITS RELEVANCE

In practice, due to the lack of understanding of the difference and link between information and data, problems can occur. For example, it is difficult to identify redundantor conflict information requirements; the transformation from human level models to machine level design and implementation may not be information content preserving; a query to a system may receive unsound and/or incomplete answers. An underlying reason for this could be that the system’s capability of bearing information is over-estimated or mis-interpreted. Some of such problems were recognized as ‘connection traps’by Codd (1970) and Howe (1989), and discussed in detail in Fengand Crowe’s work (1999). We envisage that the ideas of ‘Information Bearing Capability’ should help.

The notion of IBC and the work around it were developed over a number of years as shown in series of works (Feng 1999; Xu and Feng 2002and Hu and Feng, 2002). The version of it that we present here is that which takes into account the three major parties that are involved in information flow, namely information source (S), information bearer (B), and information receiver (R). We call such a thought the framework of SBR for simplicity and convenience purposes. The IBC Principle(Feng, 2005) is now specified as follows:

This Principle is concerned with token level (in comparison with the ‘type level’) data (or media) constructs’ representing individual real world objects and individual relationships between some real world objects.

  • For a token level data construct (or a ‘media construct’ in general), say t, to be capable of representing an individual real world object or an individual relationship between some real world objects (or a ‘referent construct’ in general), say s, which is neither necessarily true nor necessarily false[1],

The information content of t when it is considered in isolation must include s, the simplest case of which is that the literal or conventional meaning of t is part of its information content, and the literal or conventional meaning of t is s;

And t must be distinguishable (identifiable) from the rest of the data constructs in a system, say Y, that manages data including t (or from the rest of the media constructs in a system Y that manages media constructs including t, in general).

The above two conditions were formulated from the viewpoint of the relationship between S and B under the SBR framework.

  • For a data construct (or a ‘media construct’ in general) t that is capable of representing an individual real world object or an individual relationship between some real world objects (or a ‘referent construct’ in general) s to actually provide information about s,

t must be accessible by the only means available to system Y;

In the case that t has neither literal nor conventional meaning and in the case that neither the literal nor the conventional meaning of t is s, the information receiver, i.e., the R in the SBR framework, must be provided with means by Y to infer s from t.

The above two conditions were formulated from the viewpoint of R’s obtaining information about S via B under the SBR framework.

This Principle is scalable to more (i.e., t could be an instance of an entire system among many related systems, for example) or less (e.g., an instance of a simple attribute of an entity in an ER schema) complex cases, and hence flexible in terms of applicability.

We call the first condition of the above four the ‘Information Content Containment’ condition. And the concept of ‘information content’ of a sign/message can be defined as follows (Dretske 1981, p.45):

‘A state of affairs contains information about X to just that extent to which a suitably placed observer could learn something about X by consulting it.’

In this paper, we focus on how we could verify whether a state of affairs contains information about X. And our approach is to make use of theories on information flow.

3. SOME THEORIES ON INFORMATION FLOW

3.1 Dretske’s Semantic Theory of Information

Dretske put forward a theory (Dretske, 1981), which not only captures, following Shannon(Shannon and Warren, 1949), the quantitative aspect of communication of information, but also addresses the information content of an individual message. These form an account of how information can flow from its source to a cognitive agent, i.e., the receiver. Although his theory has been widely cited, we also note its objections. Firstly, as Dretske’s theoryis based upon probability, certain conditions must be maintained for a probability distribution to occur. In the ‘real world’, this might be a stringent requirement. Secondly, Dretskeincludes the ‘internal’ contribution of the prior knowledge k and the ‘external’ contribution of objective probabilities to the conceptualisation of information flow. However, Dretskehas to accept that different ways of determining relevant possibilities and precisions give different probability measures and therefore different information flow and knowledge (Barwise and Seligman, 1997). Finally, Dretske goes beyond Shannon’s work (1949) that concerns solely the quantitative aspect of communication with many messages (Devlin, 2001), and tackles explicitly the content of information that an individual message bears. And yet it is difficult to identify the information content of an individual message.

3.2 Devlin’s ‘Infon’ and Situation Theory

To model information flow, the mechanism used by Devlin (1991) is made up of situation types and constraints, which connect situation types. This theory seems to emphasise the ‘soft’ aspect of information flow, i.e., what is going on in people’s mind. Moreover, it does not seem to address the issue of how a constraint gets established, which would not rule out the possibility of them being arrived at subjectively.

3.3 The Information Channel Theory

Taking into consideration the shortcomings of the above two theories on modelling information flow, we choose the Channel Theory(Barwise and Seligman, 1997) as a tool to verify whether a given data construct satisfy the ‘information content containment condition’ of the aforementioned IBC Principle. The reasons why we think that thisis appropriate are summarised below.

In general, saying that ‘B bears information about S’ is the same as saying ‘there is information flow from S to B’. And the latter is in the language of CT, which is a systematic approach to mathematically modelling and analysing information flow.

With CT, information flow is possible only because what is involved in information flow can be seen as components of a distributed system. That is to say, the notion of ‘distributed system’ provides us with a way to model and formulate, if possible, an ‘information channel’ within which information flows.

Information flow requires the existence of certain connections (which may be abstract or concrete) among different parts that are involved in information flow. Following Dretske (1981, p.65),such a ‘connection’ is primarilymade possible by the notion of ‘conditional probability’ where the condition is the Bearer. However, this is perhaps only a particular way for ‘connections’ to be established. It would be desirable to see in general how a connection becomes possible. The notion of ‘information channel’ in CT helps here. In CT terms, information flow is captured by ‘local logics’ (which are roughly conditionals between ‘types’) within a distributed system. Most frequently, information connections lie with ‘partial alignments’ (Kalfoglou and Schorlemmer, 2003) between system components. CT is capable of formulating such alignments by using concepts of ‘infomorphisms’ ‘state space’, and ‘event classification’. The property of ‘state space’ enables every instance and state relevant to the problem to be captured. Consequently, informational relationship betweencomponents connecting through the whole system can be modelled as the inverse image of projections between their corresponding state spaces. Such inverse images are also known as ‘infomorphisms’ between event classifications. This is exactly how the ‘infomorphisms’ for the ‘core’ (which is a classification for the whole distributed system) of an information channel can be found. We ask our readers, who are not familiar with CT, to bear with us here as we will use example to show the basics of CT shortly.

4. A Method of using ct to Verify information containment

We have developed a method for the task at hand, which consists of a series of steps below. The reader is referred to the appendix for those numbered terms:

  • Identify componentclassifications[1]relevant to the task at hand and then use them to construct a distributed system;
  • Validate the existence of infomorphisms[2] between the classifications and construct an information channel[3] relevant to the task at hand;
  • Construct the core of the channel by identifying those parts of the component classifications (including normal tokens[4]) that contribute to the desired information flow;
  • Find the local logic[5] on the core, i.e., the entailmentrelationships between types of the classification for the ‘core’ that is directly relevant to the information flow, i.e., the information content containment at hand;
  • Arrive at desired system level’s theory [6]by applying the f-Elim rule [7] on the local logic on the core.

We describe this method in detail by means of examples in the next Section.

5. VERIFYING information containment

We believe that to verify whether the information content of the schema contains something about a specific ‘real world’ application domain takes two steps, which can be roughly seen as corresponding to the syntactic level and the semantic level of (Organisational) Semiotics (Stamper, 1997;Liu, 2000 andAnderson 1990). We use Figure 1 to show our points.

Figure 1 Topological connection t

5.1 Semantic Level

On this level, the problem being addressed is ‘meaning, propositions, validity, signification, denotations,…’(Stamper, 1997)Accordingly, what should be looked at is how a topological connection, which is a connection between entities made possible by an Entity-Relationship (ER) schema, is able to represent a real world relationship. That is to say, this level is concerned with the relationship between a conceptual data schema and the ‘real world’ that the schema models. Specifically, we want to see whether and how information flows from the ‘real world’ to the schema. In other words, the information content of the schema includes something about a specific ‘real world’ application domain.

In the world of CT, information flow is ever possible only within a distributed system, i.e., a system made up of distinct components, the behaviour of which is government by certain regularities. Within such a system, ‘what’ flows is captured by the notion of ‘local logics[5]’ and ‘why’ information can flow is explained by the concept of ‘information channel [3]’. Therefore, it is essential to construct relevant and appropriate ‘distributed systems’. Considering the condition of ‘information content containment’of the IBC Principle under the SBR framework, the information source is the‘individual real world object or individual relationship between some real world objects (or a ‘referent construct’ in general) s’, and the information bearer is the ‘topological relation t’.

On this level then, to make sure that a conceptual data schema is capable of representingsomeparticular real world individuals involves constructing a distributed system justifiably that supports information flow between these two different types of things. The ‘connections’ between them, which are instances of the distributed system, are crucial.

We view the process of constructing a conceptual schema for representing someparticular real world as a ‘notional system’.[2] It is interesting to note that the notion of ‘distributed system’ in CT is similar to the concept of ‘notional system’ in Soft Systems Methodology (SSM) (Checkland, 1981). In addition, sometimes we use the term ‘semantic relations’ to refer to something in the ‘real world’in contrast to ‘topological connections’, which are elements within a data schema and can be seen as something on the syntactic level. Semantic relations are not unlike the notions of ‘objects in the “reality”’ and ‘social conventions’ in semiotics (liu, 2000).

In such a distributed system (It is called ‘representation system [8]’in particular), the semantic relations and the conceptual schemacan be seen as components. Using the notions of CT, the conceptual model is our source classification (not to be confused with the ‘information source’ in aforementioned SBR), and the semantic relation is target classification. The tokens of the schema consist of all the instances of data constructs, i.e., the topological connections between data values. The types of this classification are different data construct types. An entity, a relationship or a path in a conceptual model can all be different ‘types’.

As the process of ‘constructing a schema’ is modelled as a distributed system, the classification that serves as the core of our information channel is made up of connections that are causal links between our conceptual model (i.e., the schema) and what it models for and types that are ways of classifying these links. The logic on this classification captures the reasoning of the schema construction. That is, the constraints made up the logic on the ‘core’ of the system represent the rulesemployed by the database designer about how to modelingreal world objectsto the conceptual model. The normal tokens of the logic are the links that must satisfy the constraints, among all possible tokens of the system.

Now let us use the example in Figure 1 to illustrate our ideas. Assume that there is a semantic relation s, say ‘After having successfully passedall required modules, a student receives a new qualification (a degree or diploma)’.We show how the topological connection t (as shown in Figure 1) comes to represent the semantic relations. The information channel and infomorphismsfor justifying this are shown in Figure 2.