Part 2: Data Flow Modeling

ESE Module 3-3

Structured Analysis

Part 2: Data Flow Modeling

All software, whether it's used to produce paychecks

or control a supersonic aircraft, is an information trans-

former. That is, all software accepts input, transforms it

in some way, and produces output as a result. In

essence, data flow into a system, get changed, and

ultimately flow out.

As we create the analysis model, our challenge is

to understand this transformation. How are data

changed as they flow through a system? What func-

tions are required to transform input into output?

These questions are answered during the second

analysis modeling activity-the creation of a data

flow model.

The data flow model kills two birds with one stone.

It indicates the flow of data and it also provides us

with a mechanism for functional decomposition.

Hence we address two important analysis principles:

(1) understanding the flow of data through a system,

and (2) partitioning (decomposing) function into pro-

gressively lower levels of abstraction. (See ESE Module

3-2 for details.)

Flow Modeling Notation

Data flow notation is surprisingly simple, and yet the

models that it produces can be quite sophisticated.

The basic data flow icons address only four things:

producers and consumers of data; processes that

transform data; the data flow itself; and places where

data are stored.

Armed with these icons, the software engineer

(analyst) can produce a picture of the software (and

its external environment) that establishes the basis for

architectural design.

Readings

The following excerpt has been adapted from

Software Engineering: A Practitioner's Approach and

discusses data flow notation. Extensions for real-time

software (not discussed in the video portion of this ESE

Module) are presented briefly.

Information is transformed as it flows through a computer-

based system. The system accepts input in a variety of

forms; applies hardware, software and human elements to

transform input into output; and produces output in a vari-

ety of forms. Input may be a control signal transmitted by

a transducer, a series of numbers typed by a human opera-

tor, a packet of information transmitted on a network link

or a voluminous data file retrieved from secondary storage.

The transform(s) may comprise a single logical compari-

son, a complex numerical algorithm or rule-inference

approach of an expert system. Output may light a single

LED or produce a 200-page report. In effect, we can create

a flow model for any computer-based system, regardless of

size and complexity.

Structured analysis is an information flow and content

modeling technique. A computer-based system is repre-

sented as an information transform. Overall function of the

system is represented as a single information transform,

represented as a circle or bubble. One

or more inputs, shown as labeled arrows, originate from

external entities, represented as a box. The input drives the

transform to produce output information (also represented

as labeled arrows) that is passed to the external entities. It

should be noted that the model may be applied to the

entire system or to the software element only. The key is to

represent the information fed into and produced by the

transform.

Data Flow Diagrams

As information moves through software, it is modified by a

series of transformations. A data flow diagram (DFD) is a

graphical technique that depicts information flow and the

transforms that are applied as data move from input to

output. The DFD is also known as a data flow graph or

a bubble chart.

The data flow diagram may be used to represent a sys-

tem or software at any level of abstraction. In fact, DFDs

may be partitioned into levels that represent increasing

information flow and functional detail. A level 0 DFD, also

called a fundamental system model or a context model,

represents the entire software element as a single bubble

with input and output data indicated by incoming and out-

going arrows, respectively. Additional processes (bubbles)

and information flow paths are represented as the level 0

DFD is partitioned to reveal more detail. For example, a

level 1 DFD might contain five or six bubbles with inter-

connecting arrows. Each of the processes represented at

level 1 are subfunctions of the overall system depicted in

the context model.

A rectangle is used to represent an external entity,

that is, a system element (e.g., hardware, a person,

another program) or another system that produces

information for transformation by the software or that

receives information produced by the software.

A circle represents a process or transform

that is applied to data (or control) items and changes them

in some way. An arrow represents one or more data items.

All arrows on a data flow diagram should be labeled. The

double line represents a data store--stored information that

is used by the software. The simplicity of DFD notation is

one reason why structured analysis techniques are the

most widely used.

3-3p2.2 ·· Essential Software Engineering

It is important to note that no explicit indication of the

sequence of processing is supplied by the diagram.

Procedure or sequence may be implicit in the diagram, but

explicit procedural representation is generally delayed

until software design.

As we noted earlier, each of the bubbles may be

refined or layered to depict more detail. Figure 1 illustrates

this concept. A fundamental model for system F indicates

the primary input is A and ultimate output is B. We refine

the F model into transforms f1to f7. Note that information

flow continuity must be maintained, that is, input and out-

put to each refinement must remain the same. This con-

cept, sometimes called balancing, is essential for the devel-

opment of consistent models. Further refinement of f4

depicts detail in the form of transforms f41 to f45 Again,

the input (X,Y) and output (Z) remain unchanged.

The data flow diagram is a graphical tool that can be

very valuable during software requirements analysis.

However, the diagram can cause confusion if its function is

confused with the flowchart. A data flow diagram depicts

information flow without explicit representation of proce-

dural logic (e.g., conditions or loops). It is not a flowchart

with rounded edges!

The basic notation used to develop a DFD is not in

itself sufficient to describe requirements for software. For

example, an arrow shown in a DFD represents a data item

that is input to or output from a process. A data store rep-

resents some organized collection of data. But what is the

content of the data implied by the arrow or depicted by the

store? If the arrow (or the store) represents a collection of

items, what are they? These questions are answered by

applying another component of the basic notation for

structured analysis--the data dictionary. [The format and

use of the data dictionary are presented in the third part of

this ESE Module.]

The graphical notation for DFDs must be augmented

with descriptive text. A processing narrative--a paragraph

that describes a process bubble--can be used to specify the

processing details implied by the bubble within a DFD. The

processing narrative described the input to the bubble, the

algorithm that is applied to the input and the output that is

produced. In addition, the narrative indicates restrictions

and limitations imposed on the process, performance char-

acteristics that are relevant to the process, and design con-

straints that may influence the way in which the process

will be implemented.

Extensions for Real-Time Systems

Many software applications are time dependent, and

process as much or more control-oriented information as

data. [For a detailed discussion of these real-time systems

see Chapter 15 of Software Engineering: A Practitioner's

Approach.] For now, suffice it to say that a real-time sys-

tem must interact with the real world in a timeframe dictat-

ed by the real world. Aircraft avionics, manufacturing

process control, consumer products and industrial instru-

mentation are but a few of hundreds of real-time software

applications.

Ward and Mellor Extensions

Ward and Mellor [1] extend basic structured analysis nota-

tion to accommodate the following demands imposed by a

real-time system:

· information flow that is gathered or produced on a

time-continuous basis;

· control information passed throughout the system and

associated control processing;

· multiple instances of the same transformation are

sometimes encountered in multitasking situations;

· system states and the mechanism that causes transition

between states.

In a significant percentage of real-time applications,

the system must monitor time-continuous information gen-

erated by some real world process. For example, a real-

time test monitoring system for gas turbine engines might

be required to monitor turbine speed, combustor tempera-

ture, and a variety of pressure probes on a continuous

basis. Conventional data flow notation does not make a

distinction between discrete data and time-continuous

data. An extension to basic structured analysis notation,

shown in Figure 2, provides a mechanism for representing

time-continuous data flow. The double-headed arrow is

used to represent time-continuous flow; a single-headed

arrow is used to indicate discrete data flow. In the figure,

monitored temperature is measured continuously and a

single value for temperature set-point is also provided.

The process shown in the figure produces a time-continu-

ous output, corrected value.

The distinction between discrete and time-continuous

data flow has important implications for both the system

engineer and the software designer. During the creation of

the system model, a system engineer will be better able to

isolate those processes that may be performance critical. (It

is often likely that the input and output of time-continuous

data will be performance sensitive.) As the physical or

implementation model is created, the designer must estab-

lish a mechanism for collection of time-continuous data.

Obviously, the digital system collects data in a quasi-con-

tinuous fashion using techniques such as high-speed

polling. The notation indicates where analog to digital

hardware will be required and which transforms are likely

to demand high-performance software.

In conventional data flow diagrams, control items or

event flows are not represented explicitly. In fact, the ana-

lyst is cautioned specifically to exclude the representation

of control flow from the data flow diagram. This exclusion

is overly restrictive when real-time applications are consid-

ered; for this reason, a specialized notation for representing

event flows and control processing has been developed.

Continuing the convention established for data flow dia-

grams, data flow is represented using a solid arrow.

Control flow, however, is represented using a dashed or

shaded arrow. A process that handles only control flows,

called a control process, is similarly represented using a

dashed bubble.

Structured Analysis: Data Flow Modeling .. 3-3p2.3

Control flow can be input directly to a conventional

process or into a control process. Figure 3 illustrates control

flow and processing as it would be represented using

Ward and Mellor notation. The figure illustrates a top-

level view of a data and control flow for a manufacturing

cell. [A manufacturing cell is used in factory automation

applications. It contains computers and automated

machines (e.g., robots, NC machines, specialized fixtures)

and performs one discrete manufacturing operation under

computer control.] As components to be assembled by a

robot are placed on fixtures, a status bit is set within a parts

status buffer (a control store) that indicates the presence or

absence of each component. Event information contained

within the parts status buffer is passed as a bit string to a

process, monitor fixture and operator interface. The

process will read operator commands only when the con-

trol information, bit string, indicates that all fixtures con-

tain components. An event flag, start/stop flag, is sent to

robot initiation control, a control process that enables fur-

ther command processing. Other data flows occur as a con-

sequence of the process activate event that is sent to

process robot commands.

In some situations multiple instances of the same con-

trol or data transformation process may occur in a real-

time system. This can occur in a multitasking environment

when tasks are spawned as a result of internal processing

or external events.

1] Ward, P.T., and S. Mellor, Structured Development for

Real-Time Systems (3 volumes), Yourdon Press, 1985. i~

Exercise 3-7

Data Flow Modeling

Recall the electronic checkbook problem introduced

in ESE Module 3-2. Assume that you work for a con-

sumer products company that is about to build an

electronic checkbook, called ElectroChex. The prod-

uct, about the size and shape of a standard check-

book will print checks that you insert into a slot at the

end. The product stores up to 256 payee names, cat-

egorizes payments, allows you to enter numeric and

alpha information via a qwerty keyboard and has

communication capabilities to PCs.

1. Develop Level 0, 1 and 2 data flow diagrams for

ElectroChex.

2. Be sure to identify all external entities.

3. Be sure to label all bubbles (transforms) and

3-3p2.4 ·· Essential Software Engineering

arrows in your DFDs.

4. Review your results with those developed by your

colleagues.

Hint: If you have trouble starting this exercise, think

back to our discussion of the grammatical parse in ESE

Module 3-3. part 1 . You'll recall that each of the

active verbs in the statement of scope represents a

potential function for the system. The functions trans-

late into bubbles at level 1.

Creating DFDs

Although DFD notation is quite simple, the creation of

a DFD is more difficult than you might think. The rea-

son: most software people have been trained to think

procedurally. Data flow modeling has a procedural

component to it, but it is a process of refinement

based on how data flows, not on how program logic

progresses. For this reason, it may seem a bit odd to

you if this is your first introduction.

Readings

The following excerpt has been adapted from

Software Engineering: A Practitioner's Approach and

discusses the mechanics for creating a data flow dia-

gram.

The data flow diagram (DFD) enables the software engi-