ESE Module 3-3
Structured Analysis
Part 2: Data Flow Modeling
All software, whether it's used to produce paychecks
or control a supersonic aircraft, is an information trans-
former. That is, all software accepts input, transforms it
in some way, and produces output as a result. In
essence, data flow into a system, get changed, and
ultimately flow out.
As we create the analysis model, our challenge is
to understand this transformation. How are data
changed as they flow through a system? What func-
tions are required to transform input into output?
These questions are answered during the second
analysis modeling activity-the creation of a data
flow model.
The data flow model kills two birds with one stone.
It indicates the flow of data and it also provides us
with a mechanism for functional decomposition.
Hence we address two important analysis principles:
(1) understanding the flow of data through a system,
and (2) partitioning (decomposing) function into pro-
gressively lower levels of abstraction. (See ESE Module
3-2 for details.)
Flow Modeling Notation
Data flow notation is surprisingly simple, and yet the
models that it produces can be quite sophisticated.
The basic data flow icons address only four things:
producers and consumers of data; processes that
transform data; the data flow itself; and places where
data are stored.
Armed with these icons, the software engineer
(analyst) can produce a picture of the software (and
its external environment) that establishes the basis for
architectural design.
Readings
The following excerpt has been adapted from
Software Engineering: A Practitioner's Approach and
discusses data flow notation. Extensions for real-time
software (not discussed in the video portion of this ESE
Module) are presented briefly.
Information is transformed as it flows through a computer-
based system. The system accepts input in a variety of
forms; applies hardware, software and human elements to
transform input into output; and produces output in a vari-
ety of forms. Input may be a control signal transmitted by
a transducer, a series of numbers typed by a human opera-
tor, a packet of information transmitted on a network link
or a voluminous data file retrieved from secondary storage.
The transform(s) may comprise a single logical compari-
son, a complex numerical algorithm or rule-inference
approach of an expert system. Output may light a single
LED or produce a 200-page report. In effect, we can create
a flow model for any computer-based system, regardless of
size and complexity.
Structured analysis is an information flow and content
modeling technique. A computer-based system is repre-
sented as an information transform. Overall function of the
system is represented as a single information transform,
represented as a circle or bubble. One
or more inputs, shown as labeled arrows, originate from
external entities, represented as a box. The input drives the
transform to produce output information (also represented
as labeled arrows) that is passed to the external entities. It
should be noted that the model may be applied to the
entire system or to the software element only. The key is to
represent the information fed into and produced by the
transform.
Data Flow Diagrams
As information moves through software, it is modified by a
series of transformations. A data flow diagram (DFD) is a
graphical technique that depicts information flow and the
transforms that are applied as data move from input to
output. The DFD is also known as a data flow graph or
a bubble chart.
The data flow diagram may be used to represent a sys-
tem or software at any level of abstraction. In fact, DFDs
may be partitioned into levels that represent increasing
information flow and functional detail. A level 0 DFD, also
called a fundamental system model or a context model,
represents the entire software element as a single bubble
with input and output data indicated by incoming and out-
going arrows, respectively. Additional processes (bubbles)
and information flow paths are represented as the level 0
DFD is partitioned to reveal more detail. For example, a
level 1 DFD might contain five or six bubbles with inter-
connecting arrows. Each of the processes represented at
level 1 are subfunctions of the overall system depicted in
the context model.
A rectangle is used to represent an external entity,
that is, a system element (e.g., hardware, a person,
another program) or another system that produces
information for transformation by the software or that
receives information produced by the software.
A circle represents a process or transform
that is applied to data (or control) items and changes them
in some way. An arrow represents one or more data items.
All arrows on a data flow diagram should be labeled. The
double line represents a data store--stored information that
is used by the software. The simplicity of DFD notation is
one reason why structured analysis techniques are the
most widely used.
3-3p2.2 ·· Essential Software Engineering
It is important to note that no explicit indication of the
sequence of processing is supplied by the diagram.
Procedure or sequence may be implicit in the diagram, but
explicit procedural representation is generally delayed
until software design.
As we noted earlier, each of the bubbles may be
refined or layered to depict more detail. Figure 1 illustrates
this concept. A fundamental model for system F indicates
the primary input is A and ultimate output is B. We refine
the F model into transforms f1to f7. Note that information
flow continuity must be maintained, that is, input and out-
put to each refinement must remain the same. This con-
cept, sometimes called balancing, is essential for the devel-
opment of consistent models. Further refinement of f4
depicts detail in the form of transforms f41 to f45 Again,
the input (X,Y) and output (Z) remain unchanged.
The data flow diagram is a graphical tool that can be
very valuable during software requirements analysis.
However, the diagram can cause confusion if its function is
confused with the flowchart. A data flow diagram depicts
information flow without explicit representation of proce-
dural logic (e.g., conditions or loops). It is not a flowchart
with rounded edges!
The basic notation used to develop a DFD is not in
itself sufficient to describe requirements for software. For
example, an arrow shown in a DFD represents a data item
that is input to or output from a process. A data store rep-
resents some organized collection of data. But what is the
content of the data implied by the arrow or depicted by the
store? If the arrow (or the store) represents a collection of
items, what are they? These questions are answered by
applying another component of the basic notation for
structured analysis--the data dictionary. [The format and
use of the data dictionary are presented in the third part of
this ESE Module.]
The graphical notation for DFDs must be augmented
with descriptive text. A processing narrative--a paragraph
that describes a process bubble--can be used to specify the
processing details implied by the bubble within a DFD. The
processing narrative described the input to the bubble, the
algorithm that is applied to the input and the output that is
produced. In addition, the narrative indicates restrictions
and limitations imposed on the process, performance char-
acteristics that are relevant to the process, and design con-
straints that may influence the way in which the process
will be implemented.
Extensions for Real-Time Systems
Many software applications are time dependent, and
process as much or more control-oriented information as
data. [For a detailed discussion of these real-time systems
see Chapter 15 of Software Engineering: A Practitioner's
Approach.] For now, suffice it to say that a real-time sys-
tem must interact with the real world in a timeframe dictat-
ed by the real world. Aircraft avionics, manufacturing
process control, consumer products and industrial instru-
mentation are but a few of hundreds of real-time software
applications.
Ward and Mellor Extensions
Ward and Mellor [1] extend basic structured analysis nota-
tion to accommodate the following demands imposed by a
real-time system:
· information flow that is gathered or produced on a
time-continuous basis;
· control information passed throughout the system and
associated control processing;
· multiple instances of the same transformation are
sometimes encountered in multitasking situations;
· system states and the mechanism that causes transition
between states.
In a significant percentage of real-time applications,
the system must monitor time-continuous information gen-
erated by some real world process. For example, a real-
time test monitoring system for gas turbine engines might
be required to monitor turbine speed, combustor tempera-
ture, and a variety of pressure probes on a continuous
basis. Conventional data flow notation does not make a
distinction between discrete data and time-continuous
data. An extension to basic structured analysis notation,
shown in Figure 2, provides a mechanism for representing
time-continuous data flow. The double-headed arrow is
used to represent time-continuous flow; a single-headed
arrow is used to indicate discrete data flow. In the figure,
monitored temperature is measured continuously and a
single value for temperature set-point is also provided.
The process shown in the figure produces a time-continu-
ous output, corrected value.
The distinction between discrete and time-continuous
data flow has important implications for both the system
engineer and the software designer. During the creation of
the system model, a system engineer will be better able to
isolate those processes that may be performance critical. (It
is often likely that the input and output of time-continuous
data will be performance sensitive.) As the physical or
implementation model is created, the designer must estab-
lish a mechanism for collection of time-continuous data.
Obviously, the digital system collects data in a quasi-con-
tinuous fashion using techniques such as high-speed
polling. The notation indicates where analog to digital
hardware will be required and which transforms are likely
to demand high-performance software.
In conventional data flow diagrams, control items or
event flows are not represented explicitly. In fact, the ana-
lyst is cautioned specifically to exclude the representation
of control flow from the data flow diagram. This exclusion
is overly restrictive when real-time applications are consid-
ered; for this reason, a specialized notation for representing
event flows and control processing has been developed.
Continuing the convention established for data flow dia-
grams, data flow is represented using a solid arrow.
Control flow, however, is represented using a dashed or
shaded arrow. A process that handles only control flows,
called a control process, is similarly represented using a
dashed bubble.
Structured Analysis: Data Flow Modeling .. 3-3p2.3
Control flow can be input directly to a conventional
process or into a control process. Figure 3 illustrates control
flow and processing as it would be represented using
Ward and Mellor notation. The figure illustrates a top-
level view of a data and control flow for a manufacturing
cell. [A manufacturing cell is used in factory automation
applications. It contains computers and automated
machines (e.g., robots, NC machines, specialized fixtures)
and performs one discrete manufacturing operation under
computer control.] As components to be assembled by a
robot are placed on fixtures, a status bit is set within a parts
status buffer (a control store) that indicates the presence or
absence of each component. Event information contained
within the parts status buffer is passed as a bit string to a
process, monitor fixture and operator interface. The
process will read operator commands only when the con-
trol information, bit string, indicates that all fixtures con-
tain components. An event flag, start/stop flag, is sent to
robot initiation control, a control process that enables fur-
ther command processing. Other data flows occur as a con-
sequence of the process activate event that is sent to
process robot commands.
In some situations multiple instances of the same con-
trol or data transformation process may occur in a real-
time system. This can occur in a multitasking environment
when tasks are spawned as a result of internal processing
or external events.
1] Ward, P.T., and S. Mellor, Structured Development for
Real-Time Systems (3 volumes), Yourdon Press, 1985. i~
Exercise 3-7
Data Flow Modeling
Recall the electronic checkbook problem introduced
in ESE Module 3-2. Assume that you work for a con-
sumer products company that is about to build an
electronic checkbook, called ElectroChex. The prod-
uct, about the size and shape of a standard check-
book will print checks that you insert into a slot at the
end. The product stores up to 256 payee names, cat-
egorizes payments, allows you to enter numeric and
alpha information via a qwerty keyboard and has
communication capabilities to PCs.
1. Develop Level 0, 1 and 2 data flow diagrams for
ElectroChex.
2. Be sure to identify all external entities.
3. Be sure to label all bubbles (transforms) and
3-3p2.4 ·· Essential Software Engineering
arrows in your DFDs.
4. Review your results with those developed by your
colleagues.
Hint: If you have trouble starting this exercise, think
back to our discussion of the grammatical parse in ESE
Module 3-3. part 1 . You'll recall that each of the
active verbs in the statement of scope represents a
potential function for the system. The functions trans-
late into bubbles at level 1.
Creating DFDs
Although DFD notation is quite simple, the creation of
a DFD is more difficult than you might think. The rea-
son: most software people have been trained to think
procedurally. Data flow modeling has a procedural
component to it, but it is a process of refinement
based on how data flows, not on how program logic
progresses. For this reason, it may seem a bit odd to
you if this is your first introduction.
Readings
The following excerpt has been adapted from
Software Engineering: A Practitioner's Approach and
discusses the mechanics for creating a data flow dia-
gram.
The data flow diagram (DFD) enables the software engi-