Parametric Fault Trees with Dynamic Gates and Repair Boxes

Andrea Bobbio, Università del Piemonte Orientale, Alessandria

Daniele Codetta R., Università del Piemonte Orientale, Alessandria

Key Words: Parametric fault tree, modularization, dynamic gate, repair box, Colored Petri net.

SUMMARY & CONCLUSIONS

A new approach is proposed to include s-dependencies in Fault Tree (FT) models. With respect to previous techniques, the approach presented in this paper is based on two peculiar powerful features. First, we adopt a parameterization technique, referred to as Parametric FT (PFT), to fold equal subtrees (or basic events) in order to resort to a more compact FT representation. It is shown that parameterization can be conveniently adopted as well for dynamic gates. Second, PFT can be modularized and each module translated into a High Level Colored Petri net in the form of a Stochastic Well-formed Net (SWN). SWN generate a lumped Markov chain and the saving in the dimension of the state space can be very substantial with respect to standard (non colored) Petri nets. Translation of PFT modules into SWN has proved to be very flexible, and various kinds of new dependencies can be easily accommodated. In order to exploit this flexibility a new primitive, called repair box, is introduced. A repair box, attached to an event, causes the starting of a repair activity of all the components that are failed as the event occurs. In contrast to all the previous FT based models, the addition of repair boxes enables our approach to model cyclic behaviors. We refer to the proposed approach as Dynamic Repairable PFT (DRPFT). A tool supporting DRPFT is briefly described and the tool is validated by analyzing a benchmark proposed recently in the literature for quantitative comparison (Ref. 12).

1. INTRODUCTION

Traditional Fault-trees (FT) have gained a widespread acceptance for the dependability and safety analysis of complex and critical systems, since they are simple to manipulate and are supported by powerful software tools for the qualitative and quantitative analysis. However, traditional FT suffer from the main limitation that basic components must be assumed as s-independent. S-dependence in the failure process arises when the failure behavior of a component depends on the state of the system. This kind of s-dependence has been recently tackled by many authors (Refs. 1, 2, 3, 4, 5). In the Dynamic FT (DFT) approach (Refs. 1, 3), the FT is decomposed in independent modules and each module is analyzed by generating its state space and solving the underlying Markov chain (CTMC), or by solving the local dependencies by means of numerical techniques (Ref. 5). All the above FT models are acyclic models and no previous technique has addressed the problem of including in a FT actions, taken after a fault, that restore the system to a previous condition (repair, recovery, roll-back, rejuvenation), converting an acyclic model into a cyclic one.

In order to alleviate the largeness problem, a more compact FT representation has been proposed in Refs. 6,7. This compact representation (referred to as Parametric FT (PFT)) is based on the observation that often, due to redundancies, the system to be modeled contains similar replicated units or subtrees. Similar subtrees may be folded and parameterized, so that only one representative is explicitly included in the model. PFT can still be modularized starting from an algorithm presented in Ref. 8, and each module can be automatically converted into a high level Colored Petri Net formalism called SWN (Ref. 9). SWN’s have the property that they generate symbolic states (markings) that may be viewed as a high level description of sets of actual markings. The definition of symbolic markings allows us to exploit symmetry properties in the model and to generate the underlying Markov chain in lumped form. The degree of saving in the state space generation depends on the redundancies present in the system and can be very consistent (Ref. 6).

The aim of the present paper is to present an extended version of PFT that we call DRPFT (Dynamic Repairable Parametric FT). DRPFT implements dynamic gates in compact parametric form. Moreover, DRPFT is extended to include dependencies arising from the repair process, by adding a new primitive called repair box. The solution procedure for a DRPFT is presented and a software tool developed for the analysis is briefly described. Finally, the DRPFT is validated through a benchmark example taken from Ref. 12. The quantitative results obtained from DRPFT coincide with those published in Ref. 12, but the example emphasizes that the dimension of the state space that is achieved using the DRPFT approach is more than two orders of magnitude lower that the one obtained by previous non parametric techniques.

2. ACRONYMS

FT fault tree

P(D)FT parametric (dynamic) fault tree

DRPFT dynamic repairable parametric fault tree

FDEP functional dependency gate

PAND priority and gate

SEQ sequence enforcing gate

WSP warm spare gate

SWN stochastic well-formed nets

3. DYNAMIC REPAIRABLE PARAMETRIC FT

PFT have been extensively discussed in Refs. 6 and 7. By means of the introduction of a new event, called replicator event (and drawn as a dashed rectangle) similar subtrees (or basic events) can be folded and parameterized, by defining in the replicator event a parameter with its range of variation. Replicator events are, thus, a compact construct to generate as many identical subtrees as the cardinality of the declared parameter. PFT with AND, OR, and K:N gates can be automatically converted (Ref. 6) into a Colored Petri net in the form SWN (Ref. 9). Notice that a PFT with no replicator events becomes a standard FT, and its automatic translation in a SWN provides a standard (non colored) PN. In the following, we extend the PFT formalism to include the dynamic gates proposed in Refs. 1 and 3. In particular, we consider the dynamic gates FDEP (functional dependency gate), PAND (priority and gate), SEQ (sequence enforcing gate) and WSP (warm spare gate). We show that the dynamic gates can be parameterized (when the proper conditions arise) and translated into a SWN. Finally, we introduce a new primitive, called repair box. A repair box is assigned a constant repair rate, and can be connected to any event in the PFT, with the meaning of indicating the repair (with the assigned repair rate) of all the basic components that are failed when the event occurs. The PFT formalism, augmented with dynamic gates and repair boxes, is referred to as Dynamic Repairable PFT (DRPFT). The analysis of a DRPFT follows a classical hierarchical scheme (Refs. 2,10). The DRPFT structure is first modularized, i.e. partitioned in s-independent subtrees, called modules. Each module is converted into a Petri net in the form SWN and analyzed in isolation by resorting to the underlying lumped CTMC. The module failure probability, computed from the resulting CTMC, is cast back into the original DRPFT, by replacing the whole module with a single basic event. All the above steps are automatized in a software tool and hidden to the modeler.

4. DYNAMIC GATES IN DRPFT AND THEIR TRANSLATION IN SWN

When suitable symmetry conditions arise, dynamic gates can be represented in compact parametric form, and then automatically translated into a SWN. The way in which this procedure is implemented in DRPFT is illustrated in the following paragraphs. The graphical symbols adopted for the dynamic gates are those introduced in Ref. 3.

4.1 FDEP

A FDEP gate is characterized by a trigger event and a set of dependent events. Dependent events may fail by their own or by the effect of the trigger event failure. In Fig. 1a, T is the trigger event while A and B are the dependent events. Since FDEP in Fig. 1a has no replicator events its translation is in the form of the standard PN of Fig. 1b. Failure of component A is represented by a token in place A that can be determined by the firing of transition A_f (failure of A by its own) or by the firing of transition fdep_2 (failure of the trigger event T). In the case the dependent components are identical, they can be folded and parameterized as in Fig. 2a. T is the trigger event while D(i) is a replicator event providing the parametric representation of the set of dependent components. If D(i) has the cardinality equal to 2, the DRPFT of Fig. 2a is coincident with the DFT of Fig 1a. However, the parameter i can have any cardinality, and, hence, can represent a FDEP gate with any number of multiple identical dependent components. Fig. 2b shows the corresponding SWN. The failure of one of the dependent components is represented by a colored token in place D and may be caused either by the firing of transition D_f (failure of one of the D(i)’s) or by the firing of transition fdep_2 (failure of T). Notice again that, in contrast to the FT of Fig. 1a, the complexity of the DRPFT structure of Fig. 2a does not depend on the number of dependent components that only influence the cardinality of the set D(i).

4.2 PAND

PAND gate fails if all of its input fail in a specified order (from left to right). Let us consider the gate in Fig. 3a: its input events are A and B and they have to fail in this order. Since PAND in Fig. 3a has no replicator events its translation is in the form of the standard PN of Fig. 3b. Transition pand_2 fires if B fails and A is still working; this transition puts a token in place Oper to indicate that the order has not been respected and the gate failure did not occur. Otherwise, if A fails before B, transition pand_1 fires putting a token in the failure place PAND_fail. A PFT construct can be envisaged if the PAND gate has more than two identical ordered input events.

4.3 SEQ

SEQ gate forces its input to occur in a specified order (we assume from left to right). The translation of the SEQ gate of Fig. 4a into a Petri net is shown in Fig. 4b where the transition B_f, representing the failure of B, is enabled and fires in the presence of a token in place A (A is failed). In a similar way, the failure of C is enabled by the failure of B.

4.4 WSP

A WSP gate is characterized by a main component, and a set of ordered spare components. When the main component fails it is replaced by the first component available in the spare list. A spare may be in one of the following states:

· dormant or stand-by (it is not working, but ready to replace the main component);

· working (it is working in place of the main component which is failed);

· failed.

The failure rate of a spare component when in a working condition is l. Denoting by a (0£a £1) the dormancy factor, the failure rate of the spare in the dormant condition is al. Notice that a =0 models a cold stand-by, a=1 models the hot s-independent case (the WSP behaves as an AND gate). The WSP gate fails when the main component fails and there are no available spares. Assuming that there are m identical spares, we can model the spares by means of the replicator node SP(i), in which the parameter i of cardinality m is defined (see Fig. 5a). Hence, in the DRPFT representation (Fig. 5a) the WSP gate has two inputs:

- a basic event P representing the failure of the main component;

- a replicator basic event SP(i) that is the parametric representation of the set of spares; the cardinality m of this set is equal to the number of spares.

The translation of the WSP gate of Fig. 5a is given in the SWN of Fig. 5b. Place SP_na contains the coloured tokens of the spares which are not available because failed or already working; SP_curr contains the token relative to the spare which is currently replacing the main component. Transition SP_fail models the fault of a spare when in dormant condition putting the relative token in SP_na.

When the main component P fails (token in place P_dn), transition P_spare fires putting the token relative to the spare to be used in SP_curr and SP_na. If later the spare fails (firing of transition SP_fail), if place SP_na contains a number of tokens equal to the number of spares (there are no more available spares at the moment) transition P_fail fires modeling the general failure of the gate, else another spare starts working by means of P_spare transition.

The SWN of Fig 5b is actually more general than the corresponding WSP gate of Fig. 5a. Indeed, assigning a color class to place P_dn we can model a situation in which there are n main components with m shared (or non-shared) spares.

5. MODULES DETECTION AND CLASSIFICATION

A module is a subtree that is s-independent from the rest of the FT. In a DRPFT, a subtree is a module when it has no nodes in common with other modules, does not descend from a dynamic gate or does not contain a repair box. However, the parameterization of the FT hinders the search for shared basic events, since additional conditions on the parameter definition and propagation need to be satisfied (Ref. 7). The example in Fig. 6 clarifies this point.

Structural

Module /

STEP

1 Shared

nodes /

STEP

2

Shared

Param.
nodes / STEP
3
Dyn.
Gate
Descendant /

Mod.

Type

Min.

SYS1 / no / no / no / yes / stat. / yes
SUB(i) / no / yes / no / no / - / -
SYS2 / no / no / no / yes / dyn. / yes
D_F / no / no / yes / no / - / -
SYS3 / no / no / no / yes / dyn. / no
Q_F(i) / no / no / no / yes / dyn. / yes

A module may be classified as static or dynamic. Static modules contain common basic events (possibly in parameterized form) and can be analyzed by means of suitable combinatorial techniques. Dynamic modules contain dynamic gates or repair boxes and require a state-space analysis which, in the DRPFT methodology, is obtained by translating the dynamic module into a SWN. Dynamic modules are analyzed in isolation, and replaced in the original FT by a single basic event to which the module Top event pro-