A Project to Develop a Handbook of DOE Operational Safety Event and Accident Investigation Techniques

September 16, 2010

The objective of this project is to convert the existing DOE AI “Workbook” into a DOE Technical Standard as defined in DOE-TSPP-5, dated July 1, 2009, and update the material to include current thinking, methods, and approaches for analysis and the conduct of investigations. The DOE AI Workbook has not been updated since 1999.

Note: The existing, current chapters of the DOE AI “Workbook” are in BLACK the proposed additions to create the handbook are in RED.

Outline: Handbook of U.S. Department of Energy (DOE) Operational Safety Event and Accident Investigation Techniques

Foreword

This section provides an overview and describes the changes in the current “Workbook” that will be preformed to create the new Technical Standard ”Handbook”, and new or updated techniques and approaches to be included.

Introduction and Accident Prevention Philosophy

This section includes a discussion on: Accident Prevention, Highly Reliable Organizations (HRO)/Organizational Learning, Human Performance Improvement (HPI), history the Accident Investigation (AI) “Handbook”, role of stakeholders in owning, developing, and using the “Handbook” as a living document, to be updated as knowledge matures from using the techniques.

Part I - The Context of DOE's Accident Investigation and Prevention Program

Section 1 - Accidents: General Principles Framework of DOE Accident Investigations and Operational Safety Event Reviews

1.1 Nature of Accidents and their Prevention (Update with new Prevention theory/understanding)

1.2 Human Performance and Reliability Factors Considerations

This section will be significantly updated to include: “Accident Prevention” philosophy, approaches and methods for human performance improvement, and promotion of high reliability organizations. Introductory portions and overview of “Volume 1 –HPI Handbook Concepts and Principles” will be excerpted and referenced in this section.

1.2.1 Human Error Event Precursors

1.2.2 Human-Machine Interface

1.2.3 Human Capabilities

1.2.4 Equipment/Design Considerations

1.2.5 Physical Work Environment

1.3 The Integrated Safety Management (ISM) Safety System Framework

Additional emphasis will be placed on the flow of the DOE approach from the Integrated Safety Management (ISM) perspective, to the human and organizational management systems interfaces.

1.3.1 Organizational Work Environment (Modify Existing Section 1.2.5, update with HRO/HPI Concepts)

1.3.2 Imagining work as believed to be performed versus how it is actually preformed.

1.3.3 The High Reliability Organization

1.3.4 Latent Organizational Deficiencies

Key Points to Remember

Section 2 - DOE's Accident Prevention and Investigation Program

The focus of updating this section will be mostly related to: 1) incorporating the changes that will come from the issuance of the new version of the AI Order 225.1B and, 2) providing lessons learned and best practices discussed by both DOE and DOE contractor personnel who have utilized these approaches over the past years since the “Workbook” was published.

2.1 Overall Management of the Program

2.2 Roles and Responsibilities of Key Participants

2.2.1 Appointing Officials and Line Management Participants

2.2.2 Accident Investigation Board or Operational Safety Event Review Team (OS-ERT)

2.3 Site Readiness

2.3.1 Readiness - What Is It?

2.3.2 Establishing Written Procedures and Responsibilities

2.3.3 Maintaining Resources to Support an Accident Investigation or Operational Safety Event Review

2.3.4 Training for Site Readiness

2.3.5 Conducting Periodic Practices and Evaluations

2.4 Accident Investigation Process or Operational Safety Event Review Overview

2.5 Waivers

2.6 Limited Scope Accident Investigations and Operational Safety Event Reviews

Key Points to Remember

Part II - The Accident Investigation Board or the Operational Safety Event Review Team Process

Section 3 - Appointing the Investigation Board or the Operational Safety Event Review Team

This section will be sub-divided into two distinct yet similar approaches:

1) DOE Federally Lead Accident Investigation Boards.

2) Team based Contractor Lead Operational Safety Event Reviews; of “information rich” events of management concern, causal factors, latent organizational weaknesses, events or the near miss areas of management concern .

3.1 Establishing the DOE Federally Lead Accident Investigation Board or the Contractor Lead Operational Safety Event Review Team and Its Authority

3.2 Briefing the Board or Team

3.3 DOE Federally Lead Accident Investigations

3.4 Contractor Lead Operational Safety Event Reviews– Occurrence Reporting and Processing System (ORPS) Level or Management Initiated (*Capture and discuss various approaches and methods being used)

Key Points to Remember

Section 4 - Implementing Site Readiness

This section will not be associated with a major change. The focus of updating this section will be to provide lessons learned and best practices discussed by both DOE and DOE contractor personnel who have utilized these approaches over the past years.

4.1 Immediate Post-Accident Actions

4.2 Preserving and Documenting the Accident Scene

4.2.1 Securing and Preserving the Scene

4.2.2 Documenting the Scene

4.3 Collecting, Preserving, and Controlling Evidence

4.4 Obtaining Initial Witness Statements

4.5 Transferring Information to the Board

Key Points to Remember

Section 5 - Managing the Accident Investigation or Operational Safety Event Review

A focus of the revision of this section will discuss digging below the surface, into accident and event trends data, near misses, ORPS data, surveillance data, human performance interviews and self identified reports, of management systems and human performance weaknesses. This section will be updated based on past experience and lessons learned from application of the AI analysis methods by DOE and DOE Contractors.

The general frame work will be re-written to highlight two approaches:

1) Federally Lead Accident Investigation Boards.

2) Contractor Lead Team based, Operational Safety Event Review.

5.1 Project Planning

5.1.1 Collecting Initial Site Information

5.1.2 Team based approach - Determining Task Assignments

5.1.3 Preparing a Schedule

5.1.4 Acquiring Resources

5.1.5 Addressing Potential Conflicts of Interest

5.1.6 Establishing Information Access and Release Protocols

5.2 Managing the Investigation or Operational Safety Event Review Process

5.2.1 Taking Control of the Accident Scene – Site Transition from Emergency Response to an Accident Investigation

5.2.2 Initial Meeting of the Investigation Board

5.2.3 Promoting Teamwork

5.2.4 Managing Information Collection

5.2.5 Coordinating Internal and External Communication

5.2.6 Managing the Analysis – Role of the Analyst

5.2.7 Managing Report Writing

5.2.8 Managing Onsite Closeout Activities

5.2.9 Managing Post-Investigation Activities

5.3 Controlling the Investigation Process

5.3.1 Monitoring Performance and Providing Feedback

5.3.2 Controlling Cost and Schedule

5.3.3 Assuring Quality

Key Points to Remember

Section 6 - Collecting Data

This section will not be associated with a major change. The focus of updating this section will be to provide lessons learned and best practices discussed by both DOE and DOE contractor personnel who have utilized these approaches over the past years.

6.1 Collecting Human Evidence

6.1.1 Locating Witnesses

6.1.2 Conducting Interviews

6.2 Collecting Physical Evidence

6.2.1 Documenting Physical Evidence

6.2.2 Inspecting Physical Evidence

6.2.3 Removing Physical Evidence

6.3 Collecting Documentary Evidence

6.4 Examining Organizational Concerns, Management Systems, and Line Management Oversight

6.5 Preserving and Controlling Evidence

Key Points to Remember

Section 7 - Analyzing Data

This section will be updated to include discussions of the various ways to approach event analysis and determination of facts and associated causal factors.

This includes both:

1) Accident Prevention/Operational Safety Event Review/Analysis and,

2) Accident Investigation methods, that have been found useful.

The section will also include lessons learned from Industry (i.e. Institute of Nuclear Power Operations (INPO), DOE and Contractor use of various analyses. The section will also discuss the methodology of how HRO/Latent Organizational Weaknesses and HPI/Human Error are linked in the event chain, and the derivation of causal factors.

7.1 Determining Facts

7.2 Methods for Event and Causal Factor Analysis - Overview

7.2.1 Historical Event Analysis and Latent Organizational Weaknesses

Focus will be on a How to Methodology based upon “Volume 2 - DOE HPI Handbook Human Performance Tools” This section will include approaches or methods that may be used to look at, historically trend and causal factor data from ORPS, near misses, occurrence reports, surveillance reports, error event precursors, and latent organizational weaknesses.

7.2.2. Determining HPI/Human Error Event Precursors

Focus will be on a How to Methodology- to include, “INPO 05-002 Human Performance Tools of Engineers and Other Knowledge Workers”, its methods, and “Volume 2 - DOE HPI Handbook Human Performance Tools”.

This section will also be updated based on past experience and lessons learned from application of the AI analysis methods by DOE and DOE Contractors.

7.2.3 Direct Cause

7.2.4 Contributing Causes

7.2.5 Root Causes

7.3 Using the Core Analytical Techniques

7.3.1 Events and Causal Factors Charting

7.3.2 Barrier Analysis

7.3.3 Change Analysis

7.3.5 Root Cause Analysis

7.4 Using Advanced Analytic Methods

7.4.1 Analytic Trees

7.4.2 Management Oversight and Risk Tree Analysis (MORT)

7.4.3 Project Evaluation Tree (PET) Analysis

7.5 Other Analytic Techniques

7.5.1 Time Loss Analysis

7.5.2 Human Factors Analysis

7.5.3 Integrated Accident Event Matrix

7.5.4 Failure Modes and Effects Analysis

7.5.5 Software Hazards Analysis

7.5.6 Common Cause Failure Analysis

7.5.7 Sneak Circuit Analysis

7.5.8 Materials and Structural Analysis

7.5.9 Design Criteria Analysis

7.5.10 Accident Reconstruction

7.5.11 Scientific Modeling

Key Points to Remember

Section 8 - Developing Conclusions and Judgments of Need

This section will be updated based on past experience and lessons learned from application of the AI analysis methods by DOE and DOE Contractors.

8.1 Conclusions

8.2 Judgments of Need

8.3 Minority Opinions

Key Points to Remember

Section 9 - Reporting the Results

This section will be updated based on past experience and lessons learned from application of the AI analysis methods by DOE and DOE Contractors.

9.1 Writing the Report

9.2 Report Format and Content

9.2.1 Disclaimer

9.2.2 Appointing Official's Statement of Report Acceptance

9.2.3 Table of Contents

9.2.4 Acronyms and Initialisms

9.2.5 Prologue-Interpretation of Significance

9.2.6 Executive Summary

9.2.7 Introduction

9.2.8 Facts and Analysis

9.2.9 Conclusions and Judgments of Need

9.2.10 Minority Report

9.2.11 Board Signatures

9.2.12 Board Members, Advisors, Consultants, and Staff

9.2.13 Appendices

9.3 Performing Quality Review and Validation of Conclusions

9.4 Conducting the Factual Accuracy Review

9.5 Reviews by the Assistant Secretary for Environment, Safety and Health

9.6 Submitting the Report

9.7 Briefing the Boards Report to DOE or Senior Contractor Management

Key Points to Remember

Appendices

Appendix A - Glossary

Appendix B - References

Appendix C - Specific Administrative Needs

Appendix D - Safety Management System

Appendix E - Subject Index

List of Tables

Table 1-1. Human Performance Aspects capabilities that Contribute to Accidents work performance (*Include HPI tables and charts from, “Volume 1 –HPI Handbook Concepts and Principles” and “Volume 2 - DOE HPI Handbook Human Performance Tools”.

Table 1-2. Equipment design can affect human performance

Table 2-1. Appointing officials and line management participants in accident investigations or operational safety event reviews have clearly defined responsibilities

Table 2-2. The accident investigation board has these major responsibilities

Table 2-3. The timeline for an Type A or Type B Accident Investigation or Operational Safety Event Review requires performing conducting multiple simultaneous tasks

Table 3-1. Board or Team members must meet these criteria

Table 4-1. Several types of witnesses should provide preliminary statements

Table 5-1. These activities should be included on an Accident Investigation or Operational Safety Event Review schedule

Table 5-2. The chairperson establishes protocols for controlling information

Table 5-3. The chairperson should use these guidelines in managing information collection activities

Table 5-4. The Price-Anderson Amendments Act of 1988

Table 6-1. These sources are useful for locating witnesses

Table 6-2. It is important to prepare for interviews

Table 6-3. Group and individual interviews have different advantages

Table 6-4. Interviewing do's

Table 6-5. Interviewing don'ts

Table 6-6. Use these universal precautions when handling potential blood borne pathogens

Table 6-7. These are typical questions for addressing the five core functions of integrated safety management

Table 6-8. These are typical questions for addressing the seven guiding principles of integrated safety management

Table 7-1. Case study introduction

Table 7-2. Benefits of events and causal factors charting

Table 7-3. Guidelines and symbols for preparing an events and causal factors chart

Table 7.3.4.1. Sample Historical ORPS, Near Miss, and Information Rich Event Analysis.

Table 7.3.4.2 Sample Analysis Determining HPI/Human Error Event Precursors

Table 7.3.4.3 Sample Analysis Determining HRO/Latent Organizational Weaknesses

Table 7-4. Sample barrier analysis worksheet

Table 7-5. Sample change analysis worksheet

Table 7-6. Case Study: Change analysis summary

Table 7-7. Tier diagram worksheet for a contractor organization

Table 7-8. Example tier diagram approach

Table 7-9. Compliance/noncompliance root cause model categories

Table 7-10. MORT color coding system

Table 8-1. These guidelines are useful for writing judgments of need

Table 8-2. Case Study: Judgments of need

Table 9-1. Useful strategies for drafting the Accident Investigation Report or the Operational Safety Event Review Report

Table 9-2. Example outline of an Accident Investigation Report

Table 9-3. Example outline of an Operational Safety Event Review Report

Table 9-4. Facts differ from analysis

Table 9-5 Example format for a Board or Team close out briefing power point presentation to DOE or Senior Contactor Management.

List of Figures

Figure 1-1. Human “Performance-machine "activity model"

Figure 2-1. The process used to conduct an accident investigation involves many activities

Figure 2-2. The three primary activity phases in an accident investigation overlap significantly

Figure 5-1. A typical schedule of accident investigation activities covers 30 days

Figure 7-1. Simplified events and causal factors chart

Figure 7-2. Sample of an events and causal factors chart (in progress)

Figure 7-3. Barriers are intended to protect personnel and property against hazards

Figure 7-4. Barriers to protect workers from hazards

Figure 7-5. Summary results from a barrier analysis reveal the types of barriers involved

Figure 7-6. Summary results from a barrier analysis can highlight the role of the core functions in an accident

Figure 7-7. The change analysis process is relatively simple

Figure 7-8. Events and causal factors analysis; driving events to causal factors

Figure 7-9. Grouping root causes on the events and causal factor chart

Figure 7-10. Identifying the linkages on the tier diagram