66

Appendix c

cocomo suite: data collection forms and guidelines

C.1 Introduction

Appendix C provides a set of forms and procedures for collecting effort and schedule data for a given software project throughout its life cycle, in a form compatible with the following COCOMO Suite models: COCOMO II, and its emerging extensions COCOTS, COPSEMO, COQUALMO and CORADMO. These data collection or Software Project Data (SPD) forms have been kept brief, with minimal definitions, explanations, etc. Please refer to the index and glossary for the definitions of any terms that are unfamiliar. The procedures are oriented around the collection of information and the updating of estimates at the project's life cycle anchor points (see Appendix A). Revising project estimates as each anchor point is achieved provides immediate benefits by furnishing 1) estimates that are more accurate, and 2) current cost-to-complete and schedule-to-complete information. Revised estimates also provide the data needed to perform up-to-date sensitivity, risk and parametric analyses.

Such data collection activities are as an integral part of an effective project management process. Information gathered is used in determining whether or not a project is on track relative to original plans built upon initial estimates. When actual cost and schedule performance deviates from plans, new estimates may be in order.

Data collection should not be an additional burden for management. Thus, we have organized COCOMO II data collection to be management-relevant and easy to implement via the electronic forms found on the accompanying CD. For on-going projects, data collection allows you to determine whether or not your performance is on track relative to plans. For completed projects, data collection allows you to develop a database that you can use to more precisely calibrate COCOMO II and the other Suite models to your actual experience. For both types of projects, data collection permits you to use existing knowledge to improve the accuracy of your estimating capabilities.

The data collection forms and procedures provided here enable an organization to develop the core capabilities needed to satisfy the new Level 2 Measurement and Analysis process area called Activities Performed, found in the Integrated Capability Maturity Model (CMMI) recently issued at:

http://www.sei.cmu.edu/cmm/cmmi/

The activities include: establish measurement objectives; define measures; define data collection and storage procedures; define analysis procedures; collect measurement data; analyze measurement data; store data and results; and communicate results.

C.2 Procedure for Projects

The Software Project Data forms (Figures) and corresponding instructions (Tables) described below are provided herein as well as on the accompanying CD-ROM. The first form applies to all the Suite models, while numbers two through five apply to COCOMO II; these are also needed, however, as a base for all of the emerging extensions. The remaining forms, six through nine, are specific to each of the extensions of COCOMO II. All the forms can be used either for on-going or completed projects:

Form SPD-1: General Information (All Models) (Figure C-1/Table C-1). Originated at the start of the project, updated at intermediate milestones and completed at the end of the project.

Form SPD-2a: Phase Summaries (Waterfall-based process) (Figure C-2a/Table C-2a). Estimated or actual phase information entered at the end of each major phase of the project following a Waterfall-based process. Finalized at the end of the development.

Form SPD-2b: Phase Summaries (MBASE/RUP-based process) (Figure C-2b/ Table C-2b). Estimated or actual phase information entered at the end of each major phase of the project following a MBASE/RUP-based process. Finalized at the end of the development.

Form SPD-3: Component Summaries (Figure C-3/Table C-3). Component data entered during the start of the project. Completed at the end of the development.

Form SPD-4: COCOMO II Progress Runs (Figure C-4/Table C-4). Estimated project cost and schedule data, and ratings for estimating parameters, entered at the end of each major phase of the project.

Form SPD-5: COCOMO II Project Actuals (Figure C-5/Table C-5). Actual project cost and schedule data, and final ratings for estimating parameters, collected at the end of the project.

Form SPD-5a: COCOMO II Project Actuals: Simple Completed Project (Figure C-5a/Table C-5a). Actual project cost and schedule data, and final ratings for estimating parameters, for simple completed projects; collected at the end of the project.

Form SPD-6a: COCOTS Project Level Data (Figure C-6a/Table C-6a). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development.

Form SPD-6b: COCOTS Assessment Data (Figure C-6b/Table C-6b). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development.

Form SPD-6c: COCOTS Tailoring Data (Figure C-6c/Table C-6c). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development.

Form SPD-6d: COCOTS Glue Code Data (Figure C-6d/Table C-6d). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development.

Form SPD-6e: COCOTS Volatility Data (Figure C-6e/Table C-6e). Estimated or actual project cost and schedule data, and ratings for estimating parameters; can be entered at the end of each major phase of the project. Finalized at the end of the development.

Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries (Figure C-7/Table C-7). Phase cycles and activity breakdowns.

Form SPD-8: COQUALMO Defect Summaries (Figure C-8/Table C-8). Defect introduction and removal data collected by artifact and life cycle phase.

Form SPD-9: CORADMO RAD Details Summaries (Figure C-9/Table C-9). Rapid Application Development parameters (CoRADMO Driver Ratings). Project ratings entered during the start of the project. Final ratings re-assessed at the end of the development, relying on COPSEMO detailed effort and schedule actuals' data for calibration.

C.3 Guidelines for Data Collection

C.3.1 New Projects

Projects starting out should consider collecting cost, schedule and error data at the following times during the project's life:

§  Project Start - develop your initial estimates using the following set of forms:

§  Form SPD-1: General Information

§  Form SPD-3: Component Summaries

§  Form SPD-6a: COCOTS Project Level Data

§  At the end of Major Project Phases - update your estimates using the following set of forms:

§  Form SPD-2a: Phase Summaries (Waterfall-based process)

§  Form SPD-2b: Phase Summaries (MBASE/RUP-based process)

§  Form SPD-3: Component Summaries

§  Form SPD-4: COCOMO II Progress Runs

§  Form SPD-6b: COCOTS Assessment Data

§  Form SPD-6c: COCOTS Tailoring Data

§  Form SPD-6d: COCOTS Glue Code Data

§  Form SPD-6e: COCOTS Volatility Data

§  Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries

§  Form SPD-8: COQUALMO Detailed Summaries

§  Form SPD-9: CORADMO RAD Project Summaries.

§  At the end of the development - capture your project actuals using the following forms:

§  Form SPD-2a: Phase Summaries (Waterfall-based process)

§  Form SPD-2b: Phase Summaries (MBASE/RUP-based process)

§  Form SPD-5: COCOMO II Project Actuals

§  Form SPD-6b: COCOTS Assessment Data

§  Form SPD-6c: COCOTS Tailoring Data

§  Form SPD-6d: COCOTS Glue Code Data

§  Form SPD-6e: COCOTS Volatility Data

§  Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries

§  Form SPD-8: COQUALMO Detailed Summaries

§  Form SPD-9: CORADMO RAD Details

New projects should view data collection as an opportunity. They can use the data to benchmark their progress, develop business cases and calibrate their cost models.

C.3.2 Completed Projects

In general, it is not possible to reconstruct COCOMO II and other COCOMO Suite milestone runs and detailed phase/activity data from completed projects. If the project was estimated using earlier versions of model, we suggest that you use our Rosetta stone [Boehm, Reifer, Chulani, ????] to convert the data. If they weren't, we suggest that you try to capture as much cost related data as possible using the following forms:

§  Form SPD-1: General Information - Fill out this form as best you can.

§  Form SPD-2a: Phase Summaries (Waterfall-based process) - Complete this form for each major delivery of a Waterfall-based process.

§  Form SPD-2b: Phase Summaries (MBASE/RUP-based process) - Complete this form for each major delivery of a MBASE/RUP-based process.

§  Form SPD-3: Component Summaries - Do the best you can with whatever data you can gather. Use a code counter to collect actuals whenever possible.

§  Form SPD-4: COCOMO II Progress Runs - Fill out this form using any cost- and schedule-to-complete information at your disposal. If no such information exists or is readily available, don't waste your time.

§  Form SPD-5: COCOMO II Project Actuals - Complete this form by sifting through your accounting reports and by inspecting the final product.

§  Form SPD-5a: COCOMO II Project Actuals: Simple Completed Projects – Preferably, this form should be accompanied by Form SPD-1, but it can be used as a one-page total-completed-project data collection form compatible with the data provided for a COCOMO II estimation run.

§  Form SPD-6a: COCOTS Project Level Data - Do the best you can with whatever data you can gather.

§  Form SPD-6b: COCOTS Assessment Data - Do the best you can with whatever data you can gather.

§  Form SPD-6c: COCOTS Tailoring Data - Do the best you can with whatever data you can gather.

§  Form SPD-6d: COCOTS Glue Code Data - Do the best you can with whatever data you can gather. Use a code counter to collect actuals whenever possible.

§  Form SPD-6e: COCOTS Volatility Data - Do the best you can with whatever data you can gather.

§  Form SPD-7: COPSEMO Detailed MBASE Effort and Schedule Summaries - Complete this form by sifting through your accounting reports and applying engineering judgement based on personnel and their tasks or roles.

§  Form SPD-8: COQUALMO Detailed Summaries - Fill out this form as completely as you can using inspection reports, technical review reports, testing results and reports, and software trouble report records as your source.

§  Form SPD-9: CORADMO RAD Details Summaries (Figure C-9/Table C-9). Fill out this form with Rapid Application Development parameters (CoRADMO Driver Ratings).

C.3.3 Maintenance Projects

COCOMO II also provides you with the capability to develop annual or other periodic maintenance cost estimates based upon the modification of the original COCOMO 81 maintenance model, described in Chapter 2, Section 2.5. We suggest that you use the forms provided when using this model. However, you will want to use actuals and re-rate project attributes collected on Form SPD-5 when computing the numbers.

C.4 Data Conditioning

Data conditioning is an essential activity in the software data collection and analysis process. Even when people try to provide the best data they can, there are a number of known problems and subtle sources of misunderstanding that can inject bias into their data. Use of such data to calibrate cost models can lead to erroneous results should these and other sources of data contamination not be removed.

C.4.1 Sources of Data Contamination

Besides the problems of missing data and clerical errors, some of the most common and frequent sources of software data collection problems include:

1.  Inconsistent definitions - The COCOMO II model defines terms differently than previous models. For example, it uses SLOC (Source Lines of Code) instead of DSI (Delivered Source Instructions) which were used in the original COCOMO model (see Section 2.2.1). An "IF-THEN-ELSE, ELSE IF" pair will now count as a single SLOC instead of two DSI when a terminal semi-colon is used for the counting conventions. As another example, COCOMO uses 152 person hours per person month and assumes casual overtime is not included as part of the burden. If you used something different, the model would generate erroneous answers.

2.  Improper scope - The COCOMO II model assumes that the project's scope includes certain activities and excludes others. For example, software testing is included while software support to system integration and test is not. As another example, software documentation that is normally generated during the software development life cycle is included while customer unique documentation is not. Again, you would generate erroneous answers if you used the model outside of its proper scope. Appendix A, Section 6, records the major COCOMO II scoping assumptions.

3.  Double Counting - Sometimes items are double counted or taken into account twice using several factors within the model. For example, REVL breakage is used to take into account volatile requirements. However, some people double dip by improperly rating the Precedentedness or Architecture/Risk Resolution scale factors lower than they should be to take volatility into account. You should understand what the factor ratings involve prior to rating them to avoid making this mistake.

4.  Averaging - Often, people use average ratings for groupings that extend across subsystems and the project. Because they haven't taken the time to get into the details, they consolidate their estimate and lose fidelity because little differentiation is made between different types of software. You can avoid this problem and greatly improve the accuracy of your estimates by breaking down the project into finer grained components.

5.  Garbage In, Garbage Out - Another common problem is the use of erroneous assumptions. People often use models to generate quick-and-dirty estimates. They make all sorts of simplifying assumptions in their quest for numbers. One way to avoid problems of this sort is to take a little more time to develop realistic, but simplifying assumptions. This often takes some interaction with both the developer and customer communities.

6.  Observational Bias - Finally, many people tend to be overly optimistic/pessimistic when they estimate. Biases either way should be avoided especially when they can become a self-fulfilling prophecy. Use of wide band Delphi in which groups of experts reach consensus on their estimates reduces such biases. However, such group estimates take more time to achieve and may not be practical under some circumstances.

C.4.2 Data Conditioning Guidelines

The best defense against these problems is to provide those involved in the data collection with a clear set of definitions, automated procedures and examples. Build self-checks whenever possible into your data collection system. For example, you can ask a question in two different ways on two related forms to test the consistency of the answer (e.g., effort, schedule, average staff size). Be careful not to overdo this, however. In addition, such problems can be further avoided by collecting the data close to its source. For example, try to collect labor hours using your time card system. Finally, make data collection a natural part of the way you implement your processes. For example, collect error data as part of your software trouble reporting process. This eliminates the need to use multiple forms and makes it easier to collect the data.