INF5180 Product and Process Improvement in Software Development


Table of Contents

1Introduction

2Improvement strategy

2.1Capability Maturity Models

2.2Agile project management

2.3Epistemology of SPI

2.4The psychology of measurements and control

2.5An algorithmic summary of the improvement strategy

2.5.1Complex Adaptive Systems (CAS)

2.5.2Action research

3Research approach and setting

4The evolution of the SPI system so far

4.1First iteration: Creating the framework for doing SPI

4.1.1Diagnosis

4.1.2Action planning

4.1.3Action taking

4.1.4Evaluation

4.1.5Specified learning

4.2Second iteration: Starting to predict the improvement rates

4.2.1Diagnosis

4.2.2Action planned

4.2.3Action taken

4.2.4Evaluation

4.2.5Specified learning

4.3Third iteration: Trying to improve the SPI system

4.3.1Diagnosis

4.3.2Action planned

4.3.3Action taken

4.3.4Evaluation

4.3.5Specified learning

4.4Fourth iteration: Improving the standard (“double loop learning”)

4.5Fifth iteration: Calibrating old data due to revised standard

4.6Sixth iteration: Insights from information infrastructure theory

4.6.1Diagnosis

4.6.2Action planned

4.6.3Action taken

4.6.4Evaluation

4.6.5Specified learning

5Discussion

6Conclusion

References

1Introduction

At the Norwegian Directorate of Taxes (NTAX) one of the COBOL programmers died in 1997, and other programmers had to step in. Due to a lack of a standard way of programming, this caused major problems, and everybody quickly realized that there was a severe need for a way of programming that would make the software maintainable. A standard was suggested by the programmers, it was accepted by the management, it was monitored by quality assurance personnel, and it has now been running for six years producing statistical trends that show continuous improvement (Øgland, 2006).

Despite the fact that this particular process of producing maintainable software has been a success on the outside, it has been continuously been marred by conflicts on the inside. While most of the programmers agree with the need for a common standard, few are willing to apply the standard on themselves. Every programmer would like his or her colleges to follow strict rules, but for himself or herself there should be total flexibility.

The situation seems to reflect a type of discussion among software academics and professionals that emerged quickly after the introduction of the early SPI and software capability maturity models (CMM). Critics claimed that adoption of the ideas would result in increased bureaucratization and control, leading to decreased developer creativity and process innovation capability; c.f. (Bach, 1994; Bollinger and McGowan, 1991; Curtis, 1998; Bach, 1995). Proponentsargued that the predictability and transparency of development and management practices, and the continuous systematic reflection of an organization’s software processes associated with higher maturity levels, would actually decrease management control and release the software developer’s creative potential (Curtis, 1994).

One way of seeing this conflict could be to identify the goals of productivity, quality and predictability of the group as a management view, while the “individual hero” perspective is a view that benefit individual programmers who hold no responsibility for the group as a whole. What view is the “right view” depends rather on whether one is a systematic software manager or a heroic software developer.

The problem of the SPI people, however, is to design a SPI system that is sufficiently rigid to aid the systematic improvement of quality, productivity and predictability, while at the same time being sufficiently flexible to prevent programmers from loosing their ability to do creative work.

The document in structured by having a vision for such a flexible and standardized design (“improvement plan”) in chapter two, while the chapters three to five will go more detailed into the NTAX case by explaining how we selected data and did data analysis (chapter three), the case itself (chapter four), and a discussion of the results (chapter five) as compared with the improvement strategy in chapter two. The total experience will be summarized in chapter six.

2Improvement strategy

Our vision for a sustainable SPI system consists of four simple ideas, corresponding to Deming’s four components of a “system of profound knowledge” (Deming, 1992):

  • Appreciation of a system
  • Understanding variation
  • Theory of knowledge
  • Psychology

We chose to translate these themes from total quality management (TQM) into SPI as (1) capability maturity models, (2) agile project management, (3) knowledge management, and (4) the psychology of measurements and control. In the next four subsections we will describe what we mean by each of these components, and in the fifth subsection we will put the components together by presenting the improvement strategy as an algorithm.

2.1Capability Maturity Models

According to the capability maturity model entry on Wikipedia (2006), a capability maturity model (CMM) may broadly refer to a process improvement approach that is based on a process model, or it may more specifically specifically refer to the first such model, developed by the Software Engineering Institute (SEI) in the mid-1980s, as well as the family of process models that followed.

Based on interviews with an Indian company that has been certified to SEI-CMM level 5 for many years, we were told that a good way of starting the quality improvement process would be by getting certified to the ISO 9001:2000 requirement for quality management systems standard as a first approach, before starting the more software process specific capability maturity models (NTAX, 2002).

Figure 1 – ISO 9000:2000 process model ( 9000)

The ISO 9001:2000 standard contains requirements for quality management systems, and is as such not a capability maturity model. In the appendix A of ISO 9004:2000, however, there is a five level maturity model for assessing quality management systems according to the structure given in ISO 9001:2000.

Maturity level / Performance level / Guidance
1 / No formal approach / No systematic approach evident, no results, poor results or unpredictable results
2 / Reactive approach / Problem- or corrective-based systematic approach; minimum data on improvement results available
3 / Stable formal system approach / Systematic process-based approach; early stage of systematic improvements; data available on conformance to objectives and existence of improvement trends.
4 / Continual improvement emphasized / Improvement process in use; good results and sustained improvement trends.
5 / Best-in-class performance / Strongly integrated improved process; best-in-class benchmarked results demonstrated.

If one would like to make the model compliant with ISO/IEC 15504 “SPICE”, then one could also use a maturity level 0 to indicate incomplete process or unknown status (NTAX, 2002).

2.2Agile project management

The Toyota Production System is a system that has gradually evolved since the 1950s, based on continuous series of small changes in order to make it more fit with the problems it is supposed to solve (Fujimoto, 1999; Womack, Jones and Roos, 1990; Womack and Jones, 2003).

<… må skrive mer om kundens definisjon av “verdi”, flow og push/pull… basert på de første 3 av 5 prinsippene hos Womack & Jones… koble Agile og Lean via referanse til Poppendick… koble Lean til ISO 9000 via Seddon … koble Lean til Taylor via Shingo og Tsustui… vektlegge prinsippet om forbedring gjennom å fjerne ”muda” og bruke dette for å strømlinjeforme implementeringen av ISO 9000 m.m…. målet er å gjøre kvalitetskontrollen fullstendig integrert i utviklingsprosessen (smidighet) slik at programmererne ikke tenker over at de blir kontrollert…>

If we look at the diagram in figure 1, we would like to implement this by focusing on stable flow in all the processes and also make sure that the production is pull-driven. Although the production system should satisfy all the requirements of ISO 9001:2000 and be continuously improved by the use of ISO 9004:2000, improvement should be done by increasing stability in processes, increasing the speed in processes and eliminating all unnecessary aspects that do not contribute to the value of the product.

2.3Epistemology of SPI

When Deming (1992) talks of theory of knowledge, he refers to the philosophy of C. I. Lewis (1929). What appears to make Lewis different from fellow American pragmatists, like Peirce, Dewy and James, is that Lewis was concerned with Kant’s solution of the mind/body problem, and interpreted this into an epistemology that emphasized prediction as the key essence of knowledge.

What seems to be a rough interpretation of the Deming/Lewis epistemology is that process knowledge may be described in terms of flowcharts and evaluated through the use of SPC diagrams. As pointed out by Kjersti Halvorsen (2006), this type of understanding also seems to be the epistemological foundation of Taylor’s “Principles of Scientific Management” (1911), although it was only with Shewhart’s invention of the SPC diagram (1926) that it was possible to talk of “scientific management” being scientific in the way that Shewhart and Deming interpreted the philosophy of science.

For the practical purpose of this study, however, when talking of continuous improvement, as illustrated in the arrow curling out from the measurements box towards the improvement box in figure 1, we expect to be able to identify this learning in terms of changes in flowcharts or source code of the quality management system, and that the validity of the improvements can be detected by finding corresponding changes in patterns on SPC diagrams.

2.4The psychology of measurements and control

Deming (1992) was concerned with internal motivation for work, and was frustrated by the fact that measurement systems tended to make people more interested in getting good scores than doing good work. Although this mostly resulted in him attacking management by objectives (MBO) for not understanding statistical variations, or complaining about management implementing SPC wrongly, sometimes he also questioned the use of numbers at all, making controversial statements against the use of marks and “gold stars” in school etc.

More interesting that what Deming was actually saying on this topic is perhaps the reason why he was so concerned with it, namely that “what gets measured gets done”. Measurement is a powerful tool for creating change. One of the single most important issues in the improvement strategy we propose is thus always to measure.

As explained in the background and research question in chapter one, the human components of the socio-technical SPI system plays an important role in making it sustainable, and carefulness analysis of political, social and psychological issues may thus be important in order to prevent the system from destroying itself.

2.5An algorithmic summary of the improvement strategy

We choose to call the approach an improvement strategy rather than an improvement plan as the aim is long-term and the choices for what to do on regular assessment dates depend to issues that will not be possible to see until the assessment date has been reached.

2.5.1Complex Adaptive Systems (CAS)

If we look at complex adaptive systems (CAS) as applied in organizational theory (e.g. Axelrod and Cohen, 2004), insights from evolutionary biology, computer science (agent based artificial intelligence) and social theory based on game theory, seem to provide a framework for designing management systems that are loosely coupled and locally controlled, and constantly being subjected to reviews for making them better. In the theory of CAS, there are three fundamental processes at work:

  • variation
  • interaction
  • selection

In any adaptive process there has to be a variation of the species, there has to be some interaction in order to produce different “children”, and there is a selection deciding which of the “children” that will grow up. Although, rather than looking at evolution in the jungle, in order to draw insights from these easy principles into the world of SPI, it is more useful to think of how human ideas and beliefs evolve. From a selection of books, somebody may variation of ideas, somebody may read two books, linking an idea from the first book with a different idea from the other book, and by testing this newly created idea in practice he is performing a type of selection. The new idea may survive in his mind as interesting or it may be discarded as unfruitful.

2.5.2Action research

The idea of how ideas evolve according to some evolutionary algorithm could be used as a description of how the scientific method works, and if we look at action research (e.g. Susman and Evered, 1978) we get a description of the scientific model (for creating organizational change) runs through a five step circular algorithm.

Figure 2 – Phases within an action research cycle (adapted from Susman & Evered 1978)

The first step of the model is called DIAGNOSING and consists of doing an analysis of the organization, finding out what the improvement…

Unlike other process improvement models, such as GIP (Basili, ref?) or IDEAL (ref?), the aim of the action research model is not only to produce change but to develop knowledge about efficient ways of creating change. In other words, the action research model fits with the epistemological ideas of Shewhart and Deming as described in section 2.3 above, and it is also a framework necessary for making Taylor’s scientific management scientific, in terms of producing new knowledge to be published in academic outlets.

3Research approach and setting

Our approach for collecting data and analyzing…

<…tilnærmingen er en rekonstruksjon av prosessforbedringsopplegget ved å gå gjennom gamle prosjektrapporter og lese prosessforbedringsopplegget som om det var utført som aksjonsforskning… noe det også delvis var, og som det faktisk er blitt i de siste iterasjonene…>

4The evolution of the SPI system so far

As the process of collecting and analyzing data from the COBOL programmers has been going on for several years, it seems reasonable to present the case by explaining the developing within each iteration.

4.1First iteration: Creating the framework for doing SPI

4.1.1Diagnosis

In 1997, one of the programmers died, and others had to take over the software. The ones who had to take over realised that they were dealing with “spaghetti code”, difficult to understand, as there had been no requirements or standards on how to program in a way that would make the software maintainable for the community at large. In the IT strategy plan of 1998, it was thus stated that standards should be designed and implemented, and one of the programmers had started writing a draft suggestion for such a standard, but as of late 2000, nothing had happened. The first research question thus seemed to be related to how to get the standard finalized, accepted by the programming community, the managers, define metrics for making sure the standard was being followed and make the metrics system sustainable.

4.1.2Action planning

The SPI change agent (“action researcher”) believed the best way to make the metrics system work would be by having the programmers themselves complete the standard, present it to management for acceptance, themselves define the metrics and themselves create the software needed for producing the statistics, while the role of the SPI change agent would be restricted to doing statistical analysis of the measurements, as this would be the only task where a sort of competence not found among the programmers (statistical competence) was needed and could not be found.

4.1.3Action taking

A draft standard was completed, and circulated among the programmers for comments. It was then revised and presented to management for acceptance. IT management then decided to ask the director general of the Directorate of Taxes to give a lecture to the programmers on the importance of following standards. This was followed by one of the programmers developing the metrics software, in order to produce data on how the various software packages deviated from the requirements of the standard. The SPI change agent was given data, analysed, and discussed the results with the programmers.

Figure3–Generic NTAX software lifecycle model (adapted from NTAX, 1998)

In figure 3 we illustrate how measurements of software maintainability is done as part of quality control phase IV as a part of the measurements that are supposed to be done with quality control procedure V10 at step 4 to evaluate system documentation.

4.1.4Evaluation

As it had been possible to get some historical data as well as new data, the results showed that most software project had been producing software that got more and more filled up with gotos, long paragraphs and other issues that the programmers themselves considered as “bad practices”, the major achievement of the first iteration was that the current measurements defined a baseline for further improvement and that a SPI system was now up and running. The results were documented in a project report (NTAX, 2001) that was distributed among programmers and management.

4.1.5Specified learning

<…>

4.2Second iteration: Starting to predict the improvement rates

4.2.1Diagnosis

The SPI change agent believed in the …

4.2.2Action planned

<…>

4.2.3Action taken

<…>

4.2.4Evaluation

<…>

4.2.5Specified learning

<…>

4.3Third iteration: Trying to improve the SPI system

4.3.1Diagnosis

<…>

4.3.2Action planned

<…>

4.3.3Action taken

Making regression analysis based on linear regression and exponential regression, in order to see if the improvements predictions could be improved. The theory from Sommerville (2004) was used for evaluating the metrics program and getting ideas on how to improve on that.

4.3.4Evaluation

<…>

4.3.5Specified learning

<…>

4.4Fourth iteration: Improving the standard (“double loop learning”)

4.5Fifth iteration: Calibrating old data due to revised standard

4.6Sixth iteration: Insights from information infrastructure theory

4.6.1Diagnosis

<…>

4.6.2Action planned

<…>

4.6.3Action taken

<… her er noen diagrammer…data for 2006 er ikke ferdig innsamlet, så målingene er approksimasjoner… selv om jeg velger å ta med hele datarekken for samtlige diagrammer, så kan det muligens være bedre å kommentere dem i årene der skjer noe, for eksempel diagrammene nedenfor viser interessante sprang for 2003/2004 som følge av revisjon av standard, og diagrammene bør ta trolig presentere i forbindelse med femte iterasjon… problemstillingen nå er at forbedringsraten ikke er like god som den var før, vi har fått flere programmerere som saboterer opplegget, og ledelsen viser ikke lenger engasjement i prosessen… muligens kommer opplegget til å terminere med mindre vi finner på noe smart…>