Introduction to Monte-carlo analysis Report
Mihir Karnik.
This report refers to the Monte-carlo Analysis for Software development. The paper discusses in detail how the Monte-carlo analysis is different from the Agile development model and how it is advantageous.
Introduction
Monte-carlo analysis is a mathematical technique that finds the most likely patterns in an equation result when given random input values that have been constrained between the likely real-world values for those inputs. (Troy, 2011) (Monte Carlo method)
In place of an equation, most of the times a spreadsheet of software model of the real-world process is built.
Thus, the three questions that any developer has to be prepared to face when dealing with a customer are:
- How much will this product cost to develop and deliver?
- What is the likelihood of releasing by date x?
- What resources do you need to hit date x?
Consider an example where,if we know that there are one-hundred software product features (stories) to develop, and that from the company history (or educated estimate), we know that the shortest time it would take each feature is one day, and the longest is three days then a Monte-carlo analysis would simulate in software completing these one-hundred features with a random work time of between one and three days; and it would do this thousands of times.
This is in contrast to the general tendency in today’s software engineering projects is to carry out work in an incremental form. This means that, the project is delivered to the end users in small increments with each time having an improvement over the last version that was delivered as an executable and would help us to find out the best possible time required when we add a large number of different defects such as, added scope, environment downtime, and other blocking events such as, staff availability.
The need for an Agile approach has arisen because there is no company that would allocate a large amount of time and resources (mainly, money) to any project without first expecting some sort of returns on that investment. Due to this the developer has to promise some functionality to a company in a specified amount of time and resources for the company to believe that the project is actually feasible.
This approach however, has a few drawbacks.
According to benchmarking data and definition of project failure by The IPA Institute, staggering 56% of major projects fail (The IPA Institute, 2009) due to
- budget overspending for more than 25%, and/ or
- schedule slipping for more than 25%, and/ or
- severe and continuing operational problems holding for at least one year. (Raydugin)
Problems in Agile software development
- Customer Requirements: Majority of the times the developer is only given a one line and at best approximate customer requirement statement which he has to rely upon to estimate the minimum time and resources that would be required to deliver an executable for the project.
- Estimation of the time and resource component: The developer generally works in a large organization which has undertaken the project to be completed depending upon the customer requirements. This leads to the developer in estimating more for any unforeseen delays in the project which is sometimes tice the amount of time that the developer actually needs. The management in the company can add their own estimates to it and as a result the amount of time and resources required gets bloated artificially causing a lot of projects to be cancelled.
- Non-inclusion of certainof certain project team members: Sometimes, some members of the project are not asked for their input when the time and resource components are being calculated and as a result, there can be no room left for checking and testing of some critical areas of a project which lead to the software failing in meeting the basic user requirements and thus, leads to the project being cancelled.
Due to the above mentioned problems the Agile software development sometimes can cause the project to fail or be cancelled by the user.
We will look into 2 Agile methodologies and present Monte-carlo models for each of them.
- Scrum Modeling with Excel
Steps for building Monte-carlo model:
- Build a model of a scrum process using different excel formulae in that cascade into a final amount of feature points for each simulation row.
- Using the feature points, the number of iterations required and using that a date can be determined.
The inputs required for this model are:
- Number of features for each “size” feature where the features were limited by the estimates of the developers.
- The lower bound, average and high boundary adjustments to apply against each estimate size.
- Defect rate (expressed as 1 defect for x points of y size)
- Added scope rate (expressed as 1 feature for every x points of the medium feature size for the project)
- Start date
- Days per iteration (work days, for example 10 for a two week iteration cycle)
- Number of feature points per iteration targets
(Troy, 2011)
These inputs allow a simulation model to be built. Almost all the calculation required at each step are pretty simple except for the random number generation. Each random number provided by the random number generator will after many generations follow a near Normal Distribution. Once a random number is generated for each estimate size binthis number is multiplied by the total story size of that estimate, and these are summed.
We use the following equations to calculate the Monte-carlo model.
- Calculating the adjusted story points for an estimate size using a random number within the chosen boundaries
(Troy, 2011)
- Calculating the adjustments to add for defects
(Troy, 2011)
- Calculating the adjustment to add for introduced scope
(Troy, 2011)
- Calculating the number of iterations required to compute ALL points
(Troy, 2011)
- Simple Vacation Adjustment Equation
- Finding the date giving the number of workdays
(Troy, 2011)
- Count the number of simulations that complete within a target and convert to percent
(Troy, 2011)
- The following are the results for the calculations showing the first 5 simulations of many thousands in the Monte-carlo introduction paper
(Troy, 2011)
- Kanban
Modeling a Kanban project using excel is not easy as when we have interactions between features there is a need of at least one column per feature, per simulation row which in the end just gets unmaintainable.
Kanban divides the steps of delivering a single feature into columns which is known as Status. For example, a feature might pass from Design, to Development, to Testing, to Release.
The time taken for each feature in each Status is recorded.Work is limited in each Status, and a new feature can only be pulled from left to right when a vacant position is available.
(Troy, 2011)
To simulate, the application takes the inputs of the number of Status columns, and a lower bound and upper bound for time taken to complete stories in each status, and the limit of stories allows in each status at one time. These inputs are enough to do a simple simulation, where the application loops simulating a given time interval, for example 1 day.
For each story a random time within that status’ boundaries is calculated and stories are only move to the right when a) that time has elapsed, b) there is an open position that keeps the number of stories below the WIP limit for that status. This process continues until all cards have traversed from the imaginary backlog to the completed stories pile, and the time take to do this is recorded.
The actual ranges can be mined from any work tracking tool, and are often easy to read from a Cumulative Flow Diagram which is a graphical representation of how manycards are in each status at any given moment.
Defects, added scope and the time stories spend in a “Blocked” state are represented by adding more stories to the backlog according to rates specified by the user, and extending story times by given user rates.
Kanban simulation is carried out with the specified setup either visually for a single pass, or many hundreds or thousands of times for Monte-carlo results.Kanban simulation can answer another key question – If you had to add staff, how many and what skills do you need?
Sensitivity Analysis
Sensitivity analysis answers the question of what input factor has the greatest impact on the final result. In essence, if all the inputs were increased and decreased by 10%how much each change impacted the final result.
Relevant Random Number Generation
Random number generation is a complex field of mathematics. For truly random numbers, a computer is the last thing you want as the random numbers generated by the computer are not truly random but rather they rely on algorithms that try to be random.
In the Scrum model covered in the report, we forced the random number generator to follow a Normal distribution.
Conclusion
Thus, we can conclude with reasonable confidence that the user will be able to answer the questions that had been raised earlier viz.
- How much will this product cost to develop and deliver?
- What is the likelihood of releasing by date x?
- What resources do you need to hit date x?
References
Monte Carlo method. (n.d.). Retrieved September 30, 2011, from wikipedia:
Raydugin, Y. (n.d.). Unknown Unknowns in Project Probabilistic Cost and Schedule Risk Models.
Troy, M. (2011). Introduction to Monte-carlo Analysis for Software Development.
