stability of the development team. For projects with reused software, there will be an early contribution to SLOC^ and thus immediate progress and quality metrics.
• Defects (SCOq and SCOj). Changes to resolve software errors constitute an important statistic from which the reliability and maturity of a baseline can be derived. The expectation is that the highest incidence of uncovering errors occurs immediately after a release and decreases with time as the software matures.
• Improvements (SCO2). Another stimulus for changing a baseline, improvements are also key to assessment of the quality and the progress toward producing quality. The expectation for improvements is inversely proportional to defects. Because the defect rate starts high and damps out, improvements start low (the focus is on defects) and increase. This phenomenon is loosely based on the assumption that a fixed team is working the test and maintenance activities. It is captured by the following relationship:
Effort (defects) + Effort (improvements) = Constant
Differentiation between defects and improvements is somewhat subjective. The change metrics defined herein are not particularly sensitive to either type because they rely on the sum of the impacts from both types. However, the difference between defects and improvements can have a significant impact on the maturity measures described in Section C.2.2.
• New features (SCO3). Type 3 changes reflect an update to the stakeholder expectations for new features or capabilities outside the scope of the current contract. The statistics for type 3 changes are analyzed separately because they reflect new work rather than rework.
• Number of SCOs (N). Because an SCO is a discrete unit of change, it is important for its definition to be consistent throughout all domains where the metrics will be compared. What is the level at which changes are documented and tracked? Most projects converge on a fairly loose definition of an SCO based on size, breadth of impact on individuals and teams, and CCB culture. This loose approach will work for the individual project, but if every project uses a different definition, comparability across projects is compromised. In general, SCOs should affect a single component and should be allocated to a single individual or team leader. With this simple standard, more-precise definitions of these primitives are unnecessary. Imprecise primitives work fine, and greater precision adds little value. As more and more metrics collection is supported by automated tools, there will be further homogenization of the overall measurement techniques and primitive units.
• Open rework (B). Theoretically, all rework corresponds to an increase in quality. The rework is necessary either to remove an instance of "bad" quality (SCOq and SCOj) or to enhance a component to improve life-cycle cost effectiveness (SCO2). To assess quality trends accurately, the dynamics of the rework must be evaluated in the context of the life-cycle phase. A certain amount of rework is necessary on a large software engineering effort; early rework is considered a sign of healthy progress in a modern process model. Continuous rework, late rework, or zero rework due to the absence of a configured baseline are generally indicators of negative quality. Interpretation of this statistic requires project context. In general, however, rework should ultimately approach zero at product delivery. To provide a consistent collection process that can be automated, rework can be defined as the number of SLOC estimated to change due to an SCO. The absolute accuracy of the estimates is generally unimportant. Because open rework is tracked with an estimate and closed rework is tracked separately with actuals, the values continually correct themselves and remain consistent.
• Closed rework (F). Whereas the breakage statistics estimate the damage done, the repair statistics identify the actual damage that was fixed. Upon resolution, the corresponding breakage estimate is updated to reflect the actual required repair that remains in the baseline. Although the actual SLOC fixed (F) will never be absolutely accurate, it will be relatively accurate for assessing trends. Because "fixed" can take on several different meanings depending on what is added, deleted, or changed, a consistent set of guidelines is necessary. Changed SLOC will increase B and F without a change to SLOC^. Added code will increase B, F, and SLOC^, although not in the same proportions. Deleted code (an infrequent occurrence) with no corresponding addition could increase B and reduce SLOC^. Given the volume of changes and the need for only roughly accurate data for identifying trends, the accuracy and precision of the raw data are relatively unimportant.
• Rework effort (E). The total effort expended in resolving SCOs is another necessary perspective for tracking the complexity of rework. Activities should be limited to technical requirements, software engineering, design, development, and functional test. Higher level systems engineering, management, configuration control, verification testing, and system testing should be excluded, because these activities tend to be more a function of company, customer, or project attributes, independent of quality. The goal here is to normalize the widely varying bureaucratic activities out of the metrics.
• Usage time (UT). This important statistic corresponds to the number of hours that a given baseline has been operating under realistic usage scenarios. For some systems, this statistic corresponds to straight time measurements; for many others, automated tests can simulate one day of usage in a one-hour test. For example, most transaction processing systems have an expected average load that they process daily. If this average load can be packaged in a test scenario and executed against the product baseline in one hour, it counts as 24 hours of usage time. As another example, consider a development tool that is used by humans operating at human speeds of several keystrokes per second. If automated GUI test tools can support scripts of interactions that can be tested against the product at a tenfold higher rate, then every hour of test time counts as 10 hours of usage time. Defining the mapping of test time to usage time is generally straightforward. This is also a great requirements analysis exercise that frequently uncovers ambiguities in the understanding of usage scenarios among different stakeholders.
The end-product quality metrics (Table C-2) provide insight into the maintainability of the software products with respect to type 0, 1, and 2 SCOs. Type 3 SCOs are explicitly not included, because they redefine the inherent target quality of the system and tend to require more global system and software engineering as well as some major reverification of system-level requirements. Because these types of changes are dealt with in extremely diverse ways by different customers and projects, they would tend to cloud the meanings and comparability of the data.
The following metrics data should be very helpful in determining and planning the amount of effort necessary to implement type 3 SCOs. They are also useful when applied against subsets of the product such as components or releases. The word product is used as the basis of what is being measured.
Was this article helpful?
What you need to know about… Project Management Made Easy! Project management consists of more than just a large building project and can encompass small projects as well. No matter what the size of your project, you need to have some sort of project management. How you manage your project has everything to do with its outcome.