Software Metrics

Note: This is a copy of a page I wrote for the software engineering course I taught at the University of Illinois Urbana-Champaign. I am reposting it here on my blog in the hope that it will be found useful by others in the future.

Software Metrics

What are software metrics?

  • Quantitative measurements distilled from data
  • Distilled by measuring software development processes and actual source code
  • Highlight areas that need work in specific nodes of code as well as generalizations about your code overall
  • “You can’t control what you can’t measure” -Tom DeMarco

Limitations of metrics

  • Software metrics are intended to help programmers control and monitor software production, but…
  • It’s difficult to determine “how much” software there is in a given program
  • Can give a skewed impression of software, especially when calculated early in the software development process
  • Can be difficult or complex to calculate, especially as the volume of code grows

Examples of metrics

  • Lines of code
  • Number of classes & interfaces
  • Code to comment ratio
  • Cyclomatic complexity
  • Code coverage
  • Bugs to lines of code ratio
  • Cohesion
  • Coupling
  • Failed tests per build
  • Version control commits per day
  • Lines of code per commit

Terminology

  • Node
    • A block of source code, usually either a single line, a function/method, class, or package. A node can have multiple children, but only one direct parent
  • Program
    • A graph of all of the nodes that comprise the source code
  • Flow graph
    • A directed graph of all of the single line nodes connected with vertices where the possible flow of execution might proceed

Some Specifics

Lines of Code

  • A key size attribute of software
  • Can be a good measure of software volatility, especially when tracked over the entire development process
  • Can be used as the basis for other metrics, such as the bugs:code and tests:code ratios

Code to comment ratio

  • We’ve already seen how important commenting is to developing quality code
  • This metric puts a numerical value on the amount of inline documentation in a piece of software
  • Gives developers warning on when code needs to be documented

Cyclomatic Complexity

  • Directly measures the number of linearly independent paths through source code
  • CC = E – N + p
    • where E = the number of edges of the program’s flow graph
    • N = the number of nodes of the graph
    • p = the number of connected components of the graph
  • If code contains no decisions, then CC=1, if a piece of code contains a binary if statement, CC=2, etc…
  • Upper bound on the number of unique test cases required to have complete coverage of a given branch
  • Commonly used thresholds:
Complex and high risk
Cyclomatic Complexity Risk Evaluation
1-10 A simple program without much risk
11-20 More complex, moderate risk
21-50
> 50 Practically untestable, very high risk
  • Lower CC contributes to a program’s understandability and indicates that it is more easily modifiable
  • Generally, the greater CC becomes, the more complex and unmaintainable the code becomes
  • Greater cyclomatic complexity indicates a greater learning curve for new developers

Code Coverage

  • A metric that describes to extent to which the source code of a program has been tested
  • Different degrees of code coverage:
    • Function coverage – Has each function in the program been executed?
    • Statement coverage – Has each line of the source code been executed?
    • Condition coverage – Has each evaluation point (such as a true/false decision) been executed?
    • Path coverage – Has every possible route through a given part of the code been executed?
    • Entry/exit coverage – Has every possible call and return of the function been executed
  • Some of the above are connected together
  • Code Coverage and Unit Tests
    • Indicator of how well your tests actually test your code
    • Lets you know if you have enough tests in place
    • Allows you to maintain the quality of your test suite over the lifetime of the project
  • How Code Coverage works (in Java)
    1. compile the source code
    2. instrument the compiled class files, excluding the compiled test cases. This adds the necessary information to allow for…
    3. Collect runtime data
    4. merge the runtime data into a auditable report
    5. When the tests are executed, the extra info added in when the files were instrumented will write out exact coverage data to disk

Cohesion

  • Cohesion is a measure of how strongly-related the various responsibilities of a software module are
  • A node is usually deemed to have “high cohesion” or “low cohesion”
  • High Cohesion can indicate many things about code, including the extent of reuse of code and readability
  • Disadvantages of low cohesion:
    • Increased difficulty in understanding nodes of source code
    • Increased difficulty in maintaining source code – changes will affect multiple nodes, changes in one node will require changes in many other nodes
    • Increased difficulty in reusing a node of source code, since most other nodes will not need the functionality that a node with low cohesion provides

Coupling

  • Coupling is the extent to which a node relies on the other nodes in the source code
  • Nodes can be called either “loosely/weakly coupled” or “strongly/tightly coupled”
  • Loose coupling indicates high cohesion!
  • Loose coupling refers to a relationship between nodes such that one node interacts with the other nodes via a stable interface and does not need be concerned with the internal implementation of the other nodes
  • Types of coupling:
    • Content coupling (tightest) – is when one node modifies or relies on the internal workings of another node
    • Common coupling – is when nodes share the same global data
    • External coupling – Is when nodes rely on an external data format
    • Data coupling – Is when nodes share data through parameters
    • Message coupling (loosest) – Is when modules are not dependent on each other, they use a public interface to communicate

Methods for decreasing coupling and increasing cohesion

  • Transmit messages between nodes in a flexible format (such as XML)
  • Use public interfaces to communicate messages between nodes where a file format is not required
  • Separate code into nodes that perform logical chunks of work (example: MVC pattern)
  • Write code such that the implementation of a given node of code is independent from how it is used by other nodes

Free tools for auditing software

  1. No comments yet.

  1. No trackbacks yet.