On Code Coverage in Software Testing featured image

What it is, why it matters, and how to get started

What is code coverage?

Code coverage refers to software testing metrics that show how much of a software system’s source code has been executed during testing. Represented as a percentage, code coverage indicates how well your code has been tested and identifies parts of your application that were missed in the testing process.

This article explores code coverage in depth, explaining what it is, why it matters, and how to effectively measure and improve it.

Types of code coverage

  • Statement Coverage: Measures the percentage of executed statements in the code.
  • Branch Coverage: Measures the percentage of executed branches in control structures like if-else and switch statements.
  • Function Coverage: Measures the percentage of functions or methods that have been executed.
  • Path Coverage: Measures the percentage of possible execution paths taken in the code.
  • Patch coverage: Measures the percentage of code changes (new commits) that have been tested.

Sample diagram showing where code coverage fits in the broader software testing process.

Why does code coverage matter?

The purpose of software testing—whether it’s smoke testing, regression testing, unit testing, etc.—is to improve the quality and reliability of your software. In theory, the more lines of code you test (i.e., the higher your code coverage), the more likely you’ll find harmful defects. Although high code coverage doesn’t guarantee high quality, it’s an important part of any risk mitigation strategy. And generally speaking, low code coverage is associated with a higher risk of software instability.

Benefits of high code coverage

  1. Improved quality: High code coverage often correlates with fewer bugs and higher software quality. Aiming for high coverage forces you to impose better structure on your code.
  2. Easier maintenance: Well-tested code is easier to refactor and maintain.
  3. Risk mitigation: Identifying untested parts of the code reduces the risk of undetected bugs.
  4. Release confidence: Increases the odds of finding bugs before a release, thus increasing your confidence.

How code coverage is calculated

Calculating code coverage involves running a suite of tests and measuring how much and which parts of the code are touched during these tests. This is typically done using code coverage tools and frameworks that provide detailed reports. Code coverage, or test coverage, will appear as a percentage like 80% code coverage, 90%, etc. 

Why don’t automated tests cover 100% of your code every time?

Usually, automated tests fail to cover 100% of your code because the code is low quality and has a high degree of cyclomatic complexity. Also, it could be because you do not have a downstream dependency spun up when running the unit test.

Unreasonable standards can halt progress in trying to make things perfect. Releasing new functionality to the market without creating a mountain of tech debt is a challenge.

Industry benchmarks

What is considered “good” code coverage?

Generally, code coverage between 80 and 90% is strong. Obviously, the amount of code in question influences the difficulty of achieving such coverage. After all, testing a large percentage of 700 lines of code (LoC) is easier than 15,000.

As a frame of reference, one study of 47 software projects spanning seven programming languages found an average code coverage of 74-76%.[1] The average project duration was 20 months, and the average LoC across the software projects was 8,178.

In another study, a participant whose team achieved 100% code coverage in a large space software project remarked: “When you have reached more than 90%, you are doing well.”[2]

As a rule of thumb, Google considers 60% “acceptable”, 75% “commendable,” and 90% “exemplary". (source)

A study of 47 software projects found an average code coverage of 74-76%.

Is 100% code coverage necessary or desirable?

Though possible, it’s unusual to achieve 100% code coverage for your whole codebase. And if you were to reach 100% coverage, it’d require enormous effort.

When dealing with highly critical software, which affects human health and safety or can irreparably destroy the machine it powers (e.g., a spacecraft), you may want—or be required—to achieve near 100% test coverage. For example, in one study, a development team managing a laser communication software project for satellites was required to achieve 100% coverage. And they did. But accomplishing this feat was extremely difficult: two developers worked painstakingly for two years to reach 100% test coverage for 25,000 LoC.[3] 

Interestingly, the authors of the study observed that getting from 95 to 100% coverage required a great deal more effort (proportionally) than getting from 0 to 95%. Due to the high effort in the last mile of coverage, you actually introduce risk when aiming for excessively high coverage. Such risks might include a delayed time to market.

If your development team is strapped for resources, your software systems are not “highly” critical, and you need to move fast, then it’s probably best to avoid pursuing 100% coverage.

That said, achieving near 100% coverage is more feasible for specific “patches”, i.e., new lines of code.

Getting started with code coverage

Setting up code coverage tools

To get started, integrate a code coverage tool appropriate for your programming language into your development environment. Configure it to run alongside your tests.

Code coverage tools and frameworks

Establishing goals and thresholds

Set realistic coverage goals and thresholds. While 100% coverage is ideal, as mentioned, it’s often impractical. Focus on critical and frequently used parts of the codebase. Adjust your coverage goals based on the:

  • Potential consequences of a software failure for the application being tested - e.g., if you’re testing a less critical piece of software, then you may aim for modest coverage goals (80%).
  • Broader goals of the software project - e.g., if the primary goal of a project is to get the new software product to market as quickly as possible, then you should set modest coverage goals to accommodate the need for speed.
  • Other software testing and risk mitigation measures you’ve employed - e.g., if you have strong, holistic testing procedures in place, then you may be able to justify lowering your coverage goals, at least slightly.
“100%: it sounds really good. But I think those who demand it do not know what they are asking for.”

(Source)

Coverage criteria

When setting code coverage parameters, you need to select the criteria that best fit your project needs: statement coverage, branch coverage, function coverage, and path coverage. For example, critical applications might require higher branch and path coverage.

Aim for a balance between achieving high coverage and low testing overhead. Avoid excessive focus on coverage at the expense of overall software engineering performance and business objectives. 

Interpreting coverage reports

Coverage reports usually highlight executed and unexecuted parts of the code, providing percentages and visualizations that help in understanding the extent of coverage.

In this sample dashboard of Codecov, a code coverage tool, we see a branch coverage of 95.4%. (Source)

In this sample coverage report in Coverage.py, another code coverage tool, we see two files containing a total of 76 statements, with 10 statements missed during testing, resulting in 87% test coverage. (Source)

Best practices for improving code coverage

  • Write testable code - Develop code with testability in mind. Avoid complex and tightly coupled code structures.
  • Identify and address gaps - Regularly review coverage reports to identify untested parts of the code and write additional tests to cover these gaps.
  • Continuous integration and reporting - Automate code coverage reporting in your CI pipeline to monitor coverage trends over time and catch regressions early.

Code coverage in practice

Real-world examples

Here are a few real-world examples of organizations that improved their software reliability as a result of increasing their code coverage:

Challenges and limitations

Be aware of challenges such as the overhead of maintaining tests and the false sense of security high coverage might provide if tests are not well-designed. High coverage does not guarantee high reliability. 

Combining with other techniques

Combine code coverage with other testing techniques like static analysis, peer reviews, and even controlled testing in production for comprehensive quality assurance.

Usage in industry

  • Standards and guidelines - Adopt industry standards and guidelines for code coverage, such as those from ISO.
  • Organizational approaches - Different organizations might have varying approaches. For instance, startups might prioritize rapid development over high coverage, while financial institutions might enforce strict coverage policies.
  • Part of quality metrics - Include code coverage as a part of a broader set of quality metrics, alongside code complexity and defect density.

Maintaining high code coverage

  • Sustain coverage over time - Regularly update tests to cover new code and refactored parts of the application to maintain high coverage levels.
  • Deal with legacy code - In this context, legacy code refers to code that was written before modern software testing practices emerged. It’s advised to address coverage in legacy code incrementally. Start with the most critical parts and gradually increase coverage. Also, you can leverage tools that automate the process of increasing code coverage for legacy code.
  • Automate tracking and reporting - Automate the process of tracking and reporting coverage to reduce manual effort and ensure consistency.
  • Follow SOLID principles - Following things like SOLID principles will make code easier to test.
  • Pay attention to cyclomatic complexity - Cyclomatic complexity indicates that code may be degrading in quality, in turn, creating testing difficulties.

Code coverage and testing frameworks

Popular testing frameworks

Use frameworks like JUnit for Java, pytest for Python, and Mocha for JavaScript, which have built-in support for code coverage. Integrate coverage tools with these frameworks to streamline the process and enhance reporting. Use the data from coverage reports to optimize tests, focusing on high-risk and frequently changed parts of the code.

Code coverage in CI/CD

Integrating into your development workflow and CI/CD pipeline

You can integrate code coverage tools into your continuous integration (CI) pipeline, making code coverage a standard part of every build. How you integrate code coverage into your automated testing process depends on the programming language and specific code coverage tool. Implementing code coverage checks in your CI/CD pipelines enables you to catch issues early in the development cycle. 

  • Set thresholds - Set minimum coverage thresholds that must be met for code to be merged or released. 
  • Use code coverage as a gate - Utilize coverage metrics as a gate in your deployment process, allowing only code that meets the required coverage to be deployed.

Code coverage and software quality

  • Measure of quality - Use code coverage as one of the indicators of software quality, but not the sole determinant.
  • Combined metrics - Combine coverage with other metrics like change fail rates and user satisfaction to get a holistic view of software quality.
  • Establishing standards - Set organizational standards for code coverage to maintain consistency and high quality across projects.

Again, we must reiterate that just because you’ve tested a large portion of your codebase does not mean you’ll necessarily avert an outage. Bugs may still lurk in your code. You must, therefore, employ multiple risk mitigation strategies to ensure high stability. 

Limits of pre-production testing and code coverage

No matter how much pre-production testing you do, you’re merely approximating production. Such tests cannot perfectly replicate a real-world environment with large-scale production traffic. In some cases, despite extensive pre-production testing, when you release changes to production, they cause your app to go down for an unknown reason.

This is why, in addition to pre-production testing, we advise using feature flags to test in production with a small subset of users and environments. Doing so, gives you a more accurate idea of how software changes will impact operational health when exposed to a large volume of production traffic.

Footnotes

[1] Michael Hilton, Jonathan Bell, and Darko Marinov. 2018. A Large-Scale Study of Test Coverage Evolution. In Proceedings of the 2018 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE ’18), September 3–7, 2018, Montpellier, France. ACM, New York, NY, USA, 11 pages. https://doi.org/10.1145/3238147.3238183

[2] Christian Prause, Jürgen Werner, Kay Hornig, Sascha Bosecker, and Marco Kuhrmann. 2017. Is 100% Test Coverage a Reasonable Requirement? Lessons Learned from a Space Software Project. Conference: International Conference on Product-Focused Software Process Improvement. 10.1007/978-3-319-69926-4_25.  https://www.researchgate.net/publication/319141355_Is_100_Test_Coverage_a_Reasonable_Requirement_Lessons_Learned_from_a_Space_Software_Project

[3] Ibid. https://www.researchgate.net/publication/319141355_Is_100_Test_Coverage_a_Reasonable_Requirement_Lessons_Learned_from_a_Space_Software_Project

Like what you read?
Get a demo
Related Content

More about Industry Insights

July 30, 2024