Login or register for a free account

Introducing Crap4J

“Hell is other people’s code” – T-Shirt in Mountain View, Ca.

Detect and Sanitize CRAPpy Java Code with Crap4j

by Alberto Savoia

There is no fool-proof, 100% accurate and objective way to determine if a particular piece of code is crappy or not. However, our intuition – backed by research and empirical evidence – is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit the “This is crap!” response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into “Oh crap!”

Since writing automated tests (e.g., using JUnit) for complex code is particularly hard to do, crappy code usually comes with few, if any, automated tests. The presence of automated tests implies not only some degree of testability (which in turn seems to be associated with better, or more thoughtful, design), but it also means that the developers cared enough and had enough time to write tests – another good sign.

Since the combination of complexity and lack of tests are key contributing factors in making code crappy – and a maintenance challenge – my Agitar Labs colleague Bob Evans and I have been experimenting with a metric based on those two measurements. The Change Risk Analysis and Prediction (CRAP) score uses cyclomatic complexity and code coverage from automated tests to help estimate the effort and risk associated with maintaining legacy code. We are working on an open-source experimental tool called “crap4j” that calculates the CRAP score for Java code. We need more experience and time to fine tune it, but the initial results are encouraging and we have started to experiment with it in-house.

Crap4J is currently a prototype and it’s implemented as an Eclipse plug-in using JUnit. If you are interested in contributing to crap4j’s open-source effort to support other environments and test frameworks (e.g. TestNG) please let us know. Instructions for installing the crap4j plug-in are below, but first let’s re-introduce the CRAP formula.

The CRAP Formula Version 0.1

Given a Java method m, CRAP for m is calculated as follows:

CRAP(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)

Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known and widely used metric and it’s calculated as one plus the number of unique decisions in the method. For code coverage we use basis path coverage.

Low CRAP numbers indicate code with relatively low change and maintenance risk – because it’s not too complex and/or it’s well-protected by automated and repeatable tests. High CRAP numbers indicate code that’s risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage to make sure you have not introduced any unintentional changes.

Generally speaking, you can lower your CRAP index either by adding automated tests or by refactoring to reduce complexity. Preferably both; and it’s a good idea to write the tests firsts so you can refactor more safely.

Like all software metrics, CRAP is not perfect. We know very well, for example, that you can have great code coverage and lousy tests, or that sometimes complex code is unavoidable and higher complexity method might be easier to understand than three simpler ones. We are also aware that the CRAP formula doesn’t currently take into account higher-order, more design-oriented metrics that are relevant to maintainability (such as cohesion and coupling). Since the perfect software metric does not exist and, regardless of design issues, overly complex methods and lack of tests are usually bad things, so we decided that – even in its current state – the crap4j metric provides useful information that we should start experimenting with. This way we have something concrete to give us experience and data for further refinement.

Interpreting CRAP Results

For a given method, the CRAP number ranges from 1 (for a method of complexity 1 and 100% code coverage) to a very large number (e.g. a method of complexity 100 with 0% code coverage – we have seen such beasts – would score 10,100).

Individual Method Interpretation

Bob Evans and I have looked at a lot of examples (using our code and many open source projects) and listened to a LOT of opinions. After much debate, we decided to **INITIALLY** use a CRAP score of 30 as the threshold for crappiness. This means that you can have methods with cyclomatic complexity of 10 (which is pretty high in my opinion) – provided you have 75% code coverage for it – and not have it labeled as CRAP. Or you can have a method of complexity 2 without any tests (not that we’d recommend it) and not have the method labeled as crappy.

Aggregate (Project-Level) Interpretation

At the project level, we report the percentage of methods above the CRAP threshold (i.e., all methods with a CRAP score of 30 or higher). We understand that nobody is perfect and that in some cases people have good excuses for having a few methods that are more complex, or have fewer tests, than the ideal. Project wide, we allow up to 5% crappy methods before labeling the entire project as crappy. Some people will think that this is too generous, others will thing that it’s too draconian – as we gain experience we’ll adjust accordingly or let people set their own thresholds.

CRAP Load

Crap4j also reports CRAP load, this is an estimate of the amount of work required to address crappy methods. It takes into account the amount of testing (with a small refactoring component) required for bringing a crappy method back into non-CRAP territory. Generally speaking, a CRAP load of N indicates that you have to write N tests to bring the project within the CRAP guidelines. This is even more experimental than the rest so we will not spend too much time on it at this point. I’ll blog more about it in the future.

That’s it for now. If this is of interest to you, it’s time to download it and start experimenting. Let us know what you think, how you’d change the metrics, improve the plug-in, etc.

Download and Installation Instructions

If you already have Eclipse, you can install the plug-ins from our update site at http://crap4j.org/downloads/update/ (if you don't have Eclipse, you can get it from our downloads page).

Follow the steps below to install crap4j and the JUnit Factory runner with built-in code coverage.

  1. In Eclipse, select Help > Software Updates > Find and Install
  2. Choose Search For New Features to Install and select Next
  3. Select New Remote Site
  4. Enter a name for this server: Crap4J
  5. Enter (or copy/paste) this url: http://crap4j.org/downloads/update/
  6. Install all plug-ins in the JUnit Factory category and restart Eclipse

Usage Instructions

Once installed you should see a distinctive – if not exactly tasteful – toilet-paper crap4J icon in the Eclipse toolbar.

  1. Select an open Eclipse project (i.e. click on the top-level project icon).
  2. click on the crap4j icon.

Crap4j will automatically identify and run all the JUnit tests in the project, record the coverage information, and calculate the cyclomatic complexity for each method. After it’s done (it may take a while if you have a lot of tests to run), it will display the results in a new window. The results page show high-level information and has links to more detailed pages (e.g. all methods sorted by complexity, coverage, CRAP, or CRAP load).