Understandability metrics

Why do we need understandable code? Projects with low understandability are error-prone and hard to maintain. Writing understandable code involves things like comment density, naming standards, structured or object-oriented programming, and programming style.

Several metrics exist for evaluating understandability.

LEN* Length of names

The longer the names of procedures, variables, constants, controls etc., the more descriptive they probably are. 'a' is not a good variable name, 'Age' is better, but 'EmployeeAge' is much more descriptive. Generally, names consisting of 1 or 2 letters are not good. What is enough depends on your language and the application you're making.

The following "length of names" metrics are defined by Project Metrics. They all consider only those names that are defined in VB source code files. Names defined in binary libraries are never taken into account for these metrics. The optional type definition character ($%&@#^!) is not counted in the length of a name, nor are any parentheses () for array types.

LEN Length of names

LEN = Average length of all names

LEN considers all names declared the program. It includes everything that's in LENV, LENC and LENP, plus Types, Type fields, Enums, Enum constants, controls, classes, structures, modules, projects and compiler constants.

LENV Length of variable names

LENV = Average length of all variable names

LENV includes all variables, arrays and parameters with the following exceptions:

LENVgm Length of global and module-level variable names

LENVgm = Length of a global or module-level variable name

LENVgm/var = Average length of global and module-level variable names

This version tells the length of global and module-level variable names. As opposed to LENV, LENVgm does not count parameters or local variables.

LENC Length of constant names

LENC = Average length of constant names

LENC considers the names of regular Const constants but not Enum constants nor compiler constants (#Const).

LENP Length of procedure names

LENP = Average length of all procedure names

LENP considers all procedure names with the following exceptions:

LENP and LENP/proc. LENP is available in two versions: 1) The project-level average LENP (defined above) and 2) the length of an individual procedure name, which is also called LENP. You can find the former version in the procedure-level metrics and the latter on the Procedures tab. From the latter metric you can calculate the average length LENP/proc. While both of these metrics are the average length of a procedure name, they can have different values. We recommend that you use LENP, not LENP/proc. The difference between these metrics is which procedures are included and which are not. LENP excludes some procedures that are included in LENP/proc and vice versa. LENP is specifically designed to reflect the naming that is under the programmer's own control. LENP/proc includes procedures whose names are "fixed" in a way or another.

How long is long enough?

Names should be descriptive. The longer the name, the more descriptive it is likely to be.

As to the average length of variable names, optimal lengths such as 9..15 and 10..16 characters have been found. This suggests that a good value of LENV is between 9 and 16. Similarly, it also suggests that the project-level average LENVgm/VARS should be between 9 and 16. Local variables tend to be temporary, and their naming requirements are less strict than those of global and module-level variables. Since LENV also considers local variables, it can be lower than LENVgm/VARS, which doesn't consider them.

Procedures are more complex, so their names should be more descriptive. Object-oriented programming allows short procedure names, though. A good method name can be a single verb and a property name can be a single substantive. Therefore it's hard to say if procedure names should really be shorter or longer than variable names.

Name Uniqueness Ratio (UNIQ)

When 2 program entities have the same name, it's possible that they get mixed. UNIQ measures the uniqueness of all names.

UNIQ = Number of unique names / total number of names

All the names in LEN are also counted in UNIQ.

It's acceptable to use the same name at many locations. However, the name should refer to the same logical thing. For example, variable Username should always contain the same type of username in the same data type (string). If Username can mean one thing in one procedure and another thing somewhere else, the likelihood of confusion increases.

Complexity

Complex code isn't likely to be understandable. More about complexity

Lines of code

The longer a procedure gets, the harder it probably is to understand. Therefore, watch the lines of code measures. Comments and whitespace contribute to understandability, therefore, the total line counts aren't as useful as logical code line counts. More

Comment density (MCOMM%)

The more comments in your code, the easier it is to read—and understand. Whitespace is also important for legibility.

Not all comments describe the code. Some comments such as '---------- are mere separators. It makes sense to pay attention to just those comments that mean something, that is, meaningful comments. For the details, see the definition of comments and meaningful comments.

To increase the comment density of your code, you can watch for the Uncommented code problem with the problem detection feature of Project Analyzer.

Comment recommendations

Every procedure should have at least one meaningful comment, to briefly describe the function of the procedure. It's also advisable to describe each parameter and the return value of a function. For each parameter, indicate the range of values expected and also the range returned, if the parameter is passed by reference. If the procedure accesses global or module-level data, it's advisable to note this too. Sometimes it's advisable to include a short list of the procedure's callers or callees.

The standard location for a procedure comment is immediately after the procedure declaration line (Sub/Function/Property). Project Analyzer treats comment lines attached to the procedure declaration as the procedure description according to the following rule: Comment lines immediately before and after the procedure declaration up to the next whitespace or code line.

Project Analyzer supports a special notation for writing comment headers. The comments can be used in Project Printer to generate a Comment Manual.

Besides comment headers, it's advisable to write comments inside the procedure body, especially if the procedure is long or complex. These comments should describe the internals of the procedure, and not its features or the calling interface.

How much is enough comments?

When MCOMM% is below 10%, you might want to start worrying. Of course, this isn't an exact limit. If your code is self-descriptive, you don't need so many comments.

IBM studies have found that one comment roughly every 10 statements is the density at which clarity seemed to peak. More or less reduced understandability. (Capers Jones 2000) In the terms of Project Metrics, this approximates to an optimal MCOMM%=10%.

Comment readings

© Project Analyzer Help