Lines of code metrics (LOC)

The simplest way to measure the size of a program is to count the lines. This is the oldest and most widely used size metric.

Many ways to count the lines

Lines of code, or LOC, looks like a simple concept. However, it's not. There are several ways to count the lines. Depending on what you count, you get a low or a high line count. In the table below you can see various alternatives. The "Supported as" column shows which metrics Project Metrics supports.

MetricSupported as Description
Physical lines LINES This metric counts the physical lines, but excludes classic VB form definitions and attributes.
Physical lines of code (not supported) This type of a metric counts the lines but excludes empty lines and comments. This is sometimes referred to as the source lines of code (sLOC) metric.
Logical lines LLINES A logical line covers one or more physical lines. Two or more physical lines can be joined as one logical line with the line continuation sequence " _". The LLINES metric counts a joined line just once regardless of how many physical lines there are in it.
Logical lines of code LLOC A logical line of code is one that contains actual source code. An empty line or a comment line is not counted in LLOC.
Statements STMT This is not a line count, but a statement count. Visual Basic programs typically contain one statement per line of code. However, it's possible to put several statements on one line by using the colon ":" or writing single-line If..Then statements. More about statement counts

The use of line counts

The physical lines count (LINES) is a simple but not a perfect way to measure code size. Since a logical line can expand over several lines, the physical line count exaggerates code size. A common problem in line counts is also that empty (or whitespace) lines, as well as comments, are included in the count. With improper line counts, you can appear really productive by hitting the Enter key, or alternatively, pretend that you are writing tighter code by deleting all comments.

The logical lines of code metric (LLOC) has both advantages and disadvantages. It is a simple measure, easy to understand, and widely used. You can use it to measure productivity, although you need to be cautious, because programming style can have an impact on the values. You can also estimate the number of defects per 1000 LLOC.

Line counts are notorious in that they can vary between programming languages and coding styles. A line of VB code is not the same as a line of C++ code. Implementing a feature in VB6 may require more effort (or maybe less) than what it would take in VB.NET. Especially when measuring programmers' performance the line counts aren't perfect. One programmer may produce a large number of lines, while the other spends a long time and succeeds in squeezing the same function in a small space. And, developers work on other things than just producing more and more code, such as documentation, planning, testing etc. Also be careful when paying for delivered code lines, as there are many ways to bloat the figure.

LINES Physical lines

LINES = Number of lines

This is the simplest line count. Each line ends with a line break, usually CR+LF. LINES counts every line, be it a code, a comment or an empty line.

For classic VB, the LINES metric, along with every other line count, excludes the (invisible) class and form declaration lines at the start of .frm and .cls files. These lines are not code, but contain declarations for forms, controls and properties. The source files may also include (invisible) Attribute statements containing various attributes for procedures and variables. These statements are counted as code if they exist among your code. There's an exception: Attibute statements are not counted when they're part of a module header, that is, they exist at the start of a file before any source code. — In VB.NET, <attribute> definitions are counted just like normal code.

Only source files are included in the line counts. A source file is one that has source code in it. Some of the file types excluded are project files, solution files, binary files, resource files, HTML files and other related files.

Some simple line count utilities may count the invisible declarative code at the start of .frm and .cls files. One should not use such a utility to measure the code size of classic VB projects.

Maximum procedure length?

To avoid too long procedures, you might want to set a maximum limit to LINES for procedures. There are several recommendations for the maximum. Pick your preference.

Max 66 lines
LINES <= 66. The procedure fits on one page when printed.
Max 150 lines
LINES <= 150. A recommendation for Java.
Max 200 lines
LINES <= 200. The procedure fits on 3 pages.

Some problems are easier to solve with a long procedure instead of several shorter procedures. You may wish to use two limits: a lower warning limit (such as 66) and a higher maximum limit (such as 200). The idea is to review the longish procedures in the middle range. If the procedure can be split, do it, but it can also be left alone if it works better as a long procedure. Only if a procedure exceeds the maximum limit should it be split in any case.

Maximum file length?

To avoid too long files, you might want to set a maximum limit to LINES for files. Again, you need to pick your preference.

Max 1000 lines
LINES <= 1000. This file size accommodates 15 one-page procedures or 100 short 10-line procedures.
Max 2000 lines
LINES <= 2000. A recommendation for Java. This limit also ensures the file can be reasonably flowcharted with Visustin.

Instead of limiting the file length, you may consider limiting the number of procedures in it by setting a maximum limit on the PROCS metric.

Minimum file length?

You can also have a minimum limit so as to avoid empty or nearly empty files. Some useful limits are:

Ban empty files
LINES >= 1
Ban very short files
LINES >= 5

Logical vs. physical lines

Physical line counts are quite easy: that's simply the regular lines count. What is a logical line then?

Where a statement or a comment is split over two or more physical lines, they count as a logical line. A logical line ends where the statement or comment ends.

In Visual Basic one splits a logical line using the " _" line continuation sequence. In VB2008 and later, it is also possible to leave out the " _" sequence in certain cases.

To be exact, a whitespace line can be continued with " _" too, even though though this is silly coding and rarely seen.

LLOC Logical lines of code

LLOC = Number of logical lines of code

A logical line is a logical line of code if has any other content than just a comment or whitespace. Thus, all executable lines, as well as declarative lines, are counted in LLOC. One or more statements followed by an end-of-line comment is a line of code. A full-line comment is not a line of code. An empty line (or a line with just whitespace characters) is not a line of code either.

Compiler directives (#const, #if etc.) are counted as code. However, code that is excluded by a False condition in an #if .. #then .. #elseif .. #else .. #end if block is not counted as code. In fact, it's not counted as whitespace or comment either. It's not a part of your program in the analyzed configuration, so it doesn't really have any meaning. It is included in physical line count (LINES), though.

In a summary, LLOC counts all logical lines except the following:

  1. Full comment lines (LLOC')
  2. Whitespace lines (LLOW)
  3. Lines excluded by compiler conditional directives

LLOC is a good measure of the size of your program. What is more, it is a good estimate of the complexity of a single file, class or procedure. Since LLOC is not affected by comments, blanks or line continuation, it's a handy way to measure the amount of the actual programming work. A program with a higher LLOC almost certainly "does more" than a program with a lower LLOC. When you add features, LLOC increases. When you delete features, LLOC should decrease. If you delete features and LLOC stays the same, you may have forgotten to the delete code that was left unused.

As a special case, line numbers and line labels count as code. If your code uses line numbering, all numbered lines count as lines of code, even if there rest of the line is blank or a comment. In fact, there are no comment or whitespace lines in line numbered code. For this reason, the logical line counts (LLOC, LLOC', LLOW, any metrics derived from these metrics) are not suitable for measuring fully line numbered code.

Maximum acceptable LLOC?

It's a matter of coding style to define a maximum limit for LLOC. Since LLOC excludes empty and comment lines, the maximum acceptable LLOC is lower than the maximum LINES. The following limits have been suggested for Java:

Procedure LLOC <= 50
Class LLOC <= 1500
File LLOC <= 2000
Source: Checkstyle utility for Java

Minimum acceptable LLOC?

A procedure, class or file should not be empty. It should contain at least some code to be of any use. If it's totally empty (or if it contains just empty lines or comments), it doesn't serve a purpose. Here are the minimum limits:

Procedure LLOC >= 3
Class LLOC >= 3
File LLOC >= 1

A minimum useful procedure contains 3 lines of code. A regular procedure already consists of at least 2 lines of code: the procedure header line and the end line (Sub and End Sub). To make the procedure do any work, it should thus have at least 3 lines of code.

Exception. In classic VB, an interface class can have empty methods containing just 2 lines of code.

There are specific procedure types that consist of just one line. Examples are Declare statements, Event statements, and VB.NET abstract procedure declarations, such as procedure definitions in an Interface and MustInherit declarations in a class. These "codeless" procedures are not listed in Project Metrics, so they don't really count as an exception to the minimum rules above.

On the other hand, a VB.NET property accessor consists of 3 or 4 lines of code even when the accessor body (Get/Set) is empty.

A minimum useful class contains 3 lines of code. In classic VB, the minimum useful class has a procedure with one statement (thus 3 lines of code). In VB.NET, the minimum useful class also consists of 3 lines: Class, End Class and a variable declaration with an initializer.

Exception. The minimum useful classic VB class (.cls file) can consist of just 1 or 2 lines of code. It can be an interface class (class with one empty method, or a class with some Public variables). Alternatively, it can represent a user-defined data type (class containing some Public variables). A regular class, on the other hand, should always have at least 3 lines.

A minimum useful file contains one line of code. The line can be a constant or a global variable declaration, for example.

LLOW Logical Lines of Whitespace (blank line count)

LLOW = Number of logical lines that are either empty or contain whitespace characters only

A whitespace line is either 1) an empty line, or 2) a line with nothing else than spaces, tabs or other whitespace characters. Each empty or whitespace-only logical line is counted in LLOW. That means spaces and tabulation don't affect the counting.

LLOW is almost exactly the same as "the number of blank lines", or Physical Lines of Whitespace. There's a pathological case where LLOW differs from the number of physical blank lines. That's when you join two empty lines with the line continuation character, as in the following example:

' Pathological whitespace line follows:
  _

' Pathological whitespace line above

When you write "_" to join two empty lines, it counts as two physical lines (LINES=2), but just one logical line of whitespace (LLOW=1). It's not a code line (LLOC=0). LLOW is calculated from logical lines to make it comparable to LLOC and LLOC'.

One should use whitespace wisely to add readability to code. You can use the whitespace percentage (LLOW%, see below) as an indication of if you have enough empty lines in your code. You should set your own target values for this metric based on what you feel is readable.

LLOC' and MCOMM – Counting the comments

A comment in VB is a statement that starts with a apostrophe (') or the REM statement. Project Metrics defines the following comment metrics.

Logical Lines of Comment LLOC' = Number of full comment lines

LLOC' does not include any end-of-line comments, only the full comment lines. A line with both code and commentation is counted in LLOC, not in LLOC'.

On the contrary, the Meaningful Comments metric (MCOMM) considers both the full comment lines and the end-of-line comments.

Meaningful Comments MCOMM = Number of meaningful full-line and end-of-line comments

MCOMM counts only meaningful comments and ignores meaningless comments. A meaningful comment is a comment with textual content, even if as short as three consequtive characters. A blank comment or a comment with only punctuation doesn't have a meaning, so it is not counted as meaningful comment. In addition, comments starting with a dollar sign ($) are not counted as meaningful comments since they are interpreted as special Comment directives in Project Analyzer. In VB.NET, comments starting with UPGRADE_ are not meaningful, they have been generated by the Upgrade Wizard and they should be removed eventually. Comments consisting of a single repeated letter are taken as banners and not counted in MCOMM.

Examples of meaningful comments
' ABC
' --- ABC ---
' Return value = y + 2000
Examples of meaningless comments
'
' ---------------
' -=-=-=-=-=-=-=-
' xxxxxxxxxxxxxxx
' $PROBHIDE ALL
' AB
' x = y + 2000

MCOMM% Comment density

Comment density MCOMM% = MCOMM / LLOC

MCOMM% measures how many meaningful comments there are per each logical line of code.

As a special case, when there are no code lines, MCOMM% defined as zero.

We suggest that MCOMM% be at least 20%. This means one comment for every 5 code lines. Naturally, the amount of commentation is not the only issue, it's also about what you write in the comments. If you use comment templates with information on copyright, developer, last modified date and other non-technical information, you should require a high MCOMM%, since the comments should also describe the code, not just the development process. On the other hand, if your code is simple, uses consistent naming and is easy to read, you can probably do with less comments.

Change in Project Analyzer v7.1: MCOMM and MCOMM% values reported by Project Analyzer v7.1 are not comparable to those reported by earlier versions. The counting rules have been refined to exclude some comment types that are not meaningful. Thus, newer MCOMM and MCOMM% values may be lower. A significantly lower MCOMM or MCOMM% should not be interpreted as suddenly degraded quality in this case.

Multiline comments

All the comment metrics (LLOC', MCOMM and MCOMM%) are based on logical lines. In classic VB, you could (in theory) write a multiline comment using the line continuation character "_". Writing a multiline comment is not a good practice, since you can always write two comment lines separately. In VB.NET, multiline comments are not possible.

Commented-out code

Old code that has been commented out counts as comment. What is more, it also counts as a meaningful comment. (This happens because it isn't easy to programmatically distinguish real comments from commented-out code. ) Code that has been commented out exaggerates LLOC' and MCOMM. Exceptionally high LLOC' or MCOMM may indicate the presence of a large amount of commented-out code.

To prevent adverse effects on metrics, old code should be either deleted or excluded using conditional compilation:

#If old Then
   ... old code ...
#End If

Since excluded lines are not counted in LLOC, LLOC', LLOW or MCOMM, they don't affect these metrics in any way. Excluded lines are counted in LINES, though.

Comments and understandability

LLINES Logical Lines (Total)

When you sum up all the logical lines of code, comment and whitespace, you get the total number of logical lines.

LLINES = LLOC + LLOC' + LLOW

It's usual that LLINES is somewhat less than LINES. In no case should LLINES exceed LINES.

Code, comment and whitespace percentages

Code percentage LLOC% = LLOC/LLINES Comment percentage LLOC'% = LLOC'/LLINES Whitespace percentage LLOW% = LLOW/LLINES

These three percentages measure the relative amount of code, comments and whitespace lines. They are counted from logical lines, and they sum up to 100%.

LLOC% + LLOC'% + LLOW% = 100%

How much comments?

Both LLOC'% and MCOMM% measure the amount of commentation. Which measure to use depends on what you intend to do.

How much whitespace?

The amount of whitespace is a matter of programming style. Adding blank lines improves legibility up to a point. Too many blanks will make reading harder, though, as one has to scroll through more pages than otherwise necessary.

A study by Gorla, Benander and Benander compared debug time against the amount of whitespace lines. The study was performed on COBOL. In this study the optimal amount of blanks was 14% to 20% in DATA DIVISION code and 8% to 16% in PROCEDURE DIVISION code. Programs with fewer or more blanks required more debug time.

Compared to Visual Basic, COBOL DATA DIVISION is roughly equivalent to data declarations in Visual Basic. That is the (declarations) section, Dim, Const and Type statements and the like. PROCEDURE DIVISION is the equivalent of executable procedural code.

Interpreting the result for Visual Basic development, it seems safe to assume that LLOW% values 8% to 16% are all right. This also suggests that more whitespace should be used in data declarations than in executable code. Slightly exceeding 16% should not be a big problem, but one should probably avoid too high values such as over 30%.

Reference

System size

What is a large project? Here is our suggestion for classification of Visual Basic project sizes. We base our classification on the total number of physical lines, excluding control definitions, as this is the easiest way to measure code size.

LINES Size
0..9999 Small
10,000..49,999 Medium
50,000..99,999 Semi-large
100,000..499,999 Large
500,000.. Very large

The classification is based on our long-time experience with Visual Basic projects. As programming languages differ in their uses and power of expression, this classification may not be directly usable for other languages.

See also LLOCt Lines in call tree

LLOCt measures the lines in a call tree. Read more

©Aivosto Oy - Project Analyzer Help Contents