String literal analysis and report


This page describes two reports: String literal analysis and String literal report. The analysis is described first.

String literal analysis

String literal analysis reports the use of string data in a project or solution. You can use it to estimate the space required for strings in an executable and also to find duplicate strings that could potentially be joined to save some space. This feature is accessible via the Report menu.

A large executable can store 100s of kilobytes of string data. As it happens, a large portion of it may consist of unnecessary duplicates. Project Analyzer users have reported duplication rates up to 55%. That means your executable might use more bytes to store the duplicates than the unique strings.

Sample report: String literal analysis

What is in this analysis?

The string literal analysis goes through your source code searching for string literals. A literal is a string that is contained within double quotation marks "...". The strings are grouped by size and the amount of storage space for each size group is reported.

When the string literals are compiled into an executable, they usually take 2 bytes for each character plus a certain amount of overhead bytes. The exact number of bytes depends on your Visual Basic version. You can see the bytes required by your VB version on the report.

Certain strings are exceptional and take a different amount of space. Among the special strings are the literals in Declare statements and attributes. In VB Classic, an attribute is what you see as an Attribute line. In VB.NET, an attibute is enclosed within <angle brackets>.

Unnecessary literals

If you use a lot of string literals in your code you should be concerned about removing any unnecessary ones. A literal is unnecessary if it is never used at run-time. This happens in any of the following cases:

To reduce the size of your executables, remove any dead procedures, dead constants and written-only variables with the help of the Problem report or Auto-fix.

Handling the duplicates

Eliminating duplicate literals may require quite an effort if the amount of code is large. You get a list of duplicates at the end of the report, sorted by their length.

The technique to eliminate duplicate literals is to create new constants. You replace the duplicate literals with a single constant.

As mentioned above, eliminating the duplicates is not necessary in VB.NET because it automatically takes care of this at compile time. You may still want to get rid of the duplicates in preparation for localization to diminish the amount of translation work.

Const concatenation

String constants declared on other string constants may cause a surprising effect. String constants that are formed by concatenating other constants with the & and + operators will get stored in literal into the executable (numeric constants are different). Consider the following example.

Const A = "aaaaaaaaaaaaaaaaaaaa"
Const B = "bbbbbbbbbbbbbbbbbbbb"

If you need the value of A & B, what should you do? Declaring a new constant will store all of "aaaaaaaaaaaaaaaaaaaa", "bbbbbbbbbbbbbbbbbbbb" and "aaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbb" in the executable.

Const AB = A & B ' Store in executable

Concatenation at run-time saves space in the executable but the code may run slower.

Dim AB As String
AB = A & B       ' Run-time concatenation

The alternative you choose depends on whether you value size or speed. If the strings are short, you might want to take the Const route but if they are long, you should probably concatenate them at run-time.

Special cases

The report does not consider control properties on VB Classic forms. Neither does it analyze constants formed by concatenating other constants with the & and + operators.

Attributes and Declare statements are not included in the duplicate analysis. Their memory requirements are reported separately as they are stored differently in the executable.

Strings found in compiler directives, such as #Const, #If and #Region, are excluded from the analysis. Comments are not included either.

Notes on multi-project analysis

When analyzing multiple projects at the same time, string literal analysis reports duplicates over all the files. As a shared literal needs to be compiled into each executable file, the actual required number of bytes to store all the strings is larger than reported. Similarly, the savings reached via elimination of duplicates may be less compared to when all the code is compiled into a single executable.

Thus, if you have several related projects, consider the following. To evaluate memory requirements and save memory via elimination of duplicates, analyze a single project at a time. To centralize all strings into constants use multi-project analysis.

If there is a large amount of data to share between executables, consider the use of a resource DLL or external files. This way the same data does not have to be compiled into several executables.

String literal report

A related report, the String literal report, lists all string literals found in the code along with the line where they are to be found. It is intended for review translation and spell checking purposes. The list is sorted alphabetically.

There is also an alphabetical listing of all strings. You can feed this list into a spell checker.

This report includes a section with all string literals that are not pure ASCII. Non-ASCII characters may not display properly on all systems, especially foreign ones. This section lets you find potentially problematic characters for portability tests. When running a test, make sure these strings display correctly on your target systems. This kind of a test is especially necessary for classic VB, which uses Ansi for character output. Use Unicode where possible to make sure these strings don't turn into garbage.

At the end of the report you can find a list of all the different characters used in string literals. Use this to detect any special characters you might not have noticed otherwise.

Sample report: String literal report

See also


Project Analyzer Help