Loops make programs run slowly. A few unoptimal lines can make an app run at a snail's pace. This article presents performance tricks for squeezing the max speed out of your code. The focus is on processing data arrays in loops. We restructure loops, rebuild function calls, fine-tune conditionals, choose fast operators, pre-calculate values and access arrays the proper way.
This article features tips for the optimization of numeric computation, repetitive execution, recursion and arrays. If your program processes data arrays too slowly, these ideas can give your app a real performance boost. On the other hand, if your application is slow at processing strings, read Optimized string handling instead.
The optimization tips are generic and apply to many programming languages. The code examples are in Visual Basic.
In this article:
Speed up the loops
Loops are often the performance bottlenecks of an application. The key to speed up the program is to make the loops run faster. Nested loops are often where the problems lie in.
Consider an imaging application as an example. It stores pixel images in simple arrays such as Data(x,y) or data[x,y]. To read, process and save the images, it needs to process the arrays element by element, pixel by pixel. This is done in two nested loops: the y direction and the x direction. A typical piece of code looks like this:
For y = 0 to Height-1 ' Iterate through y axis For x = 0 to Width-1 ' Iterate through x axis process the pixel at (x,y) ... Next x Next y
The inner code runs through each pixel, a total of x*y times, to process the pixels one after the other. As it happens, the program runs at a snail's pace. What can we do about it?
Focus on the inner loop
With nested loops, put your efforts on the innermost loop. It's the one that executes the largest number of times. This is where you usually get the most out of your efforts. Each statement and expression really makes a difference as it executes a million times. Here you can make miracles by eliminating unnecessary statements, choosing the right operators and picking the fastest data types.
Move loop condition to end-of-loop
Most programming languages have at least two types of loops. One is the pre-condition loop where your loop condition is at the start of the loop. The other is the post-condition loop where the condition is at the end.
Note on loop #2: In some languages, such as Pascal, the post-condition loop must
be written with an "until" condition such as
Here's the optimization trick: If the loop should execute at least once, use a post-condition loop. What's the deal? You save one evaluation! Assume the loop runs n times. Loop #1 evaluates its condition n+1 times, but loop #2 evaluates only n times.
When to use this trick? It makes sense if the loop is nested inside another loop and execution enters the loop a large number of times. It also makes sense if evaluating the loop condition takes a considerable amount of time.
Be sure not to overuse this technique. You can't use a post-condition loop if the loop should be skipped sometimes (executed zero times). Then you must use the pre-condition loop.
Move loop invariants out of the loop
A loop invariant is an expression whose value doesn't change during the loop. In the following example, y*Width is evaluated on every iteration of the inner loop, even though its value is always the same. Width doesn't change during the loop, nor does y change within the inner loop. y does change at each iteration of the outer loop, but not before.
For y = 0 to Height-1 For x = 0 to Width-1 ' y*Width is invariant i = y*Width + x Process i Next x Next y
To make the loop faster, move the invariant out of the inner loop:
For y = 0 to Height-1 Temp = y*Width For x = 0 to Width-1 i = Temp + x Process i Next x Next y
Your compiler might optimize loop invariants for you. Check to see if it really does. Compile both examples and measure the performance. If the performance is the same, you're lucky since you don't need to optimize this manually.
Use pre-calculated values
This technique is a special case of loop invariants. Replace invariant expressions with pre-calculated values (constants). Consider this simple innocent-looking expression:
x = y And Not 192
What's wrong? You possibly evaluate
Const z = Not 192 x = y And z
Again, your compiler might do this for you, so check it out.
Use lookup tables
If you don't have a single constant value, but a set of alternating values, you can't use the loop invariant techniques described above. You can, however, use lookup tables for the same effect. Consider this code:
x = 2 ^ (16 - (y And 15))
We assume y is anything from 0 up to the maximum integer. As you can see, x is calculated as a of power 2, the exponent varying from 1 to 16. (y And 15) varies between 0..15, so the exponent is between 1..16. This expression is relatively slow since it contains the slow exponential operator ^. It is a great case for a lookup table. You pre-calculate the values and store them in an array. Reading from the array is faster than re-evaluating the same 16 values over and over again.
Here's how you build the lookup table:
' Powers of 2 (from 2^16 to 2^1) Dim Pow(0 To 15) As Long For i = 0 to 15 Pow(i) = 2 ^ (16 - i) Next ... ' Use the lookup table x = Pow(y And 15) ' same as x = 2 ^ (16 - (y And 15))
If you know in advance the values that y can take, you can expand the lookup table. Instead of (y And 15), calculate the table for (y). In your application, y might take values between 0..255 or 0..65535. In this case it could make sense to pre- calculate a larger table, such as Pow(0 To 255) or Pow(0 to 65535). This way you would get rid of the And 15 part and save some calculation in your loops.
Some loops run faster if you write them out. Consider this simple loop:
For y = 0 to 7 i(y) = z And 2^y Next
You can expand this short loop as:
i(0) = z And 1 i(1) = z And 2 i(2) = z And 4 i(3) = z And 8 i(4) = z And 16 i(5) = z And 32 i(6) = z And 64 i(7) = z And 128
This is a speed vs. code size issue. It can make a big difference if the formula is slow. Rewriting it "open" (instead of re-evaluating the formula over and over) can make the code a lot faster. The expression above may be too simple to justify expansion. To make the point clear, consider this artificial example:
For y = 0 to 7 i(y) = j(0) And 2^y _ + j(1) And 2^(y-1) _ + j(2) And 2^(y-2) _ + j(3) And 2^(y-3) Next
Executing this line in VB6 will take a long time. Expanding it using the above technique will boost the performance considerably. You expand each of 2^y, 2^(y-1), 2^(y-2) and 2^(y-3). Since you know these values in advance, you don't need to re-calculate them over and over.
Speed up operations on data
Eliminate unnecessary temporary variables
Programs often store intermediate results in a temporary variable before using it further. Here is a typical example:
For y = 0 to Height-1 For x = 0 to Width-1 temp = calculate(x, y) Process temp Next x Next y
Many times you can make the program faster by eliminating the temporary variable. You save the effort of unnecessarily storing the value:
For y = 0 to Height-1 For x = 0 to Width-1 Process calculate(x, y) Next x Next y
The benefits of doing this vary by the language. Some compilers automatically optimize this for you, so removing temp manually makes no difference.
Another typical example is this:
For y = 0 to Height-1 For x = 0 to Width-1 result = calculate(x, y) Next x Next y
Here you don't actually use the value of result for anything. For all your purposes, result is effectively dead. It is only consuming system resources. Again, some compilers will eliminate such an unused variable on your behalf. For languages that don't have this feature, such as the good old Visual Basic, you can use a code analyzer (see Project Analyzer) to locate the dead variables for you.
Choose faster operators
Some mathematical operators take more to run than others. Testing the actual performance in your environment and with your data types is recommended. Here are some examples in VB6:
Note: You may need to write
Choose faster data types
Picking the right data type can make a huge difference. One might assume, for example, that a smaller data type (such as Byte) is faster than a larger data type (such as 32-bit Long integer). Well, this isn't the case. In VB6, Long is the fastest integer data type. It's also the native data type of the processor. Even if the value fits in a byte, put it in a Long if maximum speed is what you are after. (This advice doesn't necessarily work for large arrays, where there is the tradeoff between memory and speed.)
If you can, use integer data instead of floating point. Unnecessary use of floating point values makes a big performance hit.
With data types, there is a classic tradeoff between memory usage and speed. The article Save memory discusses variable data types and their memory consumption.
Optimize the control flow
Control flow optimization is an important part of giving your programs a performance boost. Sometimes you need to twist your code so that it's harder to read but runs faster.
The order of the
If condition Then rare_block Else usual_block End If
Let's assume you have noticed that the program usually jumps to the usual_block. The rare_block executes only a few times. Unfortunately, jumping down to the usual_block takes too much time. Rewrite this as:
If Not condition Then usual_block Else rare_block End If
Now the usual_block is on top and you just saved someone from getting fed up with your slow program.
The same idea works also for
Short-circuiting speeds up complex conditional statements. You can use it for conditionals with AND/OR operators.
In languages where operators AND/OR are not short-circuited, executing an expression such as "x And y" and "x Or y" evaluates both x and y before deciding what the outcome is. If the operators are short-circuited, y is not necessarily evaluated at all. The following table describes this in detail.
For optimization purposes the benefit of short-circuiting is that you sometimes get away without evaluating the latter operand (y). If y happens to be a time-intensive operation, such as a function call, your program executes faster.
The potential pitfall with short-circuiting is that y is not always evaluated (function not called). This could introduce a bug that is hard to notice.
Languages differ in whether their operators are short-circuited or
not. In Visual Basic 6.0 there is no short-circuiting. In C, on the
other hand, the operators && and || are always short-circuited.
In VB.NET, you have a choice.
In languages that don't have short-circuited operators, you can simulate them with the following constructs:
Inline function calls
Repeated function calls add overhead. Each time you call a function, you spend a little time for things like passing parameters and saving the function return value.
Inlining means you copy & paste the called function into the caller. [Well, a simple copy & paste is not always enough, you need to take the parameters and local variables into account, too.] It works best if the inlined function is small and not very complex. If the function to inline is a complex one, the amount of rework (refactoring) may be considerable.
Inlining is easier with languages that support macros or special inlined functions. Macros and inlined functions are automatically inlined for you at compile-time, so you don't need to do it manually.
Recursion is a nice technique where a function calls itself repeatedly until the work is done. It's often the most elegant way to code some algorithm.
Unfortunately, recursion also adds the overhead of a function call. If the recursion tree gets deep, you spend the time with the calls instead of the real computation.
You remove recursion by replacing it with a loop. Sometimes it is enough to use GoTo to jump to the start of the function. Sometimes you need to do more, like arrange parameter values and local variables.
Visual Basic 6.0 optimization tricks
The following lists a number of VB6 specific optimization tricks. Other than VB6 coders can skip them.
VB6 file access tricks
There is no need to read/write files byte by byte. You can read/write large blocks at a time.
Read byte array from file
' Allocate space for count bytes ReDim array(1 to count) As Byte ' Read count bytes from file Get #filenum,, array()
Read entire file to byte array
' Allocate space for entire file ReDim array(1 to LOF(filenum)) As Byte ' Read entire file Get #filenum, 1, array()
Save byte array to file
Put #filenum,, array()
Copying large data blocks in VB6 is slow. You can use a Windows system call to do that. CopyMemory copies a given number of bytes from one memory location to the other.
Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" _ (Destination As Any, Source As Any, ByVal Length As Long)
CopyMemory is a bit tricky to get right the first time. Here's how you use the function.
CopyMemory Destination, Source, numbytes
CopyMemory: Copy an array
' Source data is numbytes in size ReDim Source(0 to numbytes-1) ' Allocate memory for destination array ReDim Destination(0 to numbytes-1) CopyMemory Destination(0), Source(0), numbytes
It is important that there are at least numbytes bytes of data in both Destination() and Source(). Otherwise your program will crash. The equivalent VB code for array copy is:
Destination() = Source()
This the safer way to copy an array, but it's also the slower way.
CopyMemory: Assign a byte array to UDT
You can copy a byte array over a user-defined type (UDT) variable. This way you can access the various fields of the UDT directly without having to interpret the data from the raw bytes. Here is how you do it:
' Target variable Dim UDT as MyUDT ' Source data in byte array, size LenB(UDT) bytes ReDim Source(0 to LenB(UDT)-1) ' Copy source to target CopyMemory UDT, Source(0), LenB(UDT)
In plain English, you copy exactly the UDT size bytes from the byte array over the UDT. The call LenB(UDT) returns the number of bytes required. The data will automatically arrange itself into the various fields of the UDT. — Note how you need to call LenB instead of Len. Keep in mind that Len <= LenB. UDTs can contain extra alignment bytes or Unicode Strings, which occupy 2 bytes per character. Len doesn't count the extra bytes. The effect is that you get an incomplete copy with the end bytes not being copied at all.
You can also do it vice versa. Once you have populated the UDT fields, copy it to a byte array.
CopyMemory and type conversions
You can do many other unorthodox type conversions with CopyMemory. You can, for example, copy a 2-dimensional array over a 1-dimensional array. This way you can convert a 2D (x,y) image into a simple list of bytes for further processing.
VB6 ReDim Preserve – use it sparingly
When you want to reallocate more (or less) space for an array, you
You are better off allocating enough space first. When the processing
is done, deallocate the extra by a single
Read more: Allocate arrays wisely
VB6 With..End With block
You may have heard that the
With array(x) If .a = 0 Then .a = 1 .b = 2 ...
If the condition
If array(x).a = 0 Then With array(x) .a = 1 .b = 2 ...
VB6 Pass initial values as parameters
You can save time by passing initial variable values as parameters. Consider the following example procedure, which uses a local variable x with an initial value of 2:
Sub Test() Dim x As Long x = 2 ... End Sub
This doesn't look too bad, really. But wait... you can make it faster! You can pass the initial value as a parameter:
Sub Test(ByVal x as Long) ... End Sub ' Caller: Test 2
The last line calls Test giving
The trick works because VB6 doesn't have a variable initializer clause. Integer variables are initialized to zero. When you want a non-zero initial value, you need to execute an assignment statement (
This is nowhere near a good programming practice. After all, callers must pass 2 as the parameter, otherwise the program will fail. On the other hand, end-users will surely value faster execution over nice coding. Just document it well.
VB6 Use zero initial values
As an alternative to the above trick, you can pick local variables so that their initial values are zero. Instead of a 1-based array index use a 0-based index, for example.
VB6 variables and arrays
Static variables are soooo static
Avoid the use of
Module-level or local variables?
Oddly enough, the speed difference between accessing a module-level variable (class variable) vs. a local variable is not constant. Sometimes it's faster to read/write a local, sometimes a module-level variable. Experiment.
You can replace a module-level variable with a local copy. Use a local temporary variable and write the changes back at the end of the procedure. If this makes sense, that is.
Dynamic array is faster than fixed-size array
Dynamic arrays are faster to access than static arrays. Consider the following alternatives:
It is faster to access the array you allocated with
(End of Visual Basic 6.0 specific tricks)
Don't process the data unless you have to. This sounds obvious, but sometimes you see code that runs in vain. Consider a function that arranges an array of bytes. If the bytes are already in the correct arrangement, there is no reason to rearrange. An example is a sort routine that sorts data that was already sorted, or an imaging application that optimizes the image palette even if it was optimal already.
Process as little data as you can. Consider an imaging application that compresses image bytes. The fewer bytes there are to compress, the faster it runs. To keep the amount of data down, use the lowest color depth (bits per pixel) possible. The image can be represented with fewer bytes and the compression runs faster.
Pre-process the data. If your code spends its time doing lookups on data, store the data so that the lookups are faster. As an example, you can sort the data and do the lookups in the sorted data, which is generally much faster than searching in unsorted data.
Use available system functions. They often run faster than what you can possibly code. Learn the various functions of your programming language, framework and the operating system. As an example, searching for a character in a string in VB6 is faster done with the built-in InStr function than reading through the string character by character.
Optimization requires documentation
Code optimization can make the program look like spaghetti. The more
you optimize, the harder the code gets to read. When you replace
recursion by a
Therefore, remember to document your work. Even if the original code was clear enough to go without comments, the optimized version probably needs them badly. Comment out the old code and write an explanation what was changed and why. If the optimized code is too hard to understand, the original lines can make it clear how it used to work. Explain why you chose And instead of Mod and what the reason was.
Keep in mind that optimizing a piece of code could change the requirements of calling it. As an example, replacing "x Mod y" with "x And (y-1)" requires that x >= 0 and y is a power of 2. Similarly, replacing floating point division with integer division could make a difference with some special input values (due to rounding). Moving the loop condition to end-of-loop absolutely requires the loop must execute at least once. Document the changed requirements as you optimize. It will make life much easier later on.
Finding the bottlenecks
Most of the performance gains are often due to a limited number of changes in certain key locations in your code. What are these locations and how can you find them?
A critical location is executed thousands or hundreds of thousands of times. It may be inside a loop or a recursive algorithm. The location may consist of a handful of procedures or certain lines of code inside them.
It's not always possible to tell a bottleneck by just looking at the code searching for loops or recursion. A code profiler, such as VB Watch, is useful for finding the critical locations. It logs the execution times as you run your program. When ready, it tells you which procedures or lines were executed the highest number of times, and which ones took the longest time to execute. These places are the best candidates for manual optimization work.