id="info"

Developer

Memoize recursive functions to conserve resources

Memoization is an effective technique for conserving system resources, especially in languages that don't do tail call optimization.

Memoization is a form of caching that is used to reduce duplication of effort by your program. In short, it is a means of caching results so that when generating large data sets the same results do not need to be recalculated as part of an otherwise elegant algorithm.

The most common use of memoization is in recursion. Most programmers are aware of recursion; some even understand and use it. The majority of modern functional programming languages provide tail call optimization, but those languages that do not (usually object oriented or merely procedural) include some of the most widely used programming languages. Tail call optimization is typically considered in terms of an interpreter or compiler that optimizes a recursive function so that it will not perform the same operations over and over again. In languages that lack tail call optimization, a similar effect is achieved through memoization.

An example of an algorithm that could benefit greatly from tail call optimization or memoization is the recursive definition of a Fibonacci number:

F(0) = 0

F(1) = 1

F(n > 1) = F(n-1) + F(n-2)

This is a prime example of the importance of optimization in programming. The simplistic recursive source code for this in Ruby would look something like this:

def fib(n)

return n if [0,1].include? n

fib(n-1) + fib(n-2)

end

Unfortunately, the most widely used production version of Ruby today, Ruby 1.8.7, does not support tail call optimization (look for it in Ruby 1.9+). If n = 4 in the above code, it ends up being calculated like this:

fib(4)

(4 - 1)

+ (4 - 2)

(((4 - 1) - 1) + ((4 - 1) - 2))

+ (((4 - 2) - 1) + ((4 - 2) - 2))

((((4 - 1) - 1) - 1) + (((4 - 1) - 1) - 2)) + (3 - 2)

+ (2 - 1) + (2 - 2)

(((3 - 1) - 1) + ((3 - 1) - 2)) + 1

+ 1 + 0

((2 - 1) + (2 - 2) + 1

+ 1

(1 + 0) + 1

+ 1

1 + 1

+ 1

2

+ 1

3

That's a lot of effort just to get the number three. The problem is that, until the numbers start getting down to 0 or 1, every operation requires two sub-operations. With a high enough Fibonacci number to start the process, the number of operations required gets absolutely insane. Using the Ruby REPL, you can see that the fourth Fibonacci number requires nine calls to the fib() method, and the fifth Fibonacci number requires 15 calls:

> irb

irb(main):001:0> $opcount = 0

=> 0

irb(main):002:0> def fib(n)

irb(main):003:1> $opcount += 1

irb(main):004:1> return n if [0,1].include? n

irb(main):005:1> fib(n-1) + fib(n-2)

irb(main):006:1> end

=> nil

irb(main):007:0> fib 4

=> 3

irb(main):008:0> $opcount

=> 9

irb(main):009:0> $opcount = 0

=> 0

irb(main):010:0> fib 5

=> 5

irb(main):011:0> $opcount

=> 15

By the time you get to fib 20, you have 21,891 calls to fib(). fib 30 took more than 10 seconds to complete in a test run, and 2,692,537 calls to fib().

Memoization greatly reduces the number of such operations that must be performed, by only requiring each Fibonacci number to be calculated once. For the simple version, start by creating an array to hold already calculated numbers. For each calculation, store the result in that array. For each time a recursive function would normally calculate one of those numbers, check to see if the number is stored in your array; if so, use that, and if not, calculate and store it. As a jump-start to the array, set the first two elements to 0 and 1, respectively.

In Ruby, a Fibonacci number generator might be modified to look like this:

$fibno = [0,1]

def fib(n)

return n if $fibno.include? n

($fibno[n-1] ||= fib(n-1)) + ($fibno[n-2] ||= fib(n-2))

end

By caching values as you go so that fewer recursive calls are needed, you can get the result of fib 30 pretty much instantaneously. The total number of recursive calls to fib() is reduced from 2,692,537 to a mere 29. In fact, the number of calls to fib() increases linearly, so that the number of calls is always equal to the ordinal value of the Fibonacci number you want minus one. That is, fib 50 makes 49 calls to fib(), and fib 100 makes 99 calls to fib().

That assumes you reset $fibno every time. You can leave it alone, and reduce the number of calls to fib() even more on subsequent calls. For instance, try fib 100 with $fibno = [0,1], and 99 calls to fib() will be made. Try fib 40 without resetting $fibno, though, and only one call to fib() will be made, because $fibno already contains the appropriate value.

You can also use a somewhat simpler approach to caching than the above example. Instead of the number of calls to fib() only increasing by one for each increase in the ordinal Fibonacci value, it increases by two, resulting in 59 operations instead of 29 for fib 30:

$fibno = [0,1]

def fib(n)

$fibno[n] ||= fib(n-1) + fib(n-2)

end

Similar caching mechanisms can be used to achieve similar effects in other languages that do not optimize tail calls, such as C, Java, and Perl. In fact, in Perl this caching idiom has a special name: the Orcish Maneuver. The name comes from the or-equals operator, ||=, which can be pronounced "or cache" in cases such as memoization. Say "or cache" very quickly, and you get the name of something that bears the stamp of a favorite fantasy monster. Perhaps this is how an Orc would optimize a recursive function, after all.

In Perl, the term Orcish Maneuver is typically applied to sorting functions rather than recursive series generation functions as in the case of memoizing a Fibonacci number generator. The canonical Perl example of the Orcish Maneuver looks something like this:

my @words = (

'four',

'one',

'three',

'two'

);

my %numbers = (

'one' => 1,

'two' => 2,

'three' => 3,

'four' => 4,

);

my @numerically_sorted_words;

sub str2n {

my $key = shift;

$numbers{$key};

}

{ my %cache;

@numerically_sorted_words = sort {

($cache{$a} ||= str2n($a)) <=> ($cache{$b} ||= str2n($b));

} @words;

}

foreach (@numerically_sorted_words) { print "$_\n"; }

A probably more useful application of Perl's Orcish Maneuver would be for month names, but this example at least shows how the maneuver is used.

About

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

Editor's Picks