Software Development

Should you care about closures?


If you are a frequent reader of this blog and the comments associated with it, you have probably run across the concepts lambda calculus, closures, or anonymous functions. And if you strictly stay in the Microsoft ecosphere, you may have seen these terms suddenly springing up all over the place; the ideas are being talked about like they are the greatest thing since the CPU. C# 3.0 brought with it closures to C# (2.0 had anonymous functions), and Visual Basic 2008 is getting them as well. What exactly is the point, and why should you care?

The first thing to understand is that these are not new concepts. Lambda calculus has been around for about 70 years. Closures have been around in languages since Scheme in the 60s. C# 2.0, backwards as it may be, had anonymous functions (think of them as lightweight, limited purpose closures). Perl, Ruby, and even ECMAScript (more commonly known as JavaScript) have had this functionality forever. For a good overview of the history and technical details on these ideas, Wikipedia's article on closures seems to be accurate (even if it is a bit difficult to follow).

In the .Net world, C# and VB are really only getting closures because it is needed for LINQ. Microsoft may not come right out and say, "Gee, we left a really useful feature out until we decided we needed it to support the LINQ gimmick," but articles like the recent one about lambda expressions in MSDN Magazine make it clear that, without LINQ, it would be 10 more years before VB.Net or C# saw closures. Here's a quote: "To support LINQ queries, a few features needed to be added; among them were Visual Basic and lambda expressions." So there you have it.

Lambda expressions, closures, etc. are now big news. So what? They have always been around. It's time to start using them; with all of the recent buzz, you can't pretend you have not heard of them anymore.

One of the things that has continuously irked me about nearly every one of the recent discussions on the topic is the brainless parroting of the usefulness of closures. "Pass them as delegates to another function." Yawn. "Now your code looks so much better, define a function where it is used!" Ho hum. "Type inference!" Isn't static, strong typing a safety feature that most VB.Net and C# developers rely upon? C'mon folks. Let's get real. It sounds to me like what developers are really saying is, "Closures let you pretend you are using a dynamic scripting language inside of a strongly typed compiled language!" Well, it is true, but it is not the point.

What I really, really like closures for is the ability to use them to create Domain Specific Languages (DSLs). The other thing I really like closures for is their use in projects where you need to be able to late bind entire portions of functionality. Of course, those two concepts are nearly synonymous. What I mean is: Closures allow the programmer to work pure, blinding magic at runtime, where the functionality of the software can be defined post-compilation with no editing of the source code. This can come from configuration, a DSL (this can be as simple as code written in the same language as the program and lambda'ed or eval'ed at runtime to something as complex as a parser for a small, dynamic language), templating systems, and so on. Another interesting (albeit rare) usage of closures is in N-dimensional logic matrices.

These are just a few of my common usage items for closures. The stock list of examples and recent press sells closures short. So go ahead -- get your hands dirty and try them out. With the full support of them in C# 3.0 and VB.Net 2008, .Net developers have access to closures already (or in a few months, depending on their language preference). Sadly, for Java programmers, it appears that closures are still not part of Java. Hopefully, Java will meet parity soon. I recommend that, at the very least, you start looking at them and seeing how they can simplify your life down the road.

J.Ja

About

Justin James is the Lead Architect for Conigent.

17 comments
Mark Miller
Mark Miller

I know .Net does things differently, but where I've found closures to come in real handy is in dealing with collections of objects. I've probably pointed this out before, but in Smalltalk you can use: newCollection := collection select: [:each | each = filterCondition] as a one-liner to get a filtered collection (the []'s go around the closure). If you wanted to you could define a method that combines functions as well: newCollection := collection select [:each | each = filter] andSort: [:first :second | first < second] just to optimize things a bit. Typically such things will be done with Linq expressions, I assume. There is the yield keyword in .Net 2.0 that would do the same thing, but it's not as elegant, because you have to use it in a method that returns an enumerable iterator, which is implicitly called via. a foreach loop outside the method. I found this article at http://diditwith.net/2006/10/05/PerformanceOfForeachVsListForEach.aspx that says with optimization off List.ForEach(Action) is actually [i]faster[/i] than any other loop mechanism, which means: intList.ForEach(delegate(int i) {result += i;}); is faster than: foreach (int i in intList) result += i; It's too bad the closure example is longer, but like you were saying sometimes this would be cool to do for other implementation reasons. Now, with optimization on he says a generic for-loop is faster than anything else. I finally figured out what you were talking about on my blog about defining closures in member variables in classes and using them like methods. A while later I came upon a guy on the Squeak list who was storing Smalltalk code in a database, and loading it along with data. He would dynamically compile the code when the data was retrieved. It's typically best to do this with closures because if you want an object to execute methods on, the compiler has to know what receiver you want it to generate from the expression (that's my understanding anyway), and closures are a nice, generic receiver no matter what the expression is. I assume he was doing this for the same reasons you've seen of putting business rules into a database, which you and I have both seen in other venues. Every example I've seen of closures in .Net 3.0 code, however, seem to be hardcoded. How would you do this dynamically from a database?

dawgit
dawgit

Is MS going to do LISP any time soon? :0 ( I can see the xkcd site having fun with this already. ) :p -d

apotheon
apotheon

"[i]Wikipedia's article on closures seems to be accurate (even if it is a bit difficult to follow).[/i]" I'll try to provide a simpler explanation: A lexical closure occurs when you close lexical scope on a function. 1. Lexical scope is also known as [url=http://en.wikipedia.org/wiki/Lexical_scope#Static_scoping][b]static scope[/b][/url]. 2. In lexical scoping, there are very clear, specific rules for when something goes out of scope. When a function ends, for instance, anything declared inside it goes out of scope. Closures are basically the one and only exception to this. 3. A [url=http://en.wikipedia.org/wiki/First-class_function][b]first-class function[/b][/url] is a function that can be treated like variables, parameters, and return values -- essentially, like data. 4. To define a closure in the common case, you first create a scope with some lexically scoped declaration inside it, second create a function within that scope that refers to that lexically scoped declaration (usually a variable), and third return that function as a value to be assigned to a variable. 5. That function, now residing inside that variable, refers to the "parent" scope -- the scope within which it was defined. That scope has been "closed", in that it has ended [i]except for the closure's reference to it[/i]. This means that, short of bit-twiddling memory addresses, there's now no way to access the data from that parent scope except via the closure's interface (which may consist of nothing more than a simple function invocation). The scope is "closed" to the world for purposes of direct manipulation of the data it contains (again, short of dirty tricks like memory address bit-twiddling). 6. This provides greater encapsulation (i.e. [url=http://en.wikipedia.org/wiki/Information_hiding]information hiding[/url] and [url=http://en.wikipedia.org/wiki/Separation_of_concerns]separation of concerns[/url]) and protection (which doesn't seem to have a Wikipedia article, for some reason). In other words, it does much of what object oriented programming is designed to do, better than most object oriented languages can do it. 7. Finally . . . state (generally, "data") contained in a closure's parent scope is persistent. In other words, if you define a closure whose only purpose is to increment (by one) and output a value in its parent scope, you'll get a number one higher than last time as output every time you call that closure. The canonical example of a closure is an incrementing function. Here's a Perl example: [pre]sub closure_generator { my $value = 0; return sub { $value ++; print $value; } } $closure = closure_generator();[/pre] The "my" keyword in Perl (for those who don't know the language) is used to declare lexical scope for a variable, and "sub" is used to declare a function (notice the function declared inside the closure_generator() function has no name -- it's "anonymous"). The above Perl code stores a [i]reference[/i] to the anonymous function in the $closure variable. Dereferencing syntax used to execute the function looks something like this: [pre]$closure->();[/pre] Try it, if you have Perl installed on your system. From a bash prompt, I executed the code I just posted thusly: [pre]$ perl -le 'sub closure_generator { > my $value = 0; > return sub { $value ++; print $value; } > } > $closure = closure_generator(); > $closure->(); > $closure->();' 1 2[/pre] The "1" and "2" lines are the output generated by the $closure->() invocations. The -l option sent to the perl binary (along with the -e option, which tells it to execute the code) inserts a newline character at the end of every print() statement.

Justin James
Justin James

Are closures something you currently or plan on using? Why or why not? J.Ja

Justin James
Justin James

After all, Microsoft has F#, which is in the ML language family, and is extremely similar to OCaml. And F# has much of what is in Lisp. Are there some differences? Absolutely. But all the fundamentals are there. Microsoft is showing quite clearly that the DLR (Dynamic Language Runtime) is extremely capable. Remember, IronPython was originally an attempt to prove that .Net was weak, and the guy writing it ended up being really pro on it. :) So while Lisp in .Net (or some other Microsoft version of Lisp) would be surprising (they already have F# mainly), it would not be too surprising either. J.Ja

Justin James
Justin James

That was a *much* better explanation than I could have given, thanks for posting it! The way I usually explain it glosses over the subject to the point where it seems only marginally different from an anonymous function... J.Ja

alaniane
alaniane

Is your #7 use for a closure like using a static variable in a C function? like: void foo(void) { static int cntr = 0; cntr++; printf("%i\n",cntr); } If so, what is the advantage of using a closure over using a static variable?

jslarochelle
jslarochelle

...from the start (or very early in the genesis of the language). When available early, closures can be integrated very well into the language and be very productive. Although I don't do anything fancy with Closures - this must be the parrot in me ;-)-I am sure glad to have them well integrated with Ruby collections, database and other classes. This is one reason why Ruby is very productive. JS

apotheon
apotheon

Support for proper lexical closures is one of the first things I look for when evaluating a new language. If it doesn't support closures, that's a major failing, in my eyes. One of the reasons for this is the fact that support for closures actually indicates support for a number of other, prerequisite, capabilities -- such as lexical scoping and anonymous, first-class functions. The truth of the matter is that I don't really make use of closures all that often. When I need them, though, I'm always glad to have them.

Justin James
Justin James

In your example, cntr is of type int, and you can perform any operation on it that you could perform on an int, as your example shows. In the above example, the variable itself contains a function. In other words... TypeOf(cntr) -> int TypeOf($closure) -> sub TypeOf($closure->()) -> int ... to really munge psuedo C and pseudo Perl with some psuedo VB... Because a closure is constructed from a string literal, it opens up all sorts of possibilities. Imagine the following code (again, pseudo VB): Dim sFunctionContents as String sFunctionContents = SomeDBConnection.ExecuteScalar("SELECT function_text FROM configuration WHERE function_id = 5 AND customer_code = 9") Dim fCustomerFunction As Function(sFunctionContents) Console.Writeline(fCustomerFunction.Execute(oSomeParameter.ToString())) Wow, that is not the world's best example, but I think it shows the gist. We are dynamically determining the function's contents based on something in a database, at run time. To re-phrase, we have late bound functionality in a compiled langugage. Add some caching on fCustomerFunction (like a global collection with the customer and function IDs as keys) to avoid the DB lookup on subsequent calls, and you have an extremely powerful way to do things like implement multi-teneted, highly customized Web apps in a few lines of code. I know I am off topic, in terms of answering the question, but I hope it makes the difference a bit more clear! J.Ja

Justin James
Justin James

You are right about the pre reqs to closures. I never thought about it, but you are right that having closures guarantees a lot of other useful features as well. J.Ja

Mark Miller
Mark Miller

Something that puzzled me is Raman used situations like the following: int x = 0; int foo() { int x = 10; ... } Back when I programmed in C this would have been considered a compile-time error, due to scoping rules. Since the first definition of x is "int x = 0", this definition will carry into foo(). It would be the equivalent of saying: int foo() { int x = 0; int x = 10; ... } inside the function. So I'm curious why this works for him. I've seen other people do this sort of thing as well. Has the semantics of C changed to allow this? Using what you said in another post about using a static variable inside a function, and "passing it around" using a function pointer, would this work?: int (*)(int) init_add(int i) { static int temp; temp = i; int add(int k) { return (temp + k); } return &add; } void main(void) { int (*pFn)(int) = init_add(2); int result = pFn(5); printf("%d\n", result); } The static would be necessary because the value of the i parameter would disappear the moment you exited init_add().

apotheon
apotheon

I figured I'd offer a URL for more information about closures in C: http://linuxgazette.net/112/ramankutty.html This guy clearly knows more about C than I do, so I'll defer to his expertise in this case. Among other important facts, he points out that ANSI C doesn't support nested functions -- to get that sort of behavior, you need GCC, and even that is not guaranteed to give you the lexical closure behavior you expect when building them with nested functions. The best (and only?) way to get proper first-class lexical closures out of C would probably be to create a closure using a nested function as he demonstrates, then return a pointer to the nested function as you exit the parent function.

apotheon
apotheon

It depends on how you did it. If you had a static local variable inside a function that was passed around by way of a function pointer, you'd be a lot closer to a closure than simply having a function with a static local variable in it.

apotheon
apotheon

A static variable is a variable whose scope is specific to where it was declared, and (in the case of one static to a given function) persists between function calls. While this does sound suspiciously like the behavior of variables within the parent scope of a closure, it is not the same as a closure. The closure is more closely related to the function within which a static variable exists than to the static variable itself. It is the containing scope of the closure that makes it a closure, when that scope is "closed". A single closure can contain references to several different variables that are, in effect, static variables. Asking whether a static local variable in C is like a closure is a bit like asking whether a high-performance tire is a sports car. In addition, the function that contains a static local variable in C is not the same as a closure, either -- though it's closer than the static local variable. For instance, a standard function in C can contain a static local variable, and every time you call that function you get its access to the static local variable, similar to the behavior of a closure to its lexically closed variable. You must, however, define a new function to contain a new static local variable every time you want an additional, separate example of that behavior, or similarly jump through some unwieldy hoops (like huge code factory functions, or a single function containing multiple static local variables for different use cases). The case of a closure is different from this in that a single, very simple function can be created as a closure generator (and a closure itself can be a closure generator, for that matter). For another example, consider the fact that a closure is a first-class function. This means that you can assign your closure to some label, and pass it around as such, in much the same way you would an integer value in C. While you can get similar behavior with pointers to the functions containing your static variables, this is a separate (and sometimes dangerous) means of achieving similar effects, with a different set of caveats and side-effects (in a casual use of the term "side-effect"). Closures of a sort can be achieved in C/C++, of course. The major difference between C/C++ closures and, say, Perl closures, is in the relative unwieldiness of closure generation and manipulation. I'm not 100% certain whether a C/C++ closure fits the strictest definition of a closure (I don't know C as well as I'd like, alas), but it's definitely close enough for corporate (or government) work.

The family Jules
The family Jules

what about using a function pointer instead of the int? would that basically be the same thing, then?

alaniane
alaniane

What I was trying to understand is whether closures were similiar to static variables used in C. It was based on what I had read in the previous comment. I am not that familiar with dynamic languages so I was trying to place the concept of closures within a context that I was familiar with. Your comments do help to see where closures could be useful and it does give me and it does help clarify the concept.

Editor's Picks