Web Development

Understand how to use and implement Ruby blocks

The ubiquity, power, and elegance of the Ruby block makes it an important feature of the language that any Rubyist should know how to use and implement.

Ruby has the concept of a special type of code "block." These Ruby blocks are essentially syntactic sugar for passing a lambda (an anonymous function) to a method as an argument. Possibly the most common example of a Ruby block is the each block iterator:

foo = [0,1,2,3,4,5,6,7,8,9]

vals = [0,1]

foo.each do |n|

print "#{n}: "

puts vals[n] ||= vals[n-1] + vals[n-2]

end

In this example block, which is used to iteratively generate a Fibonacci sequence from an array of integers, each number in the array foo is being passed to the block in turn as parameter n. The code in the block is then used to operate on that number. This syntax makes for a very clear, clean appearance that makes the operation look much more direct than it actually is behind the scenes. To give you an example of how the minimum functionality of each is used to accomplish the task set forth in that example, let us use this naive implementation:

class Array

def my_each

for element in self

yield element

end

end

end

foo = [0,1,2,3,4,5,6,7,8,9]

vals = [0,1]

foo.my_each do |n|

print "#{n}: "

puts vals[n] ||= vals[n-1] + vals[n-2]

end

Not all blocks in Ruby are iterators, however. For instance, the File class comes with an open method that can take a block as an argument. This is used to neatly group file operations with the code that opens the file and provides several benefits, including automatically closing the open file when the block is finished executing. An example of this method in action follows:

File.open('/home/username/foo.txt').each_line do |line|

if line.match(/foo/)

puts line

else

puts 'fooless line'

end

end

The alternative approach, which is more common in other languages, is to do something like this Ruby example:

foo_file = File.open('/home/username/foo.txt')

for line in foo_file

if line.match(/foo/)

puts line

else

puts 'fooless line'

end

end

foo_file.close

While this is a fairly trivial example, much more complex series of operations that must be performed on an opened file may eventually result in the programmer forgetting to close the opened file when finished, using the second approach. This can be especially problematic where a looping construct of some kind is used to perform the same operation over and over again with a new file each time, potentially resulting in thousands of files being opened and never closed if the file closing operation never executes -- whether because of some error in the code or because it was never written by the programmer.

This File.open example shows a method that optionally takes a block, which the my_each method above does not. Luckily for someone who wants to create a method that optionally takes a block, the method block_given? offers a simple solution:

class Array

def my_each

for element in self

if block_given?

yield element

else

puts element

end

end

end

end

The yield method offers succinct clarity of code, but there may be times you want more from your Ruby blocks -- the ability to pass them around in your code like a variable. In such circumstances, blocks can be handled as method arguments:

class Array

def my_each(&block)

if self.size < 10

for element in self

yield element

end

else

self.even_each(&block)

end

end

def even_each

for element in self

if element.modulo(2) == 0

yield element

end

end

end

end

If you want to use idiomatic Ruby code, you might want to use some blocks within your block-using method implementations:

class Array

def my_each(&block)

if self.size < 10

self.each {|element| yield element }

else

self.even_each(&block)

end

end

def even_each

self.reject do |element|

element.modulo(2) != 0

end.each {|element| yield element }

end

end

In this example, the even_each method operates only on every even number in the array. The my_each method will operate on every element, but only if the entire array has fewer than 10 elements in it; otherwise, it passes the block off to even_each. Thus, you might get the following with these methods when executing them in irb:

>> foo = [0,1,2,3,4,5,6,7,8,9]

=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

>> foo.my_each {|n| print "#{n} " }; puts '!'

0 2 4 6 8 !

=> nil

>> bar = [0,1,2,3]

=> [0, 1, 2, 3]

>> bar.my_each {|n| print "#{n} " }; puts '!'

0 1 2 3 !

=> nil

Coming up with more sophisticated uses of the Ruby block is left as an exercise for the reader.

About

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

7 comments
Mark Miller
Mark Miller

self.reject do |element| element.modulo(2) != 0 end.each {|element| yield element } This code segment looked confusing, but I think I know what Chad's doing here. What threw me was the "end.each" construction. I kept thinking "end" was an object, but it isn't. self.reject returns a new array with the selected elements (the elements that were not rejected) from the original array. "end" represents the end of the block, and thereby the point at which the new array object exists. The "each" method call is actually to the new array, passing in a new block that does a yield on each element to the "print" block that was passed in earlier. To illustrate, if I were to use curly braces to show where the block in this code is (I realize this probably isn't legal Ruby syntax), it would look like this: (self.reject do {| element | element.modulo(2) != 0}).each {| element | yield element}

apotheon
apotheon

I notice you essentially use the words "block" and "closure" interchangeably in your write-up about Ruby blocks and closures. You also define a closure as code that can be passed around as arguments. This is not exactly true. While being able to pass around a function is necessary to making a closure, it is not sufficient. What you describe -- a function passed as an argument -- is a callback. A closure is essentially a callback that maintains persistent access to its original context after that context's scope has closed. Thus, Ruby blocks are syntactic sugar for callbacks, but those callbacks are not necessarily closures. There are those who would argue with my definition, of course. Their argument basically boils down to a statement that if something is capable of being a closure, it is a closure, but I buy that no more than I buy that someone capable of being a criminal necessarily is a criminal. By the way, I like the enso you use for your user icon. It's a bit small to make out the characters in the middle, and my knowledge of Kanji is pretty scant (to put it kindly) anyway. What does it say in there?

apotheon
apotheon

Ruby is a very, very easy language to use. Your example is actually valid Ruby code, and should do exactly what you expect. In fact, you don't even need the parentheses you added in there for grouping. You're correct that the end in that code is basically just a block delimiter, just as is a closing brace. The key to my example you quoted is that everything in Ruby that can be evaluated as a stand-along unit of code is an expression. There are no "statements" in the sense of something that is executed but has no return value; everything is not just executed but evaluated, meaning it produces a return value. Every value is an object, and therefore any return value of a known type can be sent messages to execute messages their type "contains". More to the point, a message can be chained to the end of any expression, because the expression is evaluated yielding a value that then receives the message you sent it. Hopefully that's clear. I might have gotten a little verbosely carried away. Anyway, thanks for clarifying for me. I realize now, in retrospect, that I probably should have explained that detail within the article.

Mark Miller
Mark Miller

I see what you mean. The form of the language is kind of confusing from my vantage point, but it does lend itself to conciseness. It looks similar to languages that have statements, since it has imperative keywords, like "class," "def," "if," "for-in," etc., yet it acts like a language that doesn't. It was confusing for me to see do-end used synonymously with { }. They are the same. I found an article on style usage for these two. It suggests using braces for the case where you're using the resulting value of a call with a block for a subsequent call, and to use do-end when you're just executing a sequence of actions. Sounds good to me. This is just personal taste, but IMO in Smalltalk the structure of things is (usually) clearer. Since Ruby borrowed some features from it, it might be useful to note the similarities and differences. You can probably see why I was confused at first from this. In standard Smalltalk, blocks are denoted by [ ] (and this notation is "owned" exclusively by the BlockClosure class). Your first example would be written as such in Smalltalk: foo := #(1 2 3 4 5 6 7 8 9 10). "1-based array" vals := #(0 1) asOrderedCollection. foo do: [:n | Transcript show: n asString, ': ', (vals at: n ifAbsentPut: [(vals at: n - 1) + (vals at: n - 2)]) asString; cr] "alternately, we could get rid of 'foo' and just say: 1 to: 10 do: [ ... ] In the Ruby case, this would be equivalent to doing: 0..9.each { ... }" The "do:" message in the above Smalltalk code is sent to the array, containing the block as a parameter value. It does the same thing as "each" in Ruby. So in Smalltalk "do" is a message, but in Ruby, do-end is a construct that is translated into a block. The reason I thought of adding the parentheses in my first comment is I'm used to blocks being objects in and of themselves that can receive their own messages, particularly in an OOP language. I realize you didn't put parentheses around the "reject" call, with the do-end construct (that would've looked pretty ugly), but as I was writing out my "translation" of what you did, it just didn't seem right to put ".each" right after { }. I thought it might look nearly as confusing as your example. Seeing this would normally cause me to assume that I'm sending "each" to the block, not to the array that came as a result of "reject". After all, the dot syntax suggests sending a message to the preceding object, as in "self.reject". A way one can deal with situations in Smalltalk where you want to sequence or combine actions in a single statement, particularly if it's a common pattern, is to create a method for it. For example, to do what you did in the last example, one can do this with a collection: self reject: [:element | element \\ 2 ~= 0] thenDo: [:element | aBlock value: element] The "value" message is equivalent of doing a "yield" on a block, at least the way I see it used here. These same actions could be done without a method, but it would involve doing what I did in my first comment, wrapping the "reject" call in parentheses, and then sending a "do:" message on the temporary object, with the 2nd block. One might wonder, "Why create a method," but it's like your File.open().each_line example. It makes code more concise, readable, and reliable. In Smalltalk it extends into control structures. For example, if, while, and range loops (which are like for loops, or for-each loops) are all implemented in a similar way to this--using message passing. The implication of this is that by using blocks you can create your own control structures. Very cool! Anyway, I put this out for inspiration.

Mark Miller
Mark Miller

> "alternately, we could get rid of 'foo' and just say: > 1 to: 10 do: [ ... ] > > In the Ruby case, this would be equivalent to doing: > 0..9.each { ... }" Are you sure that wouldn't be more equivalent to this in Ruby? 1.upto(10) {|n| puts "do stuff with #{n}" } You are quite right. In terms of what gets executed, to:do: is like upto(), though arrays in Smalltalk are 1-based, and it looks like the arrays in Ruby are 0-based. So I just translated between the two. With foo, you went from 0 to 9. The idea was to create a range that could be used to cause something to happen 10 times, and where the index could be used. When I wrote that, I was thinking of it creating what's called an Interval in Smalltalk (equivalent to a range in Ruby), but I just looked at the implementation, and it just iterates on the block in situ. Does that mean that to: is a range constructor while do is an iterator, the way .. is a range constructor in Ruby while upto and each are iterators? In this case, no, but it would be possible to carry out the same actions in concept using an interval, using the messages you had guessed. If I wanted to create the interval and then use that to iterate, I'd have to make that clear: (1 to: 10) do: [:n | "do something here"] To indicate that I want "1 to: 10" evaluated, and then the result sent another message, I have to put parentheses around "1 to: 10" which act as a delimiter between the messages (and cause the inner expression to be evaluated before the outer one), or assign the interval to a variable and then send do: to the instance the variable holds. In this case to: creates the interval, and do: is a message to the interval instance, giving it a block to iterate over. There's no real reason to do this in this case. I'm just illustrating a distinction. There are two methods I could've used in the Number class: one called to:, which creates an interval, and to:do:, which iterates on the block in place. To clarify further, the parameterized message construction syntax works like this: obj keyword1: param1 keyword2: param2 keyword3: param3 ... That's considered all one message, with however many parameters, to "obj". If there wasn't a to:do: method in the Number class, and I said: 1 to: 10 do: [ ... ] I'd get a "does not understand" exception, because Smalltalk would think I was trying to send a single message with two parameters, called to:do:, to the SmallInteger called "1" (Number is a base class to SmallInteger). In many instances these rules for messages work out really nicely, producing code that looks poetic in its elegance. In some cases it gets really ugly... I think it depends on how much work has been done on a library, and in some cases the Smalltalk parser, to really take into account how the functionality is going to be used.

apotheon
apotheon

> I found an article on style usage for these two. It suggests using braces for the case where you're using the resulting value of a call with a block for a subsequent call, and to use do-end when you're just executing a sequence of actions. The idiomatic way to use them is to use braces when your block is a one-liner and do/end when it spans multiple lines. Thus: foo.each {|e| puts e } bar.each do |line| print 'Enter the current color: ' puts line.gsub(/shade/, gets) end The article you linked mentions that in the first paragraph. The second just brings up someone's suggestion to use braces when the value is being used, which is not really so much idiomatic of the general consensus about how to use the language's syntax. While there are times when I would choose to diverge from such a consensus just because the alternative is a better idea, I don't think I agree this is one of those cases. The only benefit I really see to taking the "braces for using the return value" approach is making things clearer to people used to other languages' syntax, but the goal I think should be to make things clear for people used to the current language's syntax. I find that using do/end for multiline blocks makes things clearer in the general case, which is why I tend to agree with that approach. Early on in my usage of Ruby, I liked to alternate do/end with braces so because I was not initially as used to the do/end syntax and I thought alternating with braces in block nesting made things clearer -- but as I got comfortable with Ruby's syntax, I came to realize that was a result of my biases rather than any actual inherent improvement in the ease of eyeball-parsing. I also find that proliferation of special cases for style tends to make things a bit messy, in both my earlier usage and the usage advocated in Rick DeNatale's advocacy for braces when using the return value. DeNatale is a much better Rubyist than me in general (I'm somewhat familiar with his work), but that doesn't mean I agree with this point of his (repeating Jim Weirich's point). > "alternately, we could get rid of 'foo' and just say: > 1 to: 10 do: [ ... ] > > In the Ruby case, this would be equivalent to doing: > 0..9.each { ... }" Are you sure that wouldn't be more equivalent to this in Ruby? 1.upto(10) {|n| puts "do stuff with #{n}" } The difference is perhaps subtle, and merely conceptual, but your Smalltalk seems more like the upto example than the range.each example at first glance. > The "do:" message in the above Smalltalk code is sent to the array, containing the block as a parameter value. It does the same thing as "each" in Ruby. So in Smalltalk "do" is a message, but in Ruby, do-end is a construct that is translated into a block. This does seem to suggest that your Ruby example is more equivalent to the Smalltalk example than my Ruby example. Does that mean that to: is a range constructor while do is an iterator, the way .. is a range constructor in Ruby while upto and each are iterators? > The reason I thought of adding the parentheses in my first comment is I'm used to blocks being objects in and of themselves that can receive their own messages, particularly in an OOP language. In Ruby, the block itself is not an object. An object representing it is sent to the block-using method as an argument, though -- at least as I understand it. That makes it similar to a method or, more directly, a lambda in Ruby; it is not itself an object, but an object can be instantiated that represents it, and sometimes that happens implicitly.

Editor's Picks