Web Development

Using cryptographic hashes with Ruby

The uses of cryptographic hash functions for security capabilities in software are many and varied. Ruby provides simple and easy options for generating and comparing cryptographic hashes.

Last week, in Use cryptographic hashes for validation, I provided a general overview of the use and importance of cryptgraphic hashes. Today, I'll offer some specifics on how to generate and compare MD5, SHA-1, and SHA-2 cryptographic hashes using the Ruby programming language.

If you're familiar with programming in Ruby, you know the syntax for calling a library from a Ruby script is simple. MD5, SHA-1, and SHA-2 hashing is accomplished in Ruby via functionality provided by the language's standard library. The respective library calls for MD5, SHA-1, and SHA-2 look like this:

require 'digest/md5'

require 'digest/sha1'

require 'digest/sha2'

You only need to require one of them if you only intend to use one, of course. Generating a cryptographic hash from either is almost as simple as calling the library in the first place. In the following examples, I will use SHA-2 for simplicity's sake so I don't have to do everything twice. If you want to use MD5 or SHA-1 in place of SHA-2, just replace any instance of SHA2 with MD5 or SHA1, as needed.

I ran the following examples using Ruby's interactive interpreter, irb. In some cases, I have included the return value of a method as shown in irb, which is marked by the characters =>. Also keep in mind that line-wrapping may alter the appearance of some of these examples, particularly where a hash is shown, because they tend to be longer than the text width of an article at TechRepublic.

To create a cryptographic hash object from a string:

h = Digest::SHA2.new << 'string'

=> #<Digest::SHA2:256 473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8>

To compare the contents of a variable -- in this example, using one variable called foo and another called bar -- with your cryptographic hash, use the #to_s method:

foo = '473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8'

=> "473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8"

foo == h.to_s

=> true

bar = Digest::SHA2.new << 'blah blah blah'

=> #<Digest::SHA2:256 a74f733635a19aefb1f73e5947cef59cd7440c6952ef0f03d09d974274cbd6df>

bar == h.to_s

=> false

With significantly longer strings, it may be difficult to load the entire thing into memory without running out of RAM and crashing your program (or worse, introducing a security vulnerability). This is the case with particularly large files, for instance, such as when you want to generate a cryptographic hash of a file whose size is greater than the amount of RAM and swap space you may have available on your computer when you run the program. Luckily, in Ruby it is a trivial operation to iterate through a file one line at a time and update the cryptographic hash as you go, so that no more than one line of text is ever loaded into memory at any one time. In the following example, the variable path_to_file contains the path to the file from whose contents you want to generate a hash.

file_h = Digest::SHA2.new

File.open(path_to_file, 'r') do |fh|

fh.each_line do |l|

file_h << l

end

end

It is left as an exercise for the reader to look up information on how to specify an alternate record separator when reading in the file one record at a time, if a "line" with a newline character isn't appropriate. If there are not regular record separator characters you can use, you may also use the #read method instead of #each_line, which allows you to specify the number of bytes to treat as an individual record. Using the #read method is quite simple as well:

file_h = Digest::SHA2.new

File.open(path_to_file, 'r') do |fh|

while buffer = fh.read(1024)

file_h << buffer

end

end

The << syntax may be replaced with the #update method, if that works better for you as a mnemonic aid. Instead of file_h << buffer, then, you would use file_h.update(buffer). The #update method can be substituted for << in the previous #each_line example as well.

As mentioned in Use cryptographic hashes for validation last week, cryptographic hashes can be used for secure password authentication. It takes a bit more than a simple hashing function to make something like that secure for remote logins, where one must be concerned with man in the middle attacks and other problems of privacy across network connections, but there's enough here to build a secure local password authentication routine for a desktop application (assuming you use a strong enough cryptographic hash algorithm to avoid predictable hash collisions).

If you use a cryptographic hash comparison for password authentication to avoid having to store plaintext passwords where an unauthorized user can see them, make sure you don't accept a cryptographic hash directly from user input to compare with the stored hash. Apply a hash function to whatever your program receives, then compare that to the stored hash, or the very purpose of using a cryptographic hash in the first place will be violated.

You may also want to use stronger cryptographic hash algorithms when developing secure software than MD5 and SHA-1. A number of other libraries for Ruby in the form of gems -- Ruby's library modules that can be installed via the language's own software package management system -- provide cryptographic hash capabilities stronger than MD5 and SHA-1, though many of them are not part of Ruby's standard library and must be installed before use.

Luckily, MD5 and SHA-1 are not alone in the Ruby standard library, as SHA-2 is conveniently distributed with the Ruby standard library as well and provides a far stronger set of cryptographic hash algorithm options. If it were up to me, nobody would use MD5 or SHA-1 at all except for purposes of legacy compatibility. OpenPGP digital signatures and SHA-2 hashes (with a key length of at least 256) are both significant improvements over the deeply flawed SHA-1 and MD5 algorithms. If at all possible, use stronger algorithms than MD5 and SHA-1; they should be regarded only as toys, and as necessary tools for compatibility with older systems -- because they're better than nothing, but not much better.

One final note: the SHA-2 algorithm supports varying key lengths. The default key length for the Digest::SHA2 module is 256 bit key length. Shorter key lengths should be avoided, for security reasons; the longer the key, the longer it takes to brute-force crack encryption. Ruby's Digest::SHA2 implementation supports 384 bit and 512 bit keys, in addition to 256 bit keys. You may specify a longer key length by passing an argument to the new method:

h = Digest::SHA2.new(512)

Hashes generated using different key lengths do not validate against each other, just as those generated using entirely different algorithms do not validate against each other -- e.g., an MD5 hash of a given string will not validate against an SHA-2 hash of the same string, and a hash generated using a 256 bit key will not validate against a hash generated using a 512 bit key. Most people who use SHA-2 use a 256 bit key, which is sometimes called SHA-256; keep that in mind when dealing with SHA-2 hash comparisons.

About

Chad Perrin is an IT consultant, developer, and freelance professional writer. He holds both Microsoft and CompTIA certifications and is a graduate of two IT industry trade schools.

6 comments
Justin James
Justin James

I am actually fulfilling a promise a made to myself ages ago, and I am starting to learn Ruby. Nice to be reading about its usage for things like this! J.Ja

Sterling chip Camden
Sterling chip Camden

I love how it includes these hash algorithms in the core. In almost any other language, you'd have to go hunt up a library, or even write it yourself.

apotheon
apotheon

I might write more about security software development in Ruby in the coming months -- but probably not very often, considering the underwhelming popularity of this article.

apotheon
apotheon

Technically, this stuff is in the standard library, and not the official language core -- but I understand what you mean (that these classes ship with the standard Ruby distribution). I don't have much experience writing anything like security software in any language other than Ruby or Perl, so I'll just have to take your word for it, with regard to what libraries are available in most other languages. That being the case, it's pretty disappointing there isn't at least basic hashing functionality included in the standard libraries of most major languages.

Neon Samurai
Neon Samurai

I've only had time to briefly skim it so far though due to work. For me, I can understand it but don't have enough ruby skills to make use of it right away. Mind you, I'm also not seeing the stats out of the cluster but the short discussion may be indicative of the hit stats your looking at.

Editor's Picks