Last week, in Use cryptographic hashes for validation, I provided a general overview of the use and importance of cryptgraphic hashes. Today, I’ll offer some specifics on how to generate and compare MD5, SHA-1, and SHA-2 cryptographic hashes using the Ruby programming language.
If you’re familiar with programming in Ruby, you know the syntax for calling a library from a Ruby script is simple. MD5, SHA-1, and SHA-2 hashing is accomplished in Ruby via functionality provided by the language’s standard library. The respective library calls for MD5, SHA-1, and SHA-2 look like this:
You only need to
require one of them if you only intend to use one, of course. Generating a cryptographic hash from either is almost as simple as calling the library in the first place. In the following examples, I will use SHA-2 for simplicity’s sake so I don’t have to do everything twice. If you want to use MD5 or SHA-1 in place of SHA-2, just replace any instance of
SHA1, as needed.
I ran the following examples using Ruby’s interactive interpreter, irb. In some cases, I have included the return value of a method as shown in irb, which is marked by the characters
=>. Also keep in mind that line-wrapping may alter the appearance of some of these examples, particularly where a hash is shown, because they tend to be longer than the text width of an article at TechRepublic.
To create a cryptographic hash object from a string:
h = Digest::SHA2.new << 'string'
=> #<Digest::SHA2:256 473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8>
To compare the contents of a variable — in this example, using one variable called
foo and another called
bar — with your cryptographic hash, use the
foo = '473287f8298dba7163a897908958f7c0eae733e25d2e027992ea2edc9bed2fa8'
foo == h.to_s
bar = Digest::SHA2.new << 'blah blah blah'
=> #<Digest::SHA2:256 a74f733635a19aefb1f73e5947cef59cd7440c6952ef0f03d09d974274cbd6df>
bar == h.to_s
With significantly longer strings, it may be difficult to load the entire thing into memory without running out of RAM and crashing your program (or worse, introducing a security vulnerability). This is the case with particularly large files, for instance, such as when you want to generate a cryptographic hash of a file whose size is greater than the amount of RAM and swap space you may have available on your computer when you run the program. Luckily, in Ruby it is a trivial operation to iterate through a file one line at a time and update the cryptographic hash as you go, so that no more than one line of text is ever loaded into memory at any one time. In the following example, the variable
path_to_file contains the path to the file from whose contents you want to generate a hash.
file_h = Digest::SHA2.new
File.open(path_to_file, 'r') do |fh|
fh.each_line do |l|
file_h << l
It is left as an exercise for the reader to look up information on how to specify an alternate record separator when reading in the file one record at a time, if a “line” with a newline character isn’t appropriate. If there are not regular record separator characters you can use, you may also use the
#read method instead of
#each_line, which allows you to specify the number of bytes to treat as an individual record. Using the
#read method is quite simple as well:
file_h = Digest::SHA2.new
File.open(path_to_file, 'r') do |fh|
while buffer = fh.read(1024)
file_h << buffer
<< syntax may be replaced with the
#update method, if that works better for you as a mnemonic aid. Instead of
file_h << buffer, then, you would use
#update method can be substituted for
<< in the previous
#each_line example as well.
As mentioned in Use cryptographic hashes for validation last week, cryptographic hashes can be used for secure password authentication. It takes a bit more than a simple hashing function to make something like that secure for remote logins, where one must be concerned with man in the middle attacks and other problems of privacy across network connections, but there’s enough here to build a secure local password authentication routine for a desktop application (assuming you use a strong enough cryptographic hash algorithm to avoid predictable hash collisions).
If you use a cryptographic hash comparison for password authentication to avoid having to store plaintext passwords where an unauthorized user can see them, make sure you don’t accept a cryptographic hash directly from user input to compare with the stored hash. Apply a hash function to whatever your program receives, then compare that to the stored hash, or the very purpose of using a cryptographic hash in the first place will be violated.
You may also want to use stronger cryptographic hash algorithms when developing secure software than MD5 and SHA-1. A number of other libraries for Ruby in the form of gems — Ruby’s library modules that can be installed via the language’s own software package management system — provide cryptographic hash capabilities stronger than MD5 and SHA-1, though many of them are not part of Ruby’s standard library and must be installed before use.
Luckily, MD5 and SHA-1 are not alone in the Ruby standard library, as SHA-2 is conveniently distributed with the Ruby standard library as well and provides a far stronger set of cryptographic hash algorithm options. If it were up to me, nobody would use MD5 or SHA-1 at all except for purposes of legacy compatibility. OpenPGP digital signatures and SHA-2 hashes (with a key length of at least 256) are both significant improvements over the deeply flawed SHA-1 and MD5 algorithms. If at all possible, use stronger algorithms than MD5 and SHA-1; they should be regarded only as toys, and as necessary tools for compatibility with older systems — because they’re better than nothing, but not much better.
One final note: the SHA-2 algorithm supports varying key lengths. The default key length for the
Digest::SHA2 module is 256 bit key length. Shorter key lengths should be avoided, for security reasons; the longer the key, the longer it takes to brute-force crack encryption. Ruby’s
Digest::SHA2 implementation supports 384 bit and 512 bit keys, in addition to 256 bit keys. You may specify a longer key length by passing an argument to the
h = Digest::SHA2.new(512)
Hashes generated using different key lengths do not validate against each other, just as those generated using entirely different algorithms do not validate against each other — e.g., an MD5 hash of a given string will not validate against an SHA-2 hash of the same string, and a hash generated using a 256 bit key will not validate against a hash generated using a 512 bit key. Most people who use SHA-2 use a 256 bit key, which is sometimes called SHA-256; keep that in mind when dealing with SHA-2 hash comparisons.