By Beth Bowden
A great function usually does only one thing but does it supremely well. Once you learn to use a great function, you can use it immediately and frequently without ever thinking about how to use it or how it works—much like breathing. A poor function, on the other hand, withers away from disuse.
If your function is contrary or hard to use, your users may reinvent the wheel rather than use it. Let’s explore what makes a function great by looking at a hypothetical function, which I’ll call dtDF.
Function or method?
For the sake of readability, I’ll be using the term function to refer to both functions and methods. The ideas I’ll be discussing will apply to both equally well.
Give it a great name and make it flexible
The purpose of the dtDF function is to calculate the difference in days between two dates. A very easy way to improve dtDF would be to give it a better name—dtDF is cryptic and hard to remember. Let’s be verbose and call it differenceInDays instead. Now you can just look at the name and know exactly what the function does, which is a big improvement.
What else can make a function great? It is forgiving: You can call it with different data types and have no doubts about what it will do. Overloading a function to accept different parameter types is a great way to increase ease of use, so when you create a function, take advantage of overloading whenever it makes sense.
So, how could I overload differenceInDays to increase usability and make it more intuitive to use? It currently takes two dates, so I could overload it to accept two datetime objects as well. That would potentially save the function’s users from having to do conversions and from having to remember whether differenceInDays accepts date or datetime parameters.
Determine your users' expectations and meet them
So overloading my function is good, but it leads to a hairy question: What should the following call return?
differenceInDays(‘01/01/2003 23:59’,’01/02/2003 00:01’)
Should it return 1, because the datetimes fall on different days? Or should it return 0, because the difference between the two days is not 24 hours? Which option will make the most sense to the users of differenceInDays? What will they expect to happen?
What about accepting a single date instead of two dates? How would differenceInDays handle a single date object? Perhaps it would subtract the date from today’s date, or perhaps it would return zero days. Should it accept a single date parameter at all? Again, keep the users’ expectations in mind.
One method of determining user expectations is to look at commonly used software to see how it handles a similar situation. Your users will expect your function to behave the same way in similar circumstances. The Microsoft Excel function DATEDIF returns 1 if called with the same parameters, because it ignores the time portion of the datetime parameters it receives. On the other hand, Oracle SQL would return a fraction of a day if given the same arguments, so either approach is reasonable. So there’s precedent for both possibilities.
To decide which is the best option, you need to determine who will be using this function, how they will use it, and what special rules may apply. For instance, your company might have business rules that will favor Excel’s approach over the Oracle approach.
If you’re in truly unexplored territory, another way of determining your users’ expectations is to ask several potential users of your function their opinions and expectations. If you get a unanimous answer, seriously consider designing your function to fulfill those needs.
Don't overdo overloading
You can take overloading too far. While accepting dates as strings would probably be a good idea, should differenceInDays accept a two-item datetime array as a parameter? Would that make it any easier to use? Probably not, so there’s no need to add it.
Handle boundary cases without breaking
A great function is hard to break and handles edge cases. Going back to the Excel DATEDIF function, let’s see what happens if you switch the dates, so that instead of passing the greater value last, like this:
=DATEDIF("1/1/2002 23:59","1/2/2002 00:01","D")
youpass it first, instead, like this:
=DATEDIF("1/2/2002 00:01","1/1/2002 23:59","D")
Excel doesn’t handle that situation very well; it returns an ugly #NUM!result. I would argue that returning a negative number of days would be much more intuitive, so I coded differenceInDays that way.
What about null parameters? How should differenceInDays handle them? The two logical choices would be raising an error or returning a null. Again, you need to consider your users’ expectations. If the environment they’ll be using it in, Oracle, would return a null in that situation, you'd write differenceInDays to be consistent with that behavior.
A great function…does what, exactly?
What rules do you have for creating intuitive, easy-to-use functions? Share them in our discussion below.
When designing your next function, I’d encourage you remember the humble differenceInDays function discussed here. What it does is readily apparent from its name, it doesn’t surprise the users with unexpected behavior, and it accepts as many different data types as are reasonable. It also handles boundary cases and nulls with grace and consistency. To be great, a function must do all of these things.