A colleague once encountered an unusual problem concerning sort orders. The most concise way to explain the problem is to show you the following code. Simply paste it into Query Analyzer and run it against any test database:
create table TestSort(CodeToSort
insert into TestSort(CodeToSort) values('a1');
insert into TestSort(CodeToSort) values('a2');
insert into TestSort(CodeToSort) values('a21');
insert into TestSort(CodeToSort) values('a10x');
insert into TestSort(CodeToSort) values('a2a');
insert into TestSort(CodeToSort) values('a10');
A sort on this column yields the following result set:
select * from TestSort order by CodeToSort;
However, the application's requirements weren't that simple. Management wanted the results in the following sort order:
Since it may not be apparent what the rules are, I'll describe the sample strings:
Now let's reverse-engineer the rules governing the sort order (the sample doesn't make rules four and five below explicitly clear, but we can infer them):
Our biggest problem is rule number four. We can't just use the SUBSTRING() function to grab the numeric portion. Instead, we have to grab all the contiguous numeric characters, turn them into an integer, and sort on that value.
My solution is to use a user-defined-function (UDF) that extracts the numeric digits from the character string and returns them to us as an integer. Here is a function that performs this task:
CREATE FUNCTION dbo.InnerNumeric_fn
@Source Varchar(20) = 'a20x'
declare @Result varchar(10)
declare @i int
declare @j int
declare @c char(1)
declare @flag int
set @j = len(@Source)
set @i = 2
set @Result = ''
while @i <= @j
set @c = substring(@Source,@i,1)
set @flag = isnumeric(@c)
if (@flag = 1)
set @Result = @Result + @c
set @i = @i+1
RETURN CAST(@Result AS INT)
We could generalize the pattern somewhat by supposing that the initial alphabetic portion of the string might be more than one character. In that case, we wouldn't assign the value 2 to the variable I; we would have to step over the alphabetic characters. In addition, the alphabetic suffix might consist of more than one character.
In a worse case, we could also assume that there might be more than one numeric string in the sequence (i.e., axyz200bcd300xy), but then we would need much more clearly defined rules as to the desired sort order. I'll leave these variations as exercises to the interested reader.
This tip demonstrates that you can deliver "unusual" sort orders, as long as you can describe their rules. The trick isn't in the code but in the accurate and complete description of the rules. Make sure that the person who dictates the rules supplies you with an adequate test set.
The suppliers of said rules will almost always overlook something, so look for holes and exceptions in supplied sets because, if you don't, they may come back to bite you. Before you code a line, make sure that all the stakeholders agree on the rules.
TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!