A common advanced computer science problem
can be described under the heading of “directed graphs.” A directed
graph is a finite set of nodes connected by a set of vectors or
edges. For example, a node can be imagined as a “city” and each
vector as a “flight” between two cities.

There are many algorithms and papers about how
to solve problems of traversing along each possible route and
finding the route that is the shortest or costs the least. Most of
these algorithms are procedural or resort to recursion to solve.
However, SQL’s declarative language makes solving complex directed
graph problems easier and doesn’t require much coding.

Let’s take the example of flights between
cities and create a table to hold some imaginary data:

create table airports
(
    code        char(3)
constraint airports_pk primary key,
    description varchar2(200)
);

insert into airports values (‘LHR’,’London Heathrow, UK’);
insert into airports values (‘JFK’,’New York-Kennedy, USA’);
insert into airports values (‘GRU’,’Sao Paulo, Brazil’);

create table fares
(
    depart      char(3),

    arrive      char(3),

    price       number,

    constraint fares_pk primary key
(depart,arrive),
    constraint fares_depart_fk foreign key
(depart) references airports,
    constraint fares_arrive_fk foreign key
(arrive) references airports
);

insert into fares values(‘LHR’,’JFK’,700);
insert into fares values(‘JFK’,’GRU’,600);
insert into fares values(‘LHR’,’GRU’,1500);
insert into fares values(‘GRU’,’LHR’,1600);

You cannot use CONNECT BY syntax to resolve how
to get from London to Sao Paulo because there is data that creates
a loop in the graph (flying back to Sao Paulo):

select * from fares connect by prior arrive =
depart start with depart = ‘LHR’;
ERROR:
ORA-01436: CONNECT BY loop in user data

To solve directed graph problems, we need to
create a temporary table to hold all the possible paths between two
nodes. We have to be careful not to duplicate paths we already
processed and, in this case, we don’t want paths that lead to the
same place we started. I’d also like to trace the number of hops it
took to get to the destination, as well as a description of the
route taken.

The temporary table is created using the
following:

create global temporary table faretemp
(
    depart      char(3),

    arrive      char(3),

    hops        integer,

    route       varchar2(30),

    price       number,

    constraint faretemp_pk primary key
(depart,arrive)
);

A simple view can somewhat simplify the code
used in this example. The view can calculate the data for the next
hop from a path in faretemp based on a single hop in fares:

create or replace view nexthop
as
    select src.depart,
           dst.arrive,

           src.hops+1
hops,
           src.route||’,’||dst.arrive
route,
           src.price
+ dst.price price
      from faretemp src,fares dst
     where src.arrive = dst.depart
       and dst.arrive !=
src.depart;
/
show errors;

The algorithm is fairly straightforward. First,
populate the faretemp table with the data from the fare table as
the initial hop. Then, take all the values we just inserted and use
them to build all the possible two-hop paths. Keep repeating as
long as a new path can be created between two nodes. The loop will
exit when all possible paths between nodes are described. We could
also restrict the first insert to reduce the amount of data being
loaded if we were only interested in a certain starting condition.
Here is the code for just discovering paths:

truncate table faretemp;
begin
    — initial connections
    insert into faretemp
     select
depart,arrive,1,depart||’,’||arrive,price from fares;
    while sql%rowcount > 0 loop
        insert into
faretemp
            select
depart,arrive,hops,route,price from nexthop
             where
(depart,arrive)
                   not
in (select depart,arrive from faretemp);
    end loop;
end;
/
show errors;

select * from faretemp order by depart,arrive;

You can view the output in Table A.

There’s one small problem with the above data.
The data is the set of shortest routes between points (least number
of hops). However, the hop from London to Sao Paulo isn’t the
cheapest possibility.

To solve for the least expensive fare, we need
to add an update into our loop that replaces a route with a cheaper
route when it’s found in a hop. The code for that looks like
this:

truncate table faretemp;
declare
    l_count integer;
begin
    — initial connections
    insert into faretemp
        select
depart,arrive,1,depart||’,’||arrive,price from fares;
    l_count := sql%rowcount;
    while l_count > 0 loop
        update faretemp
           set
(hops,route,price) =
              (select
hops,route,price from nexthop
                where
depart = faretemp.depart
                  and
arrive = faretemp.arrive)
         where
(depart,arrive) in
             (select
depart,arrive from nexthop
               where
price < faretemp.price);
        l_count :=
sql%rowcount;
        insert into
faretemp
            select
depart,arrive,hops,route,price from nexthop
             where
(depart,arrive)
               not
in (select depart,arrive from faretemp);
        l_count := l_count
+ sql%rowcount;
    end loop;
end;
/
show errors;

select * from faretemp order by depart,arrive;

You can view the output in Table B.

The algorithm noticed that the LHR,JFK,GRU
route was cheaper than the LHR,GRU route and replaced it. The loop
will exit once there are no cheaper fares and no other possible
routes.

TechRepublic’s Oracle newsletter covers automating Oracle utilities, generating database alerts, solving directed graph problems, and more. Automatically subscribe today!