Data Centers

Oracle Tip: Serialize data with ANYDATA columns

With the ANYDATA data type and some dynamic SQL, it's possible to serialize as many tables as you wish into a single backup table using a single serialization stored procedure. Find out how in this hands-on tip.

This article originally appeared in the Oracle e-newsletter. Click here to subscribe automatically.

Oracle version 9 introduced interesting new data types, which allow developers to declare a variable that can contain any type of data. The data types are ANYDATA for a single item, ANYDATASET for a TABLE or VARRAY of data, and ANYTYPE, which describes the type of data stored in ANYDATA or ANYDATASET variables and columns. These data types are important for processing XML data stored in the database, or for Advanced Queues. The documentation mentions that the ANYDATA data type can be used to serialize objects, but there are very few examples of this concept.

Serialization is the ability to take some structure, made up of data values and other structures, and write out all the components to a stream. This stream can be read back into a structure to recover information from a previous session. Often, the act of saving and opening files in an application is merely a form of serialization.

An Oracle database might use serialization to store a backup of some version of table data that can be viewed or manipulated beyond the use of database commits, rollbacks, or flashback queries. Many applications require having features similar to source control, such as being able to compare current and previous versions and merge and undo changes at the application level. Many of these kinds of applications are designed by creating a backup table for every table in the schema. Maintaining this many backup tables and relationships, in addition to production data, can be a burden on the database and development time.

With the ANYDATA data type and some dynamic SQL, it's possible to serialize as many tables as you wish into a single backup table using a single serialization stored procedure. One advantage of using ANYDATA, rather than an easily converted data type, such as VARCHAR2, is that the original data type isn't lost. A DATE can be stored in an ANYDATA column or variable without losing any precision or depending on the current NLS semantics for converting between DATE and VARCHAR2. Numbers can be stored without losing precision during the conversion.

An ANYDATA object can be constructed using any of the Convert* methods to construct a simple value, or the "piecewise" construction methods for creating more complex values, such as objects and datasets. For this example, I'll focus on using the Convert* methods only.

To create a serialization procedure, I use dynamic SQL to generate a query of all the data in the table, including the ROWID. I parse and describe the query to get a list of columns and their data type, define the columns to be fetched, and then fetch each column from each row and insert it into a serialization table. I used DBMS_SQL in this example because "native dynamic SQL" doesn't currently support describing dynamic queries. Most of the work of the procedure is the process of converting the DBMS_SQL data type codes to the appropriate data type methods and functions. To get a list of these codes, you can look at the OCI include file, ocidfn.h, or the definition of views such as USER_TAB_COLUMNS. In this example, I use simple data types (found in EMP and DEPT tables) that could be directly converted.

drop table serialized_data;

create table serialized_data
(
    tablename varchar2(30) not null,
    row_id rowid not null,
    colseq integer not null,
    item anydata
);

create or replace procedure serialize(p_tablename varchar2)
is
    l_tablename varchar2(30) := upper(p_tablename);
    c           pls_integer;        — cursor
    x           pls_integer;        — dummy
    col_cnt     pls_integer;
    dtab        dbms_sql.desc_tab;
    l_rowid     char(18);
    l_anydata   anydata;
    l_vc2       varchar2(32767);
    l_number    number;
    l_vc        varchar(32767);
    l_date      date;
    l_raw       raw(32767);
    l_ch        char;
    l_clob      clob;
    l_blob      blob;
    l_bfile     bfile;
begin
    c := dbms_sql.open_cursor;
    dbms_sql.parse(c,'select rowid,'||p_tablename||'.* from '||p_tablename,
        dbms_sql.native);
    dbms_sql.describe_columns(c,col_cnt,dtab);
    dbms_sql.define_column(c,1,l_rowid,18);
    for i in 2 .. col_cnt loop
        case dtab(i).col_type
        when 1 then
            dbms_sql.define_column(c,i,l_vc2,dtab(i).col_max_len);
        when 2 then
            dbms_sql.define_column(c,i,l_number);
        when 9 then
            dbms_sql.define_column(c,i,l_vc,dtab(i).col_max_len);
        when 12 then
            dbms_sql.define_column(c,i,l_date);
        when 23 then
            dbms_sql.define_column_raw(c,i,l_raw,dtab(i).col_max_len);
        when 96 then
            dbms_sql.define_column_char(c,i,l_ch,dtab(i).col_max_len);
        when 112 then
            dbms_sql.define_column(c,i,l_clob);
        when 113 then
            dbms_sql.define_column(c,i,l_blob);
        when 114 then
            dbms_sql.define_column(c,i,l_bfile);
        end case;
    end loop;
    x := dbms_sql.execute(c);
    while dbms_sql.fetch_rows(c) != 0 loop
        dbms_sql.column_value(c,1,l_rowid);
        for i in 2 .. col_cnt loop
            case dtab(i).col_type
            when 1 then
                dbms_sql.column_value(c,i,l_vc2);
                l_anydata := ANYDATA.ConvertVarchar2(l_vc2);
            when 2 then
                dbms_sql.column_value(c,i,l_number);
                l_anydata := ANYDATA.ConvertNumber(l_number);
            when 9 then
                dbms_sql.column_value(c,i,l_vc);
                l_anydata := ANYDATA.ConvertVarchar(l_vc);
            when 12 then
                dbms_sql.column_value(c,i,l_date);
                l_anydata := ANYDATA.ConvertDate(l_date);
            when 23 then
                dbms_sql.column_value(c,i,l_raw);
                l_anydata := ANYDATA.ConvertRaw(l_raw);
            when 96 then
                dbms_sql.column_value(c,i,l_ch);
                l_anydata := ANYDATA.ConvertChar(l_ch);
            when 112 then
                dbms_sql.column_value(c,i,l_clob);
                l_anydata := ANYDATA.ConvertClob(l_clob);
            when 113 then
                dbms_sql.column_value(c,i,l_blob);
                l_anydata := ANYDATA.ConvertBlob(l_blob);
            when 114 then
                dbms_sql.column_value(c,i,l_bfile);
                l_anydata := ANYDATA.ConvertBFile(l_bfile);
            end case;
            insert into serialized_data (tablename,row_id,colseq,item)
                values (l_tablename,l_rowid,i,l_anydata);
        end loop;
    end loop;
    dbms_sql.close_cursor(c);
end;
/
show errors;

If I want to serialize 'EMP' and 'DEPT' tables, I can do this from SQL*Plus with the following:

exec serialize('emp');
exec serialize('dept');
select t.item.gettypename() from serialized_data t;

One disadvantage to using ANYDATA, which is an object, is that very little information can be retrieved via direct SQL. The table data must be accessed using a PL/SQL procedure. To show that I can go back and compare a column from the original table with my serialized table, I can write this anonymous PL/SQL block:

Declare
    l_anydata   anydata;
    l_vc2   varchar2(32767);
    x       pls_integer;
begin
    for row in
    (
        select emp.ename,sd.item
          from emp,serialized_data sd
         where emp.rowid = sd.row_id
           and sd.colseq = 3
           and sd.tablename = 'EMP'
    )
    loop
        l_anydata := row.item;
        if l_anydata.GetVarchar2(l_vc2) = DBMS_TYPES.SUCCESS then
            dbms_output.put_line(row.ename||' <=> '||l_vc2);
        else
            dbms_output.put_line('error!');
        end if;
    end loop;
end;
/
show errors;

SMITH <=> SMITH
ALLEN <=> ALLEN
WARD <=> WARD
JONES <=> JONES
MARTIN <=> MARTIN
BLAKE <=> BLAKE
CLARK <=> CLARK
SCOTT <=> SCOTT
KING <=> KING
TURNER <=> TURNER
ADAMS <=> ADAMS
JAMES <=> JAMES
FORD <=> FORD
MILLER <=> MILLER

PL/SQL procedure successfully completed.

The output shows a comparison between the original value and the serialized value of ENAME in the EMP table.

Scott Stephens worked for Oracle for more than 13 years in technical support, e-commerce, marketing, and software development. For more of his Oracle tips, visit our Oracle Dev Tips Library.

Editor's Picks

Free Newsletters, In your Inbox