Data Management

Understand when to use user-defined functions in SQL Server

In the simplest terms, a user-defined function (UDF) in SQL Server is a programming construct that accepts parameters, does work that typically makes use of the accepted parameters, and returns a type of result. This article will cover two types of UDFs: table-valued and scalar-valued.

In the simplest terms, a user-defined function (UDF) in SQL Server is a programming construct that accepts parameters, does work that typically makes use of the accepted parameters, and returns a type of result. This article will cover two types of UDFs: table-valued and scalar-valued. (I will not be covering aggregate functions.)

Types of UDFs

Table-valued functions

A table-valued UDF is a function that accepts parameters and returns the results in the form of a table. This type of function is special because it returns a table that you can query the results of and join with other tables. In SQL Server 2005, field values from other tables may be passed into the function during a join operation to return a record based result. To accomplish this, you must use SQL Server 2005's APPLY operator.

It can be difficult to know when it is appropriate to use a VIEW vs. when it is appropriate to use a table-valued UDF. VIEWs are a great tool for data abstraction, combining data, and logically using subsets of data. I like to use table-valued UDFs when I need to use one or more values from different tables in a join operation where some type of calculation needs to be done and an aggregation returned.

Scalar-valued functions

A scalar-valued UDF accepts parameters and, ultimately, returns a single, atomic value. There are seven reasons why these types of functions are different than stored procedures in the database engine.

  • You cannot modify data inside of a UDF.
  • A scalar-valued UDF returns only one value, where a stored procedure can have numerous OUTPUT parameters.
  • You can use scalar-valued UDFs as the default value for a column in a table.
  • Scalar-valued UDFs are an easy way to define constant values to use in your database environment.
  • You can pass field values as parameters into UDFs.
  • You can nest scalar function calls. This means that you can pass a call to a scalar-valued function to another function or stored procedure.
  • You can use the results from scalar-valued UDFs as criteria in a WHERE statement. Although you can do it, this is typically not a good idea. (Later in the article, I'll explain why I try to avoid this common practice.)

There are two types of scalar-valued UDFs: deterministic and non-deterministic. Recognizing the determinism of the functions that are created is important. An example of the importance is the creation of indexed views. One of the many restrictions of creating an index on a view is that the view definition cannot use a non-deterministic function.

Deterministic

A deterministic UDF always returns the same result with the same set of input parameters. Some examples of deterministic functions are the system functions MONTH(), YEAR(), and ISNULL().

Non-deterministic

A non-deterministic UDF is can potentially return a different value each time it is called with the same set of input parameters. Some examples of non-deterministic functions are the system functions GETDATE(), NEWID(), and @@CONNECTIONS.

Two examples of UDFs

Before presenting the examples, I will set up my SalesHistory table and load data into it:

IF OBJECT_ID('SalesHistory')>0   

      DROP TABLE SalesHistory;

CREATE TABLE [dbo].[SalesHistory]

(         

      [SaleID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,         

      [Product] [varchar](10) NULL,               

      [SaleDate] [datetime] NULL,               

      [SalePrice] [money] NULL

)

DECLARE @i SMALLINT

SET @i = 1         

WHILE (@i <=1000)

BEGIN                        

      INSERT INTO SalesHistory(Product, SaleDate, SalePrice)                     

      VALUES ('Computer', DATEADD(mm, @i, '3/11/1919'), DATEPART(ms, GETDATE()) + (@i + 57))               

      INSERT INTO SalesHistory(Product, SaleDate, SalePrice)         

      VALUES('BigScreen', DATEADD(mm, @i, '3/11/1927'), DATEPART(ms, GETDATE()) + (@i + 13))                    

      INSERT INTO SalesHistory(Product, SaleDate, SalePrice)             

      VALUES('PoolTable', DATEADD(mm, @i, '3/11/1908'), DATEPART(ms, GETDATE()) + (@i + 29))                           

      SET @i = @i + 1

END

GO

The first UDF I will look at is the scalar-valued UDF. The script below defines a function named dbo.udf_GetProductSales that accepts three parameters and returns a MONEY value. The function uses the three input parameters as criteria in calculating the total sales from the SalesHistory table.

CREATE FUNCTION dbo.udf_GetProductSales

(

      @Product VARCHAR(10),

      @BeginDate DATETIME,

      @EndDate DATETIME

)

RETURNS MONEY

AS

BEGIN

      DECLARE @Sales MONEY

     

      SELECT @Sales = SUM(SalePrice)

      FROM SalesHistory

      WHERE

            Product = @Product AND 

SaleDate BETWEEN @BeginDate AND @EndDate

      RETURN(@Sales)

END
The script below calls the UDF created in the above script. Note: The schema the function belongs to must be used in the call. In this case, the function belongs to the dbo schema.
SELECT dbo.udf_GetProductSales('PoolTable', '1/1/1990', '1/1/2000')

I usually discourage using scalar-valued UDFs in a WHERE criteria statement because, for every record considered in the query, the scalar-valued function will be called. This means that a function used in the WHERE criteria will cause a scan of the values being searched, which is going to be slower than if an index is able to be used. (I will provide more details on this concept in a future article.)

Although the use of a correlated sub-query is sometimes confusing and complicated, the use of them can help solve some of the more challenging query problems. While using these special queries is useful, they only return one column of data. You can use the upgraded table-valued UDFs in SQL Server 2005 to overcome this shortcoming. I'll show you how to use the APPLY operator to accept column values from a table and return a table-result of correlated values.

CREATE FUNCTION dbo.udf_GetProductSalesTable

(

      @Product VARCHAR(10),

      @SaleID INT

)

RETURNS @SalesTable TABLE

(

      SalesTotal MONEY,

      SalesCount INT

)

BEGIN

     

      INSERT INTO @SalesTable(SalesTotal, SalesCount)

      SELECT

            SUM(SalePrice), COUNT(SaleID)

      FROM

            SalesHistory

      WHERE

            Product = @Product AND

            SaleID <= @SaleID

      RETURN

END

GO

The above function accepts the particular product for which we were searching, along with the SaleID from the SalesHistory table. From the function definition, you can see that the function returns a table named @SalesTable that contains two columns: SalesTotal and SalesCount. The body of the function inserts aggregate values into the @SalesTable table variable based upon the input parameters.

The following code uses the APPLY operator to invoke the table-valued function with the values from the SalesHistory table. (Note: Logically, you may want to use a JOIN operator here, but it is not necessary. The APPLY operator essentially does the "JOIN" for us by applying the values from the SalesHistory table to the table-valued function. In a sense, this code works the same way a correlated sub-query does, except that it can return multiple correlated values.)
SELECT * FROM SalesHistory sh

CROSS APPLY dbo.udf_GetProductSalesTable(sh.Product, sh.SaleID)

ORDER BY sh.SaleID ASC

Additional TechRepublic resources on UDFs

Tim Chapman is a SQL Server database administrator who works for a bank in Louisville, KY, and has more than 7 years of IT experience. He is also Microsoft certified in SQL Server 2000 and SQL Server 2005. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

————————————————————————————————————————————-

Get SQL tips in your inbox

TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

Editor's Picks

Free Newsletters, In your Inbox