SolutionBase: Making Excel and SQL Server talk to each other

Create reports from SQL Server data in Microsoft Excel.

Many companies today are using SQL Server to handle data about products, services, and support calls—just about anything you can imagine. Usually companies that have SQL Server have very large and expensive software applications dedicated to entering and retrieving data from the databases. While these applications are useful, and required for getting data into the SQL Server environment in many cases, reporting can sometimes be problematic.

For many users, generating reports from SQL Server is not the first thing they think of as user friendly. With a little work, you can create documents that present SQL Server data in such a way that your end users don't realize the data they're playing with lives in a huge database. Instead, you can present such data in a more familiar format such as an Excel spreadsheet. Here's how it's done.

Author's note
The focus of this piece will be on two programs—SQL Server and Excel. This article will help you show your users how to use the two programs seamlessly to view data. The reports and information you'll be able to provide will help them do their jobs more efficiently and reduce calls to the help desk or the company DBA.

For the purposes of this article, I'll use SQL Server 2000 Evaluation Edition and Office XP. You'll need at least the runtime version of Microsoft SQL Server and Microsoft Excel or Office. I'll be using the Northwind database, an example database included with the Standard and Enterprise versions of SQL Server. This database is also available with Microsoft Access.

Simple prerequisites
Most office environments use Microsoft Office products, making Excel easy to come by. However, SQL Server can be a bit pricey, and if your organization hasn’t already shelled out or isn’t willing to shell out the usage fee (Microsoft licensing fees and other expenses) for the supercharged version of SQL Server, don't worry. Microsoft has created a desktop edition that is free to download. It will hold data just as well as the bigger versions of the product but doesn’t come with all the helpful tools for designing databases.

Your first step in creating useful reports in Excel is to poll your user community for things they use everyday. For example, you might take note when users from a certain department come by to ask questions and get help; if they are having problems with a report or getting at some data, then this solution might provide both of you with a quick and easy fix to a potentially ongoing problem. In the following scenario, I'll create a quick report for Phil, a marketing department employee, that details which customers purchased Sasquatch Ale and the quantities they purchased.

When preparing the customer report, talk to the users who might get additional use from it. Maybe it begins as a report to satisfy a need for Phil in the marketing department, but in the long run, a report created for one user will probably fit the needs of several others, with very few changes, if any.

Obtaining data natively in SQL Server
After you've collected some information, begin testing ideas to extract this data. Through my own experience, I've found that the best ways to write a report destined for Excel is to create it in Microsoft Query Analyzer or to use the tool included with Office, Microsoft Query, and test it for validity rather than relying on the Query Wizard.

Getting data from SQL Server can be very overwhelming if the report you're generating looks to several tables and needs piles and piles of data. If you're accessing multiple tables, then the Query Wizard may be a better choice to create the report. It allows you to include comments in SQL statements so that you can allow someone else to pick up where you left off, or in some cases, continue to build on a query later.

The report to show which customers purchased Sasquatch Ale and how much of this product was purchased per customer should contain only one query and reference four tables: Customers, Orders, Products, and Order Details. This three-table query might sound a bit complex at first, especially if you have yet to create your own SQL statement; but as you'll see, it really isn’t so bad.

If you look at each element individually and piece it in with the other elements, it will fall into place quickly. To retrieve the customer information, company name, and contact name, you'll need to select them from the Customers table. This table also contains the customer ID, which will provide access to the orders placed by that customer.

The Orders table, which contains the order ID for each order placed and the customer ID for the customer who placed the order, should be looked at as the gateway to the information you wish to select. While you do not need to display any of the information from the Orders table, you must make a pit stop there to get the order ID. Once this is done through a simple JOIN statement, you can select the quantity ordered for each customer, as well as the order date from the Order Details table using another simple JOIN statement.

One final JOIN statement tosses you into the Products table for the name of the product ordered. The code for this SELECT statement is shown below with comments:
-- Query to select the Customers who ordered Sasquatch Ale, the quantity they purchased and the date the order
-- was placed
-- tell SQL Server which Database it should use for your query. By default the Master database is selected.
use northwind
-- create a query selecting the columns needed from their respective tables
select cust.companyname, cust.contactname, orddet.quantity, ord.orderdate, prod.productname
from customers cust inner join orders ord on cust.customerid = ord.customerid inner join [order details] orddet on ord.orderid = orddet.orderid inner join products prod on orddet.productid = prod.productid
where prod.productname = 'Sasquatch Ale'

The code sample above will create the output shown in Figure A.

Figure A
The result set from the above query, shown in Query Analyzer’s grid format, shows 19 customers who ordered Sasquatch Ale.

The tables are joined on a common column. For example, Customers and Orders are joined where the customer IDs for both tables are equal using the command:
inner join orders ord on cust.customerid = ord.customerid

The tables also have additional syntax following their name. This is called an alias. It allows you to reference joined tables in shorthand format to simplify code. I used the cust alias for customers and the ord alias for orders. This allows me to use the syntax cust.contactname and ord.orderdate when referring to the contactname column in the Customers table and the orderdate column in the Orders table. Doing so makes the size of the query a little smaller. When you're creating large queries, fewer keystrokes might be very helpful.

This query is not the simplest query to write using only transactional SQL. It involves joining four tables to retrieve the desired information. Also, in the above statement following the Where clause, I could have added order by ord.orderdate to put the results in order by the date the order was placed, but that is not necessary to make the statement work.

How to do it in Excel
Now I'll look at the same query through the eyes of Excel. In Excel, go to Data | Import External Data | Select New Database Query. When you select the New Database Query menu option, you'll be presented with the screen shown in Figure B, which asks you to choose a DSN.

Figure B
The data source name selection window with new data source highlighted: This is the default.

What if there is no SQL Server DSN created? Simply click the OK button on the right-hand side of the window, with the default selection highlighted.

DSN, DNS, ODBC: What are all these acronyms?
When you connect to SQL Server using Excel, you're creating an Open Database Connectivity (ODBC) connection to the database. These connections are referred to as Data Source Names (DSNs), which sometimes gets mistyped as DNS, the domain name system used on the Internet. I have often made this typo, which passes the spell checker in Word and can easily be overlooked.

Click the OK button at the right of the window. You'll then see the Create New Data Source Wizard begin, as shown in Figure C.

Figure C
The first screen of the Create New Data Source Wizard

You can specify a custom name for the DSN and then select the type of data source you're connecting to. In this case, click the drop-down box, scroll to the last type of data source (SQL Server), and then click the Connect button.

On the SQL Server Login box, you'll need to specify the server name of your SQL Server 2000 computer, along with the login credentials allowing your DSN to connect to the SQL Server successfully. If your database server uses Windows authentication, select the Use Trusted Connection box and click OK. This method of connection is shown in Figure D.

Figure D
Enter the login credentials required for the database.

If your SQL Server uses SQL logins, or user accounts created within SQL Server, you should consider creating a read-only SQL login account for use by all reporting applications. If you have an account created already, enter the login name and password for that account on this page and click OK.

If you want to ensure that your data source always connects to the right database on the login box shown in Figure D, you can click the Options button and select the default database. In this case, that database would be Northwind. Then click OK and proceed with the remainder of the wizard. While this step isn’t required, it might make the process a little smoother. The screen to specify a database is shown in Figure E.

Figure E
Choosing a default database for this DSN

The wizard will then make a connection to the database using the credentials you specified and ask you if you want to specify a table for this DSN, as shown in Figure C. Leave this box blank. Because your query will use more than one table, there isn’t a need for a default table. Saving the username and password as part of the DSN isn't a bad idea if you have created a SQL login for read-only use; that way, you wouldn’t have to field calls about the username and password for the report.

Once you have everything filled in as desired, click OK. This will return you to the Choose A Data Source box, but this time your newly created DSN will appear in the list. Click on your DSN to select it and click OK.

On the next screen that comes up, you'll see the Query Wizard shown in Figure F. Here, you can choose the columns within the tables of the default database (if you selected one previously) by expanding a table's plus sign and highlighting the column you want to choose. Click the single arrow in the middle of the screen to add the column to the list.

Figure F
Choosing columns for use by the wizard

After you've selected all the columns you want to use, click Next. You'll then be asked to filter the result set. When you apply filters to the result set, you are essentially building the WHERE clause of the query. If you want to see only results where order ID is equal to 3, you would complete this page as shown in Figure G.

Figure G
Choose the order in which you would like the results sorted.

When you have filtered the data as desired, click Next and tell the Query Wizard the order in which you would like your results sorted. This screen shows the columns you've selected in each drop-down box. If you select a column, the results will be sorted by each column that you selected.

On the final screen of the wizard, you can continue to edit your query within Microsoft Query, allowing you to further customize the results you receive, or you can create an OLAP query or return the results to Excel. For this portion of the wizard, I'll add the query shown above to the wizard and display the results in Microsoft Query and then in Excel (Figures H and I).

Figure H
Results displayed in Microsoft Query

Figure I
Results displayed in Microsoft Excel

When you actually do return results to Excel, either from the wizard or from Microsoft Query, you'll be asked to choose a cell to put the data in. If you plan to include the title of the report, you might want to leave a little extra room and begin the data around row 6. If you choose to view the query and further edit it in Microsoft Query, the icon on the toolbar showing a door with an arrow entering it is the button to return that data to Excel.

A sometimes easier alternative
When you're using a complex query, as I have in this example, it might be easier to compose the query in Query Analyzer (shown above) or in Microsoft Query rather than using the wizard. To accomplish this, select any column when asked to do so by the wizard. Then click Next three times and select View Data or Edit Query in Microsoft Query. Click Finish, and you'll see the record you chose in Microsoft Query. Highlight the table shown graphically at the top of the Microsoft Query window. Then click Table | Remove Table. Click the SQL button on the toolbar and manually type your query or paste a pretyped query into this box, and click OK.