CGI crash course: How to lock and load CGIs on your Web server

Welcome to CGI ground school. Today, our instructor describes CGI scripts and explains how they are loaded on Web servers.

It’s three in the morning, and your beeper goes off. Your Web hosting firm’s newest client is having a problem getting a CGI-based shopping cart system running. And, of course, tomorrow is the “Grand Opening.”

If you’re like me and you run a tight operation and outsource most tasks, you’ve probably had a similar experience. Your natural reaction is likely to be to pass the problem on to Kenny, your contract programmer. But, wouldn’t you know it, he’s on his honeymoon, and his new bride wouldn’t let him bring even his Palm VII.

After cursing for about five minutes and then banging your shin on your bed-frame, you put on a big pot of java (pun intended) and plop down at your computer. While you Telnet to your server, a wave of horror overcomes you as you suddenly realize that you have no idea what the hell CGI actually is!

Fear not, for you are about to get a crash course.

So, what the heck is CGI?
CGI stands for Common Gateway Interface—in other words, a term you don't really need to know. Basically, CGI defines how Web servers and Web browsers handle information from HTML forms on Web pages. That's simplifying it a bit, but you get the point.

What can I do with CGI?
A Web server spends most of its time answering requests, loading the requested HTML pages, and sending the pages to users. Nothing too complicated.

But what if you want users to see something different every time they load a page? What if you want to ask a user for information and save it to a database? What if you want to display information from a file that may change five times a day?

In situations like this, loading a static (non-changing) page just isn't sufficient. We need the Web server to run a program, take some action, and then send the results page back to the user's Web browser. Further complicating matters is the fact that the results page might be different every time the program is run.

How about an example?
Let’s say you need to create a Web page that contains a form asking the user's name and e-mail address. There is also a Submit button. When users click Submit, their information should be saved on the Web server in a file you can view later, and they should get a Thank You screen back.

In your original HTML document, you will have a <FORM> and some <INPUT> tags. For example:
Name: <INPUT NAME="name"><BR>
Email: <INPUT NAME="email"><BR>

This is a basic form. If you don't know about forms, read up on them first. You can't really tackle CGI scripts without understanding forms.

The <FORM> tag has two parameters that are important for us. The METHOD parameter defines how the browser will send the information to the server and how the Web server will send it to your program. It can either be "POST" or "GET" (you will most often see "POST"). For a full explanation of the difference, you need a longer tutorial or a book.

The other parameter, ACTION, is the URL of the program on the server that will process the information sent from the form and do something with it.

When the user hits Submit, the Web browser makes a connection to the server, requests the URL in the ACTION parameter, and sends all the form values the user entered. The Web server looks at the URL, realizes it is a program rather than a static file, and runs it. The program then grabs all the data sent to it, does something, and returns HTML back to the browser as the response. That's it! That's the basic process that almost all CGI scripts will go through.

How do I set up my server for CGI?
In order for all this to happen, you need to make sure your Web server is set up to handle this whole thing. By default, some are and some aren't. You will probably need to check with your Webmaster or ISP to see if your server is set up correctly and whether you can run CGI programs. Or if you are the Webmaster, you need to read your server's documentation with respect to CGI. But it's important to understand why it works and why it doesn't.

When you (your browser, actually) request a URL from a server, the server needs to do some checking to find out what to do. How does the server know whether the URL you are requesting is a static file that it should just load and send or whether it's a program it should run and send to you? This is typically decided by two factors: Which directory the file is in and its file extension.

Getting to the bottom of the cgi-bin
First, let's look at the directory part. You've no doubt heard of a cgi-bin directory and noticed that most CGI scripts need to be in this directory. Why? Well, this is a server configuration issue. The server is set up to know that any file in this directory is a program to run, not a static file to send to the browser. Usually, you can't even put a regular HTML file in this directory, because when the server tries to load it, it will try to run it as a program rather than just send it as a file.

What's in a name?
Are you curious where the name cgi-bin came from? Well, it goes back to the original days of the NCSA Web server. By default, this Web server had two directories: cgi-src and cgi-bin. The first contained source code for CGI programs that could run on the server. The second contained the binaries (compiled executables) of the programs, which could be run on the server. Web servers typically don't have the cgi-src directory anymore, but the name cgi-bin has stuck around as the default place to house executable CGI programs on a Web server.

The extension factor
Now let's look at the second factor that determines whether a Web server runs a file or loads it as a static file: the file extension.

The extension of a file on the server, whether it be .html, .cgi, .pl, .txt, and so on, tells the server what kind of file it is and how to handle it. For instance, it knows that .html and .txt files are plain text static files that should just be sent to the browser.

You can add your own file extensions through the Web server's configuration options and tell it how to handle those files. The .cgi extension is one example of an extension that the Web server is configured to recognize as a program it should run.

Okay, now let's take another look at the <FORM> line from our example above:

When the Web browser sends its request to the ACTION URL, the Web server sees that it is in the cgi-bin directory and its extension is .cgi. Thus, it knows that this is a program it should run. So it hands off a request to the operating system, telling it to run the program and passing all the form data to the program. Makes perfect sense, doesn't it?

Tomorrow, I’ll show you how to run the CGI scripts.

Editor's Picks

Free Newsletters, In your Inbox