CGI ground school is over, and it's time to fly! Prepare for full-fledged CGI combat. Today, our instructor explains how CGI scripts should be run and how you can troubleshoot problematic scripts.
As any seasoned Web server administrator will tell you, loading CGI scripts on a server is only half the battle. You’ve still got to get them to run.
And how, pray tell, do I run a CGI script?
We're now at the point where the Web server has decided it should run the CGI program, and it’s made a request to the operating system to execute the file. This is where many problems begin occurring, because there are numerous items that must be exactly right in order for the program to run successfully and send the output back to the Web browser properly.
Some of these potential problems are specific to UNIX, and some are specific to Windows NT (I won't go into other operating systems because these two are the most common). Here’s a list of things that need to be correct in order for the process to work as intended.
1. The file needs to be executable (UNIX only)
In UNIX, files have attributes that don't exist in the Windows NT world. One of these is the executable bit. Each file has a setting that tells the operating system whether it can be executed as a program and whether it can be run by only the file owner, only by the group that the file owner is in, or by everyone on the server. In order for the operating system to run the file, it needs to be marked as executable by Everyone. This is what the chmod command does.
I won't go into detail about how chmod works, but when you see an instruction that says something like “do a chmod 755 on the program.cgi file,” what it is telling you is to make the file executable by everyone so that it can be run from the Web server.
2. The file needs to point to a valid executable (UNIX only)
The server knows to run a .cgi file as a program, but it also needs to know how to run it. If it's a compiled executable, there's no problem—the server just runs it. But if it's a script using a language like Perl, the server needs to know where to find the Perl program. This is the function of the first line of the file. For example:
In UNIX, this points to an executable file (in this case, the program is named perl) that will run your script.
The first two characters, #!, are called a shebang, which is a common UNIX syntax. If your script starts with the line above, and your server doesn't have a program called /usr/bin/perl, the whole thing will die, and you'll get an error back. For perl scripts, the line above is typical, and most servers have /usr/bin/perl. But in some rare instances, things are configured differently, and you need to edit this first line to point to a valid program to run.
3. The file needs to have an executable extension (NT only)
In the Windows NT world, execute permissions don't exist, and the shebang doesn't apply.
Some Web servers on NT are "smart" and have been designed to use the shebang line to act like UNIX.
So you don't need to worry about using chmod on servers running Windows NT, and you can ignore the first line of the file. What you do need to be concerned about, however, is the file extension.
Windows decides which program to open a file with based on its extension. If you want to run a perl script on NT, for example, you need to give it an extension of .pl so the operating system will know to use Perl to open and run the file.
4. The program needs to return a valid response
Any CGI program that runs must return a valid response to the browser. However, if the program encounters a problem while running and dies, it could output an error message. If you were running the program in a normal window on NT or UNIX, you would simply see the error message.
But in the Web world, the program needs to hand its response back to the Web server, which then packages it to send back to the browser. If the program outputs an error message, the Web server does not get the response it expects and instead returns a general error (501 error, for example) back to the browser, saying there was a problem running the program.
How does a CGI script actually work?
Now that you understand how things need to be set up, it's a good time to step through the whole process and see exactly what happens when a CGI script is run. Going back to our original example, here is the sequence of events (assuming a UNIX server):
- The browser requests the URL in the ACTION tag and passes all the data along with the request.
- The server recognizes that .cgi means this file should be run.
- It checks to make sure that CGI programs are allowed to run on the server.
- It checks to make sure that CGI programs are allowed to run in the /cgi-bin directory.
- The server launches a sub-process to run the program in the operating system.
- The operating system opens the file and looks at the first line to see which program to use with the script.
- It runs this program and passes it the filename to run.
- The script runs, does whatever it needs to, and then returns an HTML response, using print() statements, for example.
- The whole response is passed back to the server, which then packages it in an HTTP response, including content length, etc.
- The server passes the whole response back to the browser, which displays it.
And if it doesn’t work that way?
Of course, that whole process doesn't always go as planned, and there are some things that can stand in the way of your program running correctly. Like what, you ask?
When the Web server launches a sub-process to run the program (step 5 above), it performs a trick: It changes the User ID to one with very little or no permissions. This is done for security purposes and prevents others from writing a script that overwrites important files by accident or deletes whole directory trees.
But this also creates a problem when your program tries to access files to read and write. If you want your program to write to a file, you need to make sure it has permissions set up correctly for this user (usually a user named “nobody”) to write to it. Once again, you need to use the “chmod” command. A command like “chmod 777 filename.txt” will give Read, Write, and Execute permissions for the file to anyone on the server machine, so even when the server changes to the new user, it will still have access to the file.
File permissions are important to remember when trying to set up someone's CGI script, and they are often the cause of it not working correctly. Be sure to follow instructions on which file permissions are needed for which files in order to set up the script correctly.
The first thing a CGI script needs to output, assuming it's giving an HTML response back to the user, is "Content-type: text/html," followed by two returns (creating an empty line). This is needed in any CGI script so that the Web server knows what kind of data is being sent back to the browser in order to process it appropriately. The CGI script could actually return any type of response it chose, whether it is plain text, a PDF document, or a Microsoft Word file. But 99% of the time, the result of any CGI script is going to be plain HTML.
If the script runs and outputs something other than the Content-type, the Web server will return an error message to the browser saying the script returned an invalid response.
OK, so how do I fix it?
Any time you have a script that doesn't work, you need to go through a series of steps to figure out what is wrong. Most of the time, you'll be setting up a script that someone else wrote, so it can be difficult to figure out what is wrong. But if you follow a few steps, it should be easier.
Start by ensuring CGI scripts are allowed. If you aren't the Webmaster in charge of the Web server, the very first thing to check is that you are allowed to run CGI scripts. Some ISPs don't allow users to run CGI scripts on their Web sites at all. Others allow it only in certain directories, while some allow it anywhere, with no restrictions. It really depends on your host, which Web server they are running, and what they allow.
Also, make sure the file is executable. If it’s not, that’s a problem.
Then, check the shebang line. Look at the first line of file and see if the program that's trying to run it actually exists. If it says #!/usr/bin/perl, you need to make sure this program exists on your Web server. If you don't know how to check, you can ask your ISP or Webmaster and they should be able to tell you. For Perl scripts, some people may need to change it to /usr/local/bin/perl or /usr/bin/perl5 or possible other locations.
The next step is to try running the script locally. Hopefully, you have access to a command prompt or terminal on your Web server. In other words, you can Telnet into your UNIX machine or sit down at a DOS prompt on your NT server. If you don't have this kind of access, it will be harder to determine what’s wrong.
Log into the machine, navigate to the directory where the script is actually located, and try running it from the command line. It may give you an error message if the script has a syntax error in it, for example, which will help you to edit the file and fix it. If it runs perfectly, you know that the problem is most likely in your Web server's configuration or that it's possibly a security and file permissions problem.
Don’t forget to check the error log. Web servers keep a log of all errors, which includes problems with trying to run CGI programs. If there is a problem and your program cannot run correctly, looking at the error log might show you an error message like "access to filename.txt not allowed," which means your program tried to access a file that didn't have file permissions set correctly.
If you don't have access to your Web server's error log on your ISP or host, ask the administrators to check it for you and see if the script is generating errors. If that fails, ask your Webmaster for help. If you've checked all these items, e-mail your Webmaster and see if he or she will help. There may be some special setup, or the Webmaster might be able to give you some clues. Also, be sure to indicate that you've tried the above steps—the Webmaster will love you! Nothing is more aggravating than receiving a request for help from those who have apparently done nothing to try to help themselves.
If you’re still stuck after these steps, go to the source. Contact the script author for help. I recommend you use this as a last resort. In all my experience with CGI scripts and helping people install them, I would say 95% of the problems are because of Web server setup problems or other issues that I simply couldn't help them with. Be sure to tell them which steps you've tried to resolve the issue, too. I know that I am much more willing to help someone who has gone through some debugging steps on their own before asking me. Unfortunately, not every client knows anything about CGI.
Finally, you should remember that if you’re faced with a problem beyond your scope of understanding, just say no. It’s better to avoid tackling a problem you can’t fix than it is to make it worse.