GET /cis24/class7_notes.html HTTP/1.1
The first part of the line is the request method. When you want to view an HTML document, your browser automatically performs a GET. The other method that's generally supported is POST, which is often used to perform form submissions. We'll talk more about GET vs. POST as we go on.
The second part of the line is the path of the resource you want. In the above example, it is a request for an HTML file in the cis24 directory. If the request was a form submission using the GET method, all of the form data would be appended to the URL as a query string that is URL encoded. For example, if I submitted a form to a script located at /cgi-bin/myscript.pl and I provided inputs to fields that asked for my first name, last name, and birthday, the request line would look like this:
GET /cgi-bin/myscript.pl?firstname=Mike&lastname=Toppa&birthday=6%2F21%2F70 HTTP/1.1
The query string is all of the data following the question mark. URL encoding is how the data is formatted: an equal sign separates the input field's name from the value I entered for it, and the ampersand separates the name/value pairs. Also, special characters that might be misinterpreted by the web server (such as a slash) are converted to their hexadecimal equivalents (for the slash, this is %2F).
Although the POST method does not use a query string to send form data, it still URL encodes it. So, regardless of the method used for the form submission, your script will have to decode the form data before you can use it. More on this in a few minutes.
The third part of the request line indicates that you are using the HTTP 1.1 standard to make the request.
User-Agent: Mozilla/4.05(WinNT; I) Accept: image/gif, image/jpeg
After your browser has sent all its header fields, it will send a blank line (i.e. two linefeeds), to indicate the end of the header.
HTTP/1.1 200 OK
The first indicates that HTTP 1.1 is the protocol being used to communicate. The second is a code indicating the status of the request. "200" means that the request was successful and that the requested data will be sent. The third field is a brief description of the meaning of the status code. In this case, "OK".
Date: Thu, 23 Mar 2000 08:20:33 GMT Server: NCSA/1.5.2 Last-modified: Tue, 21 Mar 2000 12:15:22 GMT Content-type: text/html Content-length: 2482
A blank line ends the header.
The body: assuming the request is successful, the requested data is sent. The data may be a copy of an HTML file or a response from a CGI program.
One of the first jobs of a script that processes form data is to undo the URL encoding. Fortunately, Perl has some built in functions that make it fairly easy for you to do this. The first one to look at is called hex. Hex accepts one argument, which is a number in hexadecimal notation. Hex takes this number and converts it to a standard decimal format. For example, a number sign (#) is represented in hexadecimal notation as 23. If you pass 23 to hex, it will convert it to 35, which is the decimal notation for #.
So now you're halfway done. To turn the number 35 to a #, you use the pack function. pack accepts two arguments. Using "C" as the first argument tells pack to convert the second argument (which is your decimal number) to its character equivalent. So, pack will take the number 35 and turn it into a #. Here's some example code:
$hexNum = 23;
print "the hex value is: $hexNum\n";
$decNum = hex($hexNum);
print "the equivalent decimal value is: $decNum\n";
$char = pack("C", $decNum);
print "the equivalent character is: $char\n";
<HTML> <HEAD> <TITLE>A simple form</TITLE> </HEAD> <BODY> <FORM ACTION="/cgi-bin/birthday.pl" METHOD="GET"> <P>First Name: <BR><INPUT TYPE="Text" NAME="firstname" SIZE="20"> <P>Last Name: <BR><INPUT TYPE="Text" NAME="lastname" SIZE="20"> <P>Birthday: <BR><INPUT TYPE="Text" NAME="birthday" SIZE="10"> <P>Email Address: <BR><INPUT TYPE="Text" NAME="email" SIZE="20"> <P><INPUT TYPE="Submit" NAME="submit" VALUE="Send Information"> </FORM> </BODY> </HTML>
#!perl
# The shebang line has to be the first line in your script!
# More on this later in tonight's class.
# If the data was sent via GET, we'll decode it from the query string.
# If it was sent via POST, we'll read it from the body of the request,
# which is considered STDIN by your web server.
if ($ENV{'REQUEST_METHOD'} eq "GET") {
# Split the name-value pairs
@pairs = split(/&/, $ENV{'QUERY_STRING'});
}
elsif ($ENV{'REQUEST_METHOD'} eq "POST") {
# The "read" function is used to read data into a scalar variable.
# The first argument is the file to read from.
# The second argument is the variable to assign the results to.
# The third argument is the number of bytes to read from the file,
# (the content_length tells us how many bytes were sent)
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
# Now split the name-value pairs
@pairs = split(/&/, $buffer);
}
# @pairs contains all the name value pairs that were submitted, but they're
# still joined as single pieces of information, and they're still URL
# encoded. That is, the array looks something like this:
# (firstname=Joe, lastname=Smith, birthday=1%2F1%2F85)
# We need to get this data into a usable form by getting the key/value
# pairs into an associative array, and we need to decode the
# special characters.
foreach $pair (@pairs) {
# For each element of the @pairs array, we'll split it into two
# variables: $name and $value
($name, $value) = split(/=/, $pair);
# The next four lines decode the data. First we convert +
# characters into spaces. Spaces are the one type of character
# which are not put in a hexadecimal format. For all other
# special characters, we do the hexadecimal conversion as
# described above.
$name =~ tr/+/ /;
$name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
# Now we'll add each name/value pair to the associative array %in.
$in{$name} = $value;
}
# Now that we have our data in a useful form, we can dynamically create a
# web page and send it back to the user. The page will display the information
# he or she entered into the form.
# First we have to create a line for the header that tells the user's browser
# what kind of data we are sending, so that the browser will know what to do
# with it. In this case, it's an HTML page. You'll notice that this is an
# ordinary "print" function. For the web server, standard output (STDOUT) is
# the header and body of an HTTP response. We indicate that we're done with
# the header by printing a blank line (two linefeeds)
print "Content-type: text/html\n";
print "\n";
# Now we can print our HTML document. You can take advantage of Perl's
# variable interpolation to print variable values directly in your HTML.
print qq^
<HTML>
<HEAD>
<TITLE>Your name and birthday</TITLE>
</HEAD>
<BODY>
<P><B>Your first name:</B> $in{'firstname'}
<P><B>Your last name:</B> $in{'lastname'}
<P><B>Your birthday:</B> $in{'birthday'}
<P><B>Your email address:</B> $in{'email'}
</BODY>
</HTML>
^
#!perl
# Parse the incoming form data
if ($ENV{'REQUEST_METHOD'} eq "GET") {
@pairs = split(/&/, $ENV{'QUERY_STRING'});
}
elsif ($ENV{'REQUEST_METHOD'} eq "POST") {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
}
# Add a check to make sure they didn't try to use some other
# method.
else {
print "Content-type: text/html\n\n";
print <<"end_tag";
<HTML>
<HEAD>
<TITLE>Form Error</TITLE>
</HEAD>
<BODY>
<P><B>The METHOD of your request for this script must be either GET or POST</B>
</BODY>
</HTML>
end_tag
exit;
}
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
$name =~ tr/+/ /;
$name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
# We should handle the possibility of the form containing INPUT
# elements with the same name. With the code below, we'll
# create a comma-seperated list of values for any form fields
# that have the same name.
if ($in{$name}) {
$in{$name} = $in{$name} . "," . $value;
}
else {
$in{$name} = $value;
}
}
# Check to see if the inputs are valid
push (@errors, "your first name") unless ($in{'firstname'} =~ /\w+/);
push (@errors, "your last name") unless ($in{'lastname'} =~ /\w+/);
push (@errors, "a valid date for your birthday") unless ($in{'birthday'} =~ m:^\d{1,2}/\d{1,2}/\d{1,2}$:);
push (@errors, "a valid email address") unless ($in{'email'} =~ /^\S+\@\S+\.\S+$/);
# If there's any invalid data, print a page with the error messages, and exit the script.
if (@errors) {
print "Content-type: text/html\n\n";
print <<"end_tag";
<HTML>
<HEAD>
<TITLE>Input Error</TITLE>
</HEAD>
<BODY>
<P>Please click your browser's <I>back</I> button and enter:
<UL>
end_tag
foreach $error(@errors) {
print "<LI>$error\n";
}
print <<"end_tag";
</UL>
</BODY>
</HTML>
end_tag
exit;
}
print qq^Content-type: text/html
<HTML>
<HEAD>
<TITLE>Your name and birthday</TITLE>
</HEAD>
<BODY>
<P><B>Your first name:</B> $in{'firstname'}
<P><B>Your last name:</B> $in{'lastname'}
<P><B>Your birthday:</B> $in{'birthday'}
<P><B>Your email address:</B> $in{'email'}
</BODY>
</HTML>
^
Apache already should be configured properly for our use on the Lab computers. After the lecture we'll test them to make sure.
An HTML form is typically the "front-end" of a CGI application. These forms, and other HTML documents should be saved in the "htdocs" directory under the "Apache" directory.
Your Perl/CGI scripts need to be placed in a special directory named "cgi-bin". The Apache servers in the Lab are configured so that scripts will work only if they are stored in this location.
Your HTML form must have the correct location of the Perl/CGI script indicated in the ACTION attribute of the FORM tag - see the "birthday" form above for an example.
So far you've been able to debug your scripts simply by running them in an MS-DOS window. If there was an error in the script, you'd see a detailed error message printed to your screen when you tried to run the script. As our scripts become longer and more complex, you should get out of this habit. For example, you may have a script that opens a text file and makes changes to it. If the script contains an error that causes it to crash partway through executing, it could conceivably ruin your text file. Also, when you use Perl for CGI, you won't see detailed error messages in your browser when there's a problem: all you'll get is a page that says "Server Error." So, what are the options for debugging a CGI Perl script?
perl -c your_script.pl
The "-c" switch tells Perl to read through the script and look for syntax errors, but to not actually run the script. It will report back with the message "OK" if it doesn't find any problems, or it will list the errors it finds.
#ServerName new.host.name
ServerName localhost
http://localhost
ScriptAlias /cgi-bin/ "f:/Apache/cgi-bin/"
You could add an additional line:
ScriptAlias /scripts/ "f:/Apache/scripts/"
#AddHandler cgi-script .cgi
This will allow any scripts with the file extension .cgi to be executed. You can change this or specify additional file extensions if you wish.
<A HREF="/cgi-bin/test.pl">CGI test</A>
Save the file with the name "test.html" in the "htdocs" directory under the "Apache" directory. To see it, type this address into your browser http://127.0.0.1/test.html
print "howdy!\n";
Save the file with the name "test.pl" in the "cgi-bin" directory under the "Apache" directory
There are two problems with this script that are preventing it from executing. One is that we need to tell Apache where to find Perl on the computer's hard drive. This is because Apache can't execute your script by itself - it needs to send your script to the Perl interpreter - perl.exe - in order for the script to run. Add the "shebang" line as the first line in your script:
#!perl
Save the script, and return to your test.html page and click the link again. You should get another server error. Let's open the error.log file again and see what it says.
It should say something like "malformed header from script. Bad header." This is happening because anything that your server is going to send to a user must first identify what kind of file it is - this is done with the "header", as described earlier. The server sends this header to your web browser so that the browser knows whether to display the file (as it does with HTML files) or to start a download of the file (as it does with, for example, Zip files).
To create the header, add the following line after the shebang line:
print "content-type: text/html\n\n";
Copy-and-paste the "birthday" form from the lecture notes into a file and save it to the htdocs directory. Then do the same with the CGI script that receives the form submission (but save the script to the cgi-bin directory). Test it to see if it works, and try making some changes to the script (whatever you like) so you can get comfortable working in the CGI environment.
Last week you worked on the HTML forms for your Project.
If you're doing the Calendar, tonight you can work on the script that receives the form submissions and performs input validation on the data from the Event Entry page (see Section III of the Project description). For now, you can display the form inputs to a web page, since we haven't yet discussed saving data to files.
If you're doing the Camera Shopper, tonight you can work on the script that receives the form submission from the Search page and generates the results page (see the Project description). We haven't yet discussed working with files, so for now your script can simply display the search results in a web page.