CGI Programming Class: Lesson 5

Lesson 5

This lesson covers reading and writing files with your CGI. Inevitably as you start to program more advanced CGI applications, you'll want to store data to a file - maybe a guestbook program that keeps a log of the names and email addresses of visitors, or a counter program that must update a count file... or a program that scans a flat-file database and pulls info from it. This is very easy to do; usually the biggest problem with CGI's writing to files is file permissions in Unix.

Most web servers run with very limited permissions, so when your CGI is run, it's run with those same permissions, and can't do very much useful stuff. The caveat here is that in order to write to a data file, you must usually make it world-writable, via the chmod command:

  chmod 777 myfile.dat

The bad part about this is, it means that anyone can go in and mess up your data file, or even delete it, and there's not much you can do about it.

Some alternatives are to use the cgiwrap program on your web server, which forces CGI's to run under the owner's id and permissions. Also, the most recent version of the Apache httpd server supports SetUID CGI execution, also forces CGI's to run under the owner's userid and permissions. If you are administering your own web server, this is by far the best option, because it allows your CGI's to safely modify your own files, without having to worry about other users overwriting or deleting your data.

For the sake of this lesson, let's assume you have neither cgiwrap or SetUID, so you'll have to chmod any file you want write to.

Let's start by creating a simple data file. Enter the following data into a file, and call it data.txt.

  1101|Book of Ra|24.95|IN
  1102|Fnord Hunting|12.95|OUT
  1210|Illuminatus|14.95|IN
  1215|Conspiracy Theories|19.95|IN
  1422|Principia Discordia|9.95|IN

This is what's called a flat-file database - a text file containing data, with each line of the file being a new record in the database. The fields are separated by the pipe symbol (vertical bar |), though it could be any character that will not appear in the data itself.

The above data is a hypothetical database with fields for stock number, book name, price, and status (in stock or out of stock). Let's say you want to write a CGI that will display all of the items in the database, such as for a product catalog or order form. The advantage of this is you can modify the database whenever a product's price or status changes, without having to edit any HTML pages. A simple example would be a script like this:

  #!/usr/bin/perl

  $filename = "data.txt";

  print "Content-type:text/html\n\n";
  print <<HTMLHead;
  <html><head><title>Catalog Page</title></head>
  <body>
  HTMLHead
  ;

  open(INF,$filename);
  @indata = <INF>;
  close(INF);

  print "<table border=1>";
  print "<tr><th>Stock#</th><th>Description</th><th>Price</th><th>In Stock?</th></tr>\n";
  foreach $i (@indata) {
      chop($i);                    
     ($stock,$name,$price,$status) = split(/\|/,$i);
      print "<tr>";
      print "<td>$stock</td>";
      print "<td>$name</td>";
      print "<td>$price</td>";
      print "<td>$status</td>";
      print "</tr>\n";
  }
  print "</table>";
  print "</body></html>";

Here is the script in action.

This script only opens the file briefly - dumps the entire file into an array called @indata, then prints it all out in a table. The script also could have been done without the @indata array, by reading one record at a time from the file:

  open(INF,$filename);
  print "<table border=1>";
  print "<tr><th>Stock#</th><th>Description</th><th>Price</th><th>In Stock?</th></tr>\n";
  foreach $i (<INF>) {
      chop($i);
     ($stock,$name,$price,$status) = split(/\|/,$i);
      print "<tr>";
      print "<td>$stock</td>";
      print "<td>$name</td>";
      print "<td>$price</td>";
      print "<td>$status</td>";
      print "</tr>\n";
  }
  print "</table>";
  print "</body></html>";
  close(INF);

Whether you read the file all at once or one line at a time is just a matter of personal preference; I prefer to read it all at once, to minimize the amount of time the file is open, and free up the file for other processes that may want to access it.

Note that inside the foreach loop, the first thing the script does is chop the newline character off the end of each line. This sometimes isn't critical, but if you fail to do it, then the last field of each line (in this case the $status field) will have a \n on the end, and if you do any if-elsif or conditional tests on that field, you'll have to take the newline character into account. (There've been a number of times I've forgotten to do this, and been terribly frustrated by a script when I knew the value of that field was some number, but a test like if ($foo == 23) would fail. A few diagnostic print statements is usually all it takes for me to find the problem, and then I feel dumb for forgetting to put a chop() in there.)

The actual process of opening and reading a file consists of just these few lines:

  open(FILEHANDLE,$filename);
  $single_variable = <FILEHANDLE>;
  @array = <FILEHANDLE>;
  close(FILEHANDLE);

Reading data in from the file is done by assigning a variable, either a single value scalar variable or an array, to the <FILEHANDLE>. If you use a scalar variable, you're reading the file one line at a time. If you use an array, you're stuffing the entire file into that array. If I were to actually use the code from the above example, I'd have the first line of the file stored in $single_variable, and the rest of the file stored in @array. Once you've read the file, you can do whatever you need to do with the data.

Let's say you want to write some data now. You've already had some experience with this in the form-to-email script in lesson 3. The difference here is that instead of doing something like

  print MAIL "Subject: Form Data\n\n";

You'll actually be printing to a file rather than to the MAIL program.

Let's use the same file as above - the product file - and write a form to administer it by adding new items. Here's the html for the form:

  <html><head><title>product db</title></head>
  <body>
  <form action="http://slsq1b/cgi-bin/form8.cgi" method="POST">
  This form adds a new product to the database.<p>
  <pre>
  Stock Number: <input type="text" name="stock">
     Item Name: <input type="text" name="name">
         Price: <input type="text" name="price">
        Status: <input type="text" name="status" value="IN">
  <input type="submit">
  </form></pre>
  </body></html>

The actual form is here.

For this example, the script that processes this form only needs to decode the form data, and append it to the existing file. There are three main ways you can open a file, and how you open it depends on what you want to do with the file. If you open the file as read-only, then you can only read from it, not write to it. If you open the file to append, then everything you write to the file gets appended to the existing data in the file. If you open the file to write, then when you print to the file, you overwrite any existing data in the file. (write-mode also will create the file, if it doesn't exist.)

Here are examples of these three ways to open files:

  open(FILE1,"somefile.dat");        # Opens "somefile.dat" as read-only
  open(FOO,">>log.txt");             # Opens "log.txt" for appending
  open(BLEE,">newdata.out");         # Opens "newdata.out" for writing

Another note here. You can use just about anything for the filehandle. In the above example, the filehandles are FILE1, FOO, and BLEE, respectively. It doesn't matter what you call them; just remember that when writing to them, you want to do print FOO "whatever\n"; - i.e. you must specify the filehandle in order to write to the file.

Here is the cgi for decoding the above form. In this script, we're just appending to the existing file.

  #!/usr/bin/perl

  read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
  @pairs = split(/&/, $buffer);

  foreach $pair (@pairs) {
     ($name, $value) = split(/=/, $pair);
     $value =~ tr/+/ /;
     $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
     $value =~ s/~!/ ~!/g;
     $FORM{$name} = $value;
  }
  open(OUTF,">>data.txt");
  # remove the $ sign on the price, if any
  $FORM{'price'} =~ s/\$//g;   

  print OUTF "$FORM{'stock'}|$FORM{'name'}|$FORM{'price'}|$FORM{'status'}\n";
  close(OUTF);
  print "Location:http://slsq1b/cgi-bin/form7.cgi\n\n";

Also in this example, rather than bringing up a thank-you page after the form runs, it prints a redirect location to the cgi that displays the entire data file - so you can see if the new-record form worked. (If it doesn't work, check and be sure you set the permissions of data.txt correctly, by doing chmod 777 data.txt.)