At it's birth, the World Wide Web was designed to transfer flat, static pages from a server to a client browser. If your data wasn't in a form that easily lent itself to this data model (for example, a customer database), then you either wrote your own custom web server for that specific application, or you were out of luck. Needless to say, most people didn't enjoy this state of affairs.
Pretty early in the history of the web, CERN and the NCSA worked together to create a standard method of creating plug-in programs for web servers. This standard, called the Common Gateway Interface (CGI), leapfrogged the server/client systems of the time and became the de facto standard for creating web applications that didn't depend on flat, static HTML pages.
The major reason for the popularity of CGI was it's ease of use - anybody who already knew a common programming language didn't have to learn how to do anything special to web-enable their programs. CGI used standard input and output systems, just like traditional terminal-based programs always had. If you had a nasty old terminal-based application that took in user input, queried a database, then sent back results to the terminal; all you had to do to web-enable it was create an HTML page that would send the data to the program's input procedure, use exactly the same database query code, and slightly rewrite the output procedure to dump out HTML rather than plain text.
These dynamic pages have become so pervasive that most web surfers expect any worthwhile website to be dynamic. Using dyanmic pages on your site makes your site more interactive and interesting from a user's perspective, and it can allow you to gather better information about your site's usage to help you target different audiences.
It doesn't have to be massively dynamic and go overboard like some sites do, but most effective sites have incorporated dynamic pages to some extent. Amazon.com is perhaps the best example out there - a user logs in, and the site then makes suggestions based on the user's past purchases. The user can see a real-time sales ranking of a book they're interested in, see what comments other people have made about the book, and leave their own comments about the book if they desire. Imagine Amazon without the level of interactivity that it provides, and you'll see why dynamic pages can be so important.
All of this interactivity is great, but it comes at a huge cost in server resources. Since a dynamic page is created by one or more CGI programs on the server, that means one or more programs must be run on the server each time a user looks at a page, on top of the server resources required to serve the page to the client. Needless to say, a moderate amount of users can bring even a high-end server to its knees quickly.
mod_perl is a partial solution to the problem
of the high server load caused by dynamic pages. Created by the same
team of programmers that created the Apache web server, mod_perl streamlines
the process of creating dynamic pages as much as possible to squeeze every
last drop of performance out of server hardware. mod_perl's primary
strength is that it's a drop-in solution to high server load - it requires
little or no change in the CGI scripts that are already used to create
dynamic pages. In fact, you can install mod_perl on a currently running
copy of the Apache web server with no downtime required, and see an immediate
performance benefit.
How Traditional Perl CGI Scripts Work
To understand how mod_perl dramatically increases the performance of Perl-based CGI scripts, you first have to understand the execution lifecycle of a standard Perl CGI script. By dimly recalling the basics of how processes work from an Operating Systems class in the distant past, we know that the following steps occur when a standard Perl CGI script is executed:
1. The client browser makes a request for the dynamic page to the web server.
2. The web server locates the CGI script used to create the dynamic page, and figures out that it needs to use the Perl interpreter to execute the script.
3. The server starts an instance of the Perl interpreter as it's own independent process (meaning that the perl interpreter has to fight for processer space and memory with the other 250 or so processess on a busy webserver) after it's been read from the disk and loaded into memory.
4. The Perl interpreter parses the script, and compiles it into executable form.
5. The executable form of the script is executed, and its output is dumped to the server, which in turn dumps it to the client.
6. The compiled form of the script is removed from memory.
7. The Perl interpreter is shut down, and removed from memory.
8. The server completes the transaction.
In short, a lot of things have to happen for a simple CGI script to execute. Most of these steps require a significant amount of time, and the problem only gets worse as the server gets busier. Anybody who's been on a heavily loaded website knows how annoying it is to wait 30 seconds for a page to load. mod_perl helps fix this problem.
mod_perl helps reduce the load on a server by reducing and streamlining the steps required to execute a Perl CGI script. The two major things mod_perl does is place a copy of the perl interpreter in the web server process itself, and cache the compiled versions of commonly used scripts. This reduces the 8 steps required for a standard CGI transaction to the following 4 steps:
1. The client browser makes a request for the dynamic page to the web server.
2. The web server finds the cached copy of the compiled script, and executes it. (The Perl interpreter that is already loaded quickly compiles the script if it isn't already in the cache.)
3. The output from the script is dumped to the client (without having to be passed from the CGI program to the server, and then to the client since the script is being executed as part of the server).
4. The server completes the transaction.
As you can see, mod_perl works by eliminating
or reducing the amount of processor time and IO transfers required to execute
a CGI script. The primary speed enhancement is through reducing the
required IO transfers dramatically - IO transfers are usually what end
up slowing down a web server. It also reduces the load on the processor
by reducing the amount of work that must be performed on a recurring basis
whenever the script is executed.
Lest mod_perl be accused of being an unstable low-grade hack to get a little extra performance out of some webserver under special circumstances, the following sites are just a sampling of the websites that use mod_perl extensively: