Introduction to mod_python

This article will provide a brief introduction to mod_python, a tour of what you can do with it, and some pointers to further resources should you want to explore it in more depth. I will assume that the reader is comfortable programming in Python (although no specific knowledge is required) and is familiar with Apache and basic web concepts. This article is not intended to be a complete reference for mod_python. Instead it is meant to consolidate information available from other sources, to hopefully whet your appetite and encourage you to read more from the official documentation (links are in the Resources section).

Prerequisites and Assumptions

I am going to assume that you have a working installation of Apache 2.0.47 or higher and have Python 2.2.1 or later installed. To make things simple, I’m going to assume you are working from the same machine that Apache is installed on, so all URLs will have ’localhost’ as the server name. Replace this with your server name if this is not the case.

What is mod_python?

In the bad old days, most web development was done using CGI (Common Gateway Interface). Writing a CGI program meant creating an executable (script or binary) that the web server called to handle a request. The output generated by the CGI program would then be returned to the user via their browser. Think about that process for a second: a) the user requests a page from the web server, perhaps with some arguments sent via GET or POST, b) the web server recognizes that the requested page is handled by a CGI script and invokes the CGI process, c) the CGI program collects information from the web server using some mechanism (usually environment variables), does some processing and prints out a bunch of HTML (usually), if everything went alright, the web server takes the output and sends it to the user.

Sounds pretty cumbersome when you think about it, doesn’t it? It’s not only cumbersome, it’s also slow and very error prone. mod_python saves us from having to go through this process by integrating the Python programming language right into the Apache HTTP server. This provides a much faster way for Apache to execute python handlers, and as an added bonus, gives us complete access to the Apache internals. Imagine mod_python as a little guy (or … a Python?) stuck inside your web server intercepting certain requests and allowing you to do really cool things with them. Okay, so that may not be a very good technical description, but we’ll get to that. Sound interesting? It is!

It’s important to understand that writing applications with mod_python is not the same as writing applications with a server-side scripting language like PHP. Instead, with mod_python you specify handlers in the Apache configuration file(s) that allow you to customize how a request is handled. This allows you to do a variety of neat things like implement protocols other than HTTP, filter the request and response, determine a document’s content-type, etc.

So let’s get mod_python installed and take a tour of how it works. In order to follow along with this tutorial, you’ll need to have Apache and Python installed, and then install mod_python.

Installation

There are a few different ways to install mod_python, depending on what Operating System you are running. If you are using a distribution of Linux that has a decent package management system (any Red Hat / Fedora, Debian or Gentoo based system for instance) then there will likely already be a package available for you to install. On Gentoo I just emerge the mod_python ebuild with the USE flags I want and portage automatically adds configuration files to be included into my Apache configuration. On Red Hat Enterprise Linux I use a mod_python RPM that depends on having the python-devel and httpd-devel packages. For the sake of brevity I’m only going to cover installing from source here. Consult your OS’ documentation to see if there is an easier way for you. (There is a way to install mod_python on Windows but I am not going to cover that here).

Compiling from Source

In order to compile mod_python, you’ll need to grab the latest stable source release from the mod_python website. At the time of this writing, the latest stable release was 3.2.10.

I’m going to assume you have Apache2 already installed (if not you can get it from the Apache website). I am using Apache 2.0.59 but the process should be the same for any version of Apache above 2.0.47. I have Apache installed in /usr/local/apache2. You will need to adjust the path to match your installation.

Once you’ve downloaded the source tarball for mod_python, untar it and run the ‘configure’ script (feel free to run ./configure –help to see what other configuration options are available):

pike:/usr/local/src paul$ tar xfz mod_python-3.2.10.tgz 
pike:/usr/local/src paul$ cd mod_python-3.2.10
pike:/usr/local/src/mod_python-3.2.10 paul$ ./configure --with-apxs=/usr/local/apache2/bin/apxs
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
...

If everything went okay, and you didn’t get any error messages from configure, you should then run ‘make’ to compile mod_python and ‘make install’. You will likely need to run ‘make install’ as root, so you can use “su -c ‘make install’”:

pike:/usr/local/src/mod_python-3.2.10 paul$ make
... 
(a whole bunch of output you can ignore unless you know what you’re doing)
pike:/usr/local/src/mod_python-3.2.10 paul$ su -c 'make install'
...
(more output that we won’t concern ourselves with)

Safety Note: Never run a configure script as root. It’s always possible that the host you retrieved the source distribution from has been compromised and therefore you don’t know for sure what could be hiding in that script.

If the compilation process didn’t report any errors, you should be ready to go. Next we have to open the Apache2 configuration file (usually called httpd.conf) and add the following:

LoadModule python_module      modules/mod_python.so

Note: the path to mod_python.so may very. Check your Apache2 installation root and find out if it was actually put there or somewhere else.

As with any time you edit the Apache configuration, you will have to restart Apache before the changes take effect. Now we want to verify that our installation went smoothly. There are a variety of ways to do this, I always like to grab the default headers from Apache to see what is installed. I do this by telnetting into port 80 and typing ‘HEAD / HTTP/1.0’:

pike:/usr/local/src/mod_python-3.2.10 paul$ su -c '/usr/local/apache2/bin/apachectl restart'
pike:/usr/local/src/mod_python-3.2.10 paul$ telnet localhost 80
Trying ::1...
Connected to localhost.
Escape character is '^]'.
HEAD / HTTP/1.0

HTTP/1.1 200 OK
Date: Mon, 01 Jan 2007 04:28:24 GMT
Server: Apache/2.0.59 (Unix) mod_python/3.2.10 Python/2.3.5 PHP/5.2.0
Last-Modified: Sat, 20 Nov 2004 20:16:24 GMT
ETag: "ab5da-2c-4c23b600"
Accept-Ranges: bytes
Content-Length: 44
Connection: close
Content-Type: text/html

Connection closed by foreign host.

As you can see from the “Server” header, mod_python version 3.2.10 is installed. Now to really test it! Open the Apache2 configuration file again and add the following location directive:

<Location /mpinfo>
    SetHandler mod_python
    PythonHandler mod_python.testhandler
</Location>

Restart Apache again and point your browser to http://localhost/mpinfo and you should see a test page with a lot of useful information about your server environment. This information can come in very handy when debugging problems so I tend to keep it around. If you don’t see this page, something must have gone wrong. Go over the instructions again or consult your operating system documentation. If you see the page, congratulations, you have installed mod_python! Now let’s move on and start learning about mod_python.

Handlers

In order to truly understand the power offered by mod_python, a basic understanding of Apache handlers is required. Essentially you can think of a handler as a processing “phase”. Apache gets a request, and then initiates a number of handlers (or “phases”) to do something. The handlers can either be built into Apache, or included as modules. Apache handlers may be configured explicitly, based on either filename extensions or location. Examples of functionality that takes place in handlers may include authenticating a user, invoking a cgi script, getting the server’s status, etc. mod_python allows you to tap into any handler used by Apache. mod_python also provides a few standard handlers to help you with some common tasks.

Publisher Handler

The publisher handler is available so you don’t always have to worry about writing your own handlers and can instead focus on developing your application. It’s very handy. In order to use the publisher handler, you have to something like this to your Apache configuration:

<Directory /usr/local/apache2/htdocs/PublisherExample>
    AddHandler mod_python .py
    PythonDebug On
    PythonHandler mod_python.publisher
</Directory>

Note: The “PythonDebug On” line makes mod_python output errors to the browser when possible, instead of the Apache server logs. This is useful while we’re developing.

Now create a file called ‘ModPythonExample.py’ in the directory ‘PublisherExample’ and type the following:

from time import strftime, localtime

def publisher_example(req):
    req.content_type = 'text/html'
    time_str = strftime("%a %b %d %H:%M:%S %Y", localtime())
    message = "<h1>Hello from mod_python!</h1>"
    message += "<p>The time on this server is %s</p>" % (time_str)
    return message

Because we have modified the Apache configuration we will need to restart it once again. Now point your browser to http://localhost/PublisherExample/ModPythonExample.py/publisher_example and you should see a message telling you what time it is.

As we can see from this example, the publisher handler calls a function and just sends the return value to the client. The function receives a request object as an argument. We set the request object’s content_type to ’text/html’ because we want the output to be handled as HTML. We then construct a message including the current time and date and return it. Notice the structure of the URL. The first part after the directory (ModPythonExample.py) is the name of our file, and the second part (“publisher_example”) is the name of the function to call.

Obviously this is only a simple example. There are many great things you can do with the publisher handler. See the documentation for more information (Resources).

CGI Handler

The CGI Handler is provided as a stepping stone away from traditional CGI. It is not intended as a final solution for using mod_python. Basically, the CGI Handler emulates a CGI environment from within mod_python, allowing you to migrate CGI based Python applications to mod_python with little or no modification. Read the documentation to learn about limitations of this handler. To use it, just add a Directory directive to your Apache config (like we did for the Publisher Handler) and include the following:

SetHandler mod_python
PythonHandler mod_python.cgihandler

Once again, this should not be considered a final solution but can certainly improve the performance of your existing CGI code without too much modification.

Custom Handlers

Using a standard handler may be appropriate in many scenarios, but sometimes you might find it more appropriate to write your own handler. mod_python allows you to do this (of course) and in fact makes it pretty easy! Let’s create the following directory directive in our Apache configuration:

<Directory /usr/local/apache2/htdocs/CustomExample>
    AddHandler mod_python .py
    PythonDebug On
    PythonHandler customexample
</Directory>

This tells Apache that any requests for a file with a .py extension will be served by mod_python. (It is worth noting that the file does not actually have to exist, and that in fact a request for /CustomExample/myfile.py and /CustomExample/myotherfile.py will both be handled the same with this configuration). The “PythonHandler customexample” line tells Apache to hand requests for files with a .py extension to this module (which we will be writing). The actual process goes something like this: a) Apache receives a request for a file in the CustomExample directory that has an extension of .py. b) Apache recognizes that this request is to be handled by mod_python and attempts to import a module called “customexample”. Apache looks for this module in sys.path (with our directory prepended to it so anything in there will be found first). c) Apache will then look for a function called “handler” in the module and execute it, passing the request object as an argument. Okay, enough details, let’s write our handler module. Create a file called “customexample.py” in the “CustomExample” directory:

from mod_python import apache

def handler(req):
    req.content_type = 'text/plain'
    req.write('Hello from mod_python!')
    return apache.OK

A few things to notice here: a) we use the request object to write output to the client instead of just returning content, b) we return a constant from the apache module. The apache.OK constant corresponds to an HTTP 200 response code. Other constants are defined for 404, 302, etc response codes. Of course, this example doesn’t really do anything new, so to demonstrate the real power of writing our own handlers we are now going to create a super simple (and not very secure) MySQL authentication handler. If you have MySQL installed, create a database called “mptutorial” and add the following table and records:

CREATE TABLE `users` (
    `id` int(11) unsigned NOT NULL auto_increment,
    `username` varchar(50) NOT NULL,
    `password` varchar(50) NOT NULL,
    PRIMARY KEY  (`id`),
    UNIQUE KEY `username` (`username`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO `users` (`username`, `password`) VALUES ('john', MD5('secret'));

Then add the following to your Apache configuration:

<Directory /usr/local/apache2/htdocs/AuthenticateExample>
    AddHandler mod_python .py
    PythonDebug On
    PythonAuthenHandler authuser
    AuthType Basic
    AuthName "Secure Area"
    Require valid-user
</Directory>

Now let’s write our example code. We need to write an authentication handler that retrieves the username and password entered

Note: This example uses the MySQLdb module.

import MySQLdb
from mod_python import apache

def verify_user(username, password):
    db = MySQLdb.Connect(host='localhost',user='mpuser',passwd='mppassword',db='mptutorial')
    cur = db.cursor()
    sql = "SELECT * FROM users WHERE username = '%s' AND password = MD5('%s');" % (MySQLdb.escape_string(username), MySQLdb.escape_string(password))
    cur.execute(sql) 
    results = cur.fetchall()
    db.close()
    return len(results) > 0
    

def authenhandler(req):
    username = req.user
    password = req.get_basic_auth_pw()
    
    if verify_user(username, password):
        return apache.OK
    else:
        return apache.HTTP_UNAUTHORIZED

Restart Apache and point your browser to http://localhost/AuthenticateExample/. You should be prompted with a username and password dialog. Try entering a wrong password, then enter the username ‘john’ and the password ‘secret’. See how mod_python handled the authentication? Neat huh?

Python Server Pages (PSP)

There is one standard handler that I did not cover in the previous section. The PSP Handler allows you to use the PSP class in the mod_python.psp module. PSP stands for Python Server Pages. Python Server Pages allow you to inline Python code in HTML (or any other kind of document) as you would if you were using PHP, ASP (Active Server Pages), JSP (Java Server Pages) or something similar. Some people argue against the practice of mixing markup and code, and I’d be one of them. I personally advocate the use of server pages as a view mechanism with very little control logic. Of course, to demonstrate the functionality of PSPs I will likely break my own rule.

As with any mod_python handler, we have to edit our Apache configuration before we can use Python Server Pages. Open your Apache configuration and add a Directory directive similar to the following:

<Directory /usr/local/apache2/htdocs/PSPExample>
    AddHandler mod_python .psp
    PythonHandler mod_python.psp
    PythonDebug On
</Directory>

This should all look pretty familiar by now! Basically we tell Apache that any files with a .psp extension are handled by mod_python. We then tell Apache that the mod_python.psp module will be the generic handler for these files. Also, because we’ve told Apache that PSP files will have a .psp extension, let’s add index.psp to our DirectoryIndex in Apache’s configuration. Open your Apache configuration file again and find the line that starts with “DirectoryIndex” and add index.psp. Depending on what was there before, that line should now look like this:

DirectoryIndex index.html index.html.var index.php index.psp

We’ve now set up the PSP handler, told Apache to serve index.psp files as directory indexes, all that is left to do is to write some actual code. So restart Apache and let’s start exploring Python Server Pages. Create a file called index.psp in the PSPExample directory and type the following:

<html>
<head>
    <title>Python Server Pages (PSP)</title>
</head>
<body>
<%
import time
%>
Hello world, the time is: <%=time.strftime("%Y-%m-%d, %H:%M:%S")%>
</body>
</html>

Save the file and point your browser to http://localhost/PSPExample. Your server should give you the index.psp file, because we added it to the list of files in DirectoryIndex. If all went well, you should get a message with the current time on your server. If you are familiar with JSP, ASP, etc, then the above code should look very similar. Basically, anything in between the <% %> tags is interpreted as Python code. Whatever is between the <%= %> tags is replaced with the result of the expression. This saves you from typing a lot of write() or print statements.

Indentation can be pretty tricky in Python Server Pages. Because PSP allows you to mix Python code and HTML / XML / Anything else, you often find that you need a way to terminate a for iteration or if statement. There’s a simple, albeit cumbersome way to do this:

    <%
    a = [1,2,3,4,5,6]
    for number in a:
    %>
    This is a number: <%=number %> <br />
    <%
    # this terminates the iteration
    %>
    <h1>Hello</h1>

If we left the comment out of the above example, the Hello header element would be written to the output for each iteration of the list. That’s obviously not what we want, so add a comment to tell PSP that the for iteration has ended.

Conclusion

We’ve just taken a whirl-wind tour of some of the features of mod_python. Leveraging the power of the Python programming language and the Apache HTTP server, mod_python offers an incredible amount of flexibility to a web developer. In the next article in this series, I’m going to start covering some of the more framework oriented approaches to Python Web Development. Until then.

Resources

I have covered a lot of material in this article, but there is still a lot more to mod_python. In order to become truly proficient, you should really take the time to read the official documentation and become more familiar with Apache. The mod_python documentation can be found here. You can also find more examples at the mod_python section of the apache wiki found here.