WebKit User's Guide

Version 0.4.1
Webware for Python 0.4.1

Table of Contents


Synopsis
Feedback
Introduction
      Overview
      Compared to CGI "apps"
Errors / Uncaught Exceptions
Configuration
Administration
Debugging
      print
      Raising Exceptions
      Restarting the Server
      Assertions
Naming Conventions
Actions
PlugIns
Files
How do I develop an app?
Known Bugs
Credit

Synopsis

WebKit provides Python classes for generating dynamic content from a web-based, server-side application. It is a significantly more powerful alternative to CGI scripts for application-oriented development.

Feedback

You can e-mail webware-discuss@lists.sourceforge.net to give feedback, discuss features and get help using WebKit.

Introduction

Overview

The core concepts of the WebKit are the Application, Servlet, Request, Response and Transaction, for which there are one or more Python classes.

The application resides on the server-side and manages incoming requests in order to deliver them to servlets which then produce responses that get sent back to the client. A transaction is a simple container object that holds references to all of these objects and is accessible to all of them.

Content is normally served in HTML or XML format over an HTTP connection. However, applications can provide other forms of content and the framework is designed to allow new classes for supporting protocols other than HTTP.

In order to connect the web server and the application server, there is a small CGI script, WebKit.cgi, that bundles a web browser request and sends it to the application server, which then processes it and sends the response back to WebKit.cgi which then outputs the results for use by the web server. See the Install Guide for more information.

At a more detailed level, the process looks like this:

  1. At some point, someone has configured and run both a web server (such as Apache) and the WebKit app server (WebKit/AppServer.py).
  2. A user requests a web page by typing a URL or submitting a form.
  3. The user's browser sends the request to the remote web server.
  4. The web server detects a CGI script, WebKit.cgi, in the URL and invokes it.
  5. WebKit.cgi simply collects information about the request and sends it to the WebKit app server which is ready and waiting.
  6. The app server asks the Application object to dispatch the raw request.
  7. The application instantiates an HTTPRequest object and asks the appropriate Servlet (as determined by examining the URL) to process it.
  8. The servlet generates content into a given HTTPResponse object, whose content is then sent back by the app server to WebKit.cgi.
  9. WebKit.cgi prints the content which the web server then delivers to the user's web browser.

Compared to CGI "apps"

The alternative to a server-side application is a set of CGI scripts. However, a CGI script must always be launched from scratch and many common tasks will be performed repeatedly for each request. For example, loading libraries, opening database connections, reading configuration files, etc.

With the server-side application, the majority of these tasks can be done once at launch time and important results can be easily cached. This the application significantly more efficient.

Of course, CGIs can still be appropriate for "one shot" deals or simple applications. Webware includes a CGI Wrapper if you'd like to encapsulate your CGI scripts with robust error handling, e-mail notifications, etc.

Errors / Uncaught Exceptions

One of the conveniences provided by WebKit is the handling of uncaught exceptions. The response to an uncaught exception is:

  1. Log the time, error, script name and traceback to AppServer's console.
  2. Display a web page containing an apologetic message to the user.
  3. Save a technical web page with debugging information so that developers can look at it after-the-fact. These HTML-based error messages are stored one-per-file, if the SaveErrorMessages setting is true (the default). They are stored in the directory named by the ErrorMessagesDir (defaults to 'ErrorMsgs').
  4. Add an entry to the error log, found by default in Logs/Errors.csv.
  5. E-mail the error message if the EmailErrors setting is true, using the settings ErrorEmailServer and ErrorEmailHeaders. You'll need to configure these to active this feature.

Here is a sample error page.

Archived error messages can be browsed through the administration page.

Error handling behavior can be configured as described in Configuration.

Configuration

There are several configuration parameters through which you can alter how WebKit behaves. They are described below, including their default values. Note that you can override the defaults by placing config files in the Configs/ directory. A config file simply contains a Python dictionary containing the items you wish to override. For example:

{
      'ServletsDir': MyApp,
      'ShowDebugInfoOnErrors': 1
}


Configs/AppServer.config:

PrintConfigAtStartUp
    = 1
Does pretty much what it says. It's generally a good idea to leave this on.
Verbose
    = 1
If true, then additional messages are printed while the AppServer runs, most notably information about each request such as size and response time.
Port
    = 8086
The port that the application server runs on. Change this if there is a conflict with another application on your server.
Multitasking
    = threading
Determines if and how the application server will multitask requests. Must be one of: threading, forking, sequencing.
PlugIns
    = ['../PSP']
Loads the plug-ins from the given locations when the application server starts up.


Configs/Application.config:

PrintConfigAtStartUp
    = 1
Does pretty much what it says. It's generally a good idea to leave this on.
ServletsDir
    = 'Examples'
This is where the application always looks for the servlets. This location does not appear in URLs; it's implicit. The path can be relative to the WebKit location, or an absolute path.
LogActivity
    = 1
If true, then the execution of each servlet is logged with useful information such as time, duration and whether or not an error occurred.
ActivityLogFilename
    = 'Logs/Activity.csv'
This is the name of the file that servlet executions are logged to. This value makes no difference if LogActivity is false/0. The path can be relative to the WebKit location, or an absolute path.
ActivityLogColumns
    -->
['request.remoteAddress', 'request.method', 'request.uri', 'response.size', 'servlet.name', 'request.timeStamp', 'transaction.duration', 'transaction.errorOccurred']

Specifies the columns that will be stored in the activity log. Each column can refer to an object from the set [application, transaction, request, response, servlet, session] and then refer to its attributes using "dot notation". The attributes can be methods or instance attributes and can be qualified arbitrarily deep.
ShowDebugInfoOnErrors
    = 1
If true, then uncaught exceptions will not only display a message for the user, but debugging information for the developer as well. This includes the traceback, HTTP headers, form fields, environment and process ids.
UserErrorMessage
    -->
'The site is having technical difficulties with this page. An error has been logged, and the problem will be fixed as soon as possible. Sorry!'

This is the error message that is displayed to the user when an uncaught exception escapes the target CGI script.
ErrorLogFilename
    = 'Logs/Errors.csv'
This is the name of the file where exceptions are logged. Each entry contains the date & time, filename, pathname, exception name & data, and the HTML error message filename (assuming there is one).
SaveErrorMessages
    = 1
If true, then errors (e.g., uncaught exceptions) will produce an HTML file with both the user message and debugging information. Developers/administrators can view these files after the fact, to see the details of what went wrong.
ErrorMessagesDir
    = 'ErrorMsgs'
This is the name of the directory where HTML error messages get stored.
EmailErrors
    = 0
If true, error messages are e-mail out according to the ErrorEmailServer and ErrorEmailHeaders settings. This setting defaults to false because the other settings need to be configured first.
ErrorEmailServer
    = 'mail.-.com'
The SMTP server to use for sending e-mail error messages.
ErrorEmailHeaders
    -->
{
    'From':         '-@-.com',
    'To':           ['-@-.com'],
    'Reply-to':     '-@-.com',
    'Content-type': 'text/html',
    'Subject':      'Error'
}

The e-mail MIME headers used for e-mailing error messages. Be sure to configure 'From', 'To' and 'Reply-to' before using this feature.
Contexts
    -->
{ 'default':'Examples', 'Examples':'Examples'}

Provides the directory locations for various 'contexts' which are determined from the URL. The key is the context name and the value is the directory, which can be a relative (to WebKit) or absolute path. The special key 'default' is used when a URL is not context specific.
SessionTimeout
    = 60*60
Determines the amount of time (expressed in seconds) that passes before a user's session will timeout. When a session times out, all data associated with that session is lost.
InstanceCacheSize
    = 10
Determines the maximum number of servlet instances that will be kept in memory to serve a particular URL. This limit keeps the server from overflowing objects under the stress of many simultaneous requests.

Administration

WebKit has a built-in administration page that you can access by specifying _admin in the URL. This gives basic information and provides links to other administrative features like viewing the logs and configuration.

The error log display also contains links to the archived error messages so that you can browse them.

The adminstration scripts provide further examples of writing pages with WebKit, so you may wish to examine their source.

Here's an example of the admin page.

The URL might look something like:

http://www.host.com/WebKit.cgi/_admin

Debugging

As will all software development, you will need to debug your web application. The most popular techniques are detailed below.

print

The most common technique is the infamous print statement. The results of print statements go to the console where the WebKit application server was started (not to the HTML page as would happen with CGI). Prefixing the debugging output with a special tag (such as >>) is useful because it stands out on the console and you can search for the tag in source code to remove the print statements after they are no longer useful. For example:

print '>> fields =', self._request.fields()

Raising Exceptions

Uncaught expections are trapped at the application level where a useful error page is saved with information such as the traceback, environment, fields, etc. You can configure the application to automatically e-mail you this information. Here is an example error page.

When an application isn't behaving correctly, raising an exception can be useful because of the additional information that comes with it. Exceptions can be coupled with messages, thereby turning them into more powerful versions of the print statement. For example:

raise Exception, 'self = %s' % self

Restarting the Server

Servlet super classes don't get reloaded by the AppServer. Also, Webware can still be unstable due to it's early age. If a problem becomes unsolvable, restart the server and see if it still occurs on the first attempt.

Assertions

Assertions are used to ensure that the internal conditions of the application are as expected. An assertion is equivalent to an if statement coupled with an exception. For example:

assert shoppingCart.total()>=0.0, 'shopping cart total is %0.2f' % shoppingCart.total()

Naming Conventions

Cookies and form values that are named with surrounding underscores (such as _sid_ and _action_) are reserved by WebKit for it's own internal purposes. If you refrain from using surrounding underscores in your own names, then [a] you won't accidentally clobber an already existing internal name and [b] when new names are introduced by future versions of WebKit, they won't break your application.

Actions

Suppose you have a web page with a form and one or more buttons. Normally, when the form is submitted, a method such as Servlet's respondToPost() or Page's writeHTML(), will be invoked. However, you may find it more useful to bind the button to a specific method of your servlet such as new(), remove() etc. to implement the command, and reserve writeHTML() for displaying the page. Note that your "command methods" can then invoke writeHTML() after performing their task.

The action feature of Page let's you do this. The process goes like this:

1. Add buttons to your HTML form of type submit and name _action_. For example:

<input name=_action_ type=submit value=New>
<input name=_action_ type=submit value=Delete>

2. Add an actions() method to your class to state which actions are valid. This is security requirement is important. Without it, hackers could invoke any servlet method they wanted! For example:

def actions(self): return SuperClass.actions(self) + ['New', 'Delete']

3. Unfortunately, the HTML submit button does not separate it's value from it's title/label. If your button labels don't match your method names, you will need to implement methodNameForAction() to provide the mapping. You could simply use a dictionary to create the mapping, or if you know there is some relationship you could write the logic for it. For example,

def methodNameForAction(self, name): return string.lower(name)

4. Now you implement your action methods.

The ListBox example shows the use of actions.

Plug-ins

A plug-in is a software component that is loaded by WebKit in order to provide additional WebKit functionality without necessarily having to modify WebKit's source.

The most infamous plug-in is PSP (Python Server Pages) which ships with Webware.

Plug-ins often provide additional servlet factories, servlet subclasses, examples and documentation. Ultimately, it is the plug-in author's choice as to what to provide and in what manner.

Technically, plug-ins are Python packages that follow a few simple conventions in order to work with WebKit. More information can be found in PlugIn.py's doc strings. You can learn more about Python packages in the Python Tutorial, 6.4: "Packages" at http://www.python.org/doc/current/tut/node8.html#SECTION008400000000000000000).

Files

*.py - Mostly the implementation of the various classes of the framework, except as noted below.

WebKit.cgi - A simple script that imports and runs WebKitCGIAdaptor. This also results in a cached byte code version of WebKitCGIAdapator (e.g., a .pyc) (provided you have write permission, of course).

CGIAdaptor.py - A program that collects HTTP requests together in a dictionary and passes them to the app server for processing. This allows WebKit applications to work with any web server that supports CGI.

FCGIAdaptor.py - Like CGIAdaptor, but in accordance with the FastCGI protocol for improved performance.

address.text - The hostname and port of the AppServer are written to this file when it starts up. The CGI and FastCGI adaptors read this in order to communicate with the AppServer.

Config/AppServer.config - configuration file for the AppServer

Config/Application.config - configuration file for the Application

Logs/Access.csv - The log of servlets/pages accessed through the server as described above.

Logs/Errors.csv - The log of uncaught exceptions including date & time, exception and link to debugging page if one was saved.

ErrorMsgs/Error-scriptname-YYYY-MM-DD-*.py - Archived error messages.

How do I develop an app?

The answer to that question might not seem clear after being deluged with all the details. Here's a summary:

  1. Make sure you can run the WebKit AppServer. See the Install Guide for more information.
  2. Read the source to the examples (in WebKit/Examples), then modify one of them to get your toes wet.
  3. Create your own new example from scratch. Ninety-nine percent of the time you will be subclassing the Page class.
  4. Familiarize yourself with the class docs in order to take advantage of classes like Page, HTTPRequest, HTTPResponse and Session. Unfortunately, I couldn't get generated class docs working for this release, so you'll have to resort to breezing through the source code which is coupled with documentation strings. Read the examples first.
  5. With this additional knowledge, create more sophisticated pages.
  6. Contribute enhancements and bug fixes back to the project.   :-)
  7. Make sure you find out about new versions when they're released: Subscribe to the announcements list at http://lists.sourceforge.net/mailman/listinfo/webware-announce.

Known Bugs

Known bugs and future work in general, is documented in Future.html.

Credit

Authors: Chuck Esterbrook, Jay Love

Several people, mostly on the webware-discuss mailing list, have provided feedback and testing.

The design was inspired by both Apple's WebObjects and Java's Servlets.

The Python SocketServer module was useful in creating the app server.