rws documentation index

Shared object scripts

Shared object scripts are a possibly unique feature of rws. A shared object script is a CGI script, written in C, which is loaded into the address space of the server at runtime. Thus shared object scripts are very fast because they are written in C, loaded just once, and able to run without needing a fork(2).

On the other hand, the penalty for speed is security, although competent C programmers who are using all the features of c2lib and pthrlib should be able to write code which is free of buffer overflows and some other common security issues. (However if you allow your server to run shared object scripts from untrusted third parties, then you have essentially no security at all, since shared object scripts can interfere with the internal workings of the webserver in arbitrary ways).

The anatomy of a shared object script

A shared object script is a .so file (in other words, a shared library or DLL). It should contain a single external symbol called handle_request, prototyped as:

int handle_request (rws_request rq);

The rws_request object is defined in rws_request.h.

The first time that any client requests the shared object script, rws calls dlopen(3) on the file. As noted in the dlopen(3) manual page, this will cause _init and any constructor functions in the file to be run. Then rws creates the rws_request object (see below) and calls handle_request. The shared object script remains loaded in memory after handle_request has returned, ready for the next invocation.

On subsequent invocations, dlopen(3) is not called, so constructors only run once.

However, on each invocation, rws checks the modification time of the file on disk, and if it has changed, then it will attempt to reload the file. To do this, it calls dlclose(3) first, which will cause _fini and destructors in the library to run, and unloads the library from memory. It then reopens (dlopen(3)) the new file on disk, as above. Beware that there are some occasions when rws actually cannot reload a shared object script, even though it notices that the file has changed on disk. rws keeps a use count of the number of threads currently using the shared object script, and for safety reasons it cannot reload the file until this usage count drops to zero. This means that in some cases (eg. under very heavy load) a shared object script might never be reloaded, even if it changes on disk.

Configuring rws to recognise shared object scripts

rws will not try to run shared object scripts unless the exec so flag has been set on the alias, and the shared object script itself is executable (mode 0755). Here is an example shared object scripts directory:

alias /so-bin/
	path: /usr/share/rws/so-bin
	exec so: 1
end alias

Make sure that the so-bin directory is only writable by trusted users, and make sure each shared object script is executable, mode 0755.

If you can't make your shared object scripts run, then here is a checklist before you email me:

I have quite successfully used gdb on a running server to debug and diagnose problems in shared object scripts. However note that by default gdb may have trouble loading the symbol table for your script. Use the sharedlibrary script.so command to load symbols instead.

Shared object scripts vs. Monolith applications

If you've been looking at the Monolith application framework pages, then you may be confused about how shared object scripts relate to Monolith.

Shared object scripts are the direct analogy to CGI scripts, the only difference being that CGI scripts are usually written in very high level languages like Perl and PHP, and shared object scripts are loaded into the server process for efficiency. (Perl CGI scripts can also be loaded into the Apache server process using mod_perl, and this is done for similar reasons of efficiency).

Monolith programs are entire applications, the sort of thing which normally would be written using dozens of cooperating CGI scripts. In the case of Monolith, however, the entire application compiles down to a single .so file which happens to be (you guessed it) a shared object script.

Imagine that you are going to write yet another web-based email client. For some reason you want to write this in C (please don't try this at home: I wrote one in Perl at my last job and that was hard enough). Here are three possible approaches using C and rws:

  1. Write forty or so shared object scripts. Each displays a single frame of the application, one might generate the frameset, a couple of dozen to implement specific operations like emptying trash or moving a message between folders.

    This is very much the normal way of writing CGI-based applications.

  2. Write a Monolith application. This will probably be in lots of C files, but will compile down and be linked into a single .so file (eg. email.so) which is dropped into the so-bin directory.
  3. Write a Monolith email super-widget. This is going to exist in a shared library called /usr/lib/libmyemail.so with a corresponding header file defining the interface called myemail.h.

    Write a tiny Monolith application which just instantiates a window and an email widget, and embeds the email widget in the window. This will compile into email.so (it'll be very tiny) which is dropped into so-bin.

    The advantage of this final approach is that you can reuse the email widget in other places, or indeed sell it to other Monolith users.

So Monolith is good when you want to build applications from widgets as you would if you were building a Java/Swing, Windows MFC, gtk, Tcl/Tk graphical application. It's also good if code re-use is important to you. Shared object scripts are good when you are familiar with CGI-based techniques to build websites.

Of course, the same rws server can serve shared object scripts, multiple Monolith applications, flat files, and directory listings, all at the same time.

Tutorial on writing shared object scripts

In this tutorial I will explain how the two shared object script examples supplied with rws work. You will also need to have read the tutorials for c2lib and pthrlib which you can find by going to their respective web pages.

The first example, hello.c is very simple indeed. It's just a "hello world" program. The program starts by including rws_request.h:

#include <rws_request.h>

Following this is the handle_request function. This is the function which rws will call every time a user requests the script:

int
handle_request (rws_request rq)
{
  pseudothread pth = rws_request_pth (rq);
  http_request http_request = rws_request_http_request (rq);
  io_handle io = rws_request_io (rq);

  int close;
  http_response http_response;

  /* Begin response. */
  http_response = new_http_response (pth, http_request, io,
				     200, "OK");
  http_response_send_headers (http_response,
			      /* Content type. */
			      "Content-Type", "text/plain",
			      /* End of headers. */
			      NULL);
  close = http_response_end_headers (http_response);

  if (http_request_is_HEAD (http_request)) return close;

  io_fprintf (io, "hello, world!");

  return close;
}

We first extract some fields from the rws_request object. rws has already taken the time to parse the HTTP headers from the client, but we need to generate the reply headers (shared object scripts are always "nph" -- no parsed headers). The pthrlib functions new_http_response, http_response_send_headers and http_response_end_headers do this. Note that we send a Content-Type: text/plain header. You must always generate a correct Content-Type header.

If the original request was a HEAD request, then the client only wants to see the headers, so we stop here.

Otherwise we generate our message and return.

NB. Don't call io_fclose on the I/O handle! If you really want to force the connection to close, set the close variable to 1 and return it. This is because the client (or proxy) might be issuing several separate HTTP requests over the same kept-alive TCP connection.

The second example, show_params.c, is just slightly more complex, but demonstrates how to do parameter parsing. After reading this you should have enough knowledge to go away and write your own shared object scripts that actually do useful stuff.

As before, we start by including a few useful headers:

#include <pool.h>
#include <vector.h>
#include <pthr_cgi.h>

#include <rws_request.h>

The handle_request function starts the same way as before:

int
handle_request (rws_request rq)
{
  pool pool = rws_request_pool (rq);
  pseudothread pth = rws_request_pth (rq);
  http_request http_request = rws_request_http_request (rq);
  io_handle io = rws_request_io (rq);

Then we define some variables that we're going to use:

  cgi cgi;
  int close, i;
  http_response http_response;
  vector headers, params;
  struct http_header header;
  const char *name, *value;

The actual job of parsing out the CGI parameters is simplified because pthrlib contains a CGI library (similar to Perl's CGI.pm):

  /* Parse CGI parameters. */
  cgi = new_cgi (pool, http_request, io);

The response phase begins by sending the HTTP headers as before:

  /* Begin response. */
  http_response = new_http_response (pth, http_request, io,
				     200, "OK");
  http_response_send_headers (http_response,
			      /* Content type. */
			      "Content-Type", "text/plain",
			      /* End of headers. */
			      NULL);
  close = http_response_end_headers (http_response);

  if (http_request_is_HEAD (http_request)) return close;

Now we print out the actual contents of both the http_request object and the cgi object. HTTP headers first:

  io_fprintf (io, "This is the show_params shared object script.\r\n\r\n");
  io_fprintf (io, "Your browser sent the following headers:\r\n\r\n");

  headers = http_request_get_headers (http_request);
  for (i = 0; i < vector_size (headers); ++i)
    {
      vector_get (headers, i, header);
      io_fprintf (io, "\t%s: %s\r\n", header.key, header.value);
    }

  io_fprintf (io, "----- end of headers -----\r\n");

The full URL (including the query string), the path alone, the query string:

  io_fprintf (io, "The URL was: %s\r\n",
	      http_request_get_url (http_request));
  io_fprintf (io, "The path component was: %s\r\n",
	      http_request_path (http_request));
  io_fprintf (io, "The query string was: %s\r\n",
	      http_request_query_string (http_request));
  io_fprintf (io, "The query arguments were:\r\n");

Finally we print out the CGI parameters from the cgi object:

  params = cgi_params (cgi);
  for (i = 0; i < vector_size (params); ++i)
    {
      vector_get (params, i, name);
      value = cgi_param (cgi, name);
      io_fprintf (io, "\t%s=%s\r\n", name, value);
    }

  io_fprintf (io, "----- end of parameters -----\r\n");

  return close;
}

Further examples

That's the end of this tutorial. I hope you enjoyed it. Please contact the author about corrections or to obtain more information.

Links to manual pages


Richard Jones
Last modified: Wed Oct 9 20:02:40 BST 2002