1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
4 <title>rws documentation index</title>
5 <style type="text/css"><!--
10 background-color: #eeeeff;
19 <body bgcolor="#ffffff">
20 <h1>rws documentation index</h1>
22 <h2>Shared object scripts</h2>
25 Shared object scripts are a possibly unique feature of <code>rws</code>.
26 A shared object script is a CGI script, written in C, which
27 is loaded into the address space of the server at runtime.
28 Thus shared object scripts are very fast because they are
29 written in C, loaded just once, and able to run without
30 needing a <code>fork(2)</code>.
34 On the other hand, the penalty for speed is security, although
35 competent C programmers who are using all the features of
36 <a href="http://www.annexia.org/freeware/c2lib/">c2lib</a> and
37 <a href="http://www.annexia.org/freeware/pthrlib/">pthrlib</a>
38 should be able to write code which is free of buffer overflows
39 and some other common security issues. (However if you allow
40 your server to run shared object scripts from untrusted
41 third parties, then you have essentially no security at all, since
42 shared object scripts can interfere with the internal workings
43 of the webserver in arbitrary ways).
46 <h3>The anatomy of a shared object script</h3>
49 A shared object script is a <q><code>.so</code></q>
50 file (in other words, a shared library or <q>DLL</q>).
51 It should contain a single external symbol called
52 <code>handle_request</code>, prototyped as:
56 int handle_request (rws_request rq);
60 The <code>rws_request</code> object is defined in
61 <code>rws_request.h</code>.
65 The first time that any client requests the shared
66 object script, <code>rws</code> calls <code>dlopen(3)</code>
67 on the file. As noted in the <code>dlopen(3)</code>
68 manual page, this will cause <code>_init</code> and any
69 constructor functions in the file to be run.
70 Then <code>rws</code> creates the <code>rws_request</code>
71 object (see below) and calls <code>handle_request</code>.
72 The shared object script remains loaded in memory
73 after <code>handle_request</code> has returned, ready
74 for the next invocation.
78 On subsequent invocations, <code>dlopen(3)</code> is
79 <em>not</em> called, so constructors only run once.
83 However, on each invocation, <code>rws</code> checks the
84 modification time of the file on disk, and if it has
85 changed, then it will attempt to reload the file. To
86 do this, it calls <code>dlclose(3)</code> first, which
87 will cause <code>_fini</code> and destructors in the
88 library to run, and unloads the library from memory. It
89 then reopens (<code>dlopen(3)</code>) the new file on
90 disk, as above. Beware that there are some occasions when
91 <code>rws</code> actually cannot reload a shared object
92 script, even though it notices that the file has changed
93 on disk. <code>rws</code> keeps a use count of the number
94 of threads currently using the shared object script, and
95 for safety reasons it cannot reload the file until this
96 usage count drops to zero. This means that in some cases
97 (eg. under very heavy load) a shared object script might
98 never be reloaded, even if it changes on disk.
101 <h3>Configuring rws to recognise shared object scripts</h3>
104 <code>rws</code> will not try to run shared object scripts
105 unless the <code>exec so</code> flag has been set on the
106 alias, and the shared object script itself is executable (mode 0755).
107 Here is an example shared object scripts directory:
112 path: /usr/share/rws/so-bin
118 Make sure that the <code>so-bin</code> directory is only
119 writable by trusted users, and make sure each shared object
120 script is executable, mode 0755.
124 If you can't make your shared object scripts run, then here
125 is a checklist before you email me:
129 <li> Make sure you have put the above alias section into
130 the correct host file.
131 <li> <code>exec so</code> option is set?
132 <li> Restarted <code>rwsd</code>?
133 <li> Directory is world readable, executable (mode 0755)?
134 <li> Shared object script is world readable, executable (mode 0755)?
135 <li> Any unresolved symbols (<code>ldd -r script.so</code>), apart
136 from the <code>rws_request_*</code> symbols which will be resolved
137 when the library is loaded into <code>rws</code>?
138 <li> Missing <code>handle_request</code> function?
139 <li> <code>handle_request</code> is exported in the dynamic
140 symbol table (<code>nm -D script.so</code>)?
141 <li> Check the contents of your error_log file to see
142 if any error messages were reported.
146 I have quite successfully used <code>gdb</code> on a running
147 server to debug and diagnose problems in shared object
148 scripts. However note that by default <code>gdb</code> may
149 have trouble loading the symbol table for your script. Use
150 the <code>sharedlibrary script.so</code>
151 command to load symbols instead.
154 <h3>Shared object scripts vs. Monolith applications</h3>
157 If you've been looking at the
158 <a href="http://www.annexia.org/freeware/monolith/">Monolith
159 application framework</a> pages, then you may be confused
160 about how shared object scripts relate to Monolith.
164 Shared object scripts are the direct analogy to CGI scripts,
165 the only difference being that CGI scripts are usually written
166 in very high level languages like Perl and PHP, and shared
167 object scripts are loaded into the server process for efficiency.
168 (Perl CGI scripts can also be loaded into the Apache
169 server process using <code>mod_perl</code>, and this is done
170 for similar reasons of efficiency).
174 Monolith programs are entire applications, the sort of
175 thing which normally would be written using dozens of
176 cooperating CGI scripts. In the case of Monolith, however,
177 the entire application compiles down to a single <code>.so</code>
178 file which happens to be (you guessed it) a shared object script.
182 Imagine that you are going to write yet another web-based email
183 client. For some reason you want to write this in C (please
184 don't try this at home: I wrote one in Perl at my last job and
185 that was hard enough). Here are three possible approaches
186 using C and <code>rws</code>:
192 Write forty or so shared object scripts. Each displays
193 a single frame of the application, one might generate
194 the frameset, a couple of dozen to implement specific
195 operations like emptying trash or moving a message between
199 This is very much the normal way of writing CGI-based
202 <li> Write a Monolith application. This will probably be
203 in lots of C files, but will compile down and be linked
204 into a single <code>.so</code> file (eg. <code>email.so</code>)
205 which is dropped into the <code>so-bin</code> directory.
208 Write a Monolith email super-widget. This is going
209 to exist in a shared library called
210 <code>/usr/lib/libmyemail.so</code>
211 with a corresponding header file defining the interface
212 called <code>myemail.h</code>.
215 Write a tiny Monolith application which just instantiates
216 a window and an email widget, and embeds the email widget
217 in the window. This will compile into <code>email.so</code>
218 (it'll be very tiny) which is dropped into <code>so-bin</code>.
221 The advantage of this final approach is that you can
222 reuse the email widget in other places, or indeed sell
223 it to other Monolith users.
228 So Monolith is good when you want to build applications
229 from widgets as you would if you were building a
230 Java/Swing, Windows MFC, gtk, Tcl/Tk graphical application.
231 It's also good if code re-use is important to you.
232 Shared object scripts are good when you are familiar with
233 CGI-based techniques to build websites.
237 Of course, the same <code>rws</code> server can serve
238 shared object scripts, multiple Monolith applications,
239 flat files, and directory listings, all at the same time.
242 <h3>Tutorial on writing shared object scripts</h3>
245 In this tutorial I will explain how the two shared object
246 script examples supplied with <code>rws</code> work. You
247 will also need to have read the tutorials for
248 <a href="http://www.annexia.org/freeware/c2lib/">c2lib</a> and
249 <a href="http://www.annexia.org/freeware/pthrlib/">pthrlib</a>
250 which you can find by going to their respective web pages.
254 The first example, <code>hello.c</code> is very simple indeed.
255 It's just a "hello world" program. The program starts by
256 including <code>rws_request.h</code>:
260 #include <rws_request.h>
264 Following this is the <code>handle_request</code>
265 function. This is the function which <code>rws</code>
266 will call every time a user requests the script:
271 handle_request (rws_request rq)
273 pseudothread pth = rws_request_pth (rq);
274 http_request http_request = rws_request_http_request (rq);
275 io_handle io = rws_request_io (rq);
278 http_response http_response;
280 /* Begin response. */
281 http_response = new_http_response (pth, http_request, io,
283 http_response_send_headers (http_response,
285 "Content-Type", "text/plain",
286 /* End of headers. */
288 close = http_response_end_headers (http_response);
290 if (http_request_is_HEAD (http_request)) return close;
292 io_fprintf (io, "hello, world!");
299 We first extract some fields from the <code>rws_request</code>
300 object. <code>rws</code> has already taken the time to
301 parse the HTTP headers from the client, but we need to
302 generate the reply headers (shared object scripts
303 are always "nph" -- no parsed headers). The
304 <code>pthrlib</code> functions
305 <code>new_http_response</code>,
306 <code>http_response_send_headers</code> and
307 <code>http_response_end_headers</code> do this. Note
308 that we send a <code>Content-Type: text/plain</code>
309 header. You must always generate a correct
310 <code>Content-Type</code> header.
314 If the original request was a <code>HEAD</code> request, then
315 the client only wants to see the headers, so we stop here.
319 Otherwise we generate our message and return.
323 NB. Don't call <code>io_fclose</code> on the I/O handle! If you
324 really want to force the connection to close, set the
325 <code>close</code> variable to 1 and return it. This is
326 because the client (or proxy) might be issuing several
327 separate HTTP requests over the same kept-alive TCP connection.
331 The second example, <code>show_params.c</code>, is just slightly
332 more complex, but demonstrates how to do parameter parsing.
333 After reading this you should have enough knowledge to
334 go away and write your own shared object scripts that
335 actually do useful stuff.
339 As before, we start by including a few useful headers:
343 #include <pool.h>
344 #include <vector.h>
345 #include <pthr_cgi.h>
347 #include <rws_request.h>
351 The <code>handle_request</code> function starts the same way
357 handle_request (rws_request rq)
359 pool pool = rws_request_pool (rq);
360 pseudothread pth = rws_request_pth (rq);
361 http_request http_request = rws_request_http_request (rq);
362 io_handle io = rws_request_io (rq);
366 Then we define some variables that we're going to use:
372 http_response http_response;
373 vector headers, params;
374 struct http_header header;
375 const char *name, *value;
379 The actual job of parsing out the CGI parameters is simplified
380 because <code>pthrlib</code> contains a CGI library
381 (similar to Perl's <code>CGI.pm</code>):
385 /* Parse CGI parameters. */
386 cgi = new_cgi (pool, http_request, io);
390 The response phase begins by sending the HTTP
395 /* Begin response. */
396 http_response = new_http_response (pth, http_request, io,
398 http_response_send_headers (http_response,
400 "Content-Type", "text/plain",
401 /* End of headers. */
403 close = http_response_end_headers (http_response);
405 if (http_request_is_HEAD (http_request)) return close;
409 Now we print out the actual contents of both the
410 <code>http_request</code> object and the <code>cgi</code>
411 object. HTTP headers first:
415 io_fprintf (io, "This is the show_params shared object script.\r\n\r\n");
416 io_fprintf (io, "Your browser sent the following headers:\r\n\r\n");
418 headers = http_request_get_headers (http_request);
419 for (i = 0; i < vector_size (headers); ++i)
421 vector_get (headers, i, header);
422 io_fprintf (io, "\t%s: %s\r\n", header.key, header.value);
425 io_fprintf (io, "----- end of headers -----\r\n");
429 The full URL (including the query string), the path alone,
434 io_fprintf (io, "The URL was: %s\r\n",
435 http_request_get_url (http_request));
436 io_fprintf (io, "The path component was: %s\r\n",
437 http_request_path (http_request));
438 io_fprintf (io, "The query string was: %s\r\n",
439 http_request_query_string (http_request));
440 io_fprintf (io, "The query arguments were:\r\n");
444 Finally we print out the CGI parameters from the <code>cgi</code>
449 params = cgi_params (cgi);
450 for (i = 0; i < vector_size (params); ++i)
452 vector_get (params, i, name);
453 value = cgi_param (cgi, name);
454 io_fprintf (io, "\t%s=%s\r\n", name, value);
457 io_fprintf (io, "----- end of parameters -----\r\n");
463 <h2>Further examples</h2>
466 That's the end of this tutorial. I hope you enjoyed it. Please
467 contact the author about corrections or to obtain more information.
470 <h2>Links to manual pages</h2>
473 <li> <a href="rws_request_canonical_path.3.html"><code>rws_request_canonical_path(3)</code></a> </li>
474 <li> <a href="rws_request_file_path.3.html"><code>rws_request_file_path(3)</code></a> </li>
475 <li> <a href="rws_request_host_header.3.html"><code>rws_request_host_header(3)</code></a> </li>
476 <li> <a href="rws_request_http_request.3.html"><code>rws_request_http_request(3)</code></a> </li>
477 <li> <a href="rws_request_io.3.html"><code>rws_request_io(3)</code></a> </li>
478 <li> <a href="rws_request_pool.3.html"><code>rws_request_pool(3)</code></a> </li>
479 <li> <a href="rws_request_pth.3.html"><code>rws_request_pth(3)</code></a> </li>
483 <address><a href="mailto:rich@annexia.org">Richard Jones</a></address>
484 <!-- Created: Wed May 1 19:36:16 BST 2002 -->
486 Last modified: Wed Oct 9 20:02:40 BST 2002