c2lib
is a library of basic tools for use by C
programmers. It contains features heavily influenced by
both Perl's string handling and C++'s Standard Template
Library (STL).
The primary aims of c2lib
are:
#include <pool.h> #include <pstring.h> const char *strings[] = { "John", "Paul", "George", "Ringo" }; 5 main () { pool pool = global_pool; vector v = pvectora (pool, strings, 4); 10 printf ("Introducing the Beatles: %s\n", pjoin (pool, v, ", ")); }
When run, this program prints:
Introducing the Beatles: John, Paul, George, Ringo
Compare this to the equivalent Perl code:
#!/usr/bin/perl printf "Introducing the Beatles: %s\n", join(", ", "John", "Paul", "George", "Ringo");
The pjoin(3)
function on line 10 is
equivalent to the plain join
function
in Perl. It takes a list of strings and joins them
with a separator string (in this case ", "
),
and creates a new string which is returned and printed.
The pvectora(3)
function (line 9) takes a normal C
array of strings and converts it into a c2lib
vector
. You will find out more about
vector
s later.
In this case all our allocations are done in a standard
pool which is created automatically before main
is
called and deleted after main
returns. This pool is
called global_pool(3)
. You will find out
more about pool
s below.
Notice that, as with most c2lib
programs, there is
no need to explicitly deallocate (free) objects once you
have finished using them. Almost all of the time, objects
are freed automatically for you by the system.
#include <pool.h> #include <vector.h> #include <pstring.h> 5 main () { pool pool = global_pool; vector v = new_vector (pool, int); int i, prod = 1; 10 for (i = 1; i <= 10; ++i) vector_push_back (v, i); for (i = 0; i < vector_size (v); ++i) 15 { int elem; vector_get (v, i, elem); prod *= elem; 20 } printf ("product of integers: %s = %d\n", pjoin (pool, pvitostr (pool, v), " * "), prod); 25 }
When run:
product of integers: 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8 * 9 * 10 = 3628800
The call to new_vector(3)
on line 8 creates a new
vector object (abstract data type). In this case the vector is
allocated in the global pool and you have told it that each
element of the vector will be of type int
. Vectors
are arrays which automatically expand when you push elements
onto them. This vector behaves very much like a C++ STL
vector<int>
or a Perl array.
On lines 11-12, we push the numbers 1 through to 10 into
the vector. The vector_push_back(3)
function
pushes an element onto the end of the vector. There are
also vector_pop_back(3)
(removes and returns the
last element of a vector), vector_push_front(3)
and vector_pop_front(3)
operations.
Lines 14-20 show the general pattern for iterating over
the elements in a vector. The call to vector_get
(line 18) returns the i
th element of vector v
into variable elem
.
Finally lines 22-24 print out the result. We use the
pjoin(3)
function again to join the numbers
with the string " * "
between each pair.
Also note the use of the strange pvitostr(3)
function. pjoin(3)
is expecting a vector
of strings (ie. a vector of char *
), but
we have a vector of int
, which is
incompatible. The pvitostr(3)
function
promotes a vector of integers into a vector of strings.
The c2lib
library stores vectors as arrays and
reallocates them using prealloc(3)
whenever it
needs to expand them. This means that certain operations on
vectors are efficient, and some other operations are less
efficient. Getting an element of a vector or replacing an
element in the middle of a vector are both fast O(1) operations,
equivalent to the ordinary C index ([]) operator.
vector_push_back(3)
and
vector_pop_back(3)
are also fast. However
vector_push_front(3)
and
vector_pop_front(3)
are O(n) operations because
they require the library to shift up all the elements in the
array by one place. Normally however if your vectors are very
short (say, fewer than 100 elements), the speed difference will
not be noticable, whereas the productivity gains from using
vectors over hand-rolled linked lists or other structures will
be large. The vector
type also allows you
to insert and remove elements in the middle of the array,
as shown in the next example below:
#include <pool.h> #include <vector.h> #include <pstring.h> 5 main () { pool pool = global_pool; vector v = pvector (pool, "a", "b", "c", "d", "e", 10 "f", "g", "h", "i", "j", 0); const char *X = "X"; printf ("Original vector contains: %s\n", pjoin (pool, v, ", ")); 15 vector_erase_range (v, 3, 6); printf ("After erasing elements 3-5, vector contains: %s\n", pjoin (pool, v, ", ")); 20 vector_insert (v, 3, X); vector_insert (v, 4, X); vector_insert (v, 5, X); 25 printf ("After inserting 3 Xs, vector contains: %s\n", pjoin (pool, v, ", ")); vector_clear (v); vector_fill (v, X, 10); 30 printf ("After clearing and inserting 10 Xs, vector contains: %s\n", pjoin (pool, v, ", ")); }
When run:
Original vector contains: a, b, c, d, e, f, g, h, i, j After erasing elements 3-5, vector contains: a, b, c, g, h, i, j After inserting 3 Xs, vector contains: a, b, c, X, X, X, g, h, i, j After clearing and inserting 10 Xs, vector contains: X, X, X, X, X, X, X, X, X, X
This example demonstrates the following functions:
vector_erase_range(3)
which is used
to erase a range of elements from the middle of
a vector.
vector_insert(3)
which is used to
insert single elements into a vector.
vector_clear(3)
which completely
clears the vector - removing all elements.
vector_fill(3)
which fills a vector
with identical elements.
For more information, see the respective manual pages.
You can store just about anything in a vector: strings, pointers, wide integers, complex structures, etc. If you do want to directly store large objects in a vector, you must remember that the vector type actually copies those objects into and out of the vector each time you insert, push, get, pop and so on. For some large structures, you may want to store a pointer instead (in fact with strings you have no choice: you are always storing a pointer in the vector itself).
char *
c2lib
doesn't have a fancy string type.
Instead we just use plain old char *
. This
is possible because pools (see below) mean that we don't
need to worry about when to copy or deallocate specific
objects.
The great benefit of using plain char *
for strings is that we can continue to use the
familiar libc functions such as strcmp(3)
,
strcpy(3)
, strlen(3)
, printf(3)
and so on, as in the next example.
#include <assert.h> #include <pstring.h> char *given_name = "Richard"; 5 char *family_name = "Jones"; char *email_address = "rich@annexia.org"; main () { 10 pool pool = global_pool; char *email, *s; vector v; email = 15 psprintf (pool, "%s %s <%s>", given_name, family_name, email_address); printf ("full email address is: %s\n", email); v = pstrcsplit (pool, email, ' '); 20 printf ("split email into %d components\n", vector_size (v)); vector_get (v, 0, s); printf ("first component is: %s\n", s); 25 assert (strcmp (s, given_name) == 0); vector_get (v, 1, s); printf ("second component is: %s\n", s); assert (strcmp (s, family_name) == 0); 30 vector_get (v, 2, s); printf ("third component is: %s\n", s); s = pstrdup (pool, s); s++; 35 s[strlen(s)-1] = '\0'; assert (strcmp (s, email_address) == 0); }
When run:
full email address is: Richard Jones <rich@annexia.org> split email into 3 components first component is: Richard second component is: Jones third component is: <rich@annexia.org>
Line 15 demonstrates the psprintf(3)
function
which is like the ordinary sprintf(3)
,
but is (a) safe, and (b) allocates the string in the
pool provided, ensuring that it will be safely deallocated
later.
The pstrcsplit(3)
function is similar to the
Perl split
. It takes a string and splits it
into a vector of strings, in this case on the space
character. There are also other functions for splitting
on a string or on a regular expression.
The final part of the code, lines 21-36, prints out
the components of the split string. The vector_get(3)
function is used to pull the strings out of the vector object.
Notice on line 33 that before we remove the beginning
and end < ... > from around the email address,
we first duplicate the string using pstrdup(3)
.
In this case it is not strictly necessary to duplicate
the string s
because we know that
pstrcsplit(3)
actually allocates new
copies of the strings in the vector which it returns.
However in general this is good practice because
otherwise we would be modifying the contents of the
original vector v
.
Hashes give you all the power of Perl's "%" hashes. In fact the way they work is very similar (but more powerful: unlike Perl's hashes the key does not need to be a string).
In c2lib
there are three flavors of hash.
However they all work in essentially the same way, and
all have exactly the same functionality. The reason for
having the three flavors is just to work around an obscure
problem with the ANSI C specification!
The three flavors are:
hash | A hash of any non-string type to any non-string type. |
---|---|
sash | A hash of char * to char * . |
shash | A hash of char * to any non-string type. |
As with vectors, the phrase "any non-string type" can be simple integers or chars, pointers, or complex large structures if you wish.
Here is a short program showing you how to use a sash (but note that the same functions are available for all of the other flavors):
#include <stdio.h> #include <hash.h> #include <pstring.h> 5 main () { pool pool = global_pool; sash h = new_sash (pool); char *fruit; 10 const char *color; sash_insert (h, "banana", "yellow"); sash_insert (h, "orange", "orange"); sash_insert (h, "apple", "red"); 15 sash_insert (h, "kiwi", "green"); sash_insert (h, "grapefruit", "yellow"); sash_insert (h, "pear", "green"); sash_insert (h, "tomato", "red"); sash_insert (h, "tangerine", "orange"); 20 for (;;) { printf ("Please type in the name of a fruit: "); fruit = pgetline (pool, stdin, 0); 25 if (sash_get (h, fruit, color)) printf ("The color of that fruit is %s.\n", color); else printf ("Sorry, I don't know anything about that fruit.\n"); 30 } }
When run:
Please type in the name of a fruit: orange The color of that fruit is orange. Please type in the name of a fruit: apple The color of that fruit is red. Please type in the name of a fruit: dragon fruit Sorry, I don't know anything about that fruit.
The sash is allocated on line 8 using the new_sash(3)
function.
We populate the sash using the simple sash_insert(3)
functions (lines 12-19).
The sash_get(3)
function retrieves a value
(color
) from
the sash using the key given (fruit
). It
returns true if a value was found, or false if there
was no matching key.
There are many potentially powerful functions available
for manipulating hashes, sashes and shashes (below,
*
stands for either "h", "s" or "sh"):
*ash_exists(3)
tells you if a key
exists. It is equivalent to the Perl exists
function.
*ash_erase(3)
removes a key. It
is equivalent to the Perl delete
function.
*ash_keys(3)
returns all of the
keys of a hash in a vector. It is equivalent to the
Perl keys
function.
*ash_values(3)
returns all of the
values of a hash in a vector. It is equivalent to the
Perl values
function.
*ash_size(3)
counts the number of keys.
So far we have only touched upon pools, and it may not be clear in the examples above why they don't in fact leak memory. There appears to be no deallocation being done, which is quite counter-intuitive to most C programmers!
Pools are collections of related objects (where an "object" is some sort of memory allocation).
In C you are normally responsible for allocating and deallocating every single object, like so:
p = malloc (size); /* ... use p ... */ free (p);
However in c2lib
we first allocate a pool,
then use pmalloc(3)
and prealloc(3)
to allocate lots of related objects in the pool.
At the end of the program, all of the objects can be
deleted in one go just by calling delete_pool(3)
.
There is one special pool, called global_pool(3)
.
This pool is created for you before main
is
called, and it is deleted for you after main
returns (or if exit(3)
is called). You don't
ever need to worry about deallocating global_pool(3)
(in fact, if you try to, your program might core dump).
Thus most short programs like the ones above should just
allocate all objects in global_pool(3)
, and
never need to worry about deallocating the objects or
the pool.
For larger programs, and programs that are expected to run for a long time like servers, you will need to learn about pools.
Pools are organised in a hierarchy. This means that you often allocate one pool inside another pool. Here is a common pattern:
main () { /* ... use global_pool for allocations here ... */ for (;;) /* for each request: */ { pool pool = new_subpool (global_pool); /* ... process the request using pool ... */ delete_pool (pool); } }
pool
is created as a subpool of
global_pool(3)
for the duration of
the request. At the end of the request the pool
(and therefore all objects inside it) is deallocated.
The advantage of creating pool
as a
subpool of global_pool(3)
is that if
the request processing code calls exit(3)
in the middle of the request, then global_pool(3)
will be deallocated in the normal way and as a consequence
of this pool
will also be properly deallocated.
You can also use new_pool(3)
to create a
completely new top-level pool. There are some rare
circumstances when you will need to do this, but
generally you should avoid creating pools which are
not subpools. If in doubt, always create subpools of
global_pool(3)
or of the pool immediately
"above" you.
Pools don't just store memory allocations. You can attach
other types of objects to pools, or trigger functions which
are run when the pool is deallocated. pool_register_fd(3)
attaches a file descriptor to a pool, meaning that the file
descriptor is closed when the pool is deleted (note
however that there is no way to unattach a file descriptor
from a pool, so don't go and call close(3)
on the file descriptor once you've attached it to
a pool. pool_register_cleanup_fn(3)
registers your own clean-up function which is called
when the pool is deleted. Although you should
normally use pmalloc(3)
and/or
prealloc(3)
to allocate objects directly
in pools, you can also allocate them normally using
malloc(3)
and attach them to the pool
using pool_register_malloc(3)
. The object
will be freed up automatically when the pool is
deallocated.
Pools become very important when writing multi-threaded
servers using the pthrlib
library. Each
thread processes a single request or command. A pool
is created for every thread, and is automatically
deleted when the thread exits. This assumes of course
that threads (and hence requests) are short-lived, which
is a reasonable assumption for most HTTP-like services.
(These manual pages are not always up to date. For the
latest documentation, always consult the manual pages
supplied with the latest c2lib
package!)
delete_pool(3)
global_pool(3)
new_pool(3)
new_subpool(3)
pcalloc(3)
pmalloc(3)
pool_get_stats(3)
pool_register_cleanup_fn(3)
pool_register_fd(3)
pool_register_malloc(3)
pool_set_bad_malloc_handler(3)
prealloc(3)
copy_vector(3)
new_vector(3)
vector_allocated(3)
vector_clear(3)
vector_compare(3)
vector_element_size(3)
vector_erase(3)
vector_erase_range(3)
vector_fill(3)
vector_get(3)
vector_get_ptr(3)
vector_grep(3)
vector_grep_pool(3)
vector_insert(3)
vector_insert_array(3)
vector_map(3)
vector_map_pool(3)
vector_pop_back(3)
vector_pop_front(3)
vector_push_back(3)
vector_push_front(3)
vector_reallocate(3)
vector_replace(3)
vector_replace_array(3)
vector_reverse(3)
vector_size(3)
vector_sort(3)
vector_swap(3)
copy_hash(3)
copy_sash(3)
copy_shash(3)
hash_erase(3)
hash_exists(3)
hash_get(3)
hash_get_buckets_allocated(3)
hash_get_buckets_used(3)
hash_get_ptr(3)
hash_insert(3)
hash_keys(3)
hash_keys_in_pool(3)
hash_set_buckets_allocated(3)
hash_size(3)
hash_values(3)
hash_values_in_pool(3)
new_hash(3)
new_sash(3)
new_shash(3)
sash_erase(3)
sash_exists(3)
sash_get(3)
sash_get_buckets_allocated(3)
sash_get_buckets_used(3)
sash_insert(3)
sash_keys(3)
sash_keys_in_pool(3)
sash_set_buckets_allocated(3)
sash_size(3)
sash_values(3)
sash_values_in_pool(3)
shash_erase(3)
shash_exists(3)
shash_get(3)
shash_get_buckets_allocated(3)
shash_get_buckets_used(3)
shash_get_ptr(3)
shash_insert(3)
shash_keys(3)
shash_keys_in_pool(3)
shash_set_buckets_allocated(3)
shash_size(3)
shash_values(3)
shash_values_in_pool(3)
pchomp(3)
pchrs(3)
pconcat(3)
pdtoa(3)
pgetline(3)
pgetlinec(3)
pgetlinex(3)
pitoa(3)
pjoin(3)
pmatch(3)
pmatchx(3)
pmemdup(3)
psort(3)
psprintf(3)
pstrcat(3)
pstrcsplit(3)
pstrdup(3)
pstrlwr(3)
pstrncat(3)
pstrndup(3)
pstrresplit(3)
pstrs(3)
pstrsplit(3)
pstrupr(3)
psubst(3)
psubstr(3)
psubstx(3)
ptrim(3)
ptrimback(3)
ptrimfront(3)
pvdtostr(3)
pvector(3)
pvectora(3)
pvitostr(3)
pvsprintf(3)
pvxtostr(3)
pxtoa(3)
collision_moving_sphere_and_face(3)
face_translate_along_normal(3)
identity_matrix(3)
make_identity_matrix(3)
make_zero_vec(3)
new_identity_matrix(3)
new_matrix(3)
new_subvector(3)
new_vec(3)
new_zero_vec(3)
plane_coefficients(3)
plane_translate_along_normal(3)
point_distance_to_face(3)
point_distance_to_line(3)
point_distance_to_line_segment(3)
point_distance_to_plane(3)
point_face_angle_sum(3)
point_is_inside_plane(3)
point_lies_in_face(3)
vec_angle_between(3)
vec_dot_product(3)
vec_magnitude2d(3)
vec_magnitude(3)
vec_magnitude_in_direction(3)
vec_normalize2d(3)
vec_normalize(3)
zero_vec(3)