1 (** Bitmatch persistent patterns. *)
2 (* Copyright (C) 2008 Red Hat Inc., Richard W.M. Jones
4 * This library is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU Lesser General Public
6 * License as published by the Free Software Foundation; either
7 * version 2 of the License, or (at your option) any later version.
9 * This library is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 * Lesser General Public License for more details.
14 * You should have received a copy of the GNU Lesser General Public
15 * License along with this library; if not, write to the Free Software
16 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
22 {b Warning:} This documentation is for ADVANCED USERS ONLY.
23 If you are not an advanced user, you are probably looking
24 for {{:Bitmatch.html}the Bitmatch documentation}.
26 {{:#reference}Jump straight to the reference section for
27 documentation on types and functions}.
31 Bitmatch allows you to name sets of fields and reuse them
32 elsewhere. For example if you frequently need to parse
33 Pascal-style strings in the form length byte + string, then you
34 could name the [{ strlen : 8 : int; str : strlen*8 : string }]
35 pattern and reuse it everywhere by name.
37 These are called {b persistent patterns}.
42 (* Create a persistent pattern called 'pascal_string' which
43 * matches Pascal-style strings (length byte + string).
45 let bitmatch pascal_string =
47 str : strlen*8 : string }
49 let is_pascal_string bits =
51 | \{ :pascal_string } ->
52 printf "matches a Pascal string %s, len %d bytes\n"
58 There are some important things you should know about
59 persistent patterns before you decide to use them:
61 'Persistent' refers to the fact that they can be saved into binary
62 files. However these binary files use OCaml [Marshal] module and
63 depend (sometimes) on the version of OCaml used to generate them
64 and (sometimes) the version of bitmatch used. So your build system
65 should rebuild these files from source when your code is rebuilt.
67 Persistent patterns are syntactic. They work in the same way
68 as cutting and pasting (or [#include]-ing) code. For example
69 if a persistent pattern binds a field named [len], then any
70 uses of [len] following in the surrounding pattern could
73 Programs which generate and manipulate persistent patterns have to
74 link to camlp4. Since camlp4 in OCaml >= 3.10 is rather large, we
75 have placed this code into this separate submodule, so that
76 programs which just use bitmatch don't need to pull in the whole of
77 camlp4. This restriction does not apply to generated code which
78 only uses persistent patterns. If the distinction isn't clear,
79 use [ocamlobjinfo] to look at the dependencies of your [*.cmo]
82 Persistent patterns can be generated in several ways, but they
83 can only be {i used} by the [pa_bitmatch] syntax extension.
84 This means they are purely compile-time constructs. You
85 cannot use them to make arbitrary patterns and run those
86 patterns (not unless your program runs [ocamlc] to make a [*.cmo]
87 file then dynamically links to the [*.cmo] file).
91 A named pattern is a way to name a pattern and use it later
92 in the same source file. To name a pattern, use:
94 [let bitmatch name = { fields ... } ;;]
96 and you can then use the name later on inside another pattern,
97 by prefixing the name with a colon.
100 [bitmatch bits with { :name } -> ...]
102 You can use named patterns within named patterns.
104 Currently the use of named patterns is somewhat limited.
105 The restrictions are:
107 Named patterns can only be used within the same source file, and
108 the names occupy a completely separate namespace from anything
109 else in the source file.
111 The [let bitmatch] syntax only works at the top level. We may
112 add a [let bitmatch ... in] for inner levels later.
114 Because you cannot rename the bound identifiers in named
115 patterns, you can effectively only use them once in a
116 pattern. For example, [{ :name; :name }] is legal, but
117 any bindings in the first name would be overridden by
120 There are no "named constructors" yet, but the machinery
121 is in place to do this, and we may add them later.
123 {2 Persistent patterns in files}
125 More useful than just naming patterns, you can load
126 persistent patterns from external files. The patterns
127 in these external files can come from a variety of sources:
128 for example, in the [cil-tools] subdirectory are some
129 {{:http://cil.sf.net/}Cil-based} tools for importing C
130 structures from header files. You can also generate
131 your own files or write your own tools, as described below.
133 To use the persistent pattern(s) from a file do:
135 [open bitmatch "filename.bmpp" ;;]
137 A list of zero or more {!named} patterns are read from the file
138 and each is bound to a name (as contained in the file),
139 and then the patterns can be used with the usual [:name]
140 syntax described above.
144 The standard extension is [.bmpp]. This is just a convention
145 and you can use any extension you want.
147 {3 Directory search order}
149 If the filename is an absolute or explicit path, then we try to
150 load it from that path and stop if it fails. See the [Filename]
151 module in the standard OCaml library for the definitions of
152 "absolute path" and "explicit path". Otherwise we use the
153 following directory search order:
155 - Relative to the current directory
156 - Relative to the OCaml library directory
160 The [bitmatch-objinfo] command can be run on a file in order
161 to print out the patterns in the file.
165 We haven't implemented persistent constructors yet, although
166 the machinery is in place to make this happen. Any constructors
167 found in the file are ignored.
169 {2 Creating your own persistent patterns}
171 If you want to write a tool to import bitstrings from an
172 exotic location or markup language, you will need
173 to use the functions found in the {{:#reference}reference section}.
175 I will describe using an example here of how you would
176 programmatically create a persistent pattern which
177 matches Pascal-style "length byte + data" strings.
178 Firstly note that there are two fields, so our pattern
179 will be a list of length 2 and type {!pattern}.
181 You will need to create a camlp4 location object ([Loc.t])
182 describing the source file. This source file is used
183 to generate useful error messages for the user, so
184 you may want to set it to be the name and location in
185 the file that your tool reads for input. By convention,
186 locations are bound to name [_loc]:
189 let _loc = Loc.move_line 42 (Loc.mk "input.xml")
192 Create a pattern field representing a length field which is 8 bits wide,
193 bound to the identifier [len]:
196 let len_field = create_pattern_field _loc
197 let len_field = set_length_int len_field 8
198 let len_field = set_lident_patt len_field "len"
201 Create a pattern field representing a string of [len*8] bits.
202 Note that the use of [<:expr< >>] quotation requires
203 you to preprocess your source with [camlp4of]
204 (see {{:http://brion.inria.fr/gallium/index.php/Reflective_OCaml}this
205 page on Reflective OCaml}).
208 let str_field = create_pattern_field _loc
209 let str_field = set_length str_field <:expr< len*8 >>
210 let str_field = set_lident_patt str_field "str"
211 let str_field = set_type_string str_field
214 Join the two fields together and name it:
217 let named_pattern = "pascal_string", Pattern [len_field; str_field]
223 let chan = open_out "output.bmpp" in
224 named_to_channel chan named_pattern;
228 You can now use this pattern in another program like this:
231 open bitmatch "output.bmpp" ;;
232 let parse_pascal_string bits =
234 | \{ :pascal_string } -> str, len
235 | \{ _ } -> invalid_arg "not a Pascal string"
238 You can write more than one named pattern to the output file, and
239 they will all be loaded at the same time by [open bitmatch ".."]
240 (obviously you should give each pattern a different name). To do
241 this, just call {!named_to_channel} as many times as needed.
243 {2:reference Reference}
248 type patt = Camlp4.PreCast.Syntax.Ast.patt
249 type expr = Camlp4.PreCast.Syntax.Ast.expr
250 type loc_t = Camlp4.PreCast.Syntax.Ast.Loc.t
251 (** Just short names for the camlp4 types. *)
254 (** A field in a persistent pattern or persistent constructor. *)
256 type pattern = patt field list
257 (** A persistent pattern (used in [bitmatch] operator), is just a
258 list of pattern fields. *)
260 type constructor = expr field list
261 (** A persistent constructor (used in [BITSTRING] operator), is just a
262 list of constructor fields. *)
264 type named = string * alt
266 | Pattern of pattern (** Pattern *)
267 | Constructor of constructor (** Constructor *)
268 (** A named pattern or constructor.
270 The name is used when binding a pattern from a file, but
271 is otherwise ignored. *)
275 val string_of_pattern : pattern -> string
276 val string_of_constructor : constructor -> string
277 val string_of_field : 'a field -> string
278 (** Convert patterns, constructors or individual fields
279 into printable strings for debugging purposes.
281 The strings look similar to the syntax used by bitmatch, but
282 some things cannot be printed fully, eg. length expressions. *)
284 (** {3 Persistence} *)
286 val named_to_channel : out_channel -> named -> unit
287 (** Save a pattern/constructor to an output channel. *)
289 val named_to_string : named -> string
290 (** Serialize a pattern/constructor to a string. *)
292 val named_to_buffer : string -> int -> int -> named -> int
293 (** Serialize a pattern/constructor to part of a string, return the length. *)
295 val named_from_channel : in_channel -> named
296 (** Load a pattern/constructor from an output channel.
298 Note: This is not type safe. The pattern/constructor must
299 have been written out under the same version of OCaml and
300 the same version of bitmatch. *)
302 val named_from_string : string -> int -> named
303 (** Load a pattern/constructor from a string at offset within the string.
305 Note: This is not type safe. The pattern/constructor must
306 have been written out under the same version of OCaml and
307 the same version of bitmatch. *)
309 (** {3 Create pattern fields}
311 These fields are used in pattern matches ([bitmatch]). *)
313 val create_pattern_field : loc_t -> patt field
314 (** Create a pattern field.
316 The pattern is unbound, the type is set to [int], bit length to [32],
317 endianness to [BigEndian], signedness to unsigned ([false]),
318 and source code location to the [_loc] parameter.
320 To create a complete field you need to call the [set_*]
321 functions. For example, to create [{ len : 8 : int }]
325 let field = create_pattern_field _loc in
326 let field = set_lident_patt field "len" in
327 let field = set_length_int field 8 in
331 val set_lident_patt : patt field -> string -> patt field
332 (** Sets the pattern to the pattern binding an identifier
335 The effect is that the field [{ len : 8 : int }] could
336 be created by calling [set_lident_patt field "len"]. *)
338 val set_int_patt : patt field -> int -> patt field
339 (** Sets the pattern field to the pattern which matches an integer.
341 The effect is that the field [{ 2 : 8 : int }] could
342 be created by calling [set_int_patt field 2]. *)
344 val set_string_patt : patt field -> string -> patt field
345 (** Sets the pattern field to the pattern which matches a string.
347 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
348 be created by calling [set_int_patt field "MAGIC"]. *)
350 val set_unbound_patt : patt field -> patt field
351 (** Sets the pattern field to the unbound pattern (usually written [_]).
353 The effect is that the field [{ _ : 8 : int }] could
354 be created by calling [set_unbound_patt field]. *)
356 val set_patt : patt field -> patt -> patt field
357 (** Sets the pattern field to an arbitrary OCaml pattern match. *)
359 val set_length_int : 'a field -> int -> 'a field
360 (** Sets the length in bits of a field to a constant integer.
362 The effect is that the field [{ len : 8 : string }] could
363 be created by calling [set_length field 8]. *)
365 val set_length : 'a field -> expr -> 'a field
366 (** Sets the length in bits of a field to an OCaml expression.
368 The effect is that the field [{ len : 2*i : string }] could
369 be created by calling [set_length field <:expr< 2*i >>]. *)
371 val set_endian : 'a field -> Bitmatch.endian -> 'a field
372 (** Sets the endianness of a field to the constant endianness.
374 The effect is that the field [{ _ : 16 : bigendian }] could
375 be created by calling [set_endian field Bitmatch.BigEndian]. *)
377 val set_endian_expr : 'a field -> expr -> 'a field
378 (** Sets the endianness of a field to an endianness expression.
380 The effect is that the field [{ _ : 16 : endian(e) }] could
381 be created by calling [set_endian_expr field e]. *)
383 val set_signed : 'a field -> bool -> 'a field
384 (** Sets the signedness of a field to a constant signedness.
386 The effect is that the field [{ _ : 16 : signed }] could
387 be created by calling [set_signed field true]. *)
389 val set_type_int : 'a field -> 'a field
390 (** Sets the type of a field to [int].
392 The effect is that the field [{ _ : 16 : int }] could
393 be created by calling [set_type_int field]. *)
395 val set_type_string : 'a field -> 'a field
396 (** Sets the type of a field to [string].
398 The effect is that the field [{ str : 16 : string }] could
399 be created by calling [set_type_string field]. *)
401 val set_type_bitstring : 'a field -> 'a field
402 (** Sets the type of a field to [bitstring].
404 The effect is that the field [{ _ : 768 : bitstring }] could
405 be created by calling [set_type_bitstring field]. *)
407 val set_location : 'a field -> loc_t -> 'a field
408 (** Sets the source code location of a field. This is used when
409 pa_bitmatch displays error messages. *)
411 (** {3 Create constructor fields}
413 These fields are used in constructors ([BITSTRING]). *)
415 val create_constructor_field : loc_t -> expr field
416 (** Create a constructor field.
418 The defaults are the same as for {!create_pattern_field}
419 except that the expression is initialized to [0].
422 val set_lident_expr : expr field -> string -> expr field
423 (** Sets the expression in a constructor field to an expression
424 which uses the identifier.
426 The effect is that the field [{ len : 8 : int }] could
427 be created by calling [set_lident_expr field "len"]. *)
429 val set_int_expr : expr field -> int -> expr field
430 (** Sets the expression to the value of the integer.
432 The effect is that the field [{ 2 : 8 : int }] could
433 be created by calling [set_int_expr field 2]. *)
435 val set_string_expr : expr field -> string -> expr field
436 (** Sets the expression to the value of the string.
438 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
439 be created by calling [set_int_expr field "MAGIC"]. *)
441 val set_expr : expr field -> expr -> expr field
442 (** Sets the expression field to an arbitrary OCaml expression. *)
446 val get_patt : patt field -> patt
447 (** Get the pattern from a pattern field. *)
449 val get_expr : expr field -> expr
450 (** Get the expression from an expression field. *)
452 val get_length : 'a field -> expr
453 (** Get the length in bits from a field. Note that what is returned
454 is an OCaml expression, since lengths can be non-constant. *)
457 | ConstantEndian of Bitmatch.endian
460 val get_endian : 'a field -> endian_expr
461 (** Get the endianness of a field. This is an {!endian_expr} which
462 could be a constant or an OCaml expression. *)
464 val get_signed : 'a field -> bool
465 (** Get the signedness of a field. *)
467 type field_type = Int | String | Bitstring
469 val get_type : 'a field -> field_type
470 (** Get the type of a field, [Int], [String] or [Bitstring]. *)
472 val get_location : 'a field -> loc_t
473 (** Get the source code location of a field. *)