1 (** Bitmatch persistent patterns. *)
2 (* Copyright (C) 2008 Red Hat Inc., Richard W.M. Jones
4 * This library is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU Lesser General Public
6 * License as published by the Free Software Foundation; either
7 * version 2 of the License, or (at your option) any later version.
9 * This library is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 * Lesser General Public License for more details.
14 * You should have received a copy of the GNU Lesser General Public
15 * License along with this library; if not, write to the Free Software
16 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
22 {b Warning:} This documentation is for ADVANCED USERS ONLY.
23 If you are not an advanced user, you are probably looking
24 for {{:Bitmatch.html}the Bitmatch documentation}.
26 {{:#reference}Jump straight to the reference section for
27 documentation on types and functions}.
31 Bitmatch allows you to name sets of fields and reuse them
32 elsewhere. For example if you frequently need to parse
33 Pascal-style strings in the form length byte + string, then you
34 could name the [{ strlen : 8 : int; str : strlen*8 : string }]
35 pattern and reuse it everywhere by name.
37 These are called {b persistent patterns}.
42 (* Create a persistent pattern called 'pascal_string' which
43 * matches Pascal-style strings (length byte + string).
45 let bitmatch pascal_string =
47 str : strlen*8 : string }
49 let is_pascal_string bits =
51 | \{ :pascal_string } ->
52 printf "matches a Pascal string %s, len %d bytes\n"
58 There are some important things you should know about
59 persistent patterns before you decide to use them:
61 'Persistent' refers to the fact that they can be saved into binary
62 files. However these binary files use OCaml [Marshal] module and
63 depend (sometimes) on the version of OCaml used to generate them
64 and (sometimes) the version of bitmatch used. So your build system
65 should rebuild these files from source when your code is rebuilt.
67 Persistent patterns are syntactic. They work in the same way
68 as cutting and pasting (or [#include]-ing) code. For example
69 if a persistent pattern binds a field named [len], then any
70 uses of [len] following in the surrounding pattern could
73 Programs which generate and manipulate persistent patterns have to
74 link to camlp4. Since camlp4 in OCaml >= 3.10 is rather large, we
75 have placed this code into this separate submodule, so that
76 programs which just use bitmatch don't need to pull in the whole of
77 camlp4. This restriction does not apply to generated code which
78 only uses persistent patterns. If the distinction isn't clear,
79 use [ocamlobjinfo] to look at the dependencies of your [*.cmo]
82 Persistent patterns can be generated in several ways, but they
83 can only be {i used} by the [pa_bitmatch] syntax extension.
84 This means they are purely compile-time constructs. You
85 cannot use them to make arbitrary patterns and run those
86 patterns (not unless your program runs [ocamlc] to make a [*.cmo]
87 file then dynamically links to the [*.cmo] file).
91 A named pattern is a way to name a pattern and use it later
92 in the same source file. To name a pattern, use:
94 [let bitmatch name = { fields ... } ;;]
96 and you can then use the name later on inside another pattern,
97 by prefixing the name with a colon.
100 [bitmatch bits with { :name } -> ...]
102 You can use named patterns within named patterns.
104 Currently the use of named patterns is somewhat limited.
105 The restrictions are:
107 Named patterns can only be used within the same source file, and
108 the names occupy a completely separate namespace from anything
109 else in the source file.
111 The [let bitmatch] syntax only works at the top level. We may
112 add a [let bitmatch ... in] for inner levels later.
114 Because you cannot rename the bound identifiers in named
115 patterns, you can effectively only use them once in a
116 pattern. For example, [{ :name; :name }] is legal, but
117 any bindings in the first name would be overridden by
120 There are no "named constructors" yet, but the machinery
121 is in place to do this, and we may add them later.
123 {2 Persistent patterns in files}
125 More useful than just naming patterns, you can load
126 persistent patterns from external files. The patterns
127 in these external files can come from a variety of sources:
128 for example, in the [cil-tools] subdirectory are some
129 {{:http://cil.sf.net/}Cil-based} tools for importing C
130 structures from header files. You can also generate
131 your own files or write your own tools, as described below.
133 To use the persistent pattern(s) from a file do:
135 [open bitmatch "filename.bmpp" ;;]
137 A list of zero or more {!named} patterns are read from the file
138 and each is bound to a name (as contained in the file),
139 and then the patterns can be used with the usual [:name]
140 syntax described above.
144 The standard extension is [.bmpp]. This is just a convention
145 and you can use any extension you want.
147 {3 Directory search order}
149 If the filename is an absolute or explicit path, then we try to
150 load it from that path and stop if it fails. See the [Filename]
151 module in the standard OCaml library for the definitions of
152 "absolute path" and "explicit path". Otherwise we use the
153 following directory search order:
155 - Relative to the current directory
156 - Relative to the OCaml library directory
160 The [bitmatch-objinfo] command can be run on a file in order
161 to print out the patterns in the file.
165 We haven't implemented persistent constructors yet, although
166 the machinery is in place to make this happen. Any constructors
167 found in the file are ignored.
169 {2 Creating your own persistent patterns}
171 If you want to write a tool to import bitstrings from an
172 exotic location or markup language, you will need
173 to use the functions found in the {{:#reference}reference section}.
175 I will describe using an example here of how you would
176 programmatically create a persistent pattern which
177 matches Pascal-style "length byte + data" strings.
178 Firstly note that there are two fields, so our pattern
179 will be a list of length 2 and type {!pattern}.
181 You will need to create a camlp4 location object ([Loc.t])
182 describing the source file. This source file is used
183 to generate useful error messages for the user, so
184 you may want to set it to be the name and location in
185 the file that your tool reads for input. By convention,
186 locations are bound to name [_loc]:
189 let _loc = Loc.move_line 42 (Loc.mk "input.xml")
192 Create a pattern field representing a length field which is 8 bits wide,
193 bound to the identifier [len]:
196 let len_field = create_pattern_field _loc
197 let len_field = set_length_int len_field 8
198 let len_field = set_lident_patt len_field "len"
201 Create a pattern field representing a string of [len*8] bits.
202 Note that the use of [<:expr< >>] quotation requires
203 you to preprocess your source with [camlp4of]
204 (see {{:http://brion.inria.fr/gallium/index.php/Reflective_OCaml}this
205 page on Reflective OCaml}).
208 let str_field = create_pattern_field _loc
209 let str_field = set_length str_field <:expr< len*8 >>
210 let str_field = set_lident_patt str_field "str"
211 let str_field = set_type_string str_field
214 Join the two fields together and name it:
217 let named_pattern = "pascal_string", Pattern [len_field; str_field]
223 let chan = open_out "output.bmpp" in
224 named_to_channel chan named_pattern;
228 You can now use this pattern in another program like this:
231 open bitmatch "output.bmpp" ;;
232 let parse_pascal_string bits =
234 | \{ :pascal_string } -> str, len
235 | \{ _ } -> invalid_arg "not a Pascal string"
238 {2:reference Reference}
243 type patt = Camlp4.PreCast.Syntax.Ast.patt
244 type expr = Camlp4.PreCast.Syntax.Ast.expr
245 type loc_t = Camlp4.PreCast.Syntax.Ast.Loc.t
246 (** Just short names for the camlp4 types. *)
249 (** A field in a persistent pattern or persistent constructor. *)
251 type pattern = patt field list
252 (** A persistent pattern (used in [bitmatch] operator), is just a
253 list of pattern fields. *)
255 type constructor = expr field list
256 (** A persistent constructor (used in [BITSTRING] operator), is just a
257 list of constructor fields. *)
259 type named = string * alt
261 | Pattern of pattern (** Pattern *)
262 | Constructor of constructor (** Constructor *)
263 (** A named pattern or constructor.
265 The name is used when binding a pattern from a file, but
266 is otherwise ignored. *)
270 val string_of_pattern : pattern -> string
271 val string_of_constructor : constructor -> string
272 val string_of_field : 'a field -> string
273 (** Convert patterns, constructors or individual fields
274 into printable strings for debugging purposes.
276 The strings look similar to the syntax used by bitmatch, but
277 some things cannot be printed fully, eg. length expressions. *)
279 (** {3 Persistence} *)
281 val named_to_channel : out_channel -> named -> unit
282 (** Save a pattern/constructor to an output channel. *)
284 val named_to_string : named -> string
285 (** Serialize a pattern/constructor to a string. *)
287 val named_to_buffer : string -> int -> int -> named -> int
288 (** Serialize a pattern/constructor to part of a string, return the length. *)
290 val named_from_channel : in_channel -> named
291 (** Load a pattern/constructor from an output channel.
293 Note: This is not type safe. The pattern/constructor must
294 have been written out under the same version of OCaml and
295 the same version of bitmatch. *)
297 val named_from_string : string -> int -> named
298 (** Load a pattern/constructor from a string at offset within the string.
300 Note: This is not type safe. The pattern/constructor must
301 have been written out under the same version of OCaml and
302 the same version of bitmatch. *)
304 (** {3 Create pattern fields}
306 These fields are used in pattern matches ([bitmatch]). *)
308 val create_pattern_field : loc_t -> patt field
309 (** Create a pattern field.
311 The pattern is unbound, the type is set to [int], bit length to [32],
312 endianness to [BigEndian], signedness to unsigned ([false]),
313 and source code location to the [_loc] parameter.
315 To create a complete field you need to call the [set_*]
316 functions. For example, to create [{ len : 8 : int }]
320 let field = create_pattern_field _loc in
321 let field = set_lident_patt field "len" in
322 let field = set_length_int field 8 in
326 val set_lident_patt : patt field -> string -> patt field
327 (** Sets the pattern to the pattern binding an identifier
330 The effect is that the field [{ len : 8 : int }] could
331 be created by calling [set_lident_patt field "len"]. *)
333 val set_int_patt : patt field -> int -> patt field
334 (** Sets the pattern field to the pattern which matches an integer.
336 The effect is that the field [{ 2 : 8 : int }] could
337 be created by calling [set_int_patt field 2]. *)
339 val set_string_patt : patt field -> string -> patt field
340 (** Sets the pattern field to the pattern which matches a string.
342 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
343 be created by calling [set_int_patt field "MAGIC"]. *)
345 val set_unbound_patt : patt field -> patt field
346 (** Sets the pattern field to the unbound pattern (usually written [_]).
348 The effect is that the field [{ _ : 8 : int }] could
349 be created by calling [set_unbound_patt field]. *)
351 val set_patt : patt field -> patt -> patt field
352 (** Sets the pattern field to an arbitrary OCaml pattern match. *)
354 val set_length_int : 'a field -> int -> 'a field
355 (** Sets the length in bits of a field to a constant integer.
357 The effect is that the field [{ len : 8 : string }] could
358 be created by calling [set_length field 8]. *)
360 val set_length : 'a field -> expr -> 'a field
361 (** Sets the length in bits of a field to an OCaml expression.
363 The effect is that the field [{ len : 2*i : string }] could
364 be created by calling [set_length field <:expr< 2*i >>]. *)
366 val set_endian : 'a field -> Bitmatch.endian -> 'a field
367 (** Sets the endianness of a field to the constant endianness.
369 The effect is that the field [{ _ : 16 : bigendian }] could
370 be created by calling [set_endian field Bitmatch.BigEndian]. *)
372 val set_endian_expr : 'a field -> expr -> 'a field
373 (** Sets the endianness of a field to an endianness expression.
375 The effect is that the field [{ _ : 16 : endian(e) }] could
376 be created by calling [set_endian_expr field e]. *)
378 val set_signed : 'a field -> bool -> 'a field
379 (** Sets the signedness of a field to a constant signedness.
381 The effect is that the field [{ _ : 16 : signed }] could
382 be created by calling [set_signed field true]. *)
384 val set_type_int : 'a field -> 'a field
385 (** Sets the type of a field to [int].
387 The effect is that the field [{ _ : 16 : int }] could
388 be created by calling [set_type_int field]. *)
390 val set_type_string : 'a field -> 'a field
391 (** Sets the type of a field to [string].
393 The effect is that the field [{ str : 16 : string }] could
394 be created by calling [set_type_string field]. *)
396 val set_type_bitstring : 'a field -> 'a field
397 (** Sets the type of a field to [bitstring].
399 The effect is that the field [{ _ : 768 : bitstring }] could
400 be created by calling [set_type_bitstring field]. *)
402 val set_location : 'a field -> loc_t -> 'a field
403 (** Sets the source code location of a field. This is used when
404 pa_bitmatch displays error messages. *)
406 (** {3 Create constructor fields}
408 These fields are used in constructors ([BITSTRING]). *)
410 val create_constructor_field : loc_t -> expr field
411 (** Create a constructor field.
413 The defaults are the same as for {!create_pattern_field}
414 except that the expression is initialized to [0].
417 val set_lident_expr : expr field -> string -> expr field
418 (** Sets the expression in a constructor field to an expression
419 which uses the identifier.
421 The effect is that the field [{ len : 8 : int }] could
422 be created by calling [set_lident_expr field "len"]. *)
424 val set_int_expr : expr field -> int -> expr field
425 (** Sets the expression to the value of the integer.
427 The effect is that the field [{ 2 : 8 : int }] could
428 be created by calling [set_int_expr field 2]. *)
430 val set_string_expr : expr field -> string -> expr field
431 (** Sets the expression to the value of the string.
433 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
434 be created by calling [set_int_expr field "MAGIC"]. *)
436 val set_expr : expr field -> expr -> expr field
437 (** Sets the expression field to an arbitrary OCaml expression. *)
441 val get_patt : patt field -> patt
442 (** Get the pattern from a pattern field. *)
444 val get_expr : expr field -> expr
445 (** Get the expression from an expression field. *)
447 val get_length : 'a field -> expr
448 (** Get the length in bits from a field. Note that what is returned
449 is an OCaml expression, since lengths can be non-constant. *)
452 | ConstantEndian of Bitmatch.endian
455 val get_endian : 'a field -> endian_expr
456 (** Get the endianness of a field. This is an {!endian_expr} which
457 could be a constant or an OCaml expression. *)
459 val get_signed : 'a field -> bool
460 (** Get the signedness of a field. *)
462 type field_type = Int | String | Bitstring
464 val get_type : 'a field -> field_type
465 (** Get the type of a field, [Int], [String] or [Bitstring]. *)
467 val get_location : 'a field -> loc_t
468 (** Get the source code location of a field. *)