1 (** Bitmatch persistent patterns. *)
2 (* Copyright (C) 2008 Red Hat Inc., Richard W.M. Jones
4 * This library is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU Lesser General Public
6 * License as published by the Free Software Foundation; either
7 * version 2 of the License, or (at your option) any later version.
9 * This library is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 * Lesser General Public License for more details.
14 * You should have received a copy of the GNU Lesser General Public
15 * License along with this library; if not, write to the Free Software
16 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
22 {b Warning:} This documentation is for ADVANCED USERS ONLY.
23 If you are not an advanced user, you are probably looking
24 for {{:Bitmatch.html}the Bitmatch documentation}.
26 {{:#reference}Jump straight to the reference section for
27 documentation on types and functions}.
31 Bitmatch allows you to name sets of fields and reuse them
32 elsewhere. For example if you frequently need to parse
33 Pascal-style strings in the form length byte + string, then you
34 could name the [{ strlen : 8 : int; str : strlen*8 : string }]
35 pattern and reuse it everywhere by name.
37 These are called {b persistent patterns}.
42 (* Create a persistent pattern called 'pascal_string' which
43 * matches Pascal-style strings (length byte + string).
45 let bitmatch pascal_string =
47 str : strlen*8 : string }
49 let is_pascal_string bits =
51 | \{ :pascal_string } ->
52 printf "matches a Pascal string %s, len %d bytes\n"
59 (* Load a persistent pattern from a file. *)
60 open bitmatch "pascal.bmpp"
62 let is_pascal_string bits =
64 | \{ :pascal_string } ->
65 printf "matches a Pascal string %s, len %d bytes\n"
71 There are some important things you should know about
72 persistent patterns before you decide to use them:
74 'Persistent' refers to the fact that they can be saved into binary
75 files. However these binary files use OCaml [Marshal] module and
76 depend (sometimes) on the version of OCaml used to generate them
77 and (sometimes) the version of bitmatch used. So your build system
78 should rebuild these files from source when your code is rebuilt.
80 Persistent patterns are syntactic. They work in the same way
81 as cutting and pasting (or [#include]-ing) code. For example
82 if a persistent pattern binds a field named [len], then any
83 uses of [len] following in the surrounding pattern could
86 Programs which generate and manipulate persistent patterns have to
87 link to camlp4. Since camlp4 in OCaml >= 3.10 is rather large, we
88 have placed this code into this separate submodule, so that
89 programs which just use bitmatch don't need to pull in the whole of
90 camlp4. This restriction does not apply to code which only uses
91 persistent patterns but does not generate them. If the distinction
92 isn't clear, use [ocamlobjinfo] to look at the dependencies of your
95 Persistent patterns can be generated in several ways, but they
96 can only be {i used} by the [pa_bitmatch] syntax extension.
97 This means they are purely compile-time constructs. You
98 cannot use them to make arbitrary patterns and run those
99 patterns (not unless your program runs [ocamlc] to make a [*.cmo]
100 file then dynamically links to the [*.cmo] file).
104 A named pattern is a way to name a pattern and use it later
105 in the same source file. To name a pattern, use:
107 [let bitmatch name = { fields ... } ;;]
109 and you can then use the name later on inside another pattern,
110 by prefixing the name with a colon.
113 [bitmatch bits with { :name } -> ...]
115 You can nest named patterns within named patterns to any depth.
117 Currently the use of named patterns is somewhat limited.
118 The restrictions are:
120 Named patterns can only be used within the same source file, and
121 the names occupy a completely separate namespace from anything
122 else in the source file.
124 The [let bitmatch] syntax only works at the top level. We may
125 add a [let bitmatch ... in] for inner levels later.
127 Because you cannot rename the bound identifiers in named
128 patterns, you can effectively only use them once in a
129 pattern. For example, [{ :name; :name }] is legal, but
130 any bindings in the first name would be overridden by
133 There are no "named constructors" yet, but the machinery
134 is in place to do this, and we may add them later.
136 {2 Persistent patterns in files}
138 More useful than just naming patterns, you can load
139 persistent patterns from external files. The patterns
140 in these external files can come from a variety of sources:
141 for example, in the [cil-tools] subdirectory are some
142 {{:http://cil.sf.net/}Cil-based} tools for importing C
143 structures from header files. You can also generate
144 your own files or write your own tools, as described below.
146 To use the persistent pattern(s) from a file do:
148 [open bitmatch "filename.bmpp" ;;]
150 A list of zero or more {!named} patterns are read from the file
151 and each is bound to a name (as contained in the file),
152 and then the patterns can be used with the usual [:name]
153 syntax described above.
157 The standard extension is [.bmpp]. This is just a convention
158 and you can use any extension you want.
160 {3 Directory search order}
162 If the filename is an absolute or explicit path, then we try to
163 load it from that path and stop if it fails. See the [Filename]
164 module in the standard OCaml library for the definitions of
165 "absolute path" and "explicit path". Otherwise we use the
166 following directory search order:
168 - Relative to the current directory
169 - Relative to the OCaml library directory
173 The [bitmatch-objinfo] command can be run on a file in order
174 to print out the patterns in the file.
178 We haven't implemented persistent constructors yet, although
179 the machinery is in place to make this happen. Any constructors
180 found in the file are ignored.
182 {2 Creating your own persistent patterns}
184 If you want to write a tool to import bitstrings from an
185 exotic location or markup language, you will need
186 to use the functions found in the {{:#reference}reference section}.
188 I will describe using an example here of how you would
189 programmatically create a persistent pattern which
190 matches Pascal-style "length byte + data" strings.
191 Firstly note that there are two fields, so our pattern
192 will be a list of length 2 and type {!pattern}.
194 You will need to create a camlp4 location object ([Loc.t])
195 describing the source file. This source file is used
196 to generate useful error messages for the user, so
197 you may want to set it to be the name and location in
198 the file that your tool reads for input. By convention,
199 locations are bound to name [_loc]:
202 let _loc = Loc.move_line 42 (Loc.mk "input.xml")
205 Create a pattern field representing a length field which is 8 bits wide,
206 bound to the identifier [len]:
209 let len_field = create_pattern_field _loc
210 let len_field = set_length_int len_field 8
211 let len_field = set_lident_patt len_field "len"
214 Create a pattern field representing a string of [len*8] bits.
215 Note that the use of [<:expr< >>] quotation requires
216 you to preprocess your source with [camlp4of]
217 (see {{:http://brion.inria.fr/gallium/index.php/Reflective_OCaml}this
218 page on Reflective OCaml}).
221 let str_field = create_pattern_field _loc
222 let str_field = set_length str_field <:expr< len*8 >>
223 let str_field = set_lident_patt str_field "str"
224 let str_field = set_type_string str_field
227 Join the two fields together and name it:
230 let pattern = [len_field; str_field]
231 let named_pattern = "pascal_string", Pattern pattern
237 let chan = open_out "output.bmpp" in
238 named_to_channel chan named_pattern;
242 You can now use this pattern in another program like this:
245 open bitmatch "output.bmpp" ;;
246 let parse_pascal_string bits =
248 | \{ :pascal_string } -> str, len
249 | \{ _ } -> invalid_arg "not a Pascal string"
252 You can write more than one named pattern to the output file, and
253 they will all be loaded at the same time by [open bitmatch ".."]
254 (obviously you should give each pattern a different name). To do
255 this, just call {!named_to_channel} as many times as needed.
257 {2:reference Reference}
262 type patt = Camlp4.PreCast.Syntax.Ast.patt
263 type expr = Camlp4.PreCast.Syntax.Ast.expr
264 type loc_t = Camlp4.PreCast.Syntax.Ast.Loc.t
265 (** Just short names for the camlp4 types. *)
268 (** A field in a persistent pattern or persistent constructor. *)
270 type pattern = patt field list
271 (** A persistent pattern (used in [bitmatch] operator), is just a
272 list of pattern fields. *)
274 type constructor = expr field list
275 (** A persistent constructor (used in [BITSTRING] operator), is just a
276 list of constructor fields. *)
278 type named = string * alt
280 | Pattern of pattern (** Pattern *)
281 | Constructor of constructor (** Constructor *)
282 (** A named pattern or constructor.
284 The name is used when binding a pattern from a file, but
285 is otherwise ignored. *)
289 val string_of_pattern : pattern -> string
290 val string_of_constructor : constructor -> string
291 val string_of_pattern_field : patt field -> string
292 val string_of_constructor_field : expr field -> string
293 (** Convert patterns, constructors or individual fields
294 into printable strings for debugging purposes.
296 The strings look similar to the syntax used by bitmatch, but
297 some things cannot be printed fully, eg. length expressions. *)
299 (** {3 Persistence} *)
301 val named_to_channel : out_channel -> named -> unit
302 (** Save a pattern/constructor to an output channel. *)
304 val named_to_string : named -> string
305 (** Serialize a pattern/constructor to a string. *)
307 val named_to_buffer : string -> int -> int -> named -> int
308 (** Serialize a pattern/constructor to part of a string, return the length. *)
310 val named_from_channel : in_channel -> named
311 (** Load a pattern/constructor from an output channel.
313 Note: This is not type safe. The pattern/constructor must
314 have been written out under the same version of OCaml and
315 the same version of bitmatch. *)
317 val named_from_string : string -> int -> named
318 (** Load a pattern/constructor from a string at offset within the string.
320 Note: This is not type safe. The pattern/constructor must
321 have been written out under the same version of OCaml and
322 the same version of bitmatch. *)
324 (** {3 Create pattern fields}
326 These fields are used in pattern matches ([bitmatch]). *)
328 val create_pattern_field : loc_t -> patt field
329 (** Create a pattern field.
331 The pattern is unbound, the type is set to [int], bit length to [32],
332 endianness to [BigEndian], signedness to unsigned ([false]),
333 source code location to the [_loc] parameter, and no offset expression.
335 To create a complete field you need to call the [set_*]
336 functions. For example, to create [{ len : 8 : int }]
340 let field = create_pattern_field _loc in
341 let field = set_lident_patt field "len" in
342 let field = set_length_int field 8 in
346 val set_lident_patt : patt field -> string -> patt field
347 (** Sets the pattern to the pattern binding an identifier
350 The effect is that the field [{ len : 8 : int }] could
351 be created by calling [set_lident_patt field "len"]. *)
353 val set_int_patt : patt field -> int -> patt field
354 (** Sets the pattern field to the pattern which matches an integer.
356 The effect is that the field [{ 2 : 8 : int }] could
357 be created by calling [set_int_patt field 2]. *)
359 val set_string_patt : patt field -> string -> patt field
360 (** Sets the pattern field to the pattern which matches a string.
362 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
363 be created by calling [set_int_patt field "MAGIC"]. *)
365 val set_unbound_patt : patt field -> patt field
366 (** Sets the pattern field to the unbound pattern (usually written [_]).
368 The effect is that the field [{ _ : 8 : int }] could
369 be created by calling [set_unbound_patt field]. *)
371 val set_patt : patt field -> patt -> patt field
372 (** Sets the pattern field to an arbitrary OCaml pattern match. *)
374 val set_length_int : 'a field -> int -> 'a field
375 (** Sets the length in bits of a field to a constant integer.
377 The effect is that the field [{ len : 8 : string }] could
378 be created by calling [set_length field 8]. *)
380 val set_length : 'a field -> expr -> 'a field
381 (** Sets the length in bits of a field to an OCaml expression.
383 The effect is that the field [{ len : 2*i : string }] could
384 be created by calling [set_length field <:expr< 2*i >>]. *)
386 val set_endian : 'a field -> Bitmatch.endian -> 'a field
387 (** Sets the endianness of a field to the constant endianness.
389 The effect is that the field [{ _ : 16 : bigendian }] could
390 be created by calling [set_endian field Bitmatch.BigEndian]. *)
392 val set_endian_expr : 'a field -> expr -> 'a field
393 (** Sets the endianness of a field to an endianness expression.
395 The effect is that the field [{ _ : 16 : endian(e) }] could
396 be created by calling [set_endian_expr field e]. *)
398 val set_signed : 'a field -> bool -> 'a field
399 (** Sets the signedness of a field to a constant signedness.
401 The effect is that the field [{ _ : 16 : signed }] could
402 be created by calling [set_signed field true]. *)
404 val set_type_int : 'a field -> 'a field
405 (** Sets the type of a field to [int].
407 The effect is that the field [{ _ : 16 : int }] could
408 be created by calling [set_type_int field]. *)
410 val set_type_string : 'a field -> 'a field
411 (** Sets the type of a field to [string].
413 The effect is that the field [{ str : 16 : string }] could
414 be created by calling [set_type_string field]. *)
416 val set_type_bitstring : 'a field -> 'a field
417 (** Sets the type of a field to [bitstring].
419 The effect is that the field [{ _ : 768 : bitstring }] could
420 be created by calling [set_type_bitstring field]. *)
422 val set_location : 'a field -> loc_t -> 'a field
423 (** Sets the source code location of a field. This is used when
424 pa_bitmatch displays error messages. *)
426 val set_offset_int : 'a field -> int -> 'a field
427 (** Set the offset expression for a field to the given number.
429 The effect is that the field [{ _ : 8 : offset(160) }] could
430 be created by calling [set_offset_int field 160]. *)
432 val set_offset : 'a field -> expr -> 'a field
433 (** Set the offset expression for a field to the given expression.
435 The effect is that the field [{ _ : 8 : offset(160) }] could
436 be created by calling [set_offset_int field <:expr< 160 >>]. *)
438 val set_no_offset : 'a field -> 'a field
439 (** Remove the offset expression from a field. The field will
440 follow the previous field, or if it is the first field will
441 be at offset zero. *)
443 (** {3 Create constructor fields}
445 These fields are used in constructors ([BITSTRING]). *)
447 val create_constructor_field : loc_t -> expr field
448 (** Create a constructor field.
450 The defaults are the same as for {!create_pattern_field}
451 except that the expression is initialized to [0].
454 val set_lident_expr : expr field -> string -> expr field
455 (** Sets the expression in a constructor field to an expression
456 which uses the identifier.
458 The effect is that the field [{ len : 8 : int }] could
459 be created by calling [set_lident_expr field "len"]. *)
461 val set_int_expr : expr field -> int -> expr field
462 (** Sets the expression to the value of the integer.
464 The effect is that the field [{ 2 : 8 : int }] could
465 be created by calling [set_int_expr field 2]. *)
467 val set_string_expr : expr field -> string -> expr field
468 (** Sets the expression to the value of the string.
470 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
471 be created by calling [set_int_expr field "MAGIC"]. *)
473 val set_expr : expr field -> expr -> expr field
474 (** Sets the expression field to an arbitrary OCaml expression. *)
478 val get_patt : patt field -> patt
479 (** Get the pattern from a pattern field. *)
481 val get_expr : expr field -> expr
482 (** Get the expression from an expression field. *)
484 val get_length : 'a field -> expr
485 (** Get the length in bits from a field. Note that what is returned
486 is an OCaml expression, since lengths can be non-constant. *)
489 | ConstantEndian of Bitmatch.endian
492 val get_endian : 'a field -> endian_expr
493 (** Get the endianness of a field. This is an {!endian_expr} which
494 could be a constant or an OCaml expression. *)
496 val get_signed : 'a field -> bool
497 (** Get the signedness of a field. *)
499 type field_type = Int | String | Bitstring
501 val get_type : 'a field -> field_type
502 (** Get the type of a field, [Int], [String] or [Bitstring]. *)
504 val get_location : 'a field -> loc_t
505 (** Get the source code location of a field. *)
507 val get_offset : 'a field -> expr option
508 (** Get the offset expression of a field, or [None] if there is none. *)