1 (** Bitmatch persistent patterns. *)
2 (* Copyright (C) 2008 Red Hat Inc., Richard W.M. Jones
4 * This library is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU Lesser General Public
6 * License as published by the Free Software Foundation; either
7 * version 2 of the License, or (at your option) any later version.
9 * This library is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 * Lesser General Public License for more details.
14 * You should have received a copy of the GNU Lesser General Public
15 * License along with this library; if not, write to the Free Software
16 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
22 {b Warning:} This documentation is for ADVANCED USERS ONLY.
23 If you are not an advanced user, you are probably looking
24 for {{:Bitmatch.html}the Bitmatch documentation}.
26 {{:#reference}Jump straight to the reference section for
27 documentation on types and functions}.
31 Bitmatch allows you to name sets of fields and reuse them
32 elsewhere. For example if you frequently need to parse
33 Pascal-style strings in the form length byte + string, then you
34 could name the [{ strlen : 8 : int; str : strlen*8 : string }]
35 pattern and reuse it everywhere by name.
37 These are called {b persistent patterns}.
42 (* Create a persistent pattern called 'pascal_string' which
43 * matches Pascal-style strings (length byte + string).
45 let bitmatch pascal_string =
47 str : strlen*8 : string }
49 let is_pascal_string bits =
51 | \{ :pascal_string } ->
52 printf "matches a Pascal string %s, len %d bytes\n"
58 There are some important things you should know about
59 persistent patterns before you decide to use them:
61 'Persistent' refers to the fact that they can be saved into binary
62 files. However these binary files use OCaml [Marshal] module and
63 depend (sometimes) on the version of OCaml used to generate them
64 and (sometimes) the version of bitmatch used. So your build system
65 should rebuild these files from source when your code is rebuilt.
67 Persistent patterns are syntactic. They work in the same way
68 as cutting and pasting (or [#include]-ing) code. For example
69 if a persistent pattern binds a field named [len], then any
70 uses of [len] following in the surrounding pattern could
73 Programs which generate and manipulate persistent patterns have to
74 link to camlp4. Since camlp4 in OCaml >= 3.10 is rather large, we
75 have placed this code into this separate submodule, so that
76 programs which just use bitmatch don't need to pull in the whole of
77 camlp4. This restriction does not apply to generated code which
78 only uses persistent patterns. If the distinction isn't clear,
79 use [ocamlobjinfo] to look at the dependencies of your [*.cmo]
82 Persistent patterns can be generated in several ways, but they
83 can only be {i used} by the [pa_bitmatch] syntax extension.
84 This means they are purely compile-time constructs. You
85 cannot use them to make arbitrary patterns and run those
86 patterns (not unless your program runs [ocamlc] to make a [*.cmo]
87 file then dynamically links to the [*.cmo] file).
91 A named pattern is a way to name a pattern and use it later
92 in the same source file. To name a pattern, use:
94 [let bitmatch name = { fields ... } ;;]
96 and you can then use the name later on inside another pattern,
97 by prefixing the name with a colon.
100 [bitmatch bits with { :name } -> ...]
102 You can use named patterns within named patterns.
104 Currently the use of named patterns is somewhat limited.
105 The restrictions are:
107 Named patterns can only be used within the same source file, and
108 the names occupy a completely separate namespace from anything
109 else in the source file.
111 The [let bitmatch] syntax only works at the top level. We may
112 add a [let bitmatch ... in] for inner levels later.
114 Because you cannot rename the bound identifiers in named
115 patterns, you can effectively only use them once in a
116 pattern. For example, [{ :name; :name }] is legal, but
117 any bindings in the first name would be overridden by
120 There are no "named constructors" yet, but the machinery
121 is in place to do this, and we may add them later.
123 {2 Persistent patterns in files}
125 More useful than just naming patterns, you can load
126 persistent patterns from external files. The patterns
127 in these external files can come from a variety of sources:
128 for example, in the [cil-tools] subdirectory are some
129 {{:http://cil.sf.net/}Cil-based} tools for importing C
130 structures from header files. You can also generate
131 your own files or write your own tools, as described below.
133 To use the persistent pattern(s) from a file do:
135 [open bitmatch "filename.bmpp" ;;]
137 A list of zero or more {!named} patterns are read from the file
138 and each is bound to a name (as contained in the file),
139 and then the patterns can be used with the usual [:name]
140 syntax described above.
144 The standard extension is [.bmpp]. This is just a convention
145 and you can use any extension you want.
147 {3 Directory search order}
149 If the filename is an absolute or explicit path, then we try to
150 load it from that path and stop if it fails. See the [Filename]
151 module in the standard OCaml library for the definitions of
152 "absolute path" and "explicit path". Otherwise we use the
153 following directory search order:
155 - Relative to the current directory
156 - Relative to the OCaml library directory
160 The [bitmatch-objinfo] command can be run on a file in order
161 to print out the patterns in the file.
165 We haven't implemented persistent constructors yet, although
166 the machinery is in place to make this happen. Any constructors
167 found in the file are ignored.
169 {2 Creating your own persistent patterns}
171 If you want to write a tool to import bitstrings from an
172 exotic location or markup language, you will need
173 to use the functions found in the {{:#reference}reference section}.
175 I will describe using an example here of how you would
176 programmatically create a persistent pattern which
177 matches Pascal-style "length byte + data" strings.
178 Firstly note that there are two fields, so our pattern
179 will be a list of length 2 and type {!pattern}.
181 You will need to create a camlp4 location object ([Loc.t])
182 describing the source file. This source file is used
183 to generate useful error messages for the user, so
184 you may want to set it to be the name and location in
185 the file that your tool reads for input. By convention,
186 locations are bound to name [_loc]:
189 let _loc = Loc.move_line 42 (Loc.mk "input.xml")
192 Create a pattern field representing a length field which is 8 bits wide,
193 bound to the identifier [len]:
196 let len_field = create_pattern_field _loc
197 let len_field = set_length_int len_field 8
198 let len_field = set_lident_patt len_field "len"
201 Create a pattern field representing a string of [len*8] bits.
202 Note that the use of [<:expr< >>] quotation requires
203 you to preprocess your source with [camlp4of]
204 (see {{:http://brion.inria.fr/gallium/index.php/Reflective_OCaml}this
205 page on Reflective OCaml}).
208 let str_field = create_pattern_field _loc
209 let str_field = set_length str_field <:expr< len*8 >>
210 let str_field = set_lident_patt str_field "str"
211 let str_field = set_type_string str_field
214 Join the two fields together and name it:
217 let pattern = [len_field; str_field]
218 let named_pattern = "pascal_string", Pattern pattern
224 let chan = open_out "output.bmpp" in
225 named_to_channel chan named_pattern;
229 You can now use this pattern in another program like this:
232 open bitmatch "output.bmpp" ;;
233 let parse_pascal_string bits =
235 | \{ :pascal_string } -> str, len
236 | \{ _ } -> invalid_arg "not a Pascal string"
239 You can write more than one named pattern to the output file, and
240 they will all be loaded at the same time by [open bitmatch ".."]
241 (obviously you should give each pattern a different name). To do
242 this, just call {!named_to_channel} as many times as needed.
244 {2:reference Reference}
249 type patt = Camlp4.PreCast.Syntax.Ast.patt
250 type expr = Camlp4.PreCast.Syntax.Ast.expr
251 type loc_t = Camlp4.PreCast.Syntax.Ast.Loc.t
252 (** Just short names for the camlp4 types. *)
255 (** A field in a persistent pattern or persistent constructor. *)
257 type pattern = patt field list
258 (** A persistent pattern (used in [bitmatch] operator), is just a
259 list of pattern fields. *)
261 type constructor = expr field list
262 (** A persistent constructor (used in [BITSTRING] operator), is just a
263 list of constructor fields. *)
265 type named = string * alt
267 | Pattern of pattern (** Pattern *)
268 | Constructor of constructor (** Constructor *)
269 (** A named pattern or constructor.
271 The name is used when binding a pattern from a file, but
272 is otherwise ignored. *)
276 val string_of_pattern : pattern -> string
277 val string_of_constructor : constructor -> string
278 val string_of_field : 'a field -> string
279 (** Convert patterns, constructors or individual fields
280 into printable strings for debugging purposes.
282 The strings look similar to the syntax used by bitmatch, but
283 some things cannot be printed fully, eg. length expressions. *)
285 (** {3 Persistence} *)
287 val named_to_channel : out_channel -> named -> unit
288 (** Save a pattern/constructor to an output channel. *)
290 val named_to_string : named -> string
291 (** Serialize a pattern/constructor to a string. *)
293 val named_to_buffer : string -> int -> int -> named -> int
294 (** Serialize a pattern/constructor to part of a string, return the length. *)
296 val named_from_channel : in_channel -> named
297 (** Load a pattern/constructor from an output channel.
299 Note: This is not type safe. The pattern/constructor must
300 have been written out under the same version of OCaml and
301 the same version of bitmatch. *)
303 val named_from_string : string -> int -> named
304 (** Load a pattern/constructor from a string at offset within the string.
306 Note: This is not type safe. The pattern/constructor must
307 have been written out under the same version of OCaml and
308 the same version of bitmatch. *)
310 (** {3 Create pattern fields}
312 These fields are used in pattern matches ([bitmatch]). *)
314 val create_pattern_field : loc_t -> patt field
315 (** Create a pattern field.
317 The pattern is unbound, the type is set to [int], bit length to [32],
318 endianness to [BigEndian], signedness to unsigned ([false]),
319 and source code location to the [_loc] parameter.
321 To create a complete field you need to call the [set_*]
322 functions. For example, to create [{ len : 8 : int }]
326 let field = create_pattern_field _loc in
327 let field = set_lident_patt field "len" in
328 let field = set_length_int field 8 in
332 val set_lident_patt : patt field -> string -> patt field
333 (** Sets the pattern to the pattern binding an identifier
336 The effect is that the field [{ len : 8 : int }] could
337 be created by calling [set_lident_patt field "len"]. *)
339 val set_int_patt : patt field -> int -> patt field
340 (** Sets the pattern field to the pattern which matches an integer.
342 The effect is that the field [{ 2 : 8 : int }] could
343 be created by calling [set_int_patt field 2]. *)
345 val set_string_patt : patt field -> string -> patt field
346 (** Sets the pattern field to the pattern which matches a string.
348 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
349 be created by calling [set_int_patt field "MAGIC"]. *)
351 val set_unbound_patt : patt field -> patt field
352 (** Sets the pattern field to the unbound pattern (usually written [_]).
354 The effect is that the field [{ _ : 8 : int }] could
355 be created by calling [set_unbound_patt field]. *)
357 val set_patt : patt field -> patt -> patt field
358 (** Sets the pattern field to an arbitrary OCaml pattern match. *)
360 val set_length_int : 'a field -> int -> 'a field
361 (** Sets the length in bits of a field to a constant integer.
363 The effect is that the field [{ len : 8 : string }] could
364 be created by calling [set_length field 8]. *)
366 val set_length : 'a field -> expr -> 'a field
367 (** Sets the length in bits of a field to an OCaml expression.
369 The effect is that the field [{ len : 2*i : string }] could
370 be created by calling [set_length field <:expr< 2*i >>]. *)
372 val set_endian : 'a field -> Bitmatch.endian -> 'a field
373 (** Sets the endianness of a field to the constant endianness.
375 The effect is that the field [{ _ : 16 : bigendian }] could
376 be created by calling [set_endian field Bitmatch.BigEndian]. *)
378 val set_endian_expr : 'a field -> expr -> 'a field
379 (** Sets the endianness of a field to an endianness expression.
381 The effect is that the field [{ _ : 16 : endian(e) }] could
382 be created by calling [set_endian_expr field e]. *)
384 val set_signed : 'a field -> bool -> 'a field
385 (** Sets the signedness of a field to a constant signedness.
387 The effect is that the field [{ _ : 16 : signed }] could
388 be created by calling [set_signed field true]. *)
390 val set_type_int : 'a field -> 'a field
391 (** Sets the type of a field to [int].
393 The effect is that the field [{ _ : 16 : int }] could
394 be created by calling [set_type_int field]. *)
396 val set_type_string : 'a field -> 'a field
397 (** Sets the type of a field to [string].
399 The effect is that the field [{ str : 16 : string }] could
400 be created by calling [set_type_string field]. *)
402 val set_type_bitstring : 'a field -> 'a field
403 (** Sets the type of a field to [bitstring].
405 The effect is that the field [{ _ : 768 : bitstring }] could
406 be created by calling [set_type_bitstring field]. *)
408 val set_location : 'a field -> loc_t -> 'a field
409 (** Sets the source code location of a field. This is used when
410 pa_bitmatch displays error messages. *)
412 (** {3 Create constructor fields}
414 These fields are used in constructors ([BITSTRING]). *)
416 val create_constructor_field : loc_t -> expr field
417 (** Create a constructor field.
419 The defaults are the same as for {!create_pattern_field}
420 except that the expression is initialized to [0].
423 val set_lident_expr : expr field -> string -> expr field
424 (** Sets the expression in a constructor field to an expression
425 which uses the identifier.
427 The effect is that the field [{ len : 8 : int }] could
428 be created by calling [set_lident_expr field "len"]. *)
430 val set_int_expr : expr field -> int -> expr field
431 (** Sets the expression to the value of the integer.
433 The effect is that the field [{ 2 : 8 : int }] could
434 be created by calling [set_int_expr field 2]. *)
436 val set_string_expr : expr field -> string -> expr field
437 (** Sets the expression to the value of the string.
439 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
440 be created by calling [set_int_expr field "MAGIC"]. *)
442 val set_expr : expr field -> expr -> expr field
443 (** Sets the expression field to an arbitrary OCaml expression. *)
447 val get_patt : patt field -> patt
448 (** Get the pattern from a pattern field. *)
450 val get_expr : expr field -> expr
451 (** Get the expression from an expression field. *)
453 val get_length : 'a field -> expr
454 (** Get the length in bits from a field. Note that what is returned
455 is an OCaml expression, since lengths can be non-constant. *)
458 | ConstantEndian of Bitmatch.endian
461 val get_endian : 'a field -> endian_expr
462 (** Get the endianness of a field. This is an {!endian_expr} which
463 could be a constant or an OCaml expression. *)
465 val get_signed : 'a field -> bool
466 (** Get the signedness of a field. *)
468 type field_type = Int | String | Bitstring
470 val get_type : 'a field -> field_type
471 (** Get the type of a field, [Int], [String] or [Bitstring]. *)
473 val get_location : 'a field -> loc_t
474 (** Get the source code location of a field. *)