1 (** Bitstring persistent patterns. *)
2 (* Copyright (C) 2008 Red Hat Inc., Richard W.M. Jones
4 * This library is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU Lesser General Public
6 * License as published by the Free Software Foundation; either
7 * version 2 of the License, or (at your option) any later version,
8 * with the OCaml linking exception described in COPYING.LIB.
10 * This library is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
13 * Lesser General Public License for more details.
15 * You should have received a copy of the GNU Lesser General Public
16 * License along with this library; if not, write to the Free Software
17 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
23 {b Warning:} This documentation is for ADVANCED USERS ONLY.
24 If you are not an advanced user, you are probably looking
25 for {{:Bitstring.html}the Bitstring documentation}.
27 {{:#reference}Jump straight to the reference section for
28 documentation on types and functions}.
32 Bitstring allows you to name sets of fields and reuse them
33 elsewhere. For example if you frequently need to parse
34 Pascal-style strings in the form length byte + string, then you
35 could name the [{ strlen : 8 : int; str : strlen*8 : string }]
36 pattern and reuse it everywhere by name.
38 These are called {b persistent patterns}.
43 (* Create a persistent pattern called 'pascal_string' which
44 * matches Pascal-style strings (length byte + string).
46 let bitmatch pascal_string =
48 str : strlen*8 : string }
50 let is_pascal_string bits =
52 | \{ :pascal_string } ->
53 printf "matches a Pascal string %s, len %d bytes\n"
60 (* Load a persistent pattern from a file. *)
61 open bitmatch "pascal.bmpp"
63 let is_pascal_string bits =
65 | \{ :pascal_string } ->
66 printf "matches a Pascal string %s, len %d bytes\n"
72 There are some important things you should know about
73 persistent patterns before you decide to use them:
75 'Persistent' refers to the fact that they can be saved into binary
76 files. However these binary files use OCaml [Marshal] module and
77 depend (sometimes) on the version of OCaml used to generate them
78 and (sometimes) the version of bitstring used. So your build system
79 should rebuild these files from source when your code is rebuilt.
81 Persistent patterns are syntactic. They work in the same way
82 as cutting and pasting (or [#include]-ing) code. For example
83 if a persistent pattern binds a field named [len], then any
84 uses of [len] following in the surrounding pattern could
87 Programs which generate and manipulate persistent patterns have to
88 link to camlp4. Since camlp4 in OCaml >= 3.10 is rather large, we
89 have placed this code into this separate submodule, so that
90 programs which just use bitstring don't need to pull in the whole of
91 camlp4. This restriction does not apply to code which only uses
92 persistent patterns but does not generate them. If the distinction
93 isn't clear, use [ocamlobjinfo] to look at the dependencies of your
96 Persistent patterns can be generated in several ways, but they
97 can only be {i used} by the [pa_bitstring] syntax extension.
98 This means they are purely compile-time constructs. You
99 cannot use them to make arbitrary patterns and run those
100 patterns (not unless your program runs [ocamlc] to make a [*.cmo]
101 file then dynamically links to the [*.cmo] file).
105 A named pattern is a way to name a pattern and use it later
106 in the same source file. To name a pattern, use:
108 [let bitmatch name = { fields ... } ;;]
110 and you can then use the name later on inside another pattern,
111 by prefixing the name with a colon.
114 [bitmatch bits with { :name } -> ...]
116 You can nest named patterns within named patterns to any depth.
118 Currently the use of named patterns is somewhat limited.
119 The restrictions are:
121 Named patterns can only be used within the same source file, and
122 the names occupy a completely separate namespace from anything
123 else in the source file.
125 The [let bitmatch] syntax only works at the top level. We may
126 add a [let bitmatch ... in] for inner levels later.
128 Because you cannot rename the bound identifiers in named
129 patterns, you can effectively only use them once in a
130 pattern. For example, [{ :name; :name }] is legal, but
131 any bindings in the first name would be overridden by
134 There are no "named constructors" yet, but the machinery
135 is in place to do this, and we may add them later.
137 {2 Persistent patterns in files}
139 More useful than just naming patterns, you can load
140 persistent patterns from external files. The patterns
141 in these external files can come from a variety of sources:
142 for example, in the [cil-tools] subdirectory are some
143 {{:http://cil.sf.net/}Cil-based} tools for importing C
144 structures from header files. You can also generate
145 your own files or write your own tools, as described below.
147 To use the persistent pattern(s) from a file do:
149 [open bitmatch "filename.bmpp" ;;]
151 A list of zero or more {!named} patterns are read from the file
152 and each is bound to a name (as contained in the file),
153 and then the patterns can be used with the usual [:name]
154 syntax described above.
158 The standard extension is [.bmpp]. This is just a convention
159 and you can use any extension you want.
161 {3 Directory search order}
163 If the filename is an absolute or explicit path, then we try to
164 load it from that path and stop if it fails. See the [Filename]
165 module in the standard OCaml library for the definitions of
166 "absolute path" and "explicit path". Otherwise we use the
167 following directory search order:
169 - Relative to the current directory
170 - Relative to the OCaml library directory
172 {3 bitstring-objinfo}
174 The [bitstring-objinfo] command can be run on a file in order
175 to print out the patterns in the file.
179 We haven't implemented persistent constructors yet, although
180 the machinery is in place to make this happen. Any constructors
181 found in the file are ignored.
183 {2 Creating your own persistent patterns}
185 If you want to write a tool to import bitstrings from an
186 exotic location or markup language, you will need
187 to use the functions found in the {{:#reference}reference section}.
189 I will describe using an example here of how you would
190 programmatically create a persistent pattern which
191 matches Pascal-style "length byte + data" strings.
192 Firstly note that there are two fields, so our pattern
193 will be a list of length 2 and type {!pattern}.
195 You will need to create a camlp4 location object ([Loc.t])
196 describing the source file. This source file is used
197 to generate useful error messages for the user, so
198 you may want to set it to be the name and location in
199 the file that your tool reads for input. By convention,
200 locations are bound to name [_loc]:
203 let _loc = Loc.move_line 42 (Loc.mk "input.xml")
206 Create a pattern field representing a length field which is 8 bits wide,
207 bound to the identifier [len]:
210 let len_field = create_pattern_field _loc
211 let len_field = set_length_int len_field 8
212 let len_field = set_lident_patt len_field "len"
215 Create a pattern field representing a string of [len*8] bits.
216 Note that the use of [<:expr< >>] quotation requires
217 you to preprocess your source with [camlp4of]
218 (see {{:http://brion.inria.fr/gallium/index.php/Reflective_OCaml}this
219 page on Reflective OCaml}).
222 let str_field = create_pattern_field _loc
223 let str_field = set_length str_field <:expr< len*8 >>
224 let str_field = set_lident_patt str_field "str"
225 let str_field = set_type_string str_field
228 Join the two fields together and name it:
231 let pattern = [len_field; str_field]
232 let named_pattern = "pascal_string", Pattern pattern
238 let chan = open_out "output.bmpp" in
239 named_to_channel chan named_pattern;
243 You can now use this pattern in another program like this:
246 open bitmatch "output.bmpp" ;;
247 let parse_pascal_string bits =
249 | \{ :pascal_string } -> str, len
250 | \{ _ } -> invalid_arg "not a Pascal string"
253 You can write more than one named pattern to the output file, and
254 they will all be loaded at the same time by [open bitmatch ".."]
255 (obviously you should give each pattern a different name). To do
256 this, just call {!named_to_channel} as many times as needed.
258 {2:reference Reference}
263 type patt = Camlp4.PreCast.Syntax.Ast.patt
264 type expr = Camlp4.PreCast.Syntax.Ast.expr
265 type loc_t = Camlp4.PreCast.Syntax.Ast.Loc.t
266 (** Just short names for the camlp4 types. *)
269 (** A field in a persistent pattern or persistent constructor. *)
271 type pattern = patt field list
272 (** A persistent pattern (used in [bitmatch] operator), is just a
273 list of pattern fields. *)
275 type constructor = expr field list
276 (** A persistent constructor (used in [BITSTRING] operator), is just a
277 list of constructor fields. *)
279 type named = string * alt
281 | Pattern of pattern (** Pattern *)
282 | Constructor of constructor (** Constructor *)
283 (** A named pattern or constructor.
285 The name is used when binding a pattern from a file, but
286 is otherwise ignored. *)
290 val string_of_pattern : pattern -> string
291 val string_of_constructor : constructor -> string
292 val string_of_pattern_field : patt field -> string
293 val string_of_constructor_field : expr field -> string
294 (** Convert patterns, constructors or individual fields
295 into printable strings for debugging purposes.
297 The strings look similar to the syntax used by bitmatch, but
298 some things cannot be printed fully, eg. length expressions. *)
300 (** {3 Persistence} *)
302 val named_to_channel : out_channel -> named -> unit
303 (** Save a pattern/constructor to an output channel. *)
305 val named_to_string : named -> string
306 (** Serialize a pattern/constructor to a string. *)
308 val named_to_buffer : string -> int -> int -> named -> int
309 (** Serialize a pattern/constructor to part of a string, return the length. *)
311 val named_from_channel : in_channel -> named
312 (** Load a pattern/constructor from an output channel.
314 Note: This is not type safe. The pattern/constructor must
315 have been written out under the same version of OCaml and
316 the same version of bitstring. *)
318 val named_from_string : string -> int -> named
319 (** Load a pattern/constructor from a string at offset within the string.
321 Note: This is not type safe. The pattern/constructor must
322 have been written out under the same version of OCaml and
323 the same version of bitstring. *)
325 (** {3 Create pattern fields}
327 These fields are used in pattern matches ([bitmatch]). *)
329 val create_pattern_field : loc_t -> patt field
330 (** Create a pattern field.
332 The pattern is unbound, the type is set to [int], bit length to [32],
333 endianness to [BigEndian], signedness to unsigned ([false]),
334 source code location to the [_loc] parameter, and no offset expression.
336 To create a complete field you need to call the [set_*]
337 functions. For example, to create [{ len : 8 : int }]
341 let field = create_pattern_field _loc in
342 let field = set_lident_patt field "len" in
343 let field = set_length_int field 8 in
347 val set_lident_patt : patt field -> string -> patt field
348 (** Sets the pattern to the pattern binding an identifier
351 The effect is that the field [{ len : 8 : int }] could
352 be created by calling [set_lident_patt field "len"]. *)
354 val set_int_patt : patt field -> int -> patt field
355 (** Sets the pattern field to the pattern which matches an integer.
357 The effect is that the field [{ 2 : 8 : int }] could
358 be created by calling [set_int_patt field 2]. *)
360 val set_string_patt : patt field -> string -> patt field
361 (** Sets the pattern field to the pattern which matches a string.
363 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
364 be created by calling [set_int_patt field "MAGIC"]. *)
366 val set_unbound_patt : patt field -> patt field
367 (** Sets the pattern field to the unbound pattern (usually written [_]).
369 The effect is that the field [{ _ : 8 : int }] could
370 be created by calling [set_unbound_patt field]. *)
372 val set_patt : patt field -> patt -> patt field
373 (** Sets the pattern field to an arbitrary OCaml pattern match. *)
375 val set_length_int : 'a field -> int -> 'a field
376 (** Sets the length in bits of a field to a constant integer.
378 The effect is that the field [{ len : 8 : string }] could
379 be created by calling [set_length field 8]. *)
381 val set_length : 'a field -> expr -> 'a field
382 (** Sets the length in bits of a field to an OCaml expression.
384 The effect is that the field [{ len : 2*i : string }] could
385 be created by calling [set_length field <:expr< 2*i >>]. *)
387 val set_endian : 'a field -> Bitstring.endian -> 'a field
388 (** Sets the endianness of a field to the constant endianness.
390 The effect is that the field [{ _ : 16 : bigendian }] could
391 be created by calling [set_endian field Bitstring.BigEndian]. *)
393 val set_endian_expr : 'a field -> expr -> 'a field
394 (** Sets the endianness of a field to an endianness expression.
396 The effect is that the field [{ _ : 16 : endian(e) }] could
397 be created by calling [set_endian_expr field e]. *)
399 val set_signed : 'a field -> bool -> 'a field
400 (** Sets the signedness of a field to a constant signedness.
402 The effect is that the field [{ _ : 16 : signed }] could
403 be created by calling [set_signed field true]. *)
405 val set_type_int : 'a field -> 'a field
406 (** Sets the type of a field to [int].
408 The effect is that the field [{ _ : 16 : int }] could
409 be created by calling [set_type_int field]. *)
411 val set_type_string : 'a field -> 'a field
412 (** Sets the type of a field to [string].
414 The effect is that the field [{ str : 16 : string }] could
415 be created by calling [set_type_string field]. *)
417 val set_type_bitstring : 'a field -> 'a field
418 (** Sets the type of a field to [bitstring].
420 The effect is that the field [{ _ : 768 : bitstring }] could
421 be created by calling [set_type_bitstring field]. *)
423 val set_location : 'a field -> loc_t -> 'a field
424 (** Sets the source code location of a field. This is used when
425 pa_bitstring displays error messages. *)
427 val set_offset_int : 'a field -> int -> 'a field
428 (** Set the offset expression for a field to the given number.
430 The effect is that the field [{ _ : 8 : offset(160) }] could
431 be created by calling [set_offset_int field 160]. *)
433 val set_offset : 'a field -> expr -> 'a field
434 (** Set the offset expression for a field to the given expression.
436 The effect is that the field [{ _ : 8 : offset(160) }] could
437 be created by calling [set_offset_int field <:expr< 160 >>]. *)
439 val set_no_offset : 'a field -> 'a field
440 (** Remove the offset expression from a field. The field will
441 follow the previous field, or if it is the first field will
442 be at offset zero. *)
444 val set_check : 'a field -> expr -> 'a field
445 (** Set the check expression for a field to the given expression. *)
447 val set_no_check : 'a field -> 'a field
448 (** Remove the check expression from a field. *)
450 val set_bind : 'a field -> expr -> 'a field
451 (** Set the bind-expression for a field to the given expression. *)
453 val set_no_bind : 'a field -> 'a field
454 (** Remove the bind-expression from a field. *)
456 val set_save_offset_to : 'a field -> patt -> 'a field
457 (** Set the save_offset_to pattern for a field to the given pattern. *)
459 val set_save_offset_to_lident : 'a field -> string -> 'a field
460 (** Set the save_offset_to pattern for a field to identifier. *)
462 val set_no_save_offset_to : 'a field -> 'a field
463 (** Remove the save_offset_to from a field. *)
465 (** {3 Create constructor fields}
467 These fields are used in constructors ([BITSTRING]). *)
469 val create_constructor_field : loc_t -> expr field
470 (** Create a constructor field.
472 The defaults are the same as for {!create_pattern_field}
473 except that the expression is initialized to [0].
476 val set_lident_expr : expr field -> string -> expr field
477 (** Sets the expression in a constructor field to an expression
478 which uses the identifier.
480 The effect is that the field [{ len : 8 : int }] could
481 be created by calling [set_lident_expr field "len"]. *)
483 val set_int_expr : expr field -> int -> expr field
484 (** Sets the expression to the value of the integer.
486 The effect is that the field [{ 2 : 8 : int }] could
487 be created by calling [set_int_expr field 2]. *)
489 val set_string_expr : expr field -> string -> expr field
490 (** Sets the expression to the value of the string.
492 The effect is that the field [{ "MAGIC" : 8*5 : string }] could
493 be created by calling [set_int_expr field "MAGIC"]. *)
495 val set_expr : expr field -> expr -> expr field
496 (** Sets the expression field to an arbitrary OCaml expression. *)
500 val get_patt : patt field -> patt
501 (** Get the pattern from a pattern field. *)
503 val get_expr : expr field -> expr
504 (** Get the expression from an expression field. *)
506 val get_length : 'a field -> expr
507 (** Get the length in bits from a field. Note that what is returned
508 is an OCaml expression, since lengths can be non-constant. *)
511 | ConstantEndian of Bitstring.endian
514 val get_endian : 'a field -> endian_expr
515 (** Get the endianness of a field. This is an {!endian_expr} which
516 could be a constant or an OCaml expression. *)
518 val get_signed : 'a field -> bool
519 (** Get the signedness of a field. *)
521 type field_type = Int | String | Bitstring
523 val get_type : 'a field -> field_type
524 (** Get the type of a field, [Int], [String] or [Bitstring]. *)
526 val get_location : 'a field -> loc_t
527 (** Get the source code location of a field. *)
529 val get_offset : 'a field -> expr option
530 (** Get the offset expression of a field, or [None] if there is none. *)
532 val get_check : 'a field -> expr option
533 (** Get the check expression of a field, or [None] if there is none. *)
535 val get_bind : 'a field -> expr option
536 (** Get the bind expression of a field, or [None] if there is none. *)
538 val get_save_offset_to : 'a field -> patt option
539 (** Get the save_offset_to pattern of a field, or [None] if there is none. *)