First one: a huge memory leak when bitmatching on int64. The reason is
that on my arch (MIPS), which is a ARCH_ALIGN_INT64 arch, the function
to extract int64 calls caml_copy_int64 but you are not supposed to do
that since all functions are declared noalloc (should segfault but
merely leaks memory on my program). See this message of Xavier for
more explanations as to why it's a bug:
http://caml.inria.fr/pub/ml-archives/caml-list/2010/03/
2e386c384c04f4a424acc44a1
fae3abd.en.html
So the easy fix would be to remove the noallocs for the int64 functions,
but there is a better way around the issue: instead of assigning a int64
onto a possibly non 8 bytes aligned memory location we can merely memcpy
the value in there, thus avoiding the extra allocation and the
caml_c_call wrapper.
While looking at this I stumbled upon the other bug: bitstring_c.c
extract values by assigning a casted rvalue into a properly
stack-allocated local variable. You cannot do that, this is undefined
behavior since it assumes that unaligned reads are permitted. Again, on
my MIPS such unaligned access are (by default) diverted to an exception
handler which (slowly!) fix them on the fly (one can see the errorcount in
/proc/kernel/debug/mips/unaligned_instructions), but on other archs it
could as well result in a sigbus. I believe this would be a problem on
ARM as well. Again, memcpy comes to the rescue.
The attached patch fixes both issues.
#include <stdlib.h>
#include <stdint.h>
#include <byteswap.h>
+#include <string.h>
#include <caml/mlvalues.h>
#include <caml/fail.h>
{ \
type *ptr = (type *) ((void *) String_val (strv) + Int_val (offv)); \
type r; \
- r = *ptr; \
+ memcpy(&r, ptr, sizeof(r)); \
swap_##endian(size,r); \
return Val_int (r); \
}
{ \
type *ptr = (type *) ((void *) String_val (strv) + Int_val (offv)); \
type r; \
- r = *ptr; \
+ memcpy(&r, ptr, sizeof(r)); \
swap_##endian(size,r); \
rval(rv) = r; \
return rv; \
ocaml_bitstring_extract_fastpath_int##size##_##endian##_##signed \
(value strv, value offv, value rv) \
{ \
- CAMLparam3(strv, offv, rv); \
type *ptr = (type *) ((void *) String_val (strv) + Int_val (offv)); \
type r; \
- r = *ptr; \
+ memcpy(&r, ptr, sizeof(r)); \
swap_##endian(size,r); \
- CAMLreturn(caml_copy_int64(r)); \
+ memcpy(Data_custom_val(rv), &r, sizeof(r)); \
+ return rv; \
}
#else