#!/usr/bin/perl -w
-# Copyright (C) 2010 Red Hat Inc.
+# Copyright (C) 2010-2011 Red Hat Inc.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
=head2 ENCODING
-C<hivexregedit> expects that regedit files have already been reencoded
+C<hivexregedit> expects that regedit files have already been re-encoded
in the local encoding. Usually on Linux hosts, this means UTF-8 with
Unix-style line endings. Since Windows regedit files are often in
-UTF-16LE with Windows-style line endings, you may need to reencode the
+UTF-16LE with Windows-style line endings, you may need to re-encode the
whole file before or after processing.
-To reencode a file from Windows format to Linux (before processing it
+To re-encode a file from Windows format to Linux (before processing it
with the C<--merge> option), you would do something like this:
iconv -f utf-16le -t utf-8 < win.reg | dos2unix > linux.reg
Registry keys like C<CurrentControlSet> don't really exist in the
Windows Registry at the level of the hive file, and therefore you
-cannot modify these. Replace this with C<ControlSet001>, and
-similarly for other C<Current...> keys.
+cannot modify these.
+
+C<CurrentControlSet> is usually an alias for C<ControlSet001>. In
+some circumstances it might refer to another control set. The way
+to find out is to look at the C<HKLM\SYSTEM\Select> key:
+
+ $ hivexregedit --export SYSTEM '\Select'
+ [\Select]
+ "Current"=dword:00000001
+ "Default"=dword:00000001
+ "Failed"=dword:00000000
+ "LastKnownGood"=dword:00000002
+
+"Current" is the one which Windows will choose when it boots.
+
+Similarly, other C<Current...> keys in the path may need to
+be replaced.
=head1 EXAMPLE
The default is to use UTF-16LE, which should work with recent versions
of Windows.
+=cut
+
+my $unsafe_printable_strings;
+
+=item B<--unsafe-printable-strings>
+
+When exporting (only), assume strings are UTF-16LE and print them as
+strings instead of hex sequences. Remove the final zero codepoint
+from strings if present.
+
+This is unsafe and does not preserve the fidelity of strings in the
+original hive for various reasons:
+
+=over 4
+
+=item *
+
+Assumes the original encoding is UTF-16LE. ASCII strings and strings
+in other encodings will be corrupted by this transformation.
+
+=item *
+
+Assumes that everything which has type 1 or 2 is really a string
+and that everything else is not a string, but the type field in
+real hives is not reliable.
+
+=item *
+
+Loses information about whether a zero codepoint followed the string
+in the hive or not.
+
+=back
+
+This all happens because the hive itself contains no information about
+how strings are encoded (see
+L<Win::Hivex::Regedit(3)/ENCODING STRINGS>).
+
+You should only use this option for quick hacking and debugging of the
+hive contents, and I<never> use it if the output is going to be passed
+into another program or stored in another hive.
+
=back
=cut
"export" => \$export,
"prefix=s" => \$prefix,
"encoding=s" => \$encoding,
+ "unsafe-printable-strings" => \$unsafe_printable_strings,
) or pod2usage (2);
pod2usage (1) if $help;
print "Windows Registry Editor Version 5.00\n\n";
- reg_export ($h, $key, \*STDOUT, prefix => $prefix);
+ reg_export ($h, $key, \*STDOUT,
+ prefix => $prefix,
+ unsafe_printable_strings => $unsafe_printable_strings);
}
=head1 SEE ALSO