I could use some advice - I'm parsing a binary file in php, to be specific, it's a Sega Genesis rom-file. According to the table I have made, certain bytes correspond to characters or control different stuff with the game's text-engine.
There are bytes, which are used for characters as well as "controller"-bytes, for line-breaks, conditions, color and a bunch of other stuff, so a typical sentence will probably look like this:
FC 03 E7 05 D3 42 79 20 64 6F 69 6E 67 20 73 6F 2C BC BE 08 79 6F 75 20 6A 75 73 74 20 61 63 71 75 69 72 65 64 BC BE 04 61 20 74 65 73 74 61 6D 65 6E 74 20 74 6F 20 79 6F 75 72 BC 73 74 61 74 75 73 20 61 73 20 61 20 77 61 72 72 69 6F 72 21 BD BC
which I can translate to:
<FC><03><E7><05><D3>By doing so,<NL><BE><08>you just acquired<NL><BE><04>a testament to your<NL>status as a warrior!<CURSOR>
I want to specify properties for such a controller-byte-string such as length and write my own values to certain positions..
See, bytes that translate into characters (00 to 7F) or line-breaks (BC) only consist of a single byte while others consist of 2 (BE XX). Conditions (FC) even consist of 5 bytes: FC XX YY (where X and Y refer to offsets which I need to calculate while I put my translated strings together)
I want my parser to recognize such bytes and let me write XX YY dynamicly. Using strtr I can only replace "groups" e.g. when I put the static bytestring into an array.
How would you do this while keeping the parser flexible? Thanks!
Assuming you have your hex values available as string, you can use this regex to parse it like you've mentioned. If you identify more rules other than FC**** or BE** then you can directly add them to the below regex so that they are also extracted.
(?<fc>FC(\w\w){4})|(?<be>BE(\w\w))|(?<any>(\w\w))
Now using named groups fc
, be
, any
to identify result set easily using arrays such as $matches['fc']
.
Regex Demo: https://regex101.com/r/kR9kdP/5
$re = '/(?<fc>FC(\w\w){4})|(?P<be>BE(\w\w))|(?P<any>(\w\w))/';
$str = 'FC03E705D3FC0006042842616D20626162612062';
preg_match_all($re, $str, $matches, PREG_PATTERN_ORDER, 0);
// Print the entire match result
print_r(array_filter($matches['fc'])); // Returns an array with all FC****
print_r(array_filter($matches['be'])); // Returns an array with all BE**
print_r(array_filter($matches['any'])); // Returns rest **
PHP Demo: http://ideone.com/qWUaob
Sample Results:
Array
(
[0] => FC03E705D3
[1] => FC00060428
)
Array
(
[50] => BE08
[59] => BE04
[113] => BE08
[132] => BE04
)
Hope this helps!