assemblyunicodereverse-engineeringbinaryfilesghidra

Ghidra: Automagically set Bytes to Unicode / CString


I have a Ghidra question for you. I am disecting an executable binary and I am noticing a TON of data types that are clearly Unicode in the "Listing" Pane but are showing up as unknown Bytes.

I am aware that I can click on the first address and then select "Data" > "TerminatedUnicode" but there are hundreds of these bytes that need to be converted to Unicode.

Is there an automated way to perform this tedious task?

Data in Byte Format instead of Unicode

Manually casting individual Byte Data Types to Unicode


Solution

  • That should just be a fairly simple script, basically just createUnicodeString(Address) and getUndefinedDataAfter(Address) in a loop. The tricky part is deciding when to actually stop, but if you know when this memory range ends, that's just a simple additional check. Handling padding/alignment will be another slight pitfall, but it should be enough to repeat getUndefinedDataAfter until the current address isn't a null byte