awkrefactoringautomated-refactoring

Smart search and replace


I had some code that had a few thousands lines of code that contain pieces like this

opencanmanager.GetObjectDict()->ReadDataFrom(0x1234, 1).toInt()

that I needed to convert to some other library that uses syntax like this

ReadFromOD<int>(0x1234, 1)

.

Basically I need to search for [whatever1]opencanmanager.GetObjectDict()->ReadDataFrom([whatever2]).toInt()[whatever3] across all the lines of a text file and to replace every occurence of it with [whatever1]ReadFromOD<int>([whatever2])[whatever3] and then do the same for a few other data types.

Doing that manually was going to be a few days of absolutely terrible dumb work but all the automatic functions of any editor I know of do not allow for any smart code refactoring tools.

Now I have solved the problem using GNU AWK with the script below

#!/usr/bin/awk -f
BEGIN {
    spl1 = "opencanmanager.GetObjectDict()->ReadDataFrom("
    spl2 = ").to"
    spl2_1 = ").toString()"
    spl2_2 = ").toUInt()"
    spl2_3 = ").toInt()"
    min_spl2_len = length(spl2_3)

    repl_start = "ReadFromOD<"
    repl_mid1 = "QString"
    repl_mid2 = "uint"
    repl_mid3 = "int"
    repl_end = ">("
    repl_after = ")"
}

function replacer(str)
{
    pos1 = index(str, spl1)
    pos2 = index(str, spl2)
    if (!pos1 || !pos2) {
        return str
    }

    strbegin = substr(str, 0, pos1-1)
    mid_start_pos = pos1+length(spl1)

    strkey = substr(str, pos2, min_spl2_len)
    key1 = substr(spl2_1, 0, min_spl2_len)
    key2 = substr(spl2_2, 0, min_spl2_len)
    key3 = substr(spl2_3, 0, min_spl2_len)

    strmid = substr(str, mid_start_pos, pos2-mid_start_pos)
    if (strkey == key1) {
        repl_mid = repl_mid1; spl2_fact = spl2_1;
    } else if (strkey == key2) {
        repl_mid = repl_mid2; spl2_fact = spl2_2;
    } else if (strkey == key3) {
        repl_mid = repl_mid3; spl2_fact = spl2_3;
    } else {
        print "ERROR!!! Found", spl1, "but not any of", spl2_1, spl2_1, spl2_3 "!" > "/dev/stderr"
        exit EXIT_FAILURE
    }
    str_remainder = substr(str, pos2+length(spl2_fact))
    return strbegin repl_start repl_mid repl_end strmid repl_after str_remainder
}

{
    resultstr = $0
    do {
        resultstr = replacer(resultstr)
        more_spl = index(resultstr, spl1) || index(resultstr, spl2)
    } while (more_spl)
    print(resultstr)
}

and everything works fine but the thing still bugs me somewhat. My solution still feels a bit too complicated for a job that must be very common and must have an easy standard solution that I just dont't know about for some reason.

I am prepared to just let it go but if you know a more elegant and quick one-liner solution or some specific tool for the smart code modification problem then I would definitely would like to know.


Solution

  • If sed is an option, you can try this solution which should match both output examples from input such as this.

    $ cat input_file
    opencanmanager.GetObjectDict()->ReadDataFrom(0x1234, 1).toInt()
    
    power1 = opencanmanager.GetObjectDict()->ReadDataFrom(0x1234, 1).toInt() * opencanmanager.GetObjectDict()->ReadDataFrom(0x5678, 1).toUInt() * FACTOR1;
    power2 = opencanmanager.GetObjectDict()->ReadDataFrom(0x5678, 1).toUInt() / 2;
    
    $ sed -E 's/ReadDataFrom/ReadFromOD<int>/g;s/int/uint/2;s/(.*= )?[^>]*>([^\.]*)[^\*|/]*?(\*|\/.{2,})?[^\.]*?[^>]*?>?([^\.]*)?[^\*]*?(.*)?/\1\2 \3 \4 \5/' input_file
    ReadFromOD<int>(0x1234, 1)
    
    power1 = ReadFromOD<int>(0x1234, 1) * ReadFromOD<uint>(0x5678, 1) * FACTOR1;
    power2 = ReadFromOD<int>(0x5678, 1) / 2;
    

    s/ReadDataFrom/ReadFromOD<int>/g - The first part of the command does a simple global substitution substituting all occurances of ReadDataFrom to ReadFromOD<int>

    s/int/uint/2 - The second part will only substitute the second occurance of int to uint if there is one

    s/(.*= )?[^>]*>([^\.]*)[^\*|/]*?(\*|\/.{2,})?[^\.]*?[^>]*?>?([^\.]*)?[^\*]*?(.*)?/\1\2 \3 \4 \5/ - The third part utilizes sed grouping and back referencing.