javaregexregex-group

Regex to find methods that have an 'inner' method


I'm working on some upkeep for a monolithic java codebase, where it was discovered that some of the @GET methods will actually start a write session, and should thus actually be @POST methods. I wrote the following regex to aide my search:

@GET[^}]+startWrite

This gave me all occurrences of the @GET annotation which reached a string 'startWrite' (which is part of the method names that start a write session) before reaching a '}', which is used to close a method in java. This solution is not perfect, as it is possible that a } is used inside a method before a write session is started (for instance in an if-statement), but it proofed effective enough to work with.

However, it has since come to my attention that a lot of the methods follow this format:

@GET
@Path("/methodName")
public ObjectName methodName(...){
    ...
    return methodNameInner(...);
}
private methodNameInner(...){
    startWriteSession();
    ...
}

In other words, the write session command is moved to another method, which always bears the same name as the original method (and the pathname), followed by 'Inner'. this inner method is always below the original method. I tried to write a regex that searched for occurences of @GET, followed by some strings until either the path name or the method name (which I isolated in a separate group), followed by more characters, followed by \1Inner, followed by the same [^}]+startWrite, meaning the inner method reached a 'startWrite' string before it reached the end of the method. But I could not get it to work.

Could someone please assist me?


Solution

  • Personal thoughts

    I completely agree with the comments on your question, that a regular expression isn't the correct tool, as it will not handle all your cases, and will only work if the code you are analysing is written like you expected.

    But in some situations, where you just want to correct/adapt some existing code, limited to a few files and methods, then a regex can quickly solve your needs.

    Regular expression, without warranty

    If and only if your code will be written like you mentioned, then you could have a go with this commented pattern:

    "
    ^@GET\r?\n                     # A line with @GET
    @Path\(\"(?<path>[^\"]*)\"\)   # Followed by @Path, and capture this path.
    .*?                            # Match anything, in an ungreedy way.
    # Capture the name of the method and its parameters.
    public\s+ObjectName\s+(?<method>\w+)\((?<params>[^)]+)\)\s*\{
    .*?                            # Match anything, in an ungreedy way.
    \bstartWriteSession\(\);       # match a call to startWriteSession().
    "gmxsi
    

    A little test online: https://regex101.com/r/4Jpg1B/2