regexperl

Perl regex for street number street name, at least, one digit, one upper case letter, least two lowercase and one whitespace


I know this won't be perfect but I am looking for a regex to take a guess if a line in a file is a street address (w/o city state zip).

For example:

112 N Main

3525 Webster

1 Stone Ave

Morrison Number 34

So the recipe I am shooting for is: at least one digit, at least one upper case letter, at least two lowercase letters and at least one white space somewhere in the middle of the string. So far what I have is unsuccessful:

/[1-9][A-Z]{1}[a-z]{2}\s/

Solution

  • The pattern that you tried must match in the specified order, and does not handle that the mandatory characters are present in any order.

    You might use:

    ^(?=[^\n\d]*\d)(?=\h*\S+\h+\S)(?=[^A-Z\n]*[A-Z])(?:[^a-z\n]*[a-z]){2}.*
    

    The pattern matches:

    See a regex demo

    Note that in your regex you don't have to use the {1} as that is default, and if you want to match a digit 1-9 instead of 0-9 then you can change \d to [1-9]

    Using \s can also match a newline, I have used \h to match a horizontal whitespace char but you could change that of course.