javaregex

Regular expression : match a sequence of char and a repeating group


I am trying to setup a regular expression to validate variables names. The rules are relatively simple, but I am no reg-ex jedi :)

it must start with a lettre or a underscore

[a-zA-Z_]

it can be followed by any repetition of letters or number, or dash "-" underscore "_" OR dot "."

[a-zA-Z_0-9\-\.]*

so far so good

[a-zA-Z_][a-zA-Z_0-9\-\.]*

BUT it may also include a number of repetition of this pattern \[\w+\] this pattern, it like any of the other chars, it can appear any number of times, except at the first letter.

so I tried a few options such as

[a-zA-Z_][a-zA-Z_0-9\-\.]*(\[\w+\])*

or

[a-zA-Z_][a-zA-Z_0-9\-\.(\[\w+\])]*

I never got around to find the right syntax.

The idea is quite simple, it's a pattern, and it can located like any of the chars in the second sequence, one or more time, in any order

If this is not clear, see examples below

### valid examples
valid_var_name[a]
valid3_va[x].[y].name
valid_myVar[w].x
valide_myVar[a1].sub[b].SUB1_2[c].sub[d1].sub[d1].sub[d1].sub[d1].sub[d1].sub[d1].isValid

### invalid examples
2invalid     // start with a number
.invalid     // start with a dot
invalid[]    // nothing between bracket
invalid].name      // char ] alone is invalid
invalid[.]name     // [.] dot is not a word
invalid[].name[1]  // [] without a word not valid  

Solution

  • ^[a-zA-Z_]([\w\-.]|\[\w+\])*$
    

    This starts with an alphabetic or underscore, then allows any sequence of either alphanumerics, -, ., or [name]

    The anchors are needed so it won't match a variable in the middle of a string, like variable in 2variable.

    DEMO