I want to match street names and their house numbers that can also contain one character and a range of house numbers.
For Example:
Birkenstraße 22b
Birkenstraße 22b-23a
Birkenstraße 22b/23z
For this, I have the following rule in a ruta script:
(Street PERIOD? ((NUM "b"? (("/"|"-") NUM "b"?)?) {-> MARK(HouseNumber)}));
"b"
is the place I want to match all characters, like in a regex with [a-zA-Z]
. But I have tried to replace "b"
with "[a-zA-Z]"
and no HouseNumber was recognized at all. Whereas with "b"
I can recognize the first part of the streets Birkenstraße 22b
in my examples.
How can I mix this regular expression within a rule in UIMA Ruta?
I declared a type and assigned it like this at the begin of my script:
DECLARE CHARS;
W{REGEXP("[a-zA-Z]") -> MARK(CHARS)};
After that, I added the type CHARS
to my rule like this and it worked:
(Street PERIOD? ((NUM CHARS? (("/"|"-") NUM CHARS?)?) {-> MARK(HouseNumber)}));