I'm trying to understand the snowball stemming algorithmus. HW90 has had a similar question with examples, but not mine. The algorithmus is using two regions R1 and R2 that are definied as follows:
R1 is the region after the first non-vowel following a vowel, or is the null region at the end of the word if there is no such non-vowel.
R2 is the region after the first non-vowel following a vowel in R1, or is the null region at the end of the word if there is no such non-vowel.
I don't understand, what "the null region at the end of the word" is. Could anybody give me some examples for that, please?
Null region means empty region, no letters. You missed the examples in the documentation page:
Below, R1 and R2 are shown for a number of English words,
b e a u t i f u l |<------------->| R1 |<----->| R2
Letter t is the first non-vowel following a vowel in beautiful, so R1 is iful. In iful, the letter f is the first non-vowel following a vowel, so R2 is ul.
b e a u t y |<->| R1 ->|<- R2
In beauty, the last letter y is classed as a vowel. Again, letter t is the first non-vowel following a vowel, so R1 is just the last letter, y. R1 contains no non-vowel, so R2 is the null region at the end of the word.
b e a u ->|<- R1 ->|<- R2