actually I am building rules for my Snort IDS and trying to solve a problem with the Billion Laughs attack. It is nothing else than just recursive call of predefined variables. Snort rules may contain pcre and so i try to build an intelligent rule for this attack. This may be a simple form of this attack, with random lines inbetween the ENTITY-lines.
<!DOCTYPE data [
<!ENTITY a0 "dos" >
<!ENTITY a1 "&a0;&a0;&a0;&a0;">
<!ENTITY a2 "&a1;&a1;&a1;&a1;&a1;&a1;">
<!ENTITY a1 "&a2;&a2;&a2;&a2;&a2;&a2;">
test
<!ENTITY a1 "&a2;&a2;&a2;&ertertert;&a2;&a2;">
<!ENTITY a1 "&a2;&a2;&a2;&ertertert;&a2;&a2;">
<!ENTITY a1 "&a2;&a2;&a2;&a2;&a2;&a2;">
d
dd
<html abc>
a
<!ENTITY a2 "&a3;&a3;&a3;&a3;&a3;">
<!ENTITY a1 "&a0;&a0;&a0;&a0;&d5;">
]>
<data>&a2;</data>
And this is my actual rule:
(<!ENTITY\s[a-zA-Z0-9]*\s"(&[a-zA-Z0-9]+;){4,}">(\s?)[^]]*){5,}
To explain the goal that i want to achieve:
The rule has to trigger, whenever there are at least 5 ENTITY-lines with at least 4 of &-parameters. If all 5 lines are followed one after another, there is no problem, but the ENTITY-lines do not need to come one after another. So that i have to catch everything else in between two ENTITY-lines which makes the whole thing to a big termination problem, because [^]]* catches everything except a ] and also catches whole ENTITY-lines and makes my quantifier {5,} totaly useless. Actually i can't find any good solution for my problem.
Thanks for your help guys!
You may use
(?s)<!ENTITY\s[a-z0-9]*\s"(&[a-zA-Z0-9]+;){4,}">(?:.*?<!ENTITY\s[a-z0-9]*\s"(&[a-zA-Z0-9]+;){4,}">){4,}
See the regex demo
Details
(?s)
- DOTALL mode on, .
now matches any chars<!ENTITY
- a literal <!ENTITY
substring\s
- a whitespace[a-z0-9]*
- 0+ letters / digits\s
- a whitespace"
- a "
(&[a-zA-Z0-9]+;){4,}
- 4 or more repetitions of &
, 1+ alphanumeric chars and then ;
">
- a ">
substring(?:
- start of a non-capturing group matching....
.*?
- any 0+ chars, as few as possible<!ENTITY\s[a-z0-9]*\s"(&[a-zA-Z0-9]+;){4,}">
- same pattern as above){4,}
- ... 4 or more times.