So I have a few lines like such:
tag1:
line1word1 lineoneanychar
line2word1
tag2:
line1word1 ....
line2word1 .....
I am trying to build a java regex that extracts all the data under the tags. i.e:
String parsed1 = line1word1 lineone\nline2word1
String parsed2 = line1word1 ....\nline2word1 .....
I believe the right way to do this is using something like this, but I haven't quite got it right:
Pattern p = Pattern.compile("tag1:\n( {1}.*)\n(?!\\w+)", Pattern.DOTALL);
Matcher m = p.matcher(clean_data);
if(m.find()){
System.out.println(m.group(1));
}
Any help would be appreciated!
Could be something like that
public static void main(String[] args) throws Exception {
String input = "tag1:\n"
+ " line1word1 lineoneanychar\n"
+ " line2word1\n"
+ "tag2:\n"
+ " line1word1 ....\n"
+ " line2word1 .....\n";
Pattern p = Pattern.compile("tag\\d+:$\\n((?:^\\s.*?$\\n)+)", Pattern.DOTALL|Pattern.MULTILINE);
Matcher m = p.matcher(input);
while(m.find()){
System.out.println(m.group(1));
}
}
Remember to escape \\ in your regex.
\d is a number
\s a space
(?:something) is for making a group that won't be a real 'group' in the matcher