I am trying to split a C program by its function blocks. For example,
I tried using regex library and try to split by (){
. But of no use. Not sure where to begin.
string = """
int firt(){
if () {
}
}
customtype second(){
if () {
}
for(){
}
}
fdfndfndfnlkfe
"""
And I want the result to be a list that has each of the function block as an element: ['int first(){ ... }', 'customtype second(){....}']
I tried the following but getting None
import regex
import re
reg = r"""^[^()\n]+\([^()]*\)\s*
\{
(?:[^{}]*|(?R))+
\}"""
print(regex.match(reg, string))
First of all: don't - use a parser instead.
Second, if you insist and to see why should use a parser instead, have a glimpse at this recursive approach (which will only work with the newer regex
module):
^[^()\n]+\([^()]*\)\s*
\{
(?:[^{}]*|(?R))+
\}
See a demo on regex101.com. This will break with comments that include curly braces.
Python
this would be
import regex as re
reg = re.compile(r"""^[^()\n]+\([^()]*\)\s*
\{
(?:[^{}]*|(?R))+
\}""", re.VERBOSE | re.MULTILINE)
for function in reg.finditer(string):
print(function.group(0))