I have expressions that can be provided by the user, such as:
a*sin(w*t)
a+b/c
x^2+y^2/2
And I would like to just get the list of variables there. I don't need to do any substitutions. So, for the first formula it's gonna be {a,w,t}
. For the second one {a,b,c}
, and for the last one {x,y}
.
The expression is primarily written to be parsed with Sympy, but I need to be able to get the list of variables in C++ for some checks. I would like to:
muparser
, but I don't know if any of these provide this functionalityWhat's the easiest way to do this? How would you tackle this problem?
Given an the input: const string input
we can collect or variables into set<string>
with a regex:
\b([a-zA-Z]\w*)(?:[^(a-zA-Z0-9_]|$)
You could use this in C++ as follows:
const regex re{ "\\b([a-zA-Z]\\w*)(?:[^(a-zA-Z0-9_]|$)" };
const set<string> output{ sregex_token_iterator(cbegin(input), cend(input), re, 1), sregex_token_iterator() };
EDIT:
regex
explanation:
\b
asserts a \W
character or the beginning or end of the string([a-zA-Z]
captures anything begining with an alphabetic charachter\w*)
followed by any number of "word" characters(?:
specifies the start of my non-capturing optional match[[^(a-zA-Z0-9_]
the 1st option is a non-open-parenthesis \W
character|$)
the other option is that the end of the input has been reached