pythonc++expressionsympyanalytical

C++: Extracting symbols/variables of an analytical mathematical expression


I have expressions that can be provided by the user, such as:

 a*sin(w*t) 
 a+b/c
 x^2+y^2/2

And I would like to just get the list of variables there. I don't need to do any substitutions. So, for the first formula it's gonna be {a,w,t}. For the second one {a,b,c}, and for the last one {x,y}.

The expression is primarily written to be parsed with Sympy, but I need to be able to get the list of variables in C++ for some checks. I would like to:

What's the easiest way to do this? How would you tackle this problem?


Solution

  • Given an the input: const string input we can collect or variables into set<string> with a regex:

    \b([a-zA-Z]\w*)(?:[^(a-zA-Z0-9_]|$)

    You could use this in C++ as follows:

    const regex re{ "\\b([a-zA-Z]\\w*)(?:[^(a-zA-Z0-9_]|$)" };
    const set<string> output{ sregex_token_iterator(cbegin(input), cend(input), re, 1), sregex_token_iterator() };
    

    Live Example

    EDIT:

    regex explanation: