Introducing a YAML string in a python script

I am working on a python code able to read a YAML file and generate a rule-based model in PySB.

A new rule in the YAML file is specified like:

--- !rule
name: L_binds_R
reaction:
    L(unbound) + R(inactive) >> L(bound)%R(active)
rates:
    - Kf

With this I create a pyyaml object (pyyaml is a package to work with yaml in python) in python and the reaction attribute is stored as a string.

Then, the rule in pysb requires to be specified as:

# Rule(name, reaction, constant)
Rule('L_binds_R', L(unbound) + R(inactive) >> L(bound)%R(active), kf)

My problem relies in the fact that the 'reaction' field in yaml is stored as string in the python object but pysb does not accept any other format than plain text.

I have checked in PySB and the reaction field cannot be a string in any case and I did not find how to scape the formating of variables in YAML.

Any idea to fix the problem?

Solution

You could approach this one of two ways: restructuring your YAML find to tokenise the reaction rules, or using eval in Python.

Tokenised reaction rules

The best approach would be to structure your YAML file such that your reaction rule is already specified in individual tokens, rather than just one field for the whole reaction, e.g.

--- rule!
name: L_binds_R
reaction:
    reactant:
        name: L
        site: b
    reactant:
        name: R
        site: b
            state: inactive         
    product:
        name: L
        site: b
            bond: 1
    product:
        name: R
        site: b
            bond: 1
            state: active
    fwd_rate: kf

You could then write a parser to translate this into the following PySB rule, building the ReactionPattern using the classes in PySB core (MonomerPattern, ComplexPattern and so on):

Rule(‘L_binds_R’, L(b=None) + R(b='inactive') >> L(b=1) % R(b=(‘active’, 1)), kf)

If you have control over the code where the YAML is coming from, you might find it easier to either output PySB code directly, or perhaps write to a standard like SBML, which PySB can now read.

You might find it helpful to look at the PySB BioNetGen language (BNGL) parser I wrote, which creates a PySB model from a BioNetGen XML file, as an example of how to create a model from an external file.

Using `eval`

The alternative is to use eval. While this is the easier solution, it is strongly discouraged for security reasons*. However if the YAML files are all generated by you/your own code and you just want a quick fix, this would do it.

Here’s an example:

# You would read these in from the YAML file, but I’ll just define
# the strings here for simplicity
reaction_name = "L_binds_R"
reaction_str = "L(b=None) + R(b='inactive') >> L(b=1) % R(b=('active', 1))"
reaction_fwd_rate = "Kf"

Rule(reaction_name, eval(reaction_str), eval(reaction_fwd_rate))
# Python output 
# (assumes Monomers L and R and parameter Kf are already defined):
# >>> Rule('L_binds_R', L(b=None) + R(b='inactive') >> L(b=1) % R(b=('active', 1)), Kf)

*Consider the case where your YAML contained something like:

reaction:
    import shutil; shutil.rmtree('~')

Importing that YAML file and evaling that field would delete your home directory! eval will execute any arbitrary Python code by definition. It should only be used where the source file is completely trusted. In general you should always "sanitise your inputs" (assume inputs are dangerous until proven otherwise).

Introducing a YAML string in a python script

Tokenised reaction rules

Using eval

Using `eval`