Flux variability analysis only for transport reactions between compartments?

I would like to do a FVA only for selected reactions, in my case on transport reactions between compartments (e.g. between the cytosol and mitochondrion). I know that I can use selected_reactions in doFVA like this:

import cbmpy as cbm

mod = cbm.CBRead.readSBML3FBC('iMM904.xml.gz')

cbm.doFVA(mod, selected_reactions=['R_FORtm', 'R_CO2tm'])

Is there a way to get the entire list of transport reactions, not only the two I manually added? I thought about selecting the reactions based on their ending tm but that fails for 'R_ORNt3m' (and probably other reactions, too).

I want to share this model with others. What is the best way of storing the information in the SBML file? Currently, I would store the information in the reaction annotation as in this answer. For example

mod.getReaction('R_FORtm').setAnnotation('FVA', 'yes')

which could be parsed.

Solution

There is no built-in function for this kind of task. As you already mentioned, relying on the IDs is generally not a good idea as those can differ between different databases, models and groups (e.g. if someone decided just to enumerate reactions from r1 till rn and or metabolites from m1 till mm, filtering based on IDs fails). Instead, one can make use of the compartment field of the species. In CBMPy you can access a species' compartment by doing

import cbmpy as cbm
import pandas as pd

mod = cbm.CBRead.readSBML3FBC('iMM904.xml.gz')

mod.getSpecies('M_atp_c').getCompartmentId()
# will return 'c'

# run a FBA
cbm.doFBA(mod)

This can be used to find all fluxes between compartments as one can check for each reaction in which compartment their reagents are located. A possible implementation could look as follows:

def get_fluxes_associated_with_compartments(model_object, compartments, return_values=True):

    # check whether provided compartment IDs are valid
    if not isinstance(compartments, (list, set) or not set(compartments).issubset(model_object.getCompartmentIds())):
        raise ValueError("Please provide valid compartment IDs as a list!")
    else:
        compartments = set(compartments)

    # all reactions in the model
    model_reactions = model_object.getReactionIds()

    # check whether provided compartments are identical with the ones of the reagents of a reaction
    return_reaction_ids = [ri for ri in model_reactions if compartments == set(si.getCompartmentId() for si in
                           model_object.getReaction(ri).getSpeciesObj())]

    # return reaction along with its value
    if return_values:
        return {ri: model_object.getReaction(ri).getValue() for ri in return_reaction_ids}

    # return only a list with reaction IDs
    return return_reaction_ids

So you pass your model object and a list of compartments and then for each reaction it is checked whether there is at least one reagent located in the specified compartments.

In your case you would use it as follows:

# compartment IDs for mitochondria and cytosol
comps = ['c', 'm']

# you only want the reaction IDs; remove the ', return_values=False' part if you also want the corresponding values
trans_cyt_mit = get_fluxes_associated_with_compartments(mod, ['c', 'm'], return_values=False)

The list trans_cyt_mit will then contain all desired reaction IDs (also the two you specified in your question) which you can then pass to the doFVA function.

About the second part of your question. I highly recommend to store those reactions in a group rather than using annotation:

# create an empty group
mod.createGroup('group_trans_cyt_mit')

# get the group object so that we can manipulate it
cyt_mit = mod.getGroup('group_trans_cyt_mit')

# we can only add objects to a group so we get the reaction object for each transport reaction
reaction_objects = [mod.getReaction(ri) for ri in trans_cyt_mit]

# add all the reaction objects to the group
cyt_mit.addMember(reaction_objects)

When you now export the model, e.g. by using

cbm.CBWrite.writeSBML3FBCV2(mod, 'iMM904_with_groups.xml')

this group will be stored as well in SBML. If a colleague reads the SBML again, he/she can then easily run a FVA for the same reactions by accessing the group members which is far easier than parsing the annotation:

# do an FVA; fva_res: Reaction, Reduced Costs, Variability Min, Variability Max, abs(Max-Min), MinStatus, MaxStatus
fva_res, rea_names = cbm.doFVA(mod, selected_reactions=mod.getGroup('group_trans_cyt_mit').getMemberIDs())
fva_dict = dict(zip(rea_names, fva_res.tolist()))

# store results in a dataframe which makes the selection of reactions easier
fva_df = pd.DataFrame.from_dict(fva_dict, orient='index')
fva_df = fva_df.rename({0: "flux_value", 1: "reduced_cost_unscaled", 2: "variability_min", 3: "variability_max",
                       4: "abs_diff_var", 5: "min_status", 6: "max_status"}, axis='columns')

Now you can easily query the dataframe and find the flexible and not flexible reactions within your group:

# filter the reactions with flexibility
fva_flex = fva_df.query("abs_diff_var > 10 ** (-4)")

# filter the reactions that are not flexible
fva_not_flex = fva_df.query("abs_diff_var <= 10 ** (-4)")