How to create a genome scale metabolic model in cbmpy from scratch?

So the previous question was apparently not wisely asked. So here is the rephrased original question:

I am planing on building up a new GSMM from nothong (i.e. I need a empty model, 0 reaction/metabolite/compartment/GPR/annotation, but with a proper model structure that toolbox will accept).

I used to work with Matlab and COBRA toolbox but recently changed into Python and cbmpy toolbox. I know how to create a model structure in Matlab through COBRA but not in python with cbmpy. I am really not familiar with cmbpy's functions and I am here ask for help.

There are ways to meet my needs:

The most stupid way: read an exist model and remove all its contents.

import cbmpy as cbm
mod = cbm.CBRead.readSBML3FBC('example.xml')
for ri in mod.getSpeciesIds():
    mod.deleteSpecies(ri, also_delete = 'reaction')

Similarly, different functions start with 'mod.delete....' will clean all the stuff. Eventually you will have a clean empty model.

As suggested by Cleb, I looked into the source code and found the function I need

import cbmpy as cbm
mod = cbm.CBModel.Model('Name')
mod.createSpecies('M_a_c',boundary=False,value=float('nan'),name='AA',compartment='c',charge=None,chemFormula='C2H2O2')
mod.createSpecies('M_b_c',boundary=False,value=float('nan'),name='BB',compartment='c',charge=None,chemFormula='CHO')
mod.createReaction('R_AB',reversible=False)
mod.createReactionReagent('R_AB','M_a_c',-1.0)
mod.createReactionReagent('R_AB','M_b_c',2)

Then a model object with ID 'Name' is created. It is a very simple question and its answer very straightforward, if you have found the right function. Later you may gradually add reaction/metabolites/etc and assign the objective reaction. It is of course also possible to directly use cbm.CBModelTools.quickDefaultBuild but you need to prepare necessary arguments for adding reactions/species/boundary/objective.

Solution

While you can indeed build a model as you described, it might be easier (and cleaner) to use

cbm.CBModelTools.quickDefaultBuild

I will show how to use it for a minimal example which is quite self-explanatory:

import cbmpy as cbm


def define_new_model():

    model_name = 'my_great_model'

    reactions = {
        'R01': {'id': 'R01', 'reversible': True, 'reagents': [(-1., 'A'), (1., 'A_ext')], 'SUBSYSTEM': ''},
        'R02': {'id': 'R02', 'reversible': True, 'reagents': [(-1., 'B'), (1., 'B_ext')], 'SUBSYSTEM': '',
                'GENE_ASSOCIATION': '(gene_1 or gene_2)'},
        'R_EX_A': {'id': 'R_EX_A', 'reversible': True, 'reagents': [(1., 'A_b'), (-1., 'A_ext')], 'SUBSYSTEM': ''},
        'R_EX_B': {'id': 'R_EX_B', 'reversible': True, 'reagents': [(1., 'B_b'), (-1., 'B_ext')], 'SUBSYSTEM': ''},
        'R_biomass': {'id': 'R_biomass', 'reversible': False, 'reagents': [(-1., 'A'), (-1., 'B')], 'SUBSYSTEM': ''}
    }

    species = {
        'A': {'id': 'A', 'boundary': False, 'SUBSYSTEM': 'C1'},
        'B': {'id': 'B', 'boundary': False, 'SUBSYSTEM': 'C1'},
        'A_ext': {'id': 'A_ext', 'boundary': False, 'SUBSYSTEM': 'C0'},
        'B_ext': {'id': 'B_ext', 'boundary': False, 'SUBSYSTEM': 'C0'},
        'A_b': {'id': 'A_b', 'boundary': True, 'SUBSYSTEM': ''},
        'B_b': {'id': 'B_b', 'boundary': True, 'SUBSYSTEM': ''},
    }

    bounds = {'R_EX_A': {'lower': -1., 'upper': 0.},
              'R_EX_B': {'lower': -1., 'upper': 0.}}

    objective_function = {'objMaxJ1': {'id': 'biomass_obj1',
                                       'flux': 'R_biomass',
                                       'coefficient': 1,
                                       'sense': 'Maximize',
                                       'active': True}}

    return model_name, reactions, species, bounds, objective_function


model_def = define_new_model()

mod = cbm.CBModelTools.quickDefaultBuild(*model_def)

Now you can call

mod.getReactionIds()
['R01', 'R02', 'R_EX_A', 'R_EX_B', 'R_biomass']

and

mod.getSpeciesIds()
['A', 'A_b', 'A_ext', 'B', 'B_b', 'B_ext']

and

mod.getBoundarySpeciesIds()
['A_b', 'B_b']

and you can also simulate the model

cbm.doFBA(mod)
analyzeModel objective value: 1.0

When you look for the genes you defined for reaction R02, you will see an empty list:

mod.getGeneIds()
[]

You will first have to call

mod.createGeneAssociationsFromAnnotations()

which prints

INFO: used key(s) '['GENE_ASSOCIATION']'
INFO: Added 2 new genes and 1 associations to model

And then

mod.getGeneIds()

will give the desired outcome

['gene_1', 'gene_2']

as well as

mod.getGPRforReaction('R02').getAssociationStr()

which returns

'((gene_1 or gene_2))'

I highly recommend to use this approach as this is easier to read and edit later on if you want to change the model definition.

If you want to add further annotation, you can check this question.