nesting openmdao "assemblies"/drivers - working from a 0.13 analogy, is this possible to implement in 1.X?

I am using NREL's DAKOTA_driver openmdao plugin for parallelized Monte Carlo sampling of a model. In 0.X, I was able to nest assemblies, allowing an outer optimization driver to direct the DAKOTA_driver sampling evaluations. Is it possible for me to nest this setup within an outer optimizer? I would like the outer optimizer's workflow to call the DAKOTA_driver "assembly" then the get_dakota_output component.

import pandas as pd
import subprocess
from subprocess import call
import os
import numpy as np
from dakota_driver.driver import pydakdriver
from openmdao.api import IndepVarComp, Component, Problem, Group

from mpi4py import MPI
import sys
from itertools import takewhile
sigm = .005
n_samps = 20
X_bar=[0.065 , sigm] #2.505463e+03*.05]  
dacout = 'dak.sout'


class get_dak_output(Component):
    mean_coe = 0

    def execute(self):
       comm = MPI.COMM_WORLD
       rank = comm.Get_rank()
       nam ='ape.net_aep'
       csize = 10000
       with open(dacout) as f:
           for i,l in enumerate(f):
               pass
       numlines = i
       dakchunks = pd.read_csv(dacout,  skiprows=0, chunksize = csize, sep='there_are_no_seperators')
       linespassed = 0
       vals = []
       for dchunk in dakchunks:
           for line in dchunk.values:
               linespassed += 1
               if linespassed < 49 or linespassed > numlines - 50: continue
               else:
                   split_line = ''.join(str(s) for s in line).split()
               if len(split_line)==2:
                     if (len(split_line) != 2 or
                        split_line[0] in ('nan', '-nan') or
                        split_line[1] != nam):
                            continue
                     else:vals.append(float(split_line[0]))
       self.coe_vals = sorted(vals)
       self.mean_coe = np.mean(self.coe_vals)


class ape(Component):
    def __init__(self):
       super(ape, self).__init__()
       self.add_param('x', val=0.0)
       self.add_output('net_aep', val=0.0)

    def solve_nonlinear(self, params, unknowns, resids):
       print 'hello'
       x = params['x']
       comm = MPI.COMM_WORLD
       rank = comm.Get_rank()
       outp = subprocess.check_output("python test/exampleCall.py %f"%(float(x)),
       shell=True)

       unknowns['net_aep'] = float(outp.split()[-1])


top = Problem()

root = top.root = Group()

root.add('ape', ape())
root.add('p1', IndepVarComp('x', 13.0))
root.connect('p1.x', 'ape.x')

drives = pydakdriver(name = 'top.driver')
drives.UQ('sampling', use_seed=False)
#drives.UQ()
top.driver = drives
#top.driver = ScipyOptimizer()
#top.driver.options['optimizer'] = 'SLSQP'

top.driver.add_special_distribution('p1.x','normal', mean=0.065, std_dev=0.01, lower_bounds=-50, upper_bounds=50)
top.driver.samples = n_samps
top.driver.stdout = dacout
#top.driver.add_desvar('p2.y', lower=-50, upper=50)
#top.driver.add_objective('ape.f_xy')
top.driver.add_objective('ape.net_aep')

top.setup()


top.run()
bak = get_dak_output()
bak.execute()

print('\n')
print('E(aep) is %f'%bak.mean_coe)

Solution

There are two different options for this situation. Both will work in parallel, and both can be currently supported. But only one of them will work when you want to use analytic derivatives:

1) Nested Problems: You create one problem class that has a DOE driver in it. You pass the list of cases you want run into that driver, and it runs them in parallel. Then you put that problem into a parent problem as a component.

The parent problem doesn't know that it has a sub-problem. It just thinks it has a single component that uses multiple processors.

This is the most similar way to how you would have done it in 0.x. However I don't recommend going this route because it won't work if you want to use ever want to use analytic derivatives.

If you use this way, the dakota driver can stay pretty much as is. But you'll have to use a special sub-problem class. This isn't an officially supported feature yet, but its very doable.

2) Using a multi-point approach, you would create a Group class that represent your model. You would then create one instance of that group for each monte-carlo run you want to do. You put all of these instances into a parallel group inside your overall problem.

This approach avoids the sub-problem messiness. Its also much more efficient for actual execution. It will have a somewhat greater setup-cost than the first method. But in my opinion its well worth the one time setup cost to get the advantage of analytic derivatives. The only issue is that it would probably require some changes to the way the dakota_driver works. You would want to get a list of evaluations from the driver, then hand them them out to the individual children groups.