neural-networknlpdynet

CPU memory allocation failed in Dynet


I am not sure why I am running out of memory. Take the parser of Goldberg, all I do is to change this line:

scores, exprs = self.__evaluate(conll_sentence, True)

and add a for loop around it to repeat it K times:

for k in xrange(K):
    scores, exprs = self.__evaluate(conll_sentence, True)
    # do something

Then in the getExpr, i do the following:

samples_out = np.random.normal(0,0.001, (1, self.hidden_units))
samples_FOH = np.random.normal(0,0.001,(self.hidden_units, self.ldims * 2))
samples_FOM = np.random.normal(0,0.001,(self.hidden_units, self.ldims * 2))
samples_Bias = np.random.normal(0,0.001, (self.hidden_units))

XoutLayer = self.outLayer.expr()+inputTensor(samples_out)
XhidLayerFOH = self.hidLayerFOH.expr()+inputTensor(samples_FOH)
XhidLayerFOM = self.hidLayerFOM.expr()+inputTensor(samples_FOM)
XhidBias = self.hidBias.expr()+inputTensor(samples_Bias)

if sentence[i].headfov is None:
    sentence[i].headfov = XhidLayerFOH * concatenate([sentence[i].lstms[0], sentence[i].lstms[1]])
if sentence[j].modfov is None:
    sentence[j].modfov  = XhidLayerFOM * concatenate([sentence[j].lstms[0], sentence[j].lstms[1]])

output = XoutLayer * self.activation(sentence[i].headfov + sentence[j].modfov + XhidBias)
return output

Essentially what is happening in the above block is to first generate normally distributed noise, then add it to the trained values. But it seems somewhere along the way, all the generated values stay in the memory and it just runs out of memory. Any one knows why?


Solution

  • Dynet expressions stay in memory until the next call to renew_cg().

    So the fix would be to call it after each iteration of your loop, provided that you have retrieved all information you needed from the computation graph.

    Side note: when you do a simple addition, such as:

    XoutLayer = self.outLayer.expr()+inputTensor(samples_out)
    

    no addition is actually performed. You just create a new expression and specify how to evaluate it from other expressions. The actual computation is performed when there is a call to .forward() (or .value(), etc) on XoutLayer or on an expression whose computation depends on XoutLayer. So, dynet needs to allocate memory for all expressions in the current computation graph.