I have a question about how to calculate val_loss in Keras' multiple output. Here is an excerpt of my code.
nBatchSize = 200
nTimeSteps = 1
nInDims = 17
nHiddenDims = 10
nFinalDims = 10
nOutNum = 24
nTraLen = 300
nMaxEP = 20
nValLen = 50
sHisCSV = "history.csv"
oModel = Sequential()
oModel.add(Input(batch_input_shape=(nBatchSize, nTimeSteps, nInDims)))
oModel.add(LSTM(nHiddenDims, return_sequences=True, stateful=True))
oModel.add(LSTM(nHiddenDims, return_sequences=False, stateful=True))
oModel.add(Dense(nFinalDims, activation="relu")
oModel.add(Dense(nOutNum, activation="linear")
oModel.compile(loss="mse", optimizer=Nadam())
oModel.reset_states()
oHis = oModel.fit_generator(oDataGen, steps_per_epoch=nTraLen,
epochs=nMaxEP, shuffle=False,
validation_data=oDataGen, validation_steps=nValLen,
callbacks=[CSVLogger(sHisCSV, append=True)])
# number of cols is nOutNum(=24), number of rows is len(oEvaGen)
oPredDF = pd.DataFrame(oPredModel.predict_generator(oEvaGen, steps=len(oEvaGen))
# GTDF is a dataframe of Ground Truth
nRMSE = np.sqrt(np.nanmean(np.array(np.power(oPredDF - oGTDF, 2))))
In history.csv, val_loss is written and it is written as 3317.36. The RMSE calculated from the prediction result is 66.4.
By my understanding my Keras specification, val_loss written in history.csv is the mean MSE of 24 outputs. Assuming that it is correct, RMSE can be computed as 11.76 (= sqrt(3317.36/24)) from history.csv, which is quite different from value of nRMSE (=66.4) Just as sqrt(3317.36) = 57.6 is rather close to it.
Is my understanding of Keras specification on val_loss incorrect?
Your first assumption is correct, but the further derivation went wrong a bit.
As the MSE is the mean of the model's output's squared errors, as you can see in the Keras documentation:
mean_squared_error
keras.losses.mean_squared_error(y_true, y_pred)
and in the Keras source code:
K.mean(K.square(y_pred - y_true), axis=-1)
thus the RMSE is the square root of this value:
K.sqrt(K.mean(K.square(y_pred - y_true), axis=-1))
What you wrote would be the root square of the square error i.e. RSE.
So from your actual example:
RSE can be computed as sqrt(3317.36/24) = 11.76
RMSE can be computed as sqrt(3317.36) = 57.6
Thus the RMSE (and nRMSE) values provided by the model are correct.