I am trying to make a physics inspired machine learning model, but I am coming across a strange problem while writing the cost function.
The network takes in a vector of length N and outputs a vector of length N, so in essence a multidimensional regression problem.
The issue is that irregular subsets of this vector need to mantain a norm of 1, which is part of the cost function. That is to say, the output of the ML model needs to split the vector and take the norm of each split section. Let's say the vector is of length 100, this is split into sub-tensors of length 50 30 25 5. the problem is that it is also a time series, so actually we have an output list of these shapes: [(100,50) (100,30) (100,25) (100,5)]. To efficiently compute the norm without a loop I need to pad the second axis with 0's so that the tensors can be combined and the norms computed in one loop, but this requires an efficient way to transform this into [(100,50), (100,50), (100,50), (100,50)] so that I can stack the dataframes and run the norm computation without a loop.
It might be that I am overthinking it and that a loop is fine.
At the moment I have this, where the inputs are of shape (100,100) and norm_list contains the lengths of the parts that have to have the norm computed. The reason for the strange splitting is because the actual values are complex, but represented as pairs of reals.
def prop_loss_full(y_true,y_pred):
y_truer, y_truec = f2rr_full(y_true)
y_predr, y_predc = f2rr_full(y_pred)
diff = math_ops.squared_difference(y_pred,y_true)
split_real = tf.split(y_predr,norm_list,1)
split_imag = tf.split(y_predc,norm_list,1)
norm = math_ops.reduce_sum((math_ops.square(split_real[0])+math_ops.square(split_imag[0])),axis=1)
for i in range(1,len(split_real)):
norm+=math_ops.reduce_sum((math_ops.square(split_real[i])+math_ops.square(split_imag[i])),axis=1)
return norm+ks.MSE(y_true, y_pred)
prop_loss_full(data,data)
Any help would be appreciated! When I use tf.pad_sequence I get an error - I assume it doesn't like seuqences of 2d tensors. I have looked far and wide, including ragged tensors and other such features, but have not managed to get them to work.
You can use pad_sequences
if you transpose your data first:
# random data in the same shape as yours
x = [tf.random.uniform((100, x)) for x in [50, 30, 25]] # shape: [(100, 50), (100, 30), (100, 25)]
x = [tf.transpose(x_) for x_ in x] # shape: [(50, 100), (30, 100), (25, 100)]
x = tf.keras.utils.pad_sequences(x) # shape: tensor(3, 50, 100)
x = tf.transpose(x, [0, 2, 1]) # shape: tensor(3, 100, 50)
You still have a loop for the transpose operation of the data, but it should still be faster than using the loop over the subtensors.