Below is my Model
# Since we are predicting a value for every timestep, we set return_sequences=True
input = Input(batch_shape=ip_shape)
mLSTM = LSTM(units=32, return_sequences=True, stateful=True)(input)
mDense = Dense(units=32, activation='linear')(input)
mSkip = Add()([mLSTM, mDense])
mSkip = Dense(units=1, activation='linear')(mSkip)
model = Model(input, mSkip)
adam = Adam(learning_rate=0.01)
model.compile(optimizer=adam, loss=total_loss)
model.summary()
Model Summary
Model: "model_3"
_______________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
===============================================================================================
input_5 (InputLayer) [(104, 22050, 1)] 0 []
lstm_4 (LSTM) (104, 22050, 32) 4352 ['input_5[0][0]']
dense_5 (Dense) (104, 22050, 32) 64 ['input_5[0][0]']
add_3 (Add) (104, 22050, 32) 0 ['lstm_4[0][0]',
'dense_5[0][0]']
dense_6 (Dense) (104, 22050, 1) 33 ['add_3[0][0]']
===============================================================================================
Total params: 4449 (17.38 KB)
Trainable params: 4449 (17.38 KB)
Non-trainable params: 0 (0.00 Byte)
_______________________________________________________________________________________________
I am using a custom loss function. I am not sure if it could be messing with the shapes while backproping
def total_loss(y_true, y_pred):
ratio = 0.5
dc_loss = math_ops.pow(math_ops.subtract(math_ops.mean(y_true, 0), math_ops.mean(y_pred, 0)), 2)
dc_loss = math_ops.mean(dc_loss, axis=-1)
dc_energy = math_ops.mean(math_ops.pow(y_true, 2), axis=-1) + 0.00001
dc_loss = math_ops.div(dc_loss, dc_energy)
esr_loss = math_ops.squared_difference(y_pred, y_true)
esr_loss = math_ops.mean(esr_loss, axis=-1)
esr_energy = math_ops.mean(math_ops.pow(y_true, 2), axis=-1) + 0.00001
esr_loss = math_ops.div(esr_loss, esr_energy)
return (ratio)*dc_loss + (1-ratio)*esr_loss
Finally the error: Let me know if the whole traceback is needed
InvalidArgumentError: Graph execution error:
...
Incompatible shapes: [104,22050,32] vs. [32,22050,1]
[[{{node gradient_tape/total_loss/BroadcastGradientArgs}}]] [Op:__inference_train_function_9604]
Setting stateful=False seems to work but I don't get why
You might consider:
It would help if you flattened the prediction somewhere.