For my application, I've found that a network structured as follows works well:
- K inputs
- K * N dense layer with linear activation: KN variables in the kernel.
- N -> N layer with N variables, one for each input to the layer, tanh activation function
- N -> O dense layer with linear activation: N*O variables in the kernel
When N=1 I can achieve this with a dense layer of size 1, but how can I achieve this when N is greater than one?
this.model = tf.sequential();
this.model.add(tf.layers.dense({
inputShape: [kInputSize], units: 4, activation: 'linear'}));
this.model.add(tf.layers.multiply({activation: 'tanh'}));
this.model.add(tf.layers.dense({
units: kOutputSize, activation: 'linear'}));
The code above describes what I want, but the tf.layers.multiply
function doesn't seem to be exactly what I need. Maybe it would work if I could construct a symbolic tensor of trainable variables. I want a layer with 4 inputs, 4 variables and 4 outputs.
Instead, I get this error:
Error: A merge layer should be called on an Array of at least 2 inputs. Got 1 input(s).
So, I can achieve what I need if I can create a layer of N trainable variables with no input. How can I do that?
In my application, removing the middle layer and moving its activation function to the first layer can be functionally equivalent, but in practice it learns much more slowly - if at all. I could go into detail, but that is beyond the scope of this question.