I am using tfx pipeline for training and evaluating an autoencoder. The data that I have is basically 5 arrays of size (15,1) that I concatenate and put together and pass to the model.
In order to keep track of the training data, I defined the mean value of these parameters in my ExampleGen component. So, I have both feature1 and feature1_mean in my input features, however, after the Transform component, I remove the *_mean features from the data.
Now, after I train my model and want to pass it to the Evaluator, this error comes up:
unable to prepare inputs for evaluation: input_specs={
'feature1': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None),
'feature2': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None),
'feature3': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None),
'feature4': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None)},
features={
'feature1_mean': array([3.4559317, 3.528199 , 3.3727243, 3.0274842, 3.2321723, 3.339905 , 3.3501785, 2.987716 , 3.236495 , 3.5900073, 3.1439974, 3.1659212, ...], dtype=float32),
'feature2_mean': array([1.5840595 , 1.6105878 , 1.5401138 , 1.2408142 , 1.2962327 ,], dtype=float32),
'feature3_mean': array([1.5840595 , 1.6105878 , 1.5401138 , 1.2408142 , 1.2962327 ,....]}
[while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/EvaluateMetricsAndPlots/ComputeMetricsAndPlots()/CombineMetricsPerSlice/WindowIntoDiscarding']
Here is the configuration that I am using for my eval_config:
eval_config = tfma.EvalConfig(
model_specs=[
tfma.ModelSpec(
signature_name='serving_default',
label_key='feature1_mean',
preprocessing_function_names=['transform_features'],
)
],
metrics_specs=[
tfma.MetricsSpec(
metrics=[
tfma.MetricConfig(class_name='ExampleCount'),
]
)
],
slicing_specs=[
tfma.SlicingSpec()
])
I am just passing feature1_mean as a dummy parameter name here, cause I have no label key in fact as it is an unsupervised learning model.
The signatures that I am saving are:
def _get_tf_examples_serving_signature(model, tf_transform_output):
"""Returns a serving signature that accepts `tensorflow.Example`."""
# We need to track the layers in the model in order to save it.
# TODO(b/162357359): Revise once the bug is resolved.
model.tft_layer_inference = tf_transform_output.transform_features_layer()
@tf.function(input_signature=[
tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
])
def serve_tf_examples_fn(serialized_tf_example):
"""Returns the output to be used in the serving signature."""
raw_feature_spec = tf_transform_output.raw_feature_spec()
# Remove label feature since these will not be present at serving time.
raw_features = tf.io.parse_example(serialized_tf_example, raw_feature_spec)
raw_features.pop('feature1_mean')
raw_features.pop('feature2_mean')
raw_features.pop('feature3_mean')
raw_features.pop('feature4_mean')
transformed_features = model.tft_layer_inference(raw_features)
logging.info('serve_transformed_features = %s', transformed_features)
result = model(transformed_features)
# TODO(b/154085620): Convert the predicted labels from the model using a
# reverse-lookup (opposite of transform.py).
return {'outputs': result}
return serve_tf_examples_fn
def _get_transform_features_signature(model, tf_transform_output):
"""Returns a serving signature that applies tf.Transform to features."""
# We need to track the layers in the model in order to save it.
# TODO(b/162357359): Revise once the bug is resolved.
model.tft_layer_eval = tf_transform_output.transform_features_layer()
@tf.function(input_signature=[
tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
])
def transform_features_fn(serialized_tf_example):
"""Returns the transformed_features to be fed as input to evaluator."""
raw_feature_spec = tf_transform_output.raw_feature_spec()
raw_features = tf.io.parse_example(serialized_tf_example, raw_feature_spec)
transformed_features = model.tft_layer_eval(raw_features)
logging.info('eval_transformed_features = %s', transformed_features)
return transformed_features
return transform_features_fn
I'd really appreciate it if you can help me solve this issue.
Thanks.
One thing that I found about the TFMA library and TFX's Evaluator component, in general, is that the output key has to be one-dimensional, and there has to be a label key alway. If you want to make it work for auto-encoders, instead of making changes to the
_input_fn, in the Transform component, return the input twice with two different keys. For example, if your input key for an image isimg, returnimg_inputandimg_outputin your Transform component. This way, you don't need to manipulate the Trainer component'sinput_fn, and in the Evaluator, you can easily use theimg_outputkey as your label. However, as mentioned earlier, thisimg_outputhas to be one-dimensional. if in your model, you're using Conv2D layers to encode and decode your image, I'd recommend using the one-dimensional data at first but adding a Reshape layer to make it ready for subsequent Conv2D layers.Example: