I am writing CS231n assignment1 two-layer-net and I meet difficulty in relu_backward. My impletment is as below:
def relu_backward(dout, cache):
"""
Computes the backward pass for a layer of rectified linear units (ReLUs).
Input:
- dout: Upstream derivatives, of any shape
- cache: Input x, of same shape as dout
Returns:
- dx: Gradient with respect to x
"""
dx, x = None, cache
###########################################################################
# TODO: Implement the ReLU backward pass. #
###########################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
dx = dout
dx[x<=0.0] = 0.0
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
###########################################################################
# END OF YOUR CODE #
###########################################################################
return dx
but the result always says error is 1.0.
np.random.seed(231)
x = np.random.randn(10, 10)
dout = np.random.randn(*x.shape)
dx_num = eval_numerical_gradient_array(lambda x: relu_forward(x)[0], x, dout)
_, cache = relu_forward(x)
dx = relu_backward(dout, cache)
# The error should be on the order of e-12
print('Testing relu_backward function:')
print('dx error: ', rel_error(dx_num, dx))
Testing relu_backward function:
dx error: 1.0
Does anyone meet the same problem?
I was also experienced about this problem but actually, I solved this problem when I had debugged the 'relu_forward' function.
If you didn't use the np.maximum function, you can make that kind of error. Actually, the input and upstream derivative values come from the random values of numpy. So, if you used just max function, which is usually used in python, the calculation of max function is not concrete.
I hope this answer would be helpful to you. Then, have a good day.