Issue
I am trying to train a LeNet-5 model in TensorFlow 2 while replacing all Dense's matrix multiplications and Conv2D's convolutions with custom ones written in C. More precisely, I would like to keep those gradients as they are by default, but use my implementation of these operations instead of TensorFlow's default ones. Also, I cannot perform my custom convolutions and matrix multiplications using TensorFlow, I must go through my C code, which is called through CTypes
. Is there a way of doing so ?
What I tried so far is to use TensorFlow's @tf.experimental.dispatch_for_api
to call a functions that uses tf.py_function
, which, in turn, calls my C code. However is seems that in doing so the gradients are lost and the model cannot be trained. Is there any other way of doing it ?
@tf.experimental.dispatch_for_api(tf.matmul,
{'a': ApproximateTensor},
{'b': ApproximateTensor},
{'a': tf.Tensor , 'b': tf.Tensor },
{'a': ApproximateTensor, 'b': tf.Tensor },
{'a': tf.Tensor , 'b': ApproximateTensor},
{'a': ApproximateTensor, 'b': ApproximateTensor},
)
def custom_matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False, a_is_sparse=False, b_is_sparse=False, output_type=None):
# tf.print('MATMUL')
if not isinstance(a, ApproximateTensor):
a = ApproximateTensor(a)
if not isinstance(b, ApproximateTensor):
b = ApproximateTensor(b)
_, out_size = b.shape
return ApproximateTensor(process.linear(a.values, b.values, tf.zeros(out_size)))
process.linear
then does this:
def linear(input, kernel, bias):
global c_approx
def compute(input, kernel, bias):
global c_approx
# Extract dimensions
batch_size, in_size = input.shape
_, out_size = kernel.shape
if batch_size is None:
batch_size = 1
# Create output
output = np.zeros((batch_size, out_size), dtype=np.float32)
# Compute
for b in range(batch_size):
output[b] = c_approx.custom_matmul(input[b], kernel[i]) + bias
return tf.convert_to_tensor(output)
output = tf.py_function(compute, [input, kernel, bias], input.dtype)
# Manually set output dimensions
batch_size, _ = input.shape
_, out_size = kernel.shape
output.set_shape((batch_size, out_size))
return output
In other words, I want the opposite of what the following code would do:
@tf.custom_gradient
def custom_conv(x):
def grad_fn(dy):
return dy
return tf.nn.conv2d(x), grad_fn
I want to redefine Conv2D while keeping his default gradient.
Thanks in advance
Solution
I actually figured it out thanks to this answer: https://stackoverflow.com/a/43952168/9675442
The idea was to do this:
y = tf.matmul(a, b) + tf.stop_gradient(compute(a, b) - tf.matmul(a, b))
I hope this will help others
Answered By - aegefel
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.