Issue
I have two deep learning Models A and B
, B
is trained separately with the prediction results of A
. To train the B Model
, I want to use the prediction result of A
as B
's input. It can be understood as concatenation of the two models. So when I write the Dataset of Model B
in PyTorch, it read data from the Model A
rather than the local file. However, when train the Model B
, it's very slow since Model A
needs to be run for every input data of Model B
, is there any efficient way to combine two models? I am trying to save the prediction data of model A
as files, and then read the files as B
's dataset input, this takes a lot of storage, maybe not the best way. Looking forward to ideas!!
Solution
Assuming that you're backpropagating loss only through model B, it should take only roughly 2x time to do the forward pass through both models, and the backwards pass should take the same time regardless of whether the inputs to model B are loaded from file cache or directly from model A. Importantly, you need to disable gradient computation for model A, or model A will be rather slow and backpropagation will be slow. Your code should roughly look like:
for batch in dataset:
input_A, target_B = batch
modelA.eval()
# ensure no gradient computation for model A
with torch.no_grad():
output_A = modelA(input_A)
output_A = output_A.detatch() # ensure backpropagation will stop here
# gradient computation as normal
output_B = modelB(output_A)
loss = loss_fn(output_B,target_B)
loss.backward()
optimizer.step()
optimizer.zero_grad()
Answered By - DerekG
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.