Issue
Common pytorch error here, but I'm seeing it under a unique circumstance: when reloading a model, I get a CUDA: Out of Memory
error, even though I haven't yet placed the model on the GPU.
model = model.load_state_dict(torch.load(model_file_path))
optimizer = optimizer.load_state_dict(torch.load(optimizer_file_path))
# Error happens here ^, before I send the model to the device.
model = model.to(device_id)
Solution
The issue is that I was trying to load to a new GPU (cuda:2
) but originally saved the model and optimizer from a different GPU (cuda:0
). So even though I didn't explicitly tell it to reload to the previous GPU, the default behavior is to reload to the original GPU (which happened to be occupied).
Adding map_location=device_id
to each torch.load
call fixed the problem:
model.to(device_id)
model = model.load_state_dict(torch.load(model_file_path, map_location=device_id))
optimizer = optimizer.load_state_dict(torch.load(optimizer_file_path, map_location=device_id))
Answered By - Jacob Stern
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.