Issue
Could someone please explain to me why this function:
def train_graph_classifier(model_name, **model_kwargs):
pl.seed_everything(42)
# Create a PyTorch Lightning trainer with the generation callback
root_dir = os.path.join('/home/predictor2', "GraphLevel" + model_name)
os.makedirs(root_dir, exist_ok=True)
trainer = pl.Trainer(default_root_dir=root_dir,
callbacks=[ModelCheckpoint(save_weights_only=True, mode="max", monitor="val_acc")],
gpus=1 if str(device).startswith("cuda") else 0,
max_epochs=500,
progress_bar_refresh_rate=0)
trainer.logger._default_hp_metric = None # Optional logging argument that we don't need
# Check whether pretrained model exists. If yes, load it and skip training
pretrained_filename = os.path.join('/home/predictor2', f"GraphLevel{model_name}.ckpt")
if os.path.isfile(pretrained_filename):
print("Found pretrained model, loading...")
model = GraphLevelGNN.load_from_checkpoint(pretrained_filename)
else:
pl.seed_everything(42)
model = GraphLevelGNN(c_in=dataset.num_node_features,
c_out=1 if dataset.num_classes==2 else dataset.num_classes, #change
**model_kwargs)
trainer.fit(model, graph_train_loader, graph_val_loader)
model = GraphLevelGNN.load_from_checkpoint(trainer.checkpoint_callback.best_model_path)
# Test best model on validation and test set
train_result = trainer.test(model, graph_train_loader, verbose=False)
test_result = trainer.test(model, graph_test_loader, verbose=False)
result = {"test": test_result[0]['test_acc'], "train": train_result[0]['test_acc']}
return model, result
Returns the error:
Traceback (most recent call last):
File "stability_v3_alternative_net.py", line 604, in <module>
dp_rate=0.2)
File "stability_v3_alternative_net.py", line 591, in train_graph_classifier
model = GraphLevelGNN.load_from_checkpoint(trainer.checkpoint_callback.best_model_path)
File "/root/miniconda3/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 139, in load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/root/miniconda3/lib/python3.7/site-packages/pytorch_lightning/utilities/cloud_io.py", line 46, in load
with fs.open(path_or_url, "rb") as f:
File "/root/miniconda3/lib/python3.7/site-packages/fsspec/spec.py", line 1043, in open
**kwargs,
File "/root/miniconda3/lib/python3.7/site-packages/fsspec/implementations/local.py", line 159, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
File "/root/miniconda3/lib/python3.7/site-packages/fsspec/implementations/local.py", line 254, in __init__
self._open()
File "/root/miniconda3/lib/python3.7/site-packages/fsspec/implementations/local.py", line 259, in _open
self.f = open(self.path, mode=self.mode)
IsADirectoryError: [Errno 21] Is a directory: '/home/predictor'
where /home/predictor is the current directory i'm working in? (I made predictor2 directory because I get the same error when I replace predictor2 with predictor in the above code).
I understand that it's telling me that it's trying to write a file or something but it's finding that the location in a directory, I can get that from seeing other people's answers. But I can't see specifically here what the issue is becaues I don't name my working directory anywhere? The code was taken from this example.
Solution
This:
model=GraphLevelGNN.load_from_checkpoint(trainer.checkpoint_callback.best_model_path)
is failing because you are trying to open a directory and not a file. Check that trainer.checkpoint_callback.best_model_path
actually is the path to your .ckpt
. As it looks like it is not, then you need to figure out why your callback isn't storing the right path. You can of course hardcode it yourself for an ugly solution.
Answered By - Mikel B
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.