Thursday, January 11, 2024

[FIXED] Save only best weights with huggingface transformers

January 11, 2024 deep-learning, huggingface-transformers, nlp, pytorch No comments

Issue

Currently, I'm building a new transformer-based model with huggingface-transformers, where attention layer is different from the original one. I used run_glue.py to check performance of my model on GLUE benchmark. However, I found that Trainer class of huggingface-transformers saves all the checkpoints that I set, where I can set the maximum number of checkpoints to save. However, I want to save only the weight (or other stuff like optimizers) with best performance on validation dataset, and current Trainer class doesn't seem to provide such thing. (If we set the maximum number of checkpoints, then it removes older checkpoints, not ones with worse performances). Someone already asked about same question on Github, but I can't figure out how to modify the script and do what I want. Currently, I'm thinking about making a custom Trainer class that inherits original one and change the train() method, and it would be great if there's an easy and simple way to do this. Thanks in advance.

Solution

You may try the following parameters from trainer in the huggingface

training_args = TrainingArguments(
    output_dir='/content/drive/results',          # output directory
    do_predict= True, 
    num_train_epochs=3,              # total number of training epochs
    **per_device_train_batch_size=4,  # batch size per device during training
    per_device_eval_batch_size=2**,   # batch size for evaluation
    warmup_steps=1000,                # number of warmup steps for learning rate  
    save_steps=1000,
    save_total_limit=10,
    load_best_model_at_end= True,
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=0, evaluate_during_training=True)

There may be better ways to avoid too many checkpoints and selecting the best model. So far you can not save only the best model, but you check when the evaluation yields better results than the previous one.

Answered By - Shaina Raza

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, January 11, 2024

[FIXED] Save only best weights with huggingface transformers

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels