Continue training

If we need to do some more training epochs but doesn’t have previously defined objects we need to do this:

# define again all from previous steps
# ...

# define FileStructureManager with parameter is_continue=True
fsm = FileStructManager(base_dir='data', is_continue=True)

# create trainer
trainer = Trainer(model, train_config, fsm, torch.device('cuda:0'))

# specify training epochs number
trainer.set_epoch_num(50)

# add TensorboardMonitor with parameter is_continue=True
trainer.monitor_hub.add_monitor(TensorboardMonitor(fsm, is_continue=True))

# set Trainer to resume mode and run training
trainer.resume(from_best_checkpoint=False).train()

Parameter from_best_checkpoint=False tell Trainer, that it need continue from last checkpoint. Neural Pipeline can save best checkpoints by specified rule. For more information about it read about enable_lr_decaying method of Trainer.

Don’t worry about incorrect training history displaying. If history also exists - monitors just add new data to it.