Pytorch Lightning Learning Rate Tuners Giving unexpected results

Question

Pytorch Lightning Learning Rate Tuners Giving unexpected results

380 views Asked by Toby At 28 July 2023 at 14:18

I'm trying to find an optimal learning rate using python pl.tuner.Tuner but results aren't as expected

The model I am running is a linear classifier on top of a BertForSequenceClassification Automodel

I want to find the optimum learning rate when the bert model is frozen.

To do this I am running this code:

  
    tuner = pl.tuner.Tuner(trainer)
    results = tuner.lr_find(
        model, 
        # optimizer = optimizer,
        
        train_dataloaders=data_module, 
        min_lr=10e-8,
        max_lr=10.0,
    )
    # Plot with
    fig = results.plot(suggest=True)
    fig.show()

My optimizer is configured like this in the model:

   def configure_optimizers(self):
        """
        :return:
        """
        optimizer = torch.optim.AdamW(self.parameters(), lr=self.learning_rate)

        scheduler = get_linear_schedule_with_warmup(
            optimizer,
            num_warmup_steps=self.n_warmup_steps,
            num_training_steps=self.n_training_steps,
        )
        return dict(optimizer=optimizer, lr_scheduler=dict(scheduler=scheduler, interval="step"))

This produces:

Chart of loss against learning rate

I am confused as to why the loss is increasing at lower learning rates, and this is not what I was expecting.

I have tried:

removing the scheduler
freezing/ unfreezing the weights
Changing the initial learning rate

I was expecting a chart like this: https://github.com/comhar/pytorch-learning-rate-tuner/blob/master/images/learning_rate_tuner_plot.png

Any help appreciated

Many thanks

Original Q&A

There are 2 answers

**user22926078** · Answer 1 · 2023-11-16T10:04:51+00:00

I'm not sure if you have solved this problem, but I suggest you to use a larger num_training in tuner.lr_find()

Accoriding to the source code, the default value is 100

def _lr_find(
trainer: "pl.Trainer",
model: "pl.LightningModule",
min_lr: float = 1e-8,
max_lr: float = 1,
num_training: int = 100,
mode: str = "exponential",
early_stop_threshold: Optional[float] = 4.0,
update_attr: bool = False,
attr_name: str = "", 
) -> Optional[_LRFinder]:

**Ricky** · Answer 2 · 2024-02-15T12:19:37+00:00

I'm getting a similar plot to you, and there is a similar question on Stackoverflow: Unusual Learning Rate Finder Curve: Loss Lowest at Smallest Learning Rate

It may be due to the issue reported here: https://github.com/Lightning-AI/pytorch-lightning/issues/14167

i.e., there may be some moving average smoothing applied, which starts at 0, so the first few loss values are averaged along with 0 leading to the low loss observed on the plot.

However, that doesn't explain why there are many images online of results without this behaviour, unless they were generated with different versions of Lightning. If it is the cause, though, I guess we just have to ensure that the lowest learning rate tested is much too low to be near the optimal and then ignore the left-hand side of the resulting plot.

TechQA.

Pytorch Lightning Learning Rate Tuners Giving unexpected results

There are 2 answers

Related Questions in DEEP-LEARNING

Related Questions in PYTORCH-LIGHTNING

Related Questions in LEARNING-RATE

Popular Questions

Trending Questions