01 PyTorch workflow error: Loss function not decreasing and outputs not changing #859

georgiaalbertpalmer · 2024-03-10T21:18:38Z

georgiaalbertpalmer
Mar 10, 2024

Hi. Can someone please help me. I can't find what is wrong with my code. But when I run it multiple times over my loss function stays exactly the same and my parameters stay the same however they should be nearing the desired output and my loss function should be decreasing?

Create an instance of the model (this is a subclass of nn.Module)

model_0 = LinearRegressionModel()

Check out the parameters

list(model_0.parameters())

Make predictions with model

with torch.inference_mode(): # inference mode turns off gradient tracking (which we would need in training but don't need in testing)
y_preds = model_0(X_test)

Setup a loss function

loss_fn = nn.L1Loss() # nn.L1Loss creates a criterion that measures the mean absolute error (MAE)

Setup an optimizer (stochastic gradient descent)

optimizer = torch.optim.SGD(params=model_0.parameters(),
lr=0.01)

An epoch is one loop through the data... (this is a hyperparameter because we've set it ourselves)

epochs = 1

Training

0. Loop through the data

for epoch in range(epochs):

Set the model to training mode

model_0.train() # train mode in PyTorch sets all parameters that require gradients to require gradients

1. Forward pass

y_pred = model_0(X_train)

2. Calculate the loss (compare how different the model's predictions are to the true values)

loss = loss_fn(y_pred, y_train)
print(f'Loss: {loss}')

3. Zero the gradients of the optimizer (they accumulate by default so we must set them back to zero each time)

optimizer.zero_grad()

4. Perform backpropagation on the loss with respect to the parameters of the model (compute the gradient of every parameter with requires_grad=True)

loss.backward()

5. Step the optimizer (perform gradient descent)

optimizer.step() # by default how the optimizer changes will accumulate through the loop... so we have to zero them above in step 3 for the next iteration of the loop.

Testing

model_0.eval() # turns off gradient tracking

Yer-Marti · 2024-03-15T09:08:18Z

Yer-Marti
Mar 15, 2024

Is all the code you have written here in the same code cell? You should check what the model's parameters are and maybe print out the y_pred to see whats going on. You should use markdown when pasting code cells so its easier to read that way. If your training loop is in a different code cell separated from your model/loss_fn/optimizer initialization then I don't know what it is.

0 replies

OdinCodedAsgard · 2024-03-26T19:12:43Z

OdinCodedAsgard
Mar 26, 2024

Maybe you need to run the training loop for a bit longer to see the loss go down, in this above code you have setup epochs to be 1, so you are basically not allowing the model to look at data multiple times to decrease the loss, so i suggest to increase the epochs to 100 or maybe 1000 and try to run the code again. Another reason maybe to look at original data X_train,y_train.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

01 PyTorch workflow error: Loss function not decreasing and outputs not changing #859

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

01 PyTorch workflow error: Loss function not decreasing and outputs not changing #859

Uh oh!

georgiaalbertpalmer Mar 10, 2024

Create an instance of the model (this is a subclass of nn.Module)

Check out the parameters

Make predictions with model

Setup a loss function

Setup an optimizer (stochastic gradient descent)

An epoch is one loop through the data... (this is a hyperparameter because we've set it ourselves)

Training

0. Loop through the data

Set the model to training mode

1. Forward pass

2. Calculate the loss (compare how different the model's predictions are to the true values)

3. Zero the gradients of the optimizer (they accumulate by default so we must set them back to zero each time)

4. Perform backpropagation on the loss with respect to the parameters of the model (compute the gradient of every parameter with requires_grad=True)

5. Step the optimizer (perform gradient descent)

Testing

Replies: 2 comments

Uh oh!

Yer-Marti Mar 15, 2024

Uh oh!

OdinCodedAsgard Mar 26, 2024

georgiaalbertpalmer
Mar 10, 2024

Yer-Marti
Mar 15, 2024

OdinCodedAsgard
Mar 26, 2024