01 PyTorch workflow error: Loss function not decreasing and outputs not changing #859
Replies: 2 comments
-
Is all the code you have written here in the same code cell? You should check what the model's parameters are and maybe print out the |
Beta Was this translation helpful? Give feedback.
-
Maybe you need to run the training loop for a bit longer to see the loss go down, in this above code you have setup epochs to be 1, so you are basically not allowing the model to look at data multiple times to decrease the loss, so i suggest to increase the epochs to 100 or maybe 1000 and try to run the code again. Another reason maybe to look at original data X_train,y_train. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi. Can someone please help me. I can't find what is wrong with my code. But when I run it multiple times over my loss function stays exactly the same and my parameters stay the same however they should be nearing the desired output and my loss function should be decreasing?
Create an instance of the model (this is a subclass of nn.Module)
model_0 = LinearRegressionModel()
Check out the parameters
list(model_0.parameters())
Make predictions with model
with torch.inference_mode(): # inference mode turns off gradient tracking (which we would need in training but don't need in testing)
y_preds = model_0(X_test)
Setup a loss function
loss_fn = nn.L1Loss() # nn.L1Loss creates a criterion that measures the mean absolute error (MAE)
Setup an optimizer (stochastic gradient descent)
optimizer = torch.optim.SGD(params=model_0.parameters(),
lr=0.01)
An epoch is one loop through the data... (this is a hyperparameter because we've set it ourselves)
epochs = 1
Training
0. Loop through the data
for epoch in range(epochs):
Set the model to training mode
model_0.train() # train mode in PyTorch sets all parameters that require gradients to require gradients
1. Forward pass
y_pred = model_0(X_train)
2. Calculate the loss (compare how different the model's predictions are to the true values)
loss = loss_fn(y_pred, y_train)
print(f'Loss: {loss}')
3. Zero the gradients of the optimizer (they accumulate by default so we must set them back to zero each time)
optimizer.zero_grad()
4. Perform backpropagation on the loss with respect to the parameters of the model (compute the gradient of every parameter with requires_grad=True)
loss.backward()
5. Step the optimizer (perform gradient descent)
optimizer.step() # by default how the optimizer changes will accumulate through the loop... so we have to zero them above in step 3 for the next iteration of the loop.
Testing
model_0.eval() # turns off gradient tracking
Beta Was this translation helpful? Give feedback.
All reactions