SGD gradient calculation in Chapter 1,2,3 #1034

surajbhv7l · 2024-08-06T10:40:54Z

surajbhv7l
Aug 6, 2024

here is the code Snippet:

#setting the optimiser SGD
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model_4.parameters(),
lr=0.1) # exercise: try changing the learning rate here and seeing what happens to the model's performance

#using SGD on full training data:
loss = loss_fn(y_logits, y_blob_train)
acc = accuracy_fn(y_true=y_blob_train,
y_pred=y_pred)

# 3. Optimizer zero grad
optimizer.zero_grad()

# 4. Loss backwards
loss.backward()

# 5. Optimizer step
optimizer.step()

My question is: Isn't the SGD here is acting as batch Gradient Descent as we are calculating gradient on full training data?

LuluW8071 · 2024-08-06T17:59:20Z

LuluW8071
Aug 6, 2024

If u don't set a batch size, it acts as Batch Gradient Descent (uses the entire dataset).
If u do set a batch size, it acts as Mini-Batch Gradient Descent (divides the dataset into batches and trains batch-wise)

Since u havent set batch_size in the code snippet, yes SGD acts as Batch Gradient Descent

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SGD gradient calculation in Chapter 1,2,3 #1034

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

SGD gradient calculation in Chapter 1,2,3 #1034

Uh oh!

surajbhv7l Aug 6, 2024

Replies: 1 comment

Uh oh!

Uh oh!

LuluW8071 Aug 6, 2024

surajbhv7l
Aug 6, 2024

LuluW8071
Aug 6, 2024