Gradient Accumulation #24

julianmack · 2020-01-15T17:16:32Z

Adds Gradient accumulation to enable arbitrary batch size.

Note that the configs do not change when using accumulation. This is because accumulation over multiple steps (or gpus in future?) will be identical to no accumulation with larger batch size per step. E.g. accumulation=2, batch/step = 32 is identical to no accumulation at batch=64.

This is blocked by PRs:

julianmack added 9 commits January 13, 2020 11:29

Fixed failing test re state_dict loading

acaa9db

Added gradient accumulation CB infrastructure. Still need to write CB

ed8e7b4

Added Gradient accumulation

a869c5f

Updated docstrings

0927f4f

Fixed bug and added docstrings

ef97650

Removed print statements

f8f45b9

Updated StopEpochAfter to deal with minibatches

d9a2715

Added grad accum notebook

442fde2

Fixed doctest failures

d3cb11f

julianmack added the blocked label Jan 15, 2020

julianmack changed the title ~~Adds Gradient Accumulation~~ Gradient Accumulation Jan 15, 2020

Fixed pre-commit error

871b73d

julianmack changed the base branch from master to rnnt January 28, 2020 15:34

julianmack changed the base branch from rnnt to rnnt_lr_warm January 28, 2020 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient Accumulation #24

Gradient Accumulation #24

Uh oh!

julianmack commented Jan 15, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Gradient Accumulation #24

Are you sure you want to change the base?

Gradient Accumulation #24

Uh oh!

Conversation

julianmack commented Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

julianmack commented Jan 15, 2020 •

edited

Loading