This repository was archived by the owner on Dec 16, 2022. It is now read-only.
v2.7.0
#5394
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What's new
Added 🎉
evaluatecommand.pytorch_lr_schedulers.pyso that they will have their own documentation page.allennlp.nn.parallelwith a new base class,DdpAccelerator, which generalizesPyTorch's
DistributedDataParallelwrapper to support other implementations. Two implementations ofthis class are provided. The default is
TorchDdpAccelerator(registered at "torch"), which is just a thin wrapper aroundDistributedDataParallel. The other isFairScaleFsdpAccelerator, which wraps FairScale'sFullyShardedDataParallel.You can specify the
DdpAcceleratorin the "distributed" section of a configuration file under the key "ddp_accelerator".allennlp.nn.checkpointwith a new base class,CheckpointWrapper, for implementationsof activation/gradient checkpointing. Two implentations are provided. The default implementation is
TorchCheckpointWrapper(registered as "torch"),which exposes PyTorch's checkpoint functionality.
The other is
FairScaleCheckpointWrapperwhich exposes the more flexiblecheckpointing funtionality from FairScale.
Modelbase class now takes addp_acceleratorparameter (an instance ofDdpAccelerator) which will be available asself.ddp_acceleratorduring distributed training. This is useful when, for example, instantiating submodules in yourmodel's
__init__()method by wrapping them withself.ddp_accelerator.wrap_module(). See theallennlp.modules.transformer.t5for an example.
ScaledDotProductMatrixAttention, and converted the transformer toolkit to use itAttentionandMatrixAttentionimplementations are interchangeablefrom_pretrained_transformer_and_instancesconstructor toVocabularyTransformerTextFieldnow supports__len__.Fixed ✅
ConditionalRandomField:transitionsandtag_sequencetensors were not initialized on the desired device causing high CPU usage (see Why CRF lead a high cost on CPU? #2884)contructor_extrasinLazy()is now correctly calledconstructor_extras.allennlp.nn.initializersdocs.BeamSearchwherelast_backpointerswas not being passed to anyConstraints.TransformerTextFieldcan now take tensors of shape(1, n)like the tensors produced from a HuggingFace tokenizer.tqdmlock is now set insideMultiProcessDataLoadingwhen new workers are spawned to avoid contention when writing output.ConfigurationErroris now pickleable.TextFieldTensorin heads, not just in the backbone.ScaledDotProductAttentionto match the otherAttentionclassesallennlpcommands will now catchSIGTERMsignals and handle them similar toSIGINT(keyboard interrupt).MultiProcessDataLoaderwill properly shutdown its workers when aSIGTERMis received.Stepinstances.Changed⚠️
grad_normparameter ofGradientDescentTraineris nowUnion[float, bool],with a default value of
False.Falsemeans gradients are not rescaled and the gradientnorm is never even calculated.
Truemeans the gradients are still not rescaled but the gradientnorm is calculated and passed on to callbacks. A
floatvalue means gradients are rescaled.TensorCachenow supports more concurrent readers and writers.Commits
48af9d3 Multiple datasets and output files support for the evaluate command (#5340)
60213cd Tiny tango tweaks (#5383)
2895021 improve signal handling and worker cleanup (#5378)
b41cb3e Fix distributed loss (#5381)
6355f07 Fix Checkpointer cleaner regex on Windows (#5361)
27da04c Dataset remix (#5372)
75af38e Create Vocabulary from both pretrained transformers and instances (#5368)
5dc80a6 Adds a dataset that can be read and written lazily (#5344)
01e8a35 Improved Documentation For Learning Rate Schedulers (#5365)
8370cfa skip loading t5-base in CI (#5371)
13de38d Log batch metrics (#5362)
1f5c6e5 Use our own base images to build allennlp Docker images (#5366)
bffdbfd Bugfix: initializing all tensors and parameters of the
ConditionalRandomFieldmodel on the proper device (#5335)d45a2da Make sure that all attention works the same (#5360)
c1edaef Update google-cloud-storage requirement (#5357)
524244b Update wandb requirement from <0.12.0,>=0.10.0 to >=0.10.0,<0.13.0 (#5356)
90bf33b small fixes for tango (#5350)
2e11a15 tick version for nightly releases
311f110 Tango (#5162)
1df2e51 Bump fairscale from 0.3.8 to 0.3.9 (#5337)
b72bbfc fix constraint bug in beam search, clean up tests (#5328)
ec3e294 Create CITATION.cff (#5336)
8714aa0 This is a desperate attempt to make TensorCache a little more stable (#5334)
fd429b2 Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 (#5326)
1b5ef3a Update spacy requirement from <3.1,>=2.1.0 to >=2.1.0,<3.2 (#5305)
1f20513 TextFieldTensor in multitask models (#5331)
76f2487 set tqdm lock when new workers are spawned (#5330)
67add9d Fix
ConfigurationErrordeserialization (#5319)42d8529 allow TransformerTextField to take input directly from HF tokenizer (#5329)
64043ac Bump black from 21.6b0 to 21.7b0 (#5320)
3275055 Update mkdocs-material requirement from <7.2.0,>=5.5.0 to >=5.5.0,<7.3.0 (#5327)
5b1da90 Update links in initializers documentation (#5317)
ca656fc FairScale integration (#5242)
This discussion was created from the release v2.7.0.
Beta Was this translation helpful? Give feedback.
All reactions