Skip to content
This repository was archived by the owner on Jul 4, 2023. It is now read-only.

Commit e852dae

Browse files
authored
Merge pull request #68 from PetrochukM/update
Release 0.4.0 - Encoder rewrite, variable sequence collate support, reduced memory usage, doctests, removed SRU
2 parents aa50d77 + d944083 commit e852dae

98 files changed

Lines changed: 1247 additions & 1860 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.style.yapf

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
[style]
2+
based_on_style = chromium
3+
indent_width = 4
4+
column_limit = 100

.travis.yml

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,20 @@ dist: trusty
22
sudo: required
33

44
language: python
5-
python:
6-
- '3.6'
7-
- '3.5'
5+
matrix:
6+
include:
7+
- python: 3.6
8+
dist: trusty
9+
sudo: false
10+
- python: 3.7
11+
dist: xenial
12+
sudo: true
813

914
cache: pip
1015

1116
notifications:
1217
email: false
13-
18+
1419
before_install: source build_tools/travis/before_install.sh
1520
install: source build_tools/travis/install.sh
1621
script: RUN_DOCS=true RUN_SLOW=true RUN_FLAKE8=true bash build_tools/travis/test_script.sh

README.md

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Join our community, add datasets and neural network layers! Chat with us on [Git
1919

2020
## Installation
2121

22-
Make sure you have Python 3.5+ and PyTorch 0.4 or newer. You can then install `pytorch-nlp` using
22+
Make sure you have Python 3.6+ and PyTorch 1.0+. You can then install `pytorch-nlp` using
2323
pip:
2424

2525
pip install pytorch-nlp
@@ -50,35 +50,32 @@ train[0] # RETURNS: {'text': 'For a movie that gets..', 'sentiment': 'pos'}
5050

5151
### Apply [Neural Networks](http://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.nn.html) Layers
5252

53-
For example, from the neural network package, apply a Simple Recurrent Unit (SRU):
53+
For example, from the neural network package, apply state-of-the-art LockedDropout:
5454

5555
```python
56-
from torchnlp.nn import SRU
5756
import torch
57+
from torchnlp.nn import LockedDropout
5858

59-
input_ = torch.autograd.Variable(torch.randn(6, 3, 10))
60-
sru = SRU(10, 20)
59+
input_ = torch.randn(6, 3, 10)
60+
dropout = LockedDropout(0.5)
6161

62-
# Apply a Simple Recurrent Unit to `input_`
63-
sru(input_)
64-
# RETURNS: (
65-
# output [torch.FloatTensor (6x3x20)],
66-
# hidden_state [torch.FloatTensor (2x3x20)]
67-
# )
62+
# Apply a LockedDropout to `input_`
63+
dropout(input_)
64+
# RETURNS: torch.FloatTensor (6x3x10)
6865
```
6966

70-
### [Encode Text](http://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.text_encoders.html)
67+
### [Encode Text](http://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.encoders.text.html)
7168

7269
Tokenize and encode text as a tensor. For example, a `WhitespaceEncoder` breaks text into terms whenever it encounters a whitespace character.
7370

7471
```python
75-
from torchnlp.text_encoders import WhitespaceEncoder
72+
from torchnlp.encoders.text import WhitespaceEncoder
7673

7774
# Create a `WhitespaceEncoder` with a corpus of text
7875
encoder = WhitespaceEncoder(["now this ain't funny", "so don't you dare laugh"])
7976

8077
# Encode and decode phrases
81-
encoder.encode("this ain't funny.") # RETURNS: torch.LongTensor([6, 7, 1])
78+
encoder.encode("this ain't funny.") # RETURNS: torch.Tensor([6, 7, 1])
8279
encoder.decode(encoder.encode("This ain't funny.")) # RETURNS: "this ain't funny."
8380
```
8481

build_tools/travis/install.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,10 @@ python -m spacy download en
3535
python -m nltk.downloader perluniprops nonbreaking_prefixes
3636

3737
# Install PyTorch Dependancies
38-
if [[ $TRAVIS_PYTHON_VERSION == '3.6' ]]; then
39-
pip install http://download.pytorch.org/whl/cpu/torch-0.4.0-cp36-cp36m-linux_x86_64.whl
38+
if [[ $TRAVIS_PYTHON_VERSION == '3.7' ]]; then
39+
pip install https://download.pytorch.org/whl/cpu/torch-1.0.1.post2-cp37-cp37m-linux_x86_64.whl
4040
fi
41-
if [[ $TRAVIS_PYTHON_VERSION == '3.5' ]]; then
42-
pip install http://download.pytorch.org/whl/cpu/torch-0.4.0-cp35-cp35m-linux_x86_64.whl
41+
if [[ $TRAVIS_PYTHON_VERSION == '3.6' ]]; then
42+
pip install https://download.pytorch.org/whl/cpu/torch-1.0.1.post2-cp36-cp36m-linux_x86_64.whl
4343
fi
4444
pip install torchvision

build_tools/travis/test_script.sh

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,23 @@
77
# Exit immediately if a command exits with a non-zero status.
88
set -e
99

10-
python --version
11-
12-
if [[ "$RUN_FLAKE8" == "true" ]]; then
13-
flake8
14-
fi
10+
export PYTHONPATH=.
1511

12+
python --version
1613

1714
if [[ "$RUN_DOCS" == "true" ]]; then
1815
make -C docs html
1916
fi
2017

18+
if [[ "$RUN_FLAKE8" == "true" ]]; then
19+
flake8 torchnlp/
20+
flake8 tests/
21+
fi
22+
2123
run_tests() {
24+
TEST_CMD="python -m pytest tests/ torchnlp/ --verbose --durations=20 --cov=torchnlp --doctest-modules"
2225
if [[ "$RUN_SLOW" == "true" ]]; then
23-
TEST_CMD="py.test -v --durations=20 --cov=torchnlp --runslow"
24-
else
25-
TEST_CMD="py.test -v --durations=20 --cov=torchnlp"
26+
TEST_CMD="$TEST_CMD --runslow"
2627
fi
2728
$TEST_CMD
2829
}

docs/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ and text encoders. It's open-source software, released under the BSD3 license.
1616
source/torchnlp.datasets
1717
source/torchnlp.word_to_vector
1818
source/torchnlp.nn
19-
source/torchnlp.text_encoders
19+
source/torchnlp.encoders
2020
source/torchnlp.samplers
2121
source/torchnlp.metrics
2222
source/torchnlp.utils

docs/source/torchnlp.encoders.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
torchnlp.encoders package
2+
===============================
3+
4+
The ``torchnlp.encoders`` package supports encoding objects as a vector
5+
:class:`torch.Tensor` and decoding a vector :class:`torch.Tensor` back.
6+
7+
.. automodule:: torchnlp.encoders
8+
:members:
9+
:undoc-members:
10+
:show-inheritance:
11+
12+
.. automodule:: torchnlp.encoders.text
13+
:members:
14+
:undoc-members:
15+
:show-inheritance:

docs/source/torchnlp.text_encoders.rst

Lines changed: 0 additions & 11 deletions
This file was deleted.

examples/awd-lstm-lm/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
`awd-lstm-lm` set the state-of-the-art in word level perplexities in 2017. With PyTorch NLP, we show that in 30 minutes, we were able to reduce the footprint of this repository by 4 files (185 lines of code). We employ the use of the [datasets package](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.datasets.html), [IdentityEncoder module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.text_encoders.html#torchnlp.text_encoders.IdentityEncoder), [BPTTBatchSampler module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.samplers.html#torchnlp.samplers.BPTTBatchSampler), [LockedDropout module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.nn.html#torchnlp.nn.LockedDropout) and [WeightDrop module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.nn.html#torchnlp.nn.WeightDrop)
1+
`awd-lstm-lm` set the state-of-the-art in word level perplexities in 2017. With PyTorch NLP, we show that in 30 minutes, we were able to reduce the footprint of this repository by 4 files (185 lines of code). We employ the use of the [datasets package](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.datasets.html), [IdentityEncoder module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.encoders.text.html#torchnlp.encoders.text.IdentityEncoder), [BPTTBatchSampler module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.samplers.html#torchnlp.samplers.BPTTBatchSampler), [LockedDropout module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.nn.html#torchnlp.nn.LockedDropout) and [WeightDrop module](https://pytorchnlp.readthedocs.io/en/latest/source/torchnlp.nn.html#torchnlp.nn.WeightDrop)
22

33

44
Below is the original README from the repository:

examples/awd-lstm-lm/main.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -96,17 +96,17 @@ def model_load(fn):
9696

9797

9898
from torchnlp import datasets
99-
from torchnlp.text_encoders import IdentityEncoder
99+
from torchnlp.encoders import LabelEncoder
100100
from torchnlp.samplers import BPTTBatchSampler
101101

102102
print('Producing dataset...')
103103
train, val, test = getattr(datasets, args.data)(train=True, dev=True, test=True)
104104

105-
encoder = IdentityEncoder(train + val + test)
105+
encoder = LabelEncoder(train + val + test)
106106

107-
train_data = encoder.encode(train)
108-
val_data = encoder.encode(val)
109-
test_data = encoder.encode(test)
107+
train_data = encoder.batch_encode(train)
108+
val_data = encoder.batch_encode(val)
109+
test_data = encoder.batch_encode(test)
110110

111111
eval_batch_size = 10
112112
test_batch_size = 1

0 commit comments

Comments
 (0)