Skip to content

Commit 488b4ce

Browse files
committed
Improve MacOS support and pin tensorflow version during testing (#383)
* Improve MacOS support * Conditionally import tensorflow_text everywhere * Use requirements files for continuous testing * Fix logs * Bug fixes and improvement for linux testing * Typo fix * Address review comments
1 parent 1880120 commit 488b4ce

19 files changed

+205
-58
lines changed

.github/workflows/actions.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,8 @@ jobs:
2929
${{ runner.os }}-pip-
3030
- name: Install dependencies
3131
run: |
32-
pip install tensorflow
33-
pip install -e ".[tests]" --progress-bar off --upgrade
32+
pip install -r requirements.txt --progress-bar off
33+
pip install -e "." --progress-bar off
3434
- name: Test with pytest
3535
run: |
3636
pytest --cov=keras_nlp --cov-report xml:coverage.xml
@@ -57,7 +57,7 @@ jobs:
5757
${{ runner.os }}-pip-
5858
- name: Install dependencies
5959
run: |
60-
pip install tensorflow
61-
pip install -e ".[tests]" --progress-bar off --upgrade
60+
pip install -r requirements.txt --progress-bar off
61+
pip install -e "." --progress-bar off
6262
- name: Lint
6363
run: bash shell/lint.sh

.github/workflows/nightly.yml

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,8 @@ jobs:
3030
${{ runner.os }}-pip-
3131
- name: Install dependencies
3232
run: |
33-
pip install -e ".[tests]" --progress-bar off --upgrade
34-
pip uninstall keras -y
35-
pip uninstall tensorflow -y
36-
pip uninstall tensorflow_text -y
37-
pip install tf-nightly --progress-bar off --upgrade
38-
pip install tensorflow-text-nightly --progress-bar off --upgrade
33+
pip install -r requirements-nightly.txt --progress-bar off
34+
pip install -e "." --progress-bar off
3935
- name: Test with pytest
4036
run: |
4137
pytest --cov=keras_nlp --cov-report xml:coverage.xml

CONTRIBUTING.md

Lines changed: 77 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -84,25 +84,90 @@ Once the pull request is approved, a team member will take care of merging.
8484
Python 3.7 or later is required.
8585

8686
Setting up your KerasNLP development environment requires you to fork the
87-
KerasNLP repository, clone the repository, create a virtual environment, and
88-
install dependencies.
89-
90-
You can achieve this by running the following commands:
87+
KerasNLP repository and clone it locally. With the
88+
[GitHub CLI](https://github.com/cli/cli) installed, you can do this as follows:
9189

9290
```shell
9391
gh repo fork keras-team/keras-nlp --clone --remote
9492
cd keras-nlp
95-
python -m venv ~/keras-nlp-venv
96-
source ~/keras-nlp-venv/bin/activate
97-
pip install -e ".[tests]"
9893
```
9994

100-
The first line relies on having an installation of
101-
[the GitHub CLI](https://github.com/cli/cli).
95+
Next we must setup a python environment with the correct dependencies. We
96+
recommend using `conda` to install tensorflow dependencies (such as CUDA), and
97+
`pip` to install python packages from PyPI. The exact method will depend on your
98+
OS.
99+
100+
### Linux (recommended)
101+
102+
To setup a complete environment with TensorFlow, a local install of keras-nlp,
103+
and all development tools, run the following or adapt it to suit your needs.
104+
105+
```shell
106+
# Create and activate conda environment.
107+
conda create -n keras-nlp python=3.9
108+
conda activate keras-nlp
109+
110+
# The following can be omitted if GPU support is not required.
111+
conda install -c conda-forge cudatoolkit-dev=11.2 cudnn=8.1.0
112+
mkdir -p $CONDA_PREFIX/etc/conda/activate.d/
113+
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
114+
echo 'export XLA_FLAGS=--xla_gpu_cuda_data_dir=$CONDA_PREFIX/' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
115+
source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
116+
117+
# Install dependencies.
118+
python -m pip install --upgrade pip
119+
python -m pip install -r requirements.txt
120+
python -m pip install -e "."
121+
```
122+
123+
### MacOS
124+
125+
⚠️⚠️⚠️ MacOS binaries are for the M1 architecture are not currently available from
126+
official sources. You can try experimental development workflow leveraging the
127+
[tensorflow metal plugin](https://developer.apple.com/metal/tensorflow-plugin/)
128+
and a [community maintained build](https://github.com/sun1638650145/Libraries-and-Extensions-for-TensorFlow-for-Apple-Silicon)
129+
of `tensorflow-text`. These binaries are not provided by Google, so proceed at
130+
your own risk.
131+
132+
#### Experimental instructions for Arm (M1)
133+
134+
```shell
135+
# Create and activate conda environment.
136+
conda create -n keras-nlp python=3.9
137+
conda activate keras-nlp
138+
139+
# Install dependencies.
140+
conda install -c apple tensorflow-deps=2.9
141+
python -m pip install --upgrade pip
142+
python -m pip install -r requirements-macos-m1.txt
143+
python -m pip install -e "."
144+
```
102145

103-
Following these commands you should be able to run the tests using
104-
`pytest keras_nlp`. Please report any issues running tests following these
105-
steps.
146+
#### Instructions for x86 (Intel)
147+
148+
```shell
149+
# Create and activate conda environment.
150+
conda create -n keras-nlp python=3.9
151+
conda activate keras-nlp
152+
153+
# Install dependencies.
154+
python -m pip install --upgrade pip
155+
python -m pip install -r requirements.txt
156+
python -m pip install -e "."
157+
```
158+
159+
### Windows
160+
161+
For the best experience developing on windows, please install
162+
[WSL](https://learn.microsoft.com/en-us/windows/wsl/install), and proceed with
163+
the linux installation instruction above.
164+
165+
To run the format and lint scripts, make sure you clone the repo with Linux
166+
style line endings and change any line separator settings in your editor.
167+
This is automatically done if you clone using git inside WSL.
168+
169+
Note that will not support Windows Shell/PowerShell for any scripts in this
170+
repository.
106171

107172
## Testing changes
108173

@@ -143,18 +208,3 @@ the following commands manually every time you want to format your code:
143208
If after running these the CI flow is still failing, try updating `flake8`,
144209
`isort` and `black`. This can be done by running `pip install --upgrade black`,
145210
`pip install --upgrade flake8`, and `pip install --upgrade isort`.
146-
147-
## Developing on Windows
148-
149-
For Windows development, we recommend using WSL (Windows Subsystem for Linux),
150-
so you can run the shell scripts in this repository. We will not support
151-
Windows Shell/PowerShell. You can refer
152-
[to these instructions](https://docs.microsoft.com/en-us/windows/wsl/install)
153-
for WSL installation.
154-
155-
Note that if you are using Windows Subsystem for Linux (WSL), make sure you
156-
clone the repo with Linux style LF line endings and change the default setting
157-
for line separator in your Text Editor before running the format
158-
or lint scripts. This is automatically done if you clone using git inside WSL.
159-
If there is conflict due to the line endings you might see an error
160-
like - `: invalid option`.

examples/bert/README.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,6 @@ need to be trained for much longer on a much larger dataset.
1616
OUTPUT_DIR=~/bert_test_output
1717
DATA_URL=https://storage.googleapis.com/tensorflow/keras-nlp/examples/bert
1818

19-
# Create a virtual env and install dependencies.
20-
mkdir $OUTPUT_DIR
21-
python3 -m venv $OUTPUT_DIR/env && source $OUTPUT_DIR/env/bin/activate
22-
pip install -e ".[tests,examples]"
23-
2419
# Download example data.
2520
wget ${DATA_URL}/bert_vocab_uncased.txt -O $OUTPUT_DIR/bert_vocab_uncased.txt
2621
wget ${DATA_URL}/wiki_example_data.txt -O $OUTPUT_DIR/wiki_example_data.txt

keras_nlp/integration_tests/basic_usage_test.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,17 @@
1313
# limitations under the License.
1414

1515
import tensorflow as tf
16+
from absl.testing import parameterized
1617
from tensorflow import keras
1718

1819
import keras_nlp
1920

2021

21-
class BasicUsageTest(tf.test.TestCase):
22-
def test_quick_start(self):
22+
class BasicUsageTest(tf.test.TestCase, parameterized.TestCase):
23+
@parameterized.named_parameters(
24+
("jit_compile_false", False), ("jit_compile_true", True)
25+
)
26+
def test_quick_start(self, jit_compile):
2327
"""This matches the quick start example in our base README."""
2428

2529
# Tokenize some inputs with a binary label.
@@ -47,7 +51,7 @@ def test_quick_start(self):
4751
model = keras.Model(inputs, outputs)
4852

4953
# Run a single batch of gradient descent.
50-
model.compile(loss="binary_crossentropy", jit_compile=True)
54+
model.compile(loss="binary_crossentropy", jit_compile=jit_compile)
5155
loss = model.train_on_batch(x, y)
5256

5357
# Make sure we have a valid loss.

keras_nlp/layers/mlm_mask_generator.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,15 @@
1313
# limitations under the License.
1414

1515
import tensorflow as tf
16-
import tensorflow_text as tf_text
1716
from tensorflow import keras
1817

18+
from keras_nlp.utils.tf_utils import assert_tf_text_installed
19+
20+
try:
21+
import tensorflow_text as tf_text
22+
except ImportError:
23+
tf_text = None
24+
1925

2026
class MLMMaskGenerator(keras.layers.Layer):
2127
"""Layer that applies language model masking.
@@ -96,6 +102,8 @@ def __init__(
96102
random_token_rate=0.1,
97103
**kwargs,
98104
):
105+
assert_tf_text_installed(self.__class__.__name__)
106+
99107
super().__init__(**kwargs)
100108
self.vocabulary_size = vocabulary_size
101109
self.unselectable_token_ids = unselectable_token_ids

keras_nlp/layers/multi_segment_packer.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,15 @@
1515
"""BERT token packing layer."""
1616

1717
import tensorflow as tf
18-
import tensorflow_text as tf_text
1918
from tensorflow import keras
2019

20+
from keras_nlp.utils.tf_utils import assert_tf_text_installed
21+
22+
try:
23+
import tensorflow_text as tf_text
24+
except ImportError:
25+
tf_text = None
26+
2127

2228
class MultiSegmentPacker(keras.layers.Layer):
2329
"""Packs multiple sequences into a single fixed width model input.
@@ -106,6 +112,8 @@ def __init__(
106112
truncator="round_robin",
107113
**kwargs,
108114
):
115+
assert_tf_text_installed(self.__class__.__name__)
116+
109117
super().__init__(**kwargs)
110118
self.sequence_length = sequence_length
111119
if truncator not in ("round_robin", "waterfall"):

keras_nlp/metrics/rouge_base.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
import tensorflow as tf
2121
from tensorflow import keras
2222

23-
from keras_nlp.utils.tensor_utils import tensor_to_string_list
23+
from keras_nlp.utils.tf_utils import tensor_to_string_list
2424

2525
try:
2626
import rouge_score
@@ -62,8 +62,8 @@ def __init__(
6262

6363
if rouge_score is None:
6464
raise ImportError(
65-
"ROUGE metric requires the `rouge_score` package. "
66-
"Please install it with `pip install rouge-score`."
65+
f"{self.__class__.__name__} requires the `rouge_score` "
66+
"package. Please install it with `pip install rouge-score`."
6767
)
6868

6969
if not tf.as_dtype(self.dtype).is_floating:

keras_nlp/tokenizers/byte_tokenizer.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,14 @@
1616

1717
import numpy as np
1818
import tensorflow as tf
19-
import tensorflow_text as tf_text
2019

2120
from keras_nlp.tokenizers import tokenizer
21+
from keras_nlp.utils.tf_utils import assert_tf_text_installed
22+
23+
try:
24+
import tensorflow_text as tf_text
25+
except ImportError:
26+
tf_text = None
2227

2328

2429
class ByteTokenizer(tokenizer.Tokenizer):
@@ -150,6 +155,8 @@ def __init__(
150155
replacement_char: int = 65533,
151156
**kwargs,
152157
):
158+
assert_tf_text_installed(self.__class__.__name__)
159+
153160
# Check dtype and provide a default.
154161
if "dtype" not in kwargs or kwargs["dtype"] is None:
155162
kwargs["dtype"] = tf.int32

keras_nlp/tokenizers/sentence_piece_tokenizer.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,15 @@
1717
from typing import List
1818

1919
import tensorflow as tf
20-
import tensorflow_text as tf_text
2120

2221
from keras_nlp.tokenizers import tokenizer
23-
from keras_nlp.utils.tensor_utils import tensor_to_string_list
22+
from keras_nlp.utils.tf_utils import assert_tf_text_installed
23+
from keras_nlp.utils.tf_utils import tensor_to_string_list
24+
25+
try:
26+
import tensorflow_text as tf_text
27+
except ImportError:
28+
tf_text = None
2429

2530

2631
class SentencePieceTokenizer(tokenizer.Tokenizer):
@@ -96,6 +101,8 @@ def __init__(
96101
sequence_length: int = None,
97102
**kwargs,
98103
) -> None:
104+
assert_tf_text_installed(self.__class__.__name__)
105+
99106
# Check dtype and provide a default.
100107
if "dtype" not in kwargs or kwargs["dtype"] is None:
101108
kwargs["dtype"] = tf.int32

0 commit comments

Comments
 (0)