1- # KerasNLP
1+ # KerasNLP: Modular NLP Workflows for Keras
22[ ![ ] ( https://github.com/keras-team/keras-nlp/workflows/Tests/badge.svg?branch=master )] ( https://github.com/keras-team/keras-nlp/actions?query=workflow%3ATests+branch%3Amaster )
33![ Python] ( https://img.shields.io/badge/python-v3.7.0+-success.svg )
44![ Tensorflow] ( https://img.shields.io/badge/tensorflow-v2.5.0+-success.svg )
55[ ![ contributions welcome] ( https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat )] ( https://github.com/keras-team/keras-nlp/issues )
66
7- KerasNLP is a simple and powerful API for building Natural Language Processing
8- (NLP) models within the Keras ecosystem.
97
10- KerasNLP provides modular building blocks following
11- standard Keras interfaces (layers, metrics) that allow you to quickly and
12- flexibly iterate on your task. Engineers working in applied NLP can leverage the
13- library to assemble training and inference pipelines that are both
14- state-of-the-art and production-grade.
8+ KerasNLP is a natural language processing library that supports users through
9+ their entire development cycle. Our workflows are built from modular components
10+ that have state-of-the-art preset weights and architectures when used
11+ out-of-the-box and are easily customizable when more control is needed. We
12+ emphasize in-graph computation for all workflows so that developers can expect
13+ easy productionization using the TensorFlow ecosystem.
1514
16- KerasNLP can be understood as a horizontal extension of the Keras API —
17- components are first-party Keras objects that are too specialized to be
18- added to core Keras, but that receive the same level of polish as the rest of
19- the Keras API.
15+ This library is an extension of the core Keras API; all high-level modules are
16+ [ ` Layers ` ] ( https://keras.io/api/layers/ ) or
17+ [ ` Models ` ] ( https://keras.io/api/models/ ) that recieve that same level of polish
18+ as core Keras. If you are familiar with Keras, congratulations! You already
19+ understand most of KerasNLP.
2020
21- We are a new and growing project, and welcome [ contributions] ( CONTRIBUTING.md ) .
21+ See our [ Getting Started guide] ( https://keras.io/guides/keras_nlp/getting_started )
22+ for example usage of our modular API starting with evaluating pretrained models
23+ and building up to designing a novel transformer architecture and training a
24+ tokenizer from scratch.
25+
26+ We are a new and growing project and welcome [ contributions] ( CONTRIBUTING.md ) .
2227
2328## Quick Links
2429
@@ -27,6 +32,7 @@ We are a new and growing project, and welcome [contributions](CONTRIBUTING.md).
2732- [ Home Page] ( https://keras.io/keras_nlp )
2833- [ Developer Guides] ( https://keras.io/guides/keras_nlp )
2934- [ API Reference] ( https://keras.io/api/keras_nlp )
35+ - [ Getting Started guide] ( https://keras.io/guides/keras_nlp/getting_started )
3036
3137### For contributors
3238
@@ -53,40 +59,37 @@ pip install git+https://github.com/keras-team/keras-nlp.git --upgrade
5359
5460## Quickstart
5561
56- Tokenize text, build a tiny transformer, and train a single batch:
62+ Fine-tune BERT on a small sentiment analysis task using the
63+ [ ` keras_nlp.models ` ] ( https://keras.io/api/keras_nlp/models/ ) API:
5764
5865``` python
5966import keras_nlp
60- import tensorflow as tf
6167from tensorflow import keras
68+ import tensorflow_datasets as tfds
6269
63- # Tokenize some inputs with a binary label.
64- vocab = [" [UNK]" , " the" , " qu" , " ##ick" , " br" , " ##own" , " fox" , " ." ]
65- sentences = [" The quick brown fox jumped." , " The fox slept." ]
66- tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(
67- vocabulary = vocab,
68- sequence_length = 10 ,
70+ imdb_train, imdb_test = tfds.load(
71+ " imdb_reviews" ,
72+ split = [" train" , " test" ],
73+ as_supervised = True ,
74+ batch_size = 16 ,
75+ )
76+ classifier = keras_nlp.models.BertClassifier.from_preset(
77+ " bert_base_en_uncased" ,
78+ )
79+ classifier.compile(
80+ loss = keras.losses.SparseCategoricalCrossentropy(from_logits = True ),
81+ optimizer = keras.optimizers.experimental.AdamW(5e-5 ),
82+ metrics = keras.metrics.SparseCategoricalAccuracy(),
83+ jit_compile = True ,
6984)
70- x, y = tokenizer(sentences), tf.constant([1 , 0 ])
71-
72- # Create a tiny transformer.
73- inputs = keras.Input(shape = (None ,), dtype = " int32" )
74- outputs = keras_nlp.layers.TokenAndPositionEmbedding(
75- vocabulary_size = len (vocab),
76- sequence_length = 10 ,
77- embedding_dim = 16 ,
78- )(inputs)
79- outputs = keras_nlp.layers.TransformerEncoder(
80- num_heads = 4 ,
81- intermediate_dim = 32 ,
82- )(outputs)
83- outputs = keras.layers.GlobalAveragePooling1D()(outputs)
84- outputs = keras.layers.Dense(1 , activation = " sigmoid" )(outputs)
85- model = keras.Model(inputs, outputs)
86-
87- # Run a single batch of gradient descent.
88- model.compile(optimizer = " adam" , loss = " binary_crossentropy" , jit_compile = True )
89- model.train_on_batch(x, y)
85+ classifier.fit(
86+ imdb_train,
87+ validation_data = imdb_test,
88+ epochs = 1 ,
89+ )
90+
91+ # Predict a new example
92+ classifier.predict([" What an amazing movie, three hours of pure bliss!" ])
9093```
9194
9295For more in depth guides and examples, visit https://keras.io/keras_nlp/ .
@@ -104,7 +107,7 @@ KerasNLP provides access to pre-trained models via the `keras_nlp.models` API.
104107These pre-trained models are provided on an "as is" basis, without warranties
105108or conditions of any kind. The following underlying models are provided by third
106109parties, and subject to separate licenses:
107- DistilBERT, RoBERTa, XLM-RoBERTa, GPT-2.
110+ DistilBERT, RoBERTa, XLM-RoBERTa, DeBERTa, and GPT-2.
108111
109112## Citing KerasNLP
110113
@@ -114,7 +117,8 @@ Here is the BibTeX entry:
114117``` bibtex
115118@misc{kerasnlp2022,
116119 title={KerasNLP},
117- author={Watson, Matthew, and Qian, Chen, and Zhu, Scott and Chollet, Fran\c{c}ois and others},
120+ author={Watson, Matthew, and Qian, Chen, and Bischof, Jonathan and Chollet,
121+ Fran\c{c}ois and others},
118122 year={2022},
119123 howpublished={\url{https://github.com/keras-team/keras-nlp}},
120124}
0 commit comments