Skip to content
This repository was archived by the owner on Jul 10, 2025. It is now read-only.

Commit cced595

Browse files
author
ematejska
authored
Merge pull request #252 from seanpmorgan/gelu-migration
RFC: Migrate gelu activation from Addons to Core
2 parents 5f3e443 + cff3725 commit cced595

File tree

1 file changed

+80
-0
lines changed

1 file changed

+80
-0
lines changed

rfcs/20200525-gelu-migration.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Migrate gelu activation from TensorFlow Addons to TensorFlow Core
2+
3+
| Status | Accepted |
4+
| :---------- | :------------------------------------------------------------------------------------------------- |
5+
| **RFC #** | [252](https://github.com/tensorflow/community/pull/252) | |
6+
| **Authors** | Tzu-Wei Sung (@WindQAQ) & Sean Morgan (@seanpmorgan) |
7+
| **Sponsor** | @alextp |
8+
| **Updated** | 2020-07-15 |
9+
| **Sponsorship Deadline** | 2020-07-17 (45 Days after submission) |
10+
11+
## Rationale for Migration
12+
* [Gaussian Error Linear Units (GELUs)](https://arxiv.org/pdf/1606.08415.pdf) cited 600+ times
13+
* Used in BERT and other influential architectures
14+
* Multiple approvals from TF side:
15+
* https://github.com/tensorflow/tensorflow/pull/33945#issuecomment-617832325
16+
* https://github.com/tensorflow/tensorflow/issues/32783#issuecomment-537284266
17+
18+
## Historical Information
19+
* Have there been signifiant issues reported to Addons that need to be adressed?
20+
* Only ABI incompatibilities for the custom-op (not an issue if built along with core TF)
21+
* When was it implemented in Addons?
22+
* C++ custom-op added **2019-08-2019 (TFA 0.5.0)**
23+
* Python composite op added **2020-02-26 (TFA 0.9.0)**
24+
* We have [performed benchmarking of the GELU activation](https://colab.research.google.com/drive/1rLb4EuydbFg9PbhboXhCDqopcl6BmphG#scrollTo=0GL2x2S4zxW3)
25+
which shows it may be beneficial to retain the custom-ops, but the maintence burden has grown
26+
to much for us to continue to support it in Addons.
27+
* This migration is long over-due but we've struggled with finalizing the migration process.
28+
29+
## Implementation Details
30+
* Link to implementation in Addons:
31+
* Python: https://github.com/tensorflow/addons/blob/r0.10/tensorflow_addons/activations/gelu.py
32+
* C++ : https://github.com/tensorflow/addons/blob/r0.10/tensorflow_addons/custom_ops/activations/cc/kernels/gelu_op.h
33+
* Does this include custom-op kernels?
34+
* Yes, but currently proposing to just migrate the python composite op. This may
35+
change with discussion in the RFC.
36+
* Are they CPU/GPU/TPU compatible?
37+
* CPU/GPU compatible. No support for TPU.
38+
* What is the pytest coverage of the addon?
39+
* `tensorflow_addons/activations/gelu.py 89%`
40+
## Changes to Implementation (If Needed)
41+
```
42+
def gelu(x: types.TensorLike, approximate: bool = True) -> tf.Tensor:
43+
x = tf.convert_to_tensor(x)
44+
if approximate:
45+
pi = tf.cast(math.pi, x.dtype)
46+
coeff = tf.cast(0.044715, x.dtype)
47+
return 0.5 * x * (1.0 + tf.tanh(tf.sqrt(2.0 / pi) * (x + coeff * tf.pow(x, 3))))
48+
else:
49+
return 0.5 * x * (1.0 + tf.math.erf(x / tf.cast(tf.sqrt(2.0), x.dtype)))
50+
```
51+
The above implementation would only bring over the python composite op. Since there is
52+
[no way for us to build tfxla kernels](https://github.com/tensorflow/tensorflow/pull/33945#issuecomment-617842977)
53+
we had no support for TPUs in Addons. [There were comments](https://github.com/tensorflow/tensorflow/pull/33945#issuecomment-625380208)
54+
about using a "selector", but we would need guidance on how to implement that.
55+
56+
We may also want to discuss the `approximate` bool and if it should be included in the
57+
upstream version.
58+
59+
60+
## Transition Plan
61+
* The activation would land in [nn_ops.py](https://github.com/tensorflow/tensorflow/blob/r2.2/tensorflow//python/ops/nn_ops.py), [keras activations](https://github.com/tensorflow/tensorflow/blob/r2.2/tensorflow/python/keras/activations.py),
62+
and possibly in [keras advaced_activation layers](https://github.com/tensorflow/tensorflow/blob/r2.2/tensorflow/python/keras/layers/advanced_activations.py)
63+
* No planned changes to the parameter signatures at this time
64+
* Addons would deprecate our activation and make a call to the core functionality.
65+
* After merging to TF Core:
66+
* Consolidate/remove https://github.com/tensorflow/models/blob/r2.2.0/official/modeling/activations/gelu.py
67+
* Consolidate/remove https://github.com/tensorflow/models/blob/r2.2.0/official/modeling/activations/gelu_test.py
68+
* Consolidate/remove https://github.com/tensorflow/models/blob/r2.2.0/official/nlp/xlnet/xlnet_modeling.py#L29
69+
70+
## Relevant GitHub Issues
71+
* https://github.com/tensorflow/tensorflow/pull/33945
72+
* https://github.com/tensorflow/addons/issues/550
73+
* https://github.com/tensorflow/tensorflow/issues/32783
74+
75+
## Questions and Discussion Topics
76+
* Whom from the TF core team would sponsor this migration and ownership of the API?
77+
* Is it worth bringing over the custom-op kernels for CPU/GPU?
78+
79+
## Final Decision
80+
TBD

0 commit comments

Comments
 (0)