You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,7 +28,11 @@ Available models (Training ATM):
28
28
- KoGPT2-medium(345M)
29
29
- KoGPT2-large(774M)
30
30
- KoGPT2-xlarge(1.5B)
31
-
- KoGPT2-2.7B
31
+
- KoGPT2-2.7B *TBA*
32
+
33
+
Models are available as TF checkpoint files [Training script](https://github.com/ksjae/KoGPT2-train) or [Huggingface transformers](https://github.com/huggingface/transformers.git) compatible ones
34
+
35
+
n_ctx available : 1024 2048 384
32
36
33
37
## Intended uses & limitations
34
38
Intended for **Korean** text generation for ai-text-adventure(https://github.com/ksjae/ai-text-adventure) with PPLM.
@@ -48,7 +52,7 @@ or go to [KoGPT2-train](https://github.com/ksjae/KoGPT2-train) and use scripts/d
48
52
v0.1 may have faulty tokenizers, producing bad outputs.
49
53
50
54
v0.2+ be GPT2 with n_ctx of 2048. True form of GPT-3 implementation(alternating layers) will not be available within the year.
51
-
v0.2-story is producing hashtags (which were not finetuned for)
55
+
52
56
If other limitations or errors are found, please open an issue.
0 commit comments