Skip to content

Commit 1299346

Browse files
authored
Update README.md
1 parent 33b6f7c commit 1299346

File tree

1 file changed

+14
-15
lines changed

1 file changed

+14
-15
lines changed

README.md

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,15 @@ Demo available at : [아무말 대잔치](https://text.ksjit.com)
1919

2020
## Model description
2121
GPT2 and GPT3 trained on ~40GB of Korean datasets.
22-
see [Training data] for more details.
22+
see the included json files for hyperparameter details.
2323

2424
Available models (Training ATM):
2525
- KoGPT2-base(117M)
2626
- KoGPT2-medium(345M)
2727
- KoGPT2-large(774M)
2828
- KoGPT2-xlarge(1.5B)
29-
- KoGPT2-2.7B
30-
- KoGPT2-6.7B
29+
- KoGPT3-2.7B
30+
- KoGPT3-6.7B
3131
- GPT3-13B if possible
3232

3333
## Intended uses & limitations
@@ -37,22 +37,18 @@ Intended for **Korean** text generation for ai-text-adventure(https://github.com
3737
Download files from links at the Releases tab.
3838
Alternatively, it is available from [my own server(wget-friendly)](https://static.ksjit.com)
3939

40-
*Will be available from HuggingFace Model Hub soon*
41-
4240
#### How to use
4341

44-
from huggingface/transformers
45-
```bash
46-
python3 examples/text-generation/run_generation.py --model_type=gpt2 --model_name_or_path=kogpt2 --length=100 --fp16 --repetition_penalty=2 --p=0.8 --k=20
47-
```
42+
Try out on [colab](https://colab.research.google.com/drive/1s5zZZL8j2waMTkwUOmSOv6IywoBrNm1z?usp=sharing)
4843

49-
Or try out on [colab](https://colab.research.google.com/drive/1s5zZZL8j2waMTkwUOmSOv6IywoBrNm1z?usp=sharing)
44+
or go to [KoGPT2-train](https://github.com/ksjae/KoGPT2-train) and use scripts/demo.py
5045

5146
#### Limitations and bias
5247

5348
v0.1 may have faulty tokenizers, producing bad outputs.
5449

55-
v0.2 will be GPT2 with n_ctx of 2048. True form of GPT-3 implementation(alternating layers) will not be available within the year.
50+
v0.2+ be GPT2 with n_ctx of 2048. True form of GPT-3 implementation(alternating layers) will not be available within the year.
51+
v0.2-story is producing hashtags (which were not finetuned for)
5652
If other limitations or errors are found, please open an issue.
5753

5854
## Training data
@@ -67,7 +63,7 @@ The following data was used, and is available for redistribution [here](https://
6763
- NAVER movie reviews
6864
- Korean news(about 1GB) from Leipzig(a German university)
6965
- Context data from KorSQUAD questions
70-
- Parsed CommonCrawl data(WIP)
66+
- Parsed Korean CommonCrawl data(WIP)
7167

7268
Please note the completed dataset includes <|endoftext|> tags.
7369

@@ -86,21 +82,24 @@ All hyperparameters are the same as GPT2-large
8682
One paragraph per line(TextDataset)
8783

8884
Early models(GPT2-large v0.2 and prior) are trained on 2xTesla V100 for 3~4 weeks.
89-
Some models are trained on v3-8 TPUs.
90-
Larger models are trained on v3-128 TPUs.
85+
Models up to XL size are trained on v3-8 TPUs.
9186

9287
## Eval results
9388
> prompt >>> 나는 어두운 숲 속을 거닐고 있다.
9489

90+
### v0.1 (xl)
9591
> 어느새 나는 그 작은 산들을 향해 뛰기 시작했다. 그리고 내 시야에는 이 골짜기에 대한 어떤 동경도, 혹은 동경과 환희조차 엿보였다가 사라졌다가는 사라져 버리고 말았 다. 그러나 나 역시 그것을 믿지 않았다. 아니 그것도 모른다……. 그렇다면 그것은 또 무슨 말인가? 내가 이렇게 말해도 좋을 텐데…… 하지만 어쨌든 이곳엔 왜 그리 먼 곳에서부터 찾아올까 하고 생각해 보았다. 그래서 그는 지금 어느 한 곳만을 헤매면서 자 꾸만 걸어오는 것일까?
9692

93+
### v0.2-story (xl)
94+
> 나는 어두운 숲 속을 거닐고 있다. #앨리건트테이블 의 #코코넛젤리 색이 넘 예쁘고 맛나다. #그릭요거트 원물이 들어간 마지막 베이커리 #말차초코케이크 도 맛나고 겉에 초코도 두껍게 씹히고..✨💕💕👍🏻 . . #앨리건트테이블 @eleganttable_ #간식 #카페어니언베이커리베이커 #콩콩볼 #쿠키 #디저트맛집 #디저트카페 #dessert #카페스타그램 #맛스타그램 #먹스타 그램 #먹방 #일상 #daily
95+
9796

9897
### BibTeX entry and citation info
9998

10099
```bibtex
101100
@unpublished{CitekeyUnpublished,
102101
author = "Seungjae Kim",
103-
title = "KoGPT : Larger versions of KoGPT2",
102+
title = "Introducing larger KoGPT2",
104103
year = 2020
105104
}
106105
```

0 commit comments

Comments
 (0)