Skip to content

fix (#90) split text if token larger than 4096#106

Open
jeffery9 wants to merge 6 commits intoyihong0618:mainfrom
jeffery9:split_p
Open

fix (#90) split text if token larger than 4096#106
jeffery9 wants to merge 6 commits intoyihong0618:mainfrom
jeffery9:split_p

Conversation

@jeffery9
Copy link
Contributor

@jeffery9 jeffery9 commented Mar 8, 2023

split text when token larger than the quota

@jeffery9
Copy link
Contributor Author

jeffery9 commented Mar 8, 2023

should fix #90

@jeffery9 jeffery9 closed this Mar 8, 2023
@jeffery9 jeffery9 reopened this Mar 8, 2023
@jeffery9 jeffery9 marked this pull request as ready for review March 8, 2023 06:40
@jeffery9 jeffery9 changed the title fix #90 split if token larger than 4096, try to fix #90 Mar 8, 2023
@jeffery9 jeffery9 changed the title split if token larger than 4096, try to fix #90 [fix #90] split text if token larger than 4096 Mar 10, 2023
@jeffery9 jeffery9 changed the title [fix #90] split text if token larger than 4096 fix (#90) split text if token larger than 4096 Mar 10, 2023
Comment on lines +22 to +106

message_log = [
{
"role": "user",
# english prompt here to save tokens
"content": f"Please help me to translate,`{text}` to {self.language}, please return only translated content not include the origin text",
}
]
count_tokens = num_tokens_from_messages(message_log)
consumed_tokens = 0
t_text = ""
if count_tokens > 4000:
print("too long!")

splits = count_tokens // 4000 + 1

text_list = text.split(".")
sub_text = ""
t_sub_text = ""
for n in range(splits):
text_segment = text_list[n * splits : (n + 1) * splits]
sub_text = ".".join(text_segment)
print(sub_text)

completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
# english prompt here to save tokens
"content": f"Please help me to translate,`{sub_text}` to {self.language}, please return only translated content not include the origin text",
}
],
)
t_sub_text = (
completion["choices"][0]
.get("message")
.get("content")
.encode("utf8")
.decode()
)
print(t_sub_text)
consumed_tokens += completion["usage"]["prompt_tokens"]

t_text = t_text + t_sub_text

else:
try:
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
# english prompt here to save tokens
"content": f"Please help me to translate,`{text}` to {self.language}, please return only translated content not include the origin text",
}
],
)
t_text = (
completion["choices"][0]
.get("message")
.get("content")
.encode("utf8")
.decode()
)
consumed_tokens += completion["usage"]["prompt_tokens"]

except Exception as e:
# TIME LIMIT for open api please pay
key_len = self.key.count(",") + 1
sleep_time = int(60 / key_len)
time.sleep(sleep_time)
print(e, f"will sleep {sleep_time} seconds")
self.rotate_key()
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": f"Please help me to translate,`{text}` to {self.language}, please return only translated content not include the origin text",
}
],
)
t_text = (
completion["choices"][0]
.get("message")
.get("content")
.encode("utf8")
.decode()
)
consumed_tokens += completion["usage"]["prompt_tokens"]

print(t_text)
print(f"{consumed_tokens} prompt tokens used.")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this functions is too long let's split it

Copy link
Contributor Author

@jeffery9 jeffery9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored.

Copy link
Contributor Author

@jeffery9 jeffery9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verified

@jeffery9
Copy link
Contributor Author

have passed ' black . --check' local, but not pass ci.

@jeffery9 jeffery9 requested review from yihong0618 March 13, 2023 05:17
@yihong0618
Copy link
Owner

pip insall -U black

@jeffery9
Copy link
Contributor Author

pip insall -U black

already formatted.


jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ python3.9 -m pip install -U black
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://mirrors.163.com/pypi/simple/
Requirement already satisfied: black in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (21.6b0)
Collecting black
  Using cached https://mirrors.163.com/pypi/packages/9b/27/b2f98b627738b02dcac06ae9e2ab13f14ab906fe6dd6366050c76883d4b5/black-21.12b0-py3-none-any.whl (156 kB)
Requirement already satisfied: click>=7.1.2 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (8.0.3)
Requirement already satisfied: mypy-extensions>=0.4.3 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (0.4.3)
  Using cached https://mirrors.163.com/pypi/packages/c7/24/0de05480822e5f0f2cc539fce9029bc2507b44b7f85ec1a9e23d89dea6c3/black-21.11b1-py3-none-any.whl (155 kB)
  Using cached https://mirrors.163.com/pypi/packages/3d/ad/1cf514e7f9ee4c3d8df7c839d7977f7605ad76557f3fca741ec67f76dba6/black-21.11b0-py3-none-any.whl (155 kB)
  Using cached https://mirrors.163.com/pypi/packages/12/df/0e55791b9c6ca07b4a3404eef6cee1ca42503bf16e9fc9df0247b4803cf1/black-21.10b0-py3-none-any.whl (150 kB)
  Using cached https://mirrors.163.com/pypi/packages/d2/16/a92c999103bee1236dd93f703f3522217fe00bd97bd50ae3699c2d91e320/black-21.9b0-py3-none-any.whl (148 kB)
  Using cached https://mirrors.163.com/pypi/packages/9d/11/cee7b695f95178025c428168dd75094f0e00fdcfe0fd004a0f8bc9bea3ee/black-21.8b0-py3-none-any.whl (148 kB)
  Using cached https://mirrors.163.com/pypi/packages/b6/6e/b706ab6440ebac6e0f7fb4615232216dd3bba09fa9fba6815df90601411c/black-21.7b0-py3-none-any.whl (141 kB)
Requirement already satisfied: appdirs in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (1.4.4)
Requirement already satisfied: toml>=0.10.1 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (0.10.2)
Requirement already satisfied: regex>=2020.1.8 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (2021.10.21)
Requirement already satisfied: pathspec<1,>=0.8.1 in /Users/jeffery/Library/Python/3.9/lib/python/site-packages (from black) (0.8.1)
jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ black .
All done! ✨ 🍰 ✨
17 files left unchanged.
jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ git status
On branch split_p
Your branch is up to date with 'origin/split_p'.

nothing to commit, working tree clean
jeffery@jeffery-MBP ~/repos/bilingual_book_maker (split_p) $ 

@yihong0618
Copy link
Owner

no worry I will take a look tonight or tomorrow.

wayhome pushed a commit to wayhome/bilingual_book_maker that referenced this pull request Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants