added encoder to whisper function in LLMWhisperClient #11

ShoaibMajidDar · 2024-08-14T05:21:26Z

When processing documents in Arabic, the expected Arabic text was not returned correctly.
I added an encoder argument to the whisper function in LLMWhisperClient. The issue was resolved by encoding the response in UTF-8, which correctly handled the Arabic text. The encoder is set to default to ISO-8859-1, but can now be adjusted as needed.

src/unstract/llmwhisperer/client.py

hari-kuriakose · 2024-10-15T12:55:31Z

@ShoaibMajidDar Thanks for the contribution!
This would really help. Just couple of minor suggestions and rest looks good.

jaseemjaskp

LGTM

src/unstract/llmwhisperer/client.py

chandrasekharan-zipstack · 2024-10-30T11:57:52Z

Ideally we need to forward the encoding in the request headers itself, so that it is understood by LLMWhisperer itself and is handled subsequently by the requests library. The current approach would help handle UTF-8 correctly which should cover most of the usecases and any requirement to support different encoding schemes will be properly tackled in the client and server in the future

hari-kuriakose · 2024-10-30T19:03:06Z

@chandrasekharan-zipstack Agree, let's take up the improvements as required later.

added encoder to whisper function in LLMWhisperClient

5846fc5

hari-kuriakose reviewed Oct 15, 2024

View reviewed changes

src/unstract/llmwhisperer/client.py Outdated Show resolved Hide resolved

hari-kuriakose reviewed Oct 15, 2024

View reviewed changes

src/unstract/llmwhisperer/client.py Outdated Show resolved Hide resolved

hari-kuriakose requested review from chandrasekharan-zipstack and jaseemjaskp October 15, 2024 12:55

ShoaibMajidDar added 2 commits October 17, 2024 07:06

changed encoder to encoding and added encoding to whisper_status

c6e1b50

changed encoder to encoding and added encoding to whisper_status

f681656

jaseemjaskp approved these changes Oct 17, 2024

View reviewed changes

chandrasekharan-zipstack reviewed Oct 28, 2024

View reviewed changes

src/unstract/llmwhisperer/client.py Outdated Show resolved Hide resolved

added encoding to whisper_retrieve and removed it from whisper_status

be02721

chandrasekharan-zipstack approved these changes Oct 30, 2024

View reviewed changes

hari-kuriakose merged commit 2929529 into Zipstack:main Oct 30, 2024
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

added encoder to whisper function in LLMWhisperClient #11

added encoder to whisper function in LLMWhisperClient #11

Uh oh!

ShoaibMajidDar commented Aug 14, 2024

Uh oh!

Uh oh!

Uh oh!

hari-kuriakose commented Oct 15, 2024

Uh oh!

jaseemjaskp left a comment

Uh oh!

Uh oh!

chandrasekharan-zipstack commented Oct 30, 2024

Uh oh!

hari-kuriakose commented Oct 30, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

added encoder to whisper function in LLMWhisperClient #11

added encoder to whisper function in LLMWhisperClient #11

Uh oh!

Conversation

ShoaibMajidDar commented Aug 14, 2024

Uh oh!

Uh oh!

Uh oh!

hari-kuriakose commented Oct 15, 2024

Uh oh!

jaseemjaskp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chandrasekharan-zipstack commented Oct 30, 2024

Uh oh!

hari-kuriakose commented Oct 30, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants